Embedded SQL Programming Guide

Using Buffered Inserts

A buffered insert is an insert statement that takes advantage of table queues to buffer the rows being inserted, thereby gaining a significant performance improvement. To use a buffered insert, an application must be prepared or bound with the INSERT BUF option.

Buffered inserts can result in substantial performance improvement in applications that perform inserts. Typically, you can use a buffered insert in applications where a single insert statement (and no other database modification statement) is used within a loop to insert many rows and where the source of the data is a VALUES clause in the INSERT statement. Typically the INSERT statement is referencing one or more host variables which change their values during successive executions of the loop. The VALUES clause can specify a single row or multiple rows.

Typical decision support applications require the loading and periodic insertion of new data. This data could be hundreds of thousands of rows. You can prepare and bind applications to use buffered inserts when loading tables.

To cause an application to use buffered inserts, use the PREP command to process the application program source file, or use the BIND command on the resulting bind file. In both situations, you must specify the INSERT BUF option. For more information about binding an application, see "Binding". For more information about preparing an application, see "Creating and Preparing the Source Files".
Note: Buffered inserts cause the following to occur:

The database manager opens one 4 KB buffer for each node on which the table resides.
The INSERT statement with the VALUES clause issued by the application causes the row (or rows) to be placed into the appropriate buffer (or buffers).
The database manager returns control to the application.
The rows in the buffer are sent to the partition when the buffer becomes full, or an event occurs that causes the rows in a partially filled buffer to be sent. A partially filled buffer is flushed when one of the following occurs:

The application issues a COMMIT (implicitly or explicitly through application termination) or ROLLBACK.
The application issues another statement that causes a savepoint to be taken. OPEN, FETCH, and CLOSE cursor statements do not cause a savepoint to be taken, nor do they close an open buffered insert.
The following SQL statements will close an open buffered insert:

BEGIN COMPOUND SQL
COMMIT
DDL
DELETE
END COMPOUND SQL
EXECUTE IMMEDIATE
GRANT
INSERT
PREPARE of the same dynamic statement (by name) doing buffered inserts
REDISTRIBUTE NODEGROUP
REORG
REVOKE
ROLLBACK
RUNSTATS
SELECT INTO
UPDATE
Execution of any other statement, but not another (looping) execution of the buffered INSERT
End of application

The following APIs will close an open buffered insert:

BIND (API)
REBIND (API)
RUNSTATS (API)
REORG (API)
REDISTRIBUTE (API)

In any of these situations where another statement closes the buffered insert, the coordinator node waits until every node receives the buffers and the rows are inserted. It then executes the other statement (the one closing the buffered insert), provided all the rows were successfully inserted. See "Considerations for Using Buffered Inserts" for additional details.

The standard interface in a partitioned environment, (without a buffered insert) loads one row at a time doing the following steps (assuming that the application is running locally on one of the partitions):

The coordinator node passes the row to the database manager that is on the same node.
The database manager uses indirect hashing to determine the partition where the row should be placed:
1. The target partition receives the row.
2. The target partition inserts the row locally.
3. The target partition sends a response to the coordinator node.
The coordinator node receives the response from the target partition.
The coordinator node gives the response to the application
The insertion is not committed until the application issues a COMMIT.
Any INSERT statement containing the VALUES clause is a candidate for Buffered Insert, regardless of the number of rows or the type of elements in the rows. That is, the elements can be constants, special registers, host variables, expressions, functions and so on.

For a given INSERT statement with the VALUES clause, the DB2 SQL compiler may not buffer the insert based on semantic, performance, or implementation considerations. If you prepare or bind your application with the INSERT BUF option, ensure that it is not dependent on a buffered insert. This means:

Errors may be reported asynchronously for buffered inserts, or synchronously for regular inserts. If reported asynchronously, an insert error may be reported on a subsequent insert within the buffer, or on the other statement which closes the buffer. The statement that reports the error is not executed. For example, consider using a COMMIT statement to close a buffered insert loop. The commit reports an SQLCODE -803 (SQLSTATE 23505) due to a duplicate key from an earlier insert. In this scenario, the commit is not executed. If you want your application to really commit, for example, some updates that are performed before it enters the buffered insert loop, you must reissue the COMMIT statement.
Rows inserted may be immediately visible through a SELECT statement using a cursor without a buffered insert. With a buffered insert, the rows will not be immediately visible. Do not write your application to depend on these cursor-selected rows if you precompile or bind it with the INSERT BUF option.

Buffered inserts result in the following performance advantages:

Only one message is sent from the target partition to the coordinator node for each buffer received by the target partition.
A buffer can contain a large number of rows, especially if the rows are small.
Parallel processing occurs as insertions are being done across partitions while the coordinator node is receiving new rows.

An application that is bound with INSERT BUF should be written so that the same INSERT statement with VALUES clause is iterated repeatedly before any statement or API that closes a buffered insert is issued.
Note: You should do periodic commits to prevent the buffered inserts from filling the transaction log.

Considerations for Using Buffered Inserts

Buffered inserts exhibit behaviors that can affect an application program. This behavior is caused by the asynchronous nature of the buffered inserts. Based on the values of the row's partitioning key, each inserted row is placed in a buffer destined for the correct partition. These buffers are sent to their destination partitions as they become full, or an event causes them to be flushed. You must be aware of the following, and account for them when designing and coding the application:

Certain error conditions for inserted rows are not reported when the INSERT statement is executed. They are reported later, when the first statement other than the INSERT (or INSERT to a different table) is executed, such as DELETE, UPDATE, COMMIT, or ROLLBACK. Any statement or API that closes the buffered insert statement can see the error report. Also, any invocation of the insert itself may see an error of a previously inserted row. Moreover, if a buffered insert error is reported by another statement, such as UPDATE or COMMIT, DB2 will not attempt to execute that statement.
An error detected during the insertion of a group of rows causes all the rows of that group to be backed out. A group of rows is defined as all the rows inserted through executions of a buffered insert statement:
- From the beginning of the unit of work,
- Since the statement was prepared (if it is dynamic), or
- Since the previous execution of another updating statement. For a list of statements that close (or flush) a buffered insert, see "Using Buffered Inserts".
An inserted row may not be immediately visible to SELECT statements issued after the INSERT by the same application program, if the SELECT is executed using a cursor.

A buffered INSERT statement is either open or closed. The first invocation of the statement opens the buffered INSERT, the row is added to the appropriate buffer, and control is returned to the application. Subsequent invocations add rows to the buffer, leaving the statement open. While the statement is open, buffers may be sent to their destination partitions, where the rows are inserted into the target table's partition. If any statement or API that closes a buffered insert is invoked while a buffered INSERT statement is open (including invocation of a different buffered INSERT statement), or if a PREPARE statement is issued against an open buffered INSERT statement, the open statement is closed before the new request is processed. If the buffered INSERT statement is closed, the remaining buffers are flushed. The rows are then sent to the target partitions and inserted. Only after all the buffers are sent and all the rows are inserted does the new request begin processing.

If errors are detected during the closing of the INSERT statement, the SQLCA for the new request will be filled in describing the error, and the new request is not done. Also, the entire group of rows that were inserted through the buffered INSERT statement since it was opened are removed from the database. The state of the application will be as defined for the particular error detected. For example:

If the error is a deadlock, the transaction is rolled back (including any changes made before the buffered insert section was opened).
If the error is a unique key violation, the state of the database is the same as before the statement was opened. The transaction remains active, and any changes made before the statement was opened are not affected.

For example, consider the following application that is bound with the buffered insert option:

    EXEC SQL UPDATE t1 SET COMMENT='about to start inserts';
    DO UNTIL EOF OR SQLCODE < 0;
      READ VALUE OF hv1 FROM A FILE;
      EXEC SQL INSERT INTO t2 VALUES (:hv1);
      IF 1000 INSERTS DONE, THEN DO
         EXEC SQL INSERT INTO t3 VALUES ('another 1000 done');
         RESET COUNTER;
      END;
    END;
    EXEC SQL COMMIT;

Suppose the file contains 8 000 values, but value 3 258 is not legal (for example, a unique key violation). Each 1 000 inserts results in the execution of another SQL statement, which then closes the INSERT INTO t2 statement. During the fourth group of 1 000 inserts, the error for value 3 258 will be detected. It may be detected after the insertion of more values (not necessarily the next one). In this situation, an error code is returned for the INSERT INTO t2 statement.

The error may also be detected when an insertion is attempted on table t3, which closes the INSERT INTO t2 statement. In this situation, the error code is returned for the INSERT INTO t3 statement, even though the error applies to table t2.

Suppose, instead, that you have 3 900 rows to insert. Before being told of the error on row number 3 258, the application may exit the loop and attempt to issue a COMMIT. The unique-key-violation return code will be issued for the COMMIT statement, and the COMMIT will not be performed. If the application wants to COMMIT the 3000 rows which are in the database thus far (the last execution of EXEC SQL INSERT INTO t3 ... ends the savepoint for those 3000 rows), then the COMMIT has to be REISSUED! Similar considerations apply to ROLLBACK as well.
Note: When using buffered inserts, you should carefully monitor the SQLCODES returned to avoid having the table in an indeterminate state. For example, if you remove the SQLCODE < 0 clause from the THEN DO statement in the above example, the table could end up containing an indeterminate number of rows.

Restrictions on Using Buffered Inserts

The following restrictions apply:

For an application to take advantage of the buffered inserts, one of the following must be true:
- The application must either be prepared through PREP or bound with the BIND command and the INSERT BUF option is specified.
- The application must be bound using the BIND or the PREP API with the SQL_INSERT_BUF option.
If the INSERT statement with VALUES clause includes long fields or LOBS in the explicit or implicit column list, the INSERT BUF option is ignored for that statement and a normal insert section is done, not a buffered insert. This is not an error condition, and no error or warning message is issued.
INSERT with fullselect is not affected by INSERT BUF. A buffered INSERT does not improve the performance of this type of INSERT.
Buffered inserts can be used only in applications, and not through CLP-issued inserts, as these are done through the EXECUTE IMMEDIATE statement.

The application can then be run from any supported client platform.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]