Friday, October 06, 2006

SQL Tips for Developers: Performance Improvements with the APPEND hint

In all PeopleSoft products, it is common to find batch programs populating large working storage or reporting tables. When using Oracle RDBMS, the append hint can be used to reduce the overhead and thus improve the performance of the insert operation. Thus

INSERT /*+ APPEND*/ INTO ... SELECT ...

It works by permitting the process to insert the data directly into new data blocks at the end of the object, rather than using space currently allocated to the object. So the new data is appended to the table. This is much faster because it permits Oracle to by-pass some of the processes that occur for each row inserted, such as index maintenance which only occurs when the insert is committed. Significantly, the hint also reduces the volume of redo written.

Every time a piece of data is changed in an Oracle database, the new data is written to the redo logs. This information can then be used to reapply changes to a database during media recovery, or it can be applied to a standby database as a part of a disaster recover solution (Oracle call this technique 'Physical DataGuard'). Every change, even to working storage tables and including indexes, generates redo information.

The reduction in redo can be very significant in systems using DataGuard across a WAN. Bandwidth constraints can cause a backlog of redo information waiting to be transmitted. In most systems the majority of redo is generated by overnight batch processes and it can take time during the day for that backlog to clear and for the DR system to catch up with production. Hence, reducing redo is important.

Therefore the APPEND hint should be used in insert statements when both of the following criteria are met:
  1. The table has been created or truncated prior to the insert. It is not sufficient to delete all the rows because the empty blocks remain allocated to the table.
  2. The table should be populated in a single monolithic insert statement. Multiple statements should be combined using the UNION ALL operator between the queries. Otherwise most of the benefit will be lost during multiple index maintenance phases.
It is easy to demonstrate this with some a simple test. I will create a simple table with a unique index, just as PeopleSoft would. Then I will truncate it and populate it, first without the APPEND hint, and then with it.

CREATE TABLE t
(a NUMBER
,b VARCHAR2(200));

CREATE UNIQUE INDEX t ON t(a);

TRUNCATE TABLE t;
INSERT INTO t
SELECT ROWNUM, RPAD('42',200,'.')
FROM dba_objects;

TRUNCATE TABLE t;
INSERT /*+ APPEND*/ INTO t
SELECT ROWNUM, RPAD('42',200,'.')
FROM dba_objects;

The session statistics show the savings in a number of areas.

Session Stats               No Hint Append Hint
------------------------ ---------- -----------
db block changes 202,926 3,192
db block gets 159,148 4,796
db block gets direct 1,413
db block gets from cache 159,148 3,383
DB time 1,308 828
redo entries 104,842 1,853
redo size 32,391,936 2,838,148

Fewer blocks are changed because they are not written and then rewritten as each row is inserted. Table blocks are only written once during the insert and the index is only maintained when the insert is committed.

Other recommendations:
  • When using the APPEND hint, the insert should be committed immediately. It must be committed before the table can be referenced, otherwise an error will be generated.

  • The data in the table will have been completely replaced by this point, so it is almost certainly appropriate to refresh the optimiser statistics. In Applicaition Engine use the %UpdateStats MetaSQL.

  • The APPEND hint cannot be used in conjuction with the %InsertSelect MetaSQL because the hint MUST be placed between the 'insert' and 'into' keywords. NB: The statement will not fail if you get the hint wrong.

No comments :