Wednesday, May 02, 2007

%UpdateStats() -v- Optimizer Dynamic Sampling

My previous post about the changes to DDL models in PeopleTools 8.48 made me to think about whether %UpdateStats() PeopleCode macro is the best solution to managing statistics on working storage tables in Oracle.

Optimizer Dynamic Sampling was introduced in Oracle 9.0.2. as a solution to the same problem. When a query is compiled Oracle can collect some optimiser statistics based upon a small random sample of blocks for tables that do not have statistics and that meet certain other criteria depending upon the parameter OPTIMIZER_DYNAMIC_SAMPLING. In Oracle 10g the default value for this parameter changed from 1 to 2 and so dynamic sampling applies to ALL unanalysed tables.

Thus, it should be possible to resolve the problem of incorrect statistics on a working storage table without explicitly collecting statistics during an Application Engine program, and therefore without needing a code change to add %UpdateStats(). Instead, simply delete statistics from the table, and lock them. A subsequent GATHER_SCHEMA_STATS will skip any locked tables. When a query references the table it will dynamically sample statistics and use them in determining the execution plan.

However, there is one more problem to overcome. GATHER_TABLE_STATS will raise an exception on a table with locked statistics. If you want to use Dynamic Sampling on a table where %UpdateStats() is already used to update the statistics, the PeopleCode macro will raise an exception that will cause Application Engine programs to terminate with an error. The workaround is to encapsulate GATHER_TABLE_STATS in a procedure that handles the exception, and reference the procedure in the DDL model. It is not possible to put a PL/SQL block in DDL model.


CREATE OR REPLACE PACKAGE BODY wrapper AS
PROCEDURE ps_stats(p_ownname VARCHAR2, p_tabname VARCHAR2, p_estpct NUMBER) IS
 table_stats_locked EXCEPTION;
 PRAGMA EXCEPTION_INIT(table_stats_locked,-20005);
 l_temporary VARCHAR2(1 CHAR);
 l_force BOOLEAN := TRUE;
BEGIN
 BEGIN
  SELECT temporary
  INTO   l_temporary
  FROM   all_tables
  WHERE  owner = p_ownname
  AND    table_name = p_tabname
  ;
 EXCEPTION WHEN no_data_found THEN
  RAISE_APPLICATION_ERROR(-20001,'Table '||p_ownname||'.'||p_tabname||' does not exist');
 END;

 IF l_temporary = 'Y' THEN
  l_force := FALSE; --don't force stats collect on GTTs
 ELSE
  l_force := TRUE; --don't force stats collect on GTTs
 END IF;

 IF p_estpct = 0 THEN
  sys.dbms_stats.gather_table_stats
  (ownname=>p_ownname
  ,tabname=>p_tabname
  ,estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE
  ,method_opt=>'FOR ALL COLUMNS SIZE AUTO'
  ,cascade=>TRUE
  ,force=>l_force
  );
 ELSE
  sys.dbms_stats.gather_table_stats
  (ownname=>p_ownname
  ,tabname=>p_tabname
  ,estimate_percent=>p_estpct
  ,method_opt=>'FOR ALL COLUMNS SIZE AUTO'
  ,cascade=>TRUE
  ,force=>l_force
  );
 END IF;
EXCEPTION
 WHEN table_stats_locked THEN NULL;
END ps_stats;
END wrapper;
/

At this time I have no data to determine which method is more likely to produce the better execution plan. However, when performance problems occur in production they are instinctively routed to the DBA, who is likely to have difficulty introducing a code change at short notice. Dynamic Sampling has some clear advantages.

Update 12.2.2009: Since writing this note I have come across some scenarios on Oracle 10.2.0.3 where the optimiser does not chose the best execution plan with dynamically sampled statistics, even with OPTIMIZER_DYNAMIC_SAMPLING set to the highest level, and I have had to explicitly collect statistics on working storage tables.

I have adjusted the wrapper procedure to call dbms_stats with the force option on permanent tables (so it collects statistics on tables whose statistics are locked and doesn't raise an exception, although it does not use the force option on Global Temporary Tables.

I still recommend that statistics should still be locked on tables related to PeopleSoft temporary records. The rationale for this is that you don't want any schema- or database-wide process that gathers statistics to process any working storage tables. Either the data in the table at the time the schema-wide statistics are gathered is not the same as when process runs and so you still want to use dynamic sampling, or else the process that populates the table has been coded to explicitly call %UpdateStats, and the new version of the wrapper will update these statistics.

2 comments :

Bruce Hubbell said...

OK. I understand the package that you have created.... and you can't put a PL/SQL block in DDL model
but what do we put into the PSDDLMODEL to call this new procedure ? We are 10G and 8.47.03 of Tools.

David Kurtz said...

You can't put an anonymous PL/SQL block in the DDL model, but you can call a PL/SQL procedure - either one supplied by Oracle, or written by yourself.
In PT8.47 and earlier, you cannot call DBMS_STATS.GATHER_TABLE_STATS directly because [TBNAME] contains both the schema and table name and must be separated. In PT8.48 the schema name moves to [DBNAME], and [TBNAME] contains only the table name. A different version of the wrapper is required. You can download wrapper.sql as a part of the code examples for PeopleSoft for the Oracle DBA (if you have the book see page 122).
The wrapper in the book does not handle the exception raised when analysing a table with locked statistics. I'll leave that as a exercise.