Thursday, June 25, 2009

Controlling How %UpdateStats Collects Optimizer Statistics

I have written a number of entries on this blog about updating database statistics on tables during Application Engine processes.
I proposed a PL/SQL package (wrapper.sql) called from the DDL model to intercept the call from the %UpdateStats macro in Application Engine. By default it
  • Gathers statistics on regular tables.
  • Refreshed only stale statistics on partitioned tables.
  • Does not gather statistics on Global Temporary Records.
I have now published an enhanced version: wrapper848meta.sql.
  • A table PS_GFC_STATS_OVRD holds meta-data to override the default behaviour of the script for certain records. The meta-data can also specify the size of the sample, and the options to control the collection of histograms.
  • If a private instance of a Temporary Records is a Global Temporary Tables, the wrapper may still collect statistics (normally this would be suppressed because of the risk of one session using the statistics collected by another session, but this will not happen for these tables).
Now, it is possible to specify the few tables where statistics must still be explicitly gathered, or whether to do this only if the current statistics on the table are stale. The DBA is probably the person best placed to decide whether and how to collect statistics on which tables, and these decisions can be implemented with the meta-data, but without code change.

Thus, it is possible to
  • Reduce the number of calls to dbms_stats,
  • to reduce the overhead of the remaining calls
  • and at least preserve, if not improve performance of batch processes without making any code changes.

Oracle Statistics History Retention in PeopleSoft

I have been working on a system where many Application Engine programs are running throughout the day, and are frequently collecting Optimizer statistics with the %UpdateStats macro on many working storage tables. Concurrent calls to dbms_stats are typical.

There are two new behaviours in Oracle 10g RDBMS that can in extreme cases, in combination with a system that calls dbms_stats very frequently, create a significant performance overhead.

From Oracle 10g, histograms may, by default, be collected automatically. That means that rows are concurrently deleted from and inserted into histgrm$ and hist_head$, leading to contention and consistent read.
  • Also from Oracle 10g, every time you collect statistics on a table the old statistics are retained in the SYS.WRI$_OPTSTAT%HISTORY tables. If histograms have previously been collected, these are also copied. DBMS_STATS has the additional overhead of writing this history. I found in excess of 10,000 versions of previous statistics for some tables, because the batch processes have updated statistics on working storage tables that many times.
  • dbms_stats also appears to be responsible for purging history older than the retention limit. The default retention period is 31 days. I have seen concurrent calls to dbms_stats blocked on row level locks on the statistics history tables. For me, this occurred 31 days after the system went live on a significantly increased volume.
  • SELECT dbms_stats.get_stats_history_retention FROM dual;
    GET_STATS_HISTORY_RETENTION 
    --------------------------- 
                             31
Statistics history was designed to work in conjunction with schema wide statistics jobs that only refreshed stale statistics. There is an option on gather_schema_stats to collect only statistics on tables where the current statistics are stale. However, there is no such option on gather_table_stats. If you have decided to call this procedure for a particular table, then it is assumed you know you need to refresh the statistics. However, by calling dbms_stats from a batch program you can end up calling it much more frequently than is really necessary.

Recommendations
  • Disable statistics history by using dbms_stats.alter_stats_history_retention to set the retention period to zero. Unfortunately this can only be set at database level. The statistics history is there in case you want to revert to a previous version of the statistics should a new set of statistics produce a problem, but it is only used rarely, and I think this is a necessary sacrifice.
  • EXECUTE dbms_stats.alter_stats_history_retention(retention=>0);
  • Use Oracle Optimizer Dynamic Sampling. However, I suggest increasing the level from the default of 2 to 4 to increase the situations in which it is used.
  • Introduce the new version of the PL/SQL wrapper package for dbms_stats so that you can specify the records for which statistics will be explicitly collected, and whether histograms are to be collect. Thus you can reduce the number of calls to dbms_stats.
If you have allowed the statistics history to grow before you disable history retention, then you might like to read John Hallas' posting on Purging statistics from the SYSAUX tablespace.