My previous post about the changes to DDL models in PeopleTools 8.48 made me to think about whether %UpdateStats() PeopleCode macro is the best solution to managing statistics on working storage tables in Oracle.
Optimizer Dynamic Sampling was introduced in Oracle 9.0.2. as a solution to the same problem. When a query is compiled Oracle can collect some optimiser statistics based upon a small random sample of blocks for tables that do not have statistics and that meet certain other criteria depending upon the parameter OPTIMIZER_DYNAMIC_SAMPLING. In Oracle 10g the default value for this parameter changed from 1 to 2 and so dynamic sampling applies to ALL unanalysed tables.
Thus, it should be possible to resolve the problem of incorrect statistics on a working storage table without explicitly collecting statistics during an Application Engine program, and therefore without needing a code change to add %UpdateStats(). Instead, simply delete statistics from the table, and lock them. A subsequent GATHER_SCHEMA_STATS will skip any locked tables. When a query references the table it will dynamically sample statistics and use them in determining the execution plan.
However, there is one more problem to overcome. GATHER_TABLE_STATS will raise an exception on a table with locked statistics. If you want to use Dynamic Sampling on a table where %UpdateStats() is already used to update the statistics, the PeopleCode macro will raise an exception that will cause Application Engine programs to terminate with an error. The workaround is to encapsulate GATHER_TABLE_STATS in a procedure that handles the exception, and reference the procedure in the DDL model. It is not possible to put a PL/SQL block in DDL model.
At this time I have no data to determine which method is more likely to produce the better execution plan. However, when performance problems occur in production they are instinctively routed to the DBA, who is likely to have difficulty introducing a code change at short notice. Dynamic Sampling has some clear advantages.
Update 12.2.2009: Since writing this note I have come across some scenarios on Oracle 10.2.0.3 where the optimiser does not chose the best execution plan with dynamically sampled statistics, even with OPTIMIZER_DYNAMIC_SAMPLING set to the highest level, and I have had to explicitly collect statistics on working storage tables.
I have adjusted the wrapper procedure to call dbms_stats with the force option on permanent tables (so it collects statistics on tables whose statistics are locked and doesn't raise an exception, although it does not use the force option on Global Temporary Tables.
I still recommend that statistics should still be locked on tables related to PeopleSoft temporary records. The rationale for this is that you don't want any schema- or database-wide process that gathers statistics to process any working storage tables. Either the data in the table at the time the schema-wide statistics are gathered is not the same as when process runs and so you still want to use dynamic sampling, or else the process that populates the table has been coded to explicitly call %UpdateStats, and the new version of the wrapper will update these statistics.
Wednesday, May 02, 2007
Changes to DDL Models in PeopleTools 8.48
A recent thread on Oracle-L led me to look at how Oracle has changed the way that PeopleTools 8.48 collects Oracle Cost-Based Optimiser statistics. It now uses DBMS_STATS instead of the ANALYZE command. This has also caused me to reconsider some options for managing statistics for a PeopleSoft system.
Application Engine programs can collect Cost-Based Optimiser statistics on specific tables by calling the %UpdateStats([,high/low]); PeopleCode macro. This uses one of two DDL models depending on whether the high or low option is specified. However, these DDL models only exist for Oracle and DB2/MVS. %UpdateStats() has no function on other platforms.
This was PeopleSoft’s platform generic solution (before their takeover by Oracle, and before Dynamic Sampling was available in the Oracle database) to the very real problem that occurs when statistics on a working storage or reporting table, that is emptied, repopulated and used during a batch process, do not accurately represent the content of the table and hence cause the optimiser to choose an inappropriate execution plan. PeopleSoft provided a method of refreshing the statistics during the process, and introduced new DDL models because each database platform would have its own command. However, this approach relies upon developers to add the %UpdateStats() macro for every occasion where data has changed sufficiently to require refreshing the statistics. Unfortunately, developers are not always best placed to make that decision. There are still plenty of places in delivered PeopleSoft code where this macro could be usefully added.
Up to PeopleTools 8.47, PeopleSoft delivered two DDL models that used the ANALYZE command. The %UpdateStats(,high) ran a full compute of the table:
While %UpdateStats(,low) estimated statistics with the default sample size:
From PeopleTools 8.48, these DDL models now call the Oracle supplied PL/SQL package DBMS_STATS. The high option still performs a full compute of statistics.
While the low option estimates statistics with the sample size determined by the pseudo-variable. DBMS_STATS.AUTO_SAMPLE_SIZE.
So it would appear that PeopleSoft now follow the recommendations that Oracle have been making since version 8i of the database to use DBMS_STATS instead of the ANALYZE command. This is certainly a step in the right direction. It also makes good sense to use the automatic sample size. ESTIMATE_PERCENT defaults to DBMS_STATS.AUTO_SAMPLE_SIZE from Oracle 10g. Previously it was NULL, which caused a full compute.
However, there is a problem.
PeopleSoft have chosen to specify the METHOD_OPT as FOR ALL INDEXED COLUMNS SIZE 1. If you have specified histograms on any of your columns, or generated them automatically with DBMS_STATS, the PeopleSoft command will remove them from indexed columns and will leave any histograms on unindexed columns unchanged, and potentially out of date.
The default in Oracle 9i is FOR ALL COLUMNS SIZE 1. This removes all histograms on all columns, although this is at least the same behaviour as the ANALYZE command.
In Oracle 10g, METHOD_OPT defaults to FOR ALL COLUMNS SIZE AUTO. The Oracle manual states that the database ‘…determines the columns to collect histograms based on data distribution and the workload of the columns’. So, Oracle may remove histograms if it judges that they are not necessary.
I have no hesitation in recommending that value for METHOD the delivered DDL models should be changed. I would suggest using FOR ALL COLUMNS SIZE REPEAT.
Addendum 12.6.2007: I have noticed that the DDL models change again in PT8.48.07. The full compute is now for ALL COLUMNS, but the estimate is still for ALL INDEXED COLUMNS! Closer, but still no cigar!
Application Engine programs can collect Cost-Based Optimiser statistics on specific tables by calling the %UpdateStats(
This was PeopleSoft’s platform generic solution (before their takeover by Oracle, and before Dynamic Sampling was available in the Oracle database) to the very real problem that occurs when statistics on a working storage or reporting table, that is emptied, repopulated and used during a batch process, do not accurately represent the content of the table and hence cause the optimiser to choose an inappropriate execution plan. PeopleSoft provided a method of refreshing the statistics during the process, and introduced new DDL models because each database platform would have its own command. However, this approach relies upon developers to add the %UpdateStats() macro for every occasion where data has changed sufficiently to require refreshing the statistics. Unfortunately, developers are not always best placed to make that decision. There are still plenty of places in delivered PeopleSoft code where this macro could be usefully added.
Up to PeopleTools 8.47, PeopleSoft delivered two DDL models that used the ANALYZE command. The %UpdateStats(,high) ran a full compute of the table:
ANALYZE TABLE [TBNAME] COMPUTE STATISTICS;
While %UpdateStats(,low) estimated statistics with the default sample size:
ANALYZE TABLE [TBNAME] ESTIMATE STATISTICS;
From PeopleTools 8.48, these DDL models now call the Oracle supplied PL/SQL package DBMS_STATS. The high option still performs a full compute of statistics.
DBMS_STATS.GATHER_TABLE_STATS (ownname=> [DBNAME], tabname=>[TBNAME], estimate_percent=>1, method_opt=> 'FOR ALL COLUMNS SIZE 1',cascade=>TRUE);
While the low option estimates statistics with the sample size determined by the pseudo-variable. DBMS_STATS.AUTO_SAMPLE_SIZE.
DBMS_STATS.GATHER_TABLE_STATS (ownname=> [DBNAME], tabname=>[TBNAME], estimate_percent=> dbms_stats.auto_sample_size, method_opt=> 'FOR ALL INDEXED COLUMNS SIZE 1',cascade=>TRUE);
So it would appear that PeopleSoft now follow the recommendations that Oracle have been making since version 8i of the database to use DBMS_STATS instead of the ANALYZE command. This is certainly a step in the right direction. It also makes good sense to use the automatic sample size. ESTIMATE_PERCENT defaults to DBMS_STATS.AUTO_SAMPLE_SIZE from Oracle 10g. Previously it was NULL, which caused a full compute.
However, there is a problem.
PeopleSoft have chosen to specify the METHOD_OPT as FOR ALL INDEXED COLUMNS SIZE 1. If you have specified histograms on any of your columns, or generated them automatically with DBMS_STATS, the PeopleSoft command will remove them from indexed columns and will leave any histograms on unindexed columns unchanged, and potentially out of date.
The default in Oracle 9i is FOR ALL COLUMNS SIZE 1. This removes all histograms on all columns, although this is at least the same behaviour as the ANALYZE command.
In Oracle 10g, METHOD_OPT defaults to FOR ALL COLUMNS SIZE AUTO. The Oracle manual states that the database ‘…determines the columns to collect histograms based on data distribution and the workload of the columns’. So, Oracle may remove histograms if it judges that they are not necessary.
I have no hesitation in recommending that value for METHOD the delivered DDL models should be changed. I would suggest using FOR ALL COLUMNS SIZE REPEAT.
Addendum 12.6.2007: I have noticed that the DDL models change again in PT8.48.07. The full compute is now for ALL COLUMNS, but the estimate is still for ALL INDEXED COLUMNS! Closer, but still no cigar!
Labels:
Analyze
,
DBMS_STATS
,
DDL Model
,
PeopleTools 8.48
Thursday, February 22, 2007
Reset Global Unique Identifier when cloning PeopleSoft databases
When cloning PeopleSoft databases there are a number of data values that need to be changed in the database to reflect the new database name. The most obvious is DBNAME on PS.PSDBOWNER. This maps the name of the databases, which on Oracle must also match the TNS Service Name, to the schema that contains the database.
However, a piece of data that is often forgotten is GUID on PSOPTIONS. It is documented in PeopleBooks, and is used by both the Performance Monitor and the Environmental Hub. GUID uniquely identifies a particular PeopleSoft system. PeopleSoft assigns a unique value, referred to as a GUID, to each PeopleSoft application installation. This value can't be customized.
When a Performance Monitoring agent registers with the PeopleSoft Performance Monitor, it provides this GUID. The first time the monitoring system receives information from a monitored system, it detects the GUID. For each new GUID detected, the monitoring system creates a new monitored system definition. Unless the specified monitor is changed, the new database will be monitored by the same instance of the PeopleSoft Performance Monitor that is monitoring the source database. The monitor assumes that the agents for both systems belong to the same system. Data for both systems will be mixed up, making it unreliable.
When an Environment Management agent notifies the hub that it has found a manageable component belonging to an environment, if the GUID of the environment is not recognized, the hub creates a new environment representation. Otherwise the hub will assume that the two environments are the same, leading to confusion.
To resolve these problems, set the value of the GUID field in the PSOPTIONS table to a single space in the new copy database. The next time an application server connects to the database, the system generates a new, unique GUID. You can insert the blank value in the PSOPTIONS table using the SQL tool at your site.
However, a piece of data that is often forgotten is GUID on PSOPTIONS. It is documented in PeopleBooks, and is used by both the Performance Monitor and the Environmental Hub. GUID uniquely identifies a particular PeopleSoft system. PeopleSoft assigns a unique value, referred to as a GUID, to each PeopleSoft application installation. This value can't be customized.
When a Performance Monitoring agent registers with the PeopleSoft Performance Monitor, it provides this GUID. The first time the monitoring system receives information from a monitored system, it detects the GUID. For each new GUID detected, the monitoring system creates a new monitored system definition. Unless the specified monitor is changed, the new database will be monitored by the same instance of the PeopleSoft Performance Monitor that is monitoring the source database. The monitor assumes that the agents for both systems belong to the same system. Data for both systems will be mixed up, making it unreliable.
When an Environment Management agent notifies the hub that it has found a manageable component belonging to an environment, if the GUID of the environment is not recognized, the hub creates a new environment representation. Otherwise the hub will assume that the two environments are the same, leading to confusion.
To resolve these problems, set the value of the GUID field in the PSOPTIONS table to a single space in the new copy database. The next time an application server connects to the database, the system generates a new, unique GUID. You can insert the blank value in the PSOPTIONS table using the SQL tool at your site.
Wednesday, February 07, 2007
Use of Windows Service Dependency in PeopleSoft
If you run the PeoleSoft Application Server on Windows, there
are advantages to configuring the domains to run as services.
Processes do not run in DOS windows, where they can accidentally
be shut down. They can run with the privileges for a different NT
user. Even if you don't want services to start automatically with
server, it can simply startup scripts to simply use the NET START/STOP
command. Then you might want to define interdependencies between
the services to make sure that dependant services are started in
the right order. Service dependencies are specified by adding the
string value DependOnService to the registry key for the
service in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\
(see Microsoft support note 193888).

The services that start either the application server or process schuler should be made dependent upon the BEA ProcMGR V8.1 (Tuxedo IPC Helper) service, otherwise the domain startup will fail. I have made the PeopleSoft service dependent on both the Oracle database service and the IPC process by making DependOnService a multi-string value.

The Dependencies tab of the service properties window shows which services this service depends upon, and which services depend on this service.

When the PeopleSoft service was made dependant upon the BEA ProcMgr service, the BEA ProcMgr service also reports that the PeopleSoft service depends upon it.

Now, when the PeopleSoft process is started, it will also start the BEA ProcMgr process if it is not already started. Shutting down the BEA ProcMgr service will also cause the PeopleSoft process to shutdown first.
are advantages to configuring the domains to run as services.
Processes do not run in DOS windows, where they can accidentally
be shut down. They can run with the privileges for a different NT
user. Even if you don't want services to start automatically with
server, it can simply startup scripts to simply use the NET START/STOP
command. Then you might want to define interdependencies between
the services to make sure that dependant services are started in
the right order. Service dependencies are specified by adding the
string value DependOnService to the registry key for the
service in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\
(see Microsoft support note 193888).
The services that start either the application server or process schuler should be made dependent upon the BEA ProcMGR V8.1 (Tuxedo IPC Helper) service, otherwise the domain startup will fail. I have made the PeopleSoft service dependent on both the Oracle database service and the IPC process by making DependOnService a multi-string value.
The Dependencies tab of the service properties window shows which services this service depends upon, and which services depend on this service.
When the PeopleSoft service was made dependant upon the BEA ProcMgr service, the BEA ProcMgr service also reports that the PeopleSoft service depends upon it.
Now, when the PeopleSoft process is started, it will also start the BEA ProcMgr process if it is not already started. Shutting down the BEA ProcMgr service will also cause the PeopleSoft process to shutdown first.
Monday, December 04, 2006
Retrieving Oracle trace files via an external table
December 21st 2006: Correction to script
When talking about performance tuning, I will often tell developers that they need to use Oracle SQL*Trace and TKPROF because that is an absolutely reliable way to find out how their SQL executed, how long it took and hence where they should focus their attention to improve performance. However, that requires access to the trace files in the USER_DUMP_DEST directory on the database server, which is something that database administrators are justifiably reticent to provide. When I am on a customer site, I often experience the same difficulties.
Even if developers can get a log on to a development database server, trace files will only be readable on Unix servers within the DBA group unless _trace_file_public is set to true.
Another option is to retrieve trace files from the user_dump_dest directory via an external table. The idea is not mine. I've seen this technique demonstrated by Tanel Poder, there is an article by Jared Still that demonstrates how to retrieve the alert log, and there is also an posting on Ask Tom using utl_dir. However, I couldn't find a script to do trace files, so I wrote one.
The script assumes the trace file is created by the current session. The first thing it does is to create a database directory that maps to the USER_DUMP_DEST directory, and an external table that corresponds to the trace file. Then you can just query the trace file in SQL*PLus and spool the output to a local file.
Of course, this also shows that external tables are a significant security risk. The privilege to create and read from directories carefully controlled.
When talking about performance tuning, I will often tell developers that they need to use Oracle SQL*Trace and TKPROF because that is an absolutely reliable way to find out how their SQL executed, how long it took and hence where they should focus their attention to improve performance. However, that requires access to the trace files in the USER_DUMP_DEST directory on the database server, which is something that database administrators are justifiably reticent to provide. When I am on a customer site, I often experience the same difficulties.
Even if developers can get a log on to a development database server, trace files will only be readable on Unix servers within the DBA group unless _trace_file_public is set to true.
Another option is to retrieve trace files from the user_dump_dest directory via an external table. The idea is not mine. I've seen this technique demonstrated by Tanel Poder, there is an article by Jared Still that demonstrates how to retrieve the alert log, and there is also an posting on Ask Tom using utl_dir. However, I couldn't find a script to do trace files, so I wrote one.
The script assumes the trace file is created by the current session. The first thing it does is to create a database directory that maps to the USER_DUMP_DEST directory, and an external table that corresponds to the trace file. Then you can just query the trace file in SQL*PLus and spool the output to a local file.
Of course, this also shows that external tables are a significant security risk. The privilege to create and read from directories carefully controlled.
REM user_dump_dest.sql
REM (c) Go-Faster Consultancy Ltd.
REM 30.11.2006 initial version
REM trace the thing you are interested in.
ALTER SESSION SET tracefile_identifier = 'gfctest';
ALTER SESSION SET sql_trace = true;
SELECT * FROM dual;
ALTER SESSION SET sql_Trace = false;
REM determine path for user_dump_dest and create an database directory
set echo off feedback off verify on timi off
column dir new_value dir format a18
column path new_value path format a60
SELECT name dir, value path
FROM v$parameter
WHERE name = 'user_dump_dest'
/
CREATE OR REPLACE DIRECTORY &dir AS '&path';
REM determine the name of the trace file from show process ID, and database name and parameters
column tracefile_name new_value tracefile_name
SELECT LOWER(d.name)||'_ora_'||p.spid
||DECODE(p.value,'','','_'||value) tracefile_name
FROM v$parameter p, v$database d, sys.v_$session s, sys.v_$process p
,(SELECT sid FROM v$mystat WHERE rownum=1) m
WHERE p.name = 'tracefile_identifier'
AND s.paddr = p.addr
AND s.sid = m.sid
/
REM create an external table that corresponds to the trace file
DROP TABLE &tracefile_name;
CREATE TABLE &tracefile_name
(trace_line VARCHAR2(4000))
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER DEFAULT DIRECTORY user_dump_dest
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE NOBADFILE NODISCARDFILE NOLOGFILE
FIELDS MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(trace_line CHAR(4000)))
LOCATION ('&tracefile_name..trc')
);
REM just query the trace file back to a local spool file in SQL Plus
set head off pages 0 termout off
spool &tracefile_name..trc
SELECT * FROM &tracefile_name;
spool off
set termout on
DROP TABLE &tracefile_name;
Subscribe to:
Posts
(
Atom
)