Thursday, September 06, 2018

How Not to Collect Optimizer Statistics in an Application Engine Program

I was told about a PeopleSoft customer experiencing an Oracle error when collecting statistics during an Application Engine.
ORA-06533: Subscript beyond count 
ORA-06512: at "SYS.DBMS_STATS"…
It is possibly a manifestation of a database bug. The workaround was not to use AUTO_SAMPLE_SIZE, so instead, this customer initially coded an explicit call to DBMS_STATS.GATHER_TABLE_STATS with a specific sample size.
This blog is not about the bug, but how to manage the workaround.

DO NOT TRY THIS AT HOME!


I think that there are a number of problems with this approach
  1. Using a fixed sample size rather than AUTO_SAMPLE_SIZE should only be considered as a temporary workaround. The new hash-based number-of-distinct-values (NDV) algorithm in Oracle 12c only works with AUTO_SAMPLE_SIZE, and it produces more accurate statistics and runs faster because saves a large sort operation. Coding a fixed sample size into an Application Engine requires a managed change to be made, tested and released into production, and then when the underlying problem is resolved the customisation needs to be removed by the same managed process. 
  2. DBMS_STATS.GATHER_TABLE_STATS implies a commit. That can lead to problems that PeopleSoft avoids by only calling statistics via the %UpdateStats macro and controlling when that macro does and does not execute. 

Committing and Restart Checkpointing in Application Engine

If a restartable Application Engine program fails it rolls back to the last commit point and it can then be restarted from that point. Committing is managed by Application Engine at section and step levels where the program state record is updated accordingly. If an error occurs in a step after the implicit commit in DBMS_STATS, it can result in the data in the application tables being different to where the state record indicates the program can be restarted. The program may not restart, or it could conceivably execute but produce erroneous results.
Committing inside a do while loop, including any other Application Engine program called from inside the loop is suppressed at Application Engine section/step level and therefore the execution of %UpdateStats macro is also suppressed. Otherwise, you could get rogue ORA-01555 Snapshot Too Old errors. Suppression of %UpdateStats is reported in the Application Engine step trace.
-- 19:07:37.199 .(AEMINITEST.MAIN.LOOP) (Do Select)
%Select(recname) SELECT recname FROM psrecdefn WHERE recname = 'JOB'
/
-- Buffers:
--   1) JOB
-- 19:07:37.204 Iteration 1 of .(AEMINITEST.MAIN.LOOP) (Do Select) loop

-- 19:07:37.208 .(AEMINITEST.MAIN.LOOP) (SQL)
RECSTATS PS_JOB LOW
/
-- 19:07:37.211 UpdateStats ignored - COMMIT required

-- 19:07:37.212 .(AEMINITEST.MAIN.LOOP) (Do Fetch)
Even a previously uncommitted SQL step can lead to %UpdateStats being suppressed.
-- 19:07:35.205 .(AEMINITEST.MAIN.Step01) (SQL)
UPDATE PS_AETEMPTBLMGR SET AE_DEDICATED = 1 WHERE 0 = 1
/
-- Row(s) affected: 0

-- 19:07:35.213 .(AEMINITEST.MAIN.Step02) (PeopleCode)

-- 19:07:35.220 .(AEMINITEST.MAIN.US1) (SQL)
RECSTATS PS_JOB LOW
/
-- 19:07:35.227 UpdateStats ignored - COMMIT required
If you code DBMS_STATS.GATHER_TABLE_STATS explicitly, Application Engine will not recognise the step as having committed. In the following example, you can see the %UpdateStats on the last step has been suppressed because it Application Engine does not recognise that the update in the first step has been committed by the call to DBMS_STATS.
-- 19:12:06.690 .(AEMINITEST.MAIN.Step01) (SQL)
UPDATE PS_AETEMPTBLMGR SET AE_DEDICATED = 1 WHERE 0 = 1
/
-- Row(s) affected: 0
-- 19:12:06.696 .(AEMINITEST.MAIN.Step02) (PeopleCode)
-- 19:12:06.700 .(AEMINITEST.MAIN.GTS) (SQL)
%Execute(/) BEGIN 
dbms_stats.gather_table_stats('SYSADM','PS_JOB',estimate_percent=>1); END; /
/
-- Row(s) affected: 1

-- 19:12:09.871 .(AEMINITEST.MAIN.US1) (SQL)
RECSTATS PS_JOB LOW
/
-- 19:12:09.877 UpdateStats ignored - COMMIT required
/
Perhaps, the safest form of this workaround would be to have the step with DBMS_STATS and the immediately preceding step explicitly commit as in the following example.  I have also made the program restartable.  Now restart data is checkpointed, and the %UpdateStats macro executes at step US1.
Restart Data CheckPointed
/
COMMIT
/

-- 19:20:24.792 .(AEMINITEST.MAIN.GTS) (SQL)
%Execute(/) BEGIN 
dbms_stats.gather_table_stats('SYSADM','PS_JOB',estimate_percent=>1); END; /
/
-- Row(s) affected: 1
/ 
/
Restart Data CheckPointed
/
COMMIT
/

-- 19:20:29.153 .(AEMINITEST.MAIN.US1) (SQL)
RECSTATS PS_JOB LOW
/
-- Row(s) affected: 1
/
However, you have to consider the state the application data after an error, whether you wish to restart or cancel the Application Engine because you can no longer rollback.

Doing the Right Thing

I recommend that:
  • You should only ever collect stats in Application Engine with the %UpdateStats macro that in turn executes the command in the DDL model.
  • From Oracle 11g both PeopleSoft statistics gathering DDL models should be the same and should ultimately call DBMS_STATS without any parameters other than the table name. The default value of ESTIMATE_PERCENT is AUTO_SAMPLE_SIZE.
There are two DDL models in PeopleSoft because %UpdateStats can be invoked with a second parameter to collect the statistics HIGH or LOW. This dates back to Oracle's ANALYZE command that could either compute or estimate statistics (and other database platforms had similar options). Collecting optimizer statistics with ANALYZE has been deprecated for many years, but the command still has other valid uses.  It was superceded by DBMS_STATS in Oracle 8i (released in 1998).
Automatic sample size was introduced in Oracle 9i.  In Oracle 9i and 10g, it was usual to use automatic sample size in the high statistics gathering model and a small fixed sample size in the low model for use on very large tables.  The LOW parameter was specified on %Updatestats in Application Engine programs as necessary.
This approach became redundant from Oracle 11g with the introduction of table preferences. If you need to collect statistics with a specific rather than the automatic sample size or specify any other parameters, then a table preference should be created. Preferences apply wherever statistics are gathered on that table and not overridden in the call of DBMS_STATS., including schema and database-wide operations such as the maintenance window.  If there are multiple places where statistics are collected on a table, a preference assures that the statistics will always be collected will be consistently.
From Oracle 12c, as the new NDV algorithm only works with AUTO_SAMPLE_SIZE, you should always use the default unless you have an overarching reason to the contrary, and then you should use a table preference. This approach does not require any application code change because the preference is an attribute of a table in the database.
I recommend using GFCPSSTATS package, calling it from the DDL model (see previous blog Managing Cost-Based Optimizer Statistics for PeopleSoft). The package also includes a mechanism to specify table preferences in metadata, so that they are automatically instantiated when a table is created or altered by Application Designer, and are applied to every instance of a temporary record.
gfcpsstats11.ps_stats(p_ownname=>[DBNAME], p_tabname=>[TBNAME]);
Alternatively, just call DBMS_STATS with no additional parameters (other than FORCE, in case you lock statistics on temporary tables) but then you must manage table preferences manually.
DBMS_STATS.GATHER_TABLE_STATS(ownname=>[DBNAME], tabname=>[TBNAME], force=>TRUE);

Further Reading

Friday, June 29, 2018

Managing Cost-Based Optimizer Statistics for PeopleSoft

I gave this presentation to UKOUG PeopleSoft Roadshow 2018

PeopleSoft presents some special challenges when it comes to collecting and maintaining the object statistics used by the cost-based optimizer.

I have previously written and blogged on this subject.  This presentation focuses exclusively on the Oracle database and draws together the various concepts into a single consistent picture.  It makes clear recommendations for Oracle 12c that will help you work with the cost-based optimizer, rather than continually fight against it.

It looks at collecting statistics for permanent and temporary working storage tables and considers some other factors that can affect optimizer statistics.

This presentation also discusses PSCBO_STATS, that is going to be shipped with PeopleTools, and compares and contrasts it with GFCPSSTATS11.

Saturday, April 14, 2018

Application Engine in Process Scheduler: PSAESRV Server Process -v- Standalone PSAE executable

Whether to use the Application Engine server process (PSAESRV) in the process scheduler tuxedo domain or the standalone PSAE executable is a frequently discussed point amongst PeopleSoft administrator.  Over the years, I have written various things on the subject.  I am going to draw them together in this blog, and restate Oracle’s now clear advice about when to use which option.

In PeopleTools 8.4, the Process Scheduler became a fully fledged Tuxedo domain.  The PSAESRV process was also introduced at this time.  It is a persistent process that handles both Application Engine and Optimization Engine requests.  Each server process creates persistent database connections.  A number of these server processes are started with the domain.   The PSAESRV process does not spawn like other Tuxedo server processes.  Instead, you must configure the number of server processes to match the maximum number of concurrent Application Engine process requests and concurrent Optimization Engine requests that the process scheduler can run.  The server was introduced to handle very short-lived Application Engine programs thus avoiding the overhead of instantiating a new process and new database sessions for each process request.  CRM typically uses Application Engine in this fashion, but generally, you do not see this in other PeopleSoft products.

Oracle has not always been clear what they mean by a short-lived process.  It has been suggested that if Application Engine processes are typically taking less than 10-30 seconds, or if you run more than 1000 Application Engine processes requests per hour (note 651970.1) you should use PSAESRVs.
PeopleBooks advises you should use PSAESRV because it delivers improved system performance.  However, PeopleTools Performance Guidelines Red Paper (Doc ID 747389.1) contradicts this somewhat.  Ultimately, if you have any doubts, you should it test each way and determine whether one way gives a clear advantage over the other.

Oracle Support Note "What will be the impact of disabling PSAESRV on the Process Scheduler (Doc ID 651970.1)" explains that if PSAESRV is disabled in the Tuxedo domain configuration, the Process Scheduler goes back to the legacy behaviour and spawns a stand-alone PSAE process for each Application Engine request.  “The Application Engine will take a bit longer to start, [the] time delay may be range from millisecond to seconds” depending on hardware and configuration.

The stand-alone process has several advantages.
  • At the end of the Application Engine program, it disconnects from the database and terminates.  Thus releasing resources from the process and the database session.  Whereas the persistent Application Engine process has been reported to accumulate allocated memory over time.
  • If you are using Oracle database Global Temporary Tables in an application engine, then you should not use PSAESRV because the tables are always created PRESERVE ON COMMIT and so are only released when the database session terminates.
  • If you set any session parameters within an Application Engine program run via PSAESRV, or enable database trace, then these settings will carry forward from one Application Program to the next unless you reset the parameter at the end of the program, or the start of the next.  This is not a concern with standalone PSAE processes.
However, there is at least one case where you must use the server process:
  • If you are using Oracle Active Data Guard and wish to mark some Application Engine programs as read-only then they must be run via the PSAESRV process

Conclusion

  • PeopleTools Performance Guidelines Red Paper (Doc ID 747389.1 sums it up very nicely: “PSAE is as good as PSAESRV for most practical purposes.  If you have an application engine job that runs longer than 10 seconds, PSAE is equivalent to PSAESRV.  PSAE has the added advantage of being recycled at the end of each application engine job, cleaning up any outstanding SQL cursors to the database that may have been left behind.  Because PSAE recycles after each use, PSAE does not have any possible memory leakage problem that may occupy the precious system memory.  In short, PSAE is a cleaner workhorse.”
  • I think it is reasonable to use PSAESRV in CRM.  For all other products, I recommend that PSAESRV should be disabled from all Process Schedulers.
    • If you do have some Application Processes that are both short-lived (i.e. less than 10 seconds) and run frequently, then consider creating other process schedulers with PSAESRV processes that are dedicated to running only these process.  You can then move these processes to a new Process Scheduler category that only runs on these new Process Scheduler.
  • PSAESRV is configured by default, so if you don’t want to use it, and mostly you won’t, then you have to remember to disable it.

Tuesday, March 20, 2018

Resetting High Water Marks on On-line Temporary Table Instances

PeopleSoft has always used regular database tables for temporary working storage in batch processes.   Up to PeopleTools 7.x working storage tables were shared by all instances of a program.  That led to consistent read contention when multiple processes concurrently used the same table, resulting in much higher high water marks that increased durations of full scans.
From PeopleTools 8, many copies of each temporary working storage table are created.  Application Engines that run on the Process Scheduler are allocated exclusive use of a particular copy of the table.  This avoids the inter-process contention.  They start by truncating each allocated table, which resets the high-water mark.
Some delivered processing uses batch programs that are run apparently synchronously from the PIA.  On-line edit and post in Financials General Ledger is a common example.  Up to PeopleTools 7, the application server would synchronously spawn a batch process and wait for it to complete.  From PeopleTools 8 the process is submitted to the process scheduler, and the PIA polls the Scheduler tables waiting for the process to complete.  However, Application Engine can be run within the component processor.  In Financials General Ledger, this can be chosen by a setting an installation configuration option.  The truly on-line method can perform better because you are no longer waiting for the process scheduler to pick up the process request.  A separate process Application Engine is not spawned, but the Application Engine program is executed by the PSAPPSRV application server process.  One of the limitations is that the Application Engine program cannot commit.  Committing after steps or sections is suppressed, and the %TruncateTable macro generates a delete statement instead.  Therefore, on-line temporary table instances are never truncated by any process and their high-water marks can be raised by processes that handle larger volumes of data.  This can have impacts for subsequent processes with smaller data volumes but that still have to full-scan working storage tables up to their high water marks.

Truncating On-line Temporary Table Instances

The answer is to implement a periodic process that truncates working storage tables, but only doing so when the table is not currently being used by a process.  Every on-line Application Engine program is allocated a temporary table instance number, it locks the corresponding row on the table PS_AEONLINEINST.  If it is allocated to instance 1, it locks the row where CURTEMPINSTANCE is 1 and uses instance 1 of each temporary record that it needs.  
Therefore the proposed truncate process must also lock the row on PS_AEONLINEINST that corresponds to each table that is to be truncated.  The truncate must be done in an autonomous transaction so that the implicit commit does not release that lock.  The lock can be released after the truncate completes.  Thus, the truncate process waits for any online process to complete before truncating a table with the same instance number, and no process can start while the truncate process is holding the lock.  However, each truncate will be very quick, and so each lock will only be held briefly, and it will have only a minimal effect on any online process that may be running at the time.  

I have written a PL/SQL packaged procedure (to perform this process for all temporary records.  It is available on Github as a part of my repository of miscellaneous PeopleSoft scripts.

Package Usage

Usually, the package will be run without any parameters. The default behaviour will be to truncate tables with more than a single extent.  Information on what the package does is emitted to the server output.
Set serveroutput on 
EXECUTE xx_onlineinsthwmreset.main;
The package can be run in test mode when it will list the commands without executing them.  Thus you can see what it will do without actually doing it.
EXECUTE xx_onlineinsthwmreset.main(p_testmode=>TRUE);
The package can optionally deallocate any physical storage. Storage will be reallocated next time the table is used.
EXECUTE xx_onlineinsthwmreset.main(p_drop_storage=>TRUE, p_min_extents=>0);
The package can be run for certain tables that match a particular pattern.
EXECUTE xx_onlineinsthwmreset.main(p_recname_like=>'JP%');
I recommend that the package is run daily. However, it can be run safely while the users are doing on-line edit/post processing, but it would be sensible to choose a quiet time.

Wednesday, March 07, 2018

Fewer Platform Flags on Indexes from PeopleTools 8.55

It has always been possible in Application Designer to specify upon which databases platforms each index should be built.  This is really a feature that is used by PeopleSoft development, rather than customers to deliver indexes that are more appropriate for a particular platform due to differences in the optimizer.
Over the years, the number of supported PeopleSoft platforms has declined.  In PeopleTools 8.45, it went down from 9 to 6 and in PeopleTools 8.55 it has gone down to just 4, but there are still 9 columns on PSINDEXDEFN that correspond to the original 9 supported platforms.
I explained in a previous blog that you can have all or none of the platform flags set to the same value, but with the platform radio button on the index properties dialogue box is still set to 'some' because one or more of the platform flag columns for some of the unsupported platforms is set differently.  Of course, this is a purely cosmetic problem, but one that can cause confusion in Application Designer.
PeopleTools 8.45 PeopleTools 8.55
I fix this by updating PeopleTools tables as follows. The first query reports on the indexes where the supported platform flags all have the same value and one of the unsupported platform flags are different.
column RECNAME format a15
column INDEXID format a3 heading 'Idx|ID'
column DDLCOUNT format 999 heading 'DDL|Cnt'
column CUSTKEYORDER format 9999 heading 'Cust|Key|Order'
column KEYCOUNT format 999 heading 'Key|Cnt'
column PLATFORM_SBS format 999 heading 'SBS'
column PLATFORM_DB2 format 999 heading 'DB2'
column PLATFORM_ORA format 999 heading 'ORA'
column PLATFORM_INF format 999 heading 'INF'
column PLATFORM_DBX format 999 heading 'DBx'
column PLATFORM_ALB format 999 heading 'ALB'
column PLATFORM_SYB format 999 heading 'SYB'
column PLATFORM_MSS format 999 heading 'MSS'
column PLATFORM_DB4 format 999 heading 'DB4'
column ACTIVEFLAG Format 999 heading 'Active'
column CLUSTERFLAG format 999 heading 'Clst|Flg'
column UNIQUEFLAG format 999 heading 'Uniq|Flg'
column INDEXTYPE format 999 heading 'Idx|Type'
column IDXCOMMENTS format a60
spool platformfix855
SELECT *
FROM PSINDEXDEFN
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_MSS
AND (PLATFORM_ORA!=PLATFORM_SBS
  OR PLATFORM_ORA!=PLATFORM_ALB
  OR PLATFORM_ORA!=PLATFORM_SYB
  OR PLATFORM_ORA!=PLATFORM_INF
  OR PLATFORM_ORA!=PLATFORM_DB4)
;
These are the indexes that have inconsistent platform flags.  In this case, PS_PSPMTRANSHIST is to be disabled on DB2/AS400.  You can't update the flag via Application Designer, but you could set the radio button to ALL.
                                           Cust
                Idx  Idx Uniq Clst          Key  Key  DDL
RECNAME         ID  Type  Flg  Flg Active Order  Cnt  Cnt  SBS  DB2  ORA  INF  DBx  ALB  SYB  MSS  DB4
--------------- --- ---- ---- ---- ------ ----- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
IDXCOMMENTS
------------------------------------------------------------
PSPMTRANSHIST   _      1    1    1      1     0    4    1    1    1    1    1    1    1    1    1    0
…
It could be a tedious process to do this for a lot of indexes.  So the following SQL commands correct all indexes.  They set the SQL flags for the unsupported platforms to the value for the supported platforms if they are all the same. The version number on the record definition is updated so that Application Designer refreshes the object.
UPDATE PSVERSION
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','RDM');

UPDATE PSLOCK
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','RDM');

UPDATE PSRECDEFN
SET VERSION = (
  SELECT VERSION
  FROM PSVERSION
  WHERE OBJECTTYPENAME = 'RDM')
WHERE RECNAME IN (
  SELECT RECNAME
  FROM PSINDEXDEFN
  WHERE PLATFORM_DB2=PLATFORM_DBX
  AND PLATFORM_DBX=PLATFORM_INF
  AND PLATFORM_INF=PLATFORM_ORA
  AND PLATFORM_ORA=PLATFORM_SYB
  AND ( PLATFORM_ORA!=PLATFORM_SBS
    OR PLATFORM_ORA!=PLATFORM_ALB
    OR PLATFORM_ORA!=PLATFORM_DB4)
);

UPDATE psindexdefn
SET PLATFORM_DB4=PLATFORM_ORA
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND PLATFORM_ORA!=PLATFORM_DB4;

UPDATE psindexdefn
SET PLATFORM_ALB=PLATFORM_ORA
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND PLATFORM_ORA!=PLATFORM_ALB;

UPDATE psindexdefn
SET PLATFORM_SBS=PLATFORM_ORA
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND PLATFORM_ORA!=PLATFORM_SBS;
The platform flags now say 'ALL'. Not a tremendous change, but at least I can immediately see that these indexes do build on all platforms without having to open each one.

The new platformfix855.sql script is available on collection of miscellaneous scripts on Github.