Friday, October 24, 2014

Minimising Parse Time in Application Engine with ReUseStatement

This article explains how to determine the overhead of using literal values rather than bind variables in SQL submitted by PeopleSoft Application Engine programs. Using a combination of PeopleSoft Application Engine Batch Timings and Oracle Active Session History (ASH), it is possible to determine where it is most effective to change to alter this behaviour by setting the ReUse attribute.

ReUse Statement Flag

I originally wrote about the Performance Benefits of ReUseStatement in Application Engine over 5 years ago, and the problem is no less significant today than it was then.  There are still many places in the delivered PeopleSoft application that would benefit from it.  However, it is not possible to simply set the attribute across all Application Engine programs because it will cause errors in steps where SQL is dynamically generated, so each case must be considered.  Where the ReUse attributed is changed, it must be tested and migrated like any other customisation.

Recently, I have been encountering problems in a number of places on a Financials system which were resolved by enabling ReUseStatement.  So I started by calculating how much time was lost by not setting it.

Application Engine Batch Timings

If an Application Engine step is not reused, then it is compiled prior to every execution during which the Application Engine bind variables are resolved to literals. The number and duration of compilations and executions are reported in the Batch Timings, the production of which are controlled with the TraceAE flag in the Process Scheduler configuration file (psprcs.cfg).  

;-------------------------------------------------------------------------
; AE Tracing Bitfield
;
; Bit       Type of tracing
; ---       ---------------
...
; 128       - Timings Report to AET file
...
; 1024      - Timings Report to tables
...
TraceAE=1152
;------------------------------------------------------------------------

Whatever other TraceAE flags you set, I would always recommend that you set 128 and 1024 so that batch timings are always emitted to both file and database.  The overhead of enabling them is negligible because they are managed in memory at run time and physically written once when the Application Engine program ends.

NB: The settings in the configuration file can be overridden by command line options.  If you specify -TRACE parameter in the Process Scheduler definitions remember to also set these flags.

Batch timings are written to the AE Trace file at end of an Application Engine program, and to PS_BAT_TIMINGS PeopleTools tables at the successful end of an Application Engine program. 
It can be useful to have batch timings in the trace file of an Application Engine that failed because they are not written to the database.  As your system runs, you will build up batch timings for all of your successful Application Engine processes (which will be most of them.  This is a useful source of performance metrics.

Compilations, Execution and ReUse

In this example, the number of compilations is equal to the number of executions because ReUseStatement is not set.

                          PeopleSoft Application Engine Timings
                              (All timings in seconds)

                               C o m p i l e    E x e c u t e    F e t c h        Total
SQL Statement                  Count   Time     Count   Time     Count   Time     Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
99XxxXxx.Step02.S                 8453      2.8    8453    685.6       0      0.0    688.4
...

With ReUse Statement enabled, there is now only a single compilation, and most of the time is saved in execution not compilation.

                               C o m p i l e    E x e c u t e    F e t c h        Total
SQL Statement                  Count   Time     Count   Time     Count   Time     Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
99XxxXxx.Step02.S                    1      0.0    8453    342.3       0      0.0    342.3
...

So we can use the batch timings to identify steps where ReUseStatement is not set because they have as many compilations as executions, and then we can profile the top statements.

 Profile Compilations

This query produces a simple profile of batch timings for statements. 
  • In sub-query x it extract batch timings for statements with more than one compilation in processes that ended in the last 7 days.
  • There is a long-standing bug in batch timings where negative timings can be returned when the clock that returns milliseconds recycles back to zero every 216 milliseconds.  Sub-query y calculates an adjustment that is applied in sub-query z if the timing is negative.
  • Arbitrarily, I am only looking at statements with more than a total of 10000 compilations.

REM ReUseCand.sql
REM (c)Go-Faster COnsultancy Ltd. 2014
COLUMN detail_id FORMAT a32
COLUMN step_time FORMAT 999990 HEADING 'AE|Step|Secs'
COLUMN compile_count HEADING 'AE|Compile|Count'
COLUMN execute_count HEADING 'AE|Execute|Count'
COLUMN processes HEADING 'Num|Process|Instances'
COLUMN process_name HEADING 'Process|Name'
SPOOL ReUseCand
WITH x AS (
SELECT  l.process_instance, l.process_name
,  l.time_elapsed/1000 time_elapsed
, l.enddttm-l.begindttm diffdttm
,  d.bat_program_name||'.'||d.detail_id detail_id
, d.compile_count, d.compile_time/1000 compile_time
, d.execute_time/1000 execute_time
FROM ps_bat_Timings_dtl d
, ps_bat_timings_log l
WHERE d.process_instance = l.process_instance
AND d.compile_count = d.execute_count 
AND d.compile_count > 1 
AND l.enddttm > SYSDATE-7
), y as (
SELECT  x.*
, GREATEST(0,60*(60*(24*EXTRACT(day FROM diffdttm)
                             +EXTRACT(hour FROM diffdttm))
                             +EXTRACT(minute FROM diffdttm))
                             +EXTRACT(second FROM diffdttm)-x.time_elapsed) delta 
FROM  x)
, z as (
SELECT process_instance, process_name, detail_id
,  CASE WHEN time_elapsed < 0 THEN time_elapsed+delta  
               ELSE time_elapsed END time_elapsed
,  compile_count
, CASE WHEN compile_time < 0 THEN compile_time+delta
      ELSE compile_time END AS compile_time
, CASE WHEN execute_time < 0 THEN execute_time+delta
      ELSE execute_time END AS execute_time
FROM y
), a as (
SELECT  process_name, detail_id
, SUM(compile_time+execute_time) step_time
, SUM(compile_count) compile_count
, COUNT(DISTINCT process_instance) processes
FROM  z
GROUP BY process_name, detail_id)
SELECT * FROM a
WHERE compile_count >= 10000 
ORDER BY step_time DESC
/
SPOOL OFF

So now I have a list of steps with lots of compilations.  I know how long they took, but I don't know how much time I might save by enabling ReUseStatement. That will save some time in Application Engine, but it will also cut down database parse time.  So now I need determine the parse time from ASH data.

Process                                         Step    Compile    Process
Name         DETAIL_ID                          SEcs      Count  Instances
------------ -------------------------------- ------ ---------- ----------
AP_PSTVCHR   AP_PSTCOMMON.ACCT_UPD.Step02.S    12001      40704         10
AP_VCHRBLD   APVEDTMOVE.UPDQTY03.Step03.S       4313      49536         28
FS_VATUPDFS  FS_VATUPDFS.Seq0-b.DJ300-2.S       4057     203704          3
AP_VCHRBLD   APVEDTMOVE.UPDQTY03.Step02.S       2799      49536         28
PC_BI_TO_PC  PC_BI_TO_PC.UPD_PRBI.UPDATE.S      1974      23132         10
FS_VATUPDFS  FS_VATUPDFS.Seq0-a.X0001.D         1960      37368          3
GL_JEDIT_0   FS_CEDT_ECFS.iTSELog.iTSELog.S     1628      13104        519
AP_APY2015   AP_APY2015.V_CREATE.Step14.H       1192      11318         19

This query is based on the previous one, but includes scalar queries on the ASH data for each step.
  • WARNING: This query can run for a long period because it has to scan ASH data for each entry in BAT_TIMINGS_DTL.
  • This time in sub-query x I am looking for a rolling 7-day period up to the last hour, because ASH data will have been flushed to the ASH repository.
  • In sub-query y there are two scalar queries that retrieve the DB time spent on hard parse, and all DB time for each batch timing entry.  These queries count 10 seconds for each distinct sample ID to estimate elapsed time rather than total DB time.
  • The query does not group by process name because one step can be called from many Application Engine programs and I want to aggregate the total time across all of them.

REM ReUseCandASH.sql
REM ReUseCandASH.sql
REM (c)Go-Faster Consultancy Ltd. 2014
COLUMN processes       HEADING 'Num|Process|Instances'
COLUMN process_name    HEADING 'Process|Name'
COLUMN detail_id       FORMAT a32
COLUMN step_time       HEADING 'AE|Step|SEcs'    FORMAT 999990 
COLUMN compile_count   HEADING 'AE|Compile|Count'
COLUMN execute_count   HEADING 'AE|Execute|Count'
COLUMN hard_parse_secs HEADING 'Hard|Parse|Secs' FORMAT 99990
COLUMN ash_secs        HEADING 'DB|Time'         FORMAT 99990
SPOOL ReUseCandASH
WITH x AS (
SELECT  l.process_instance, l.process_name
,  l.time_elapsed/1000 time_elapsed
, l.begindttm, l.enddttm
, l.enddttm-l.begindttm diffdttm
,  d.bat_program_name||'.'||d.detail_id detail_id
, d.compile_count, d.compile_time/1000 compile_time
, d.execute_time/1000 execute_time
FROM ps_bat_timings_dtl d
, ps_bat_timings_log l
WHERE d.process_instance = l.process_instance
AND d.compile_count = d.execute_count
AND d.compile_count > 1
AND l.enddttm >= TRUNC(SYSDATE-7,'HH24')
AND l.enddttm < TRUNC(SYSDATE,'HH24') 
), y as (
SELECT  x.*
, GREATEST(0,60*(60*(24*EXTRACT(day    FROM diffdttm)
                             +EXTRACT(hour   FROM diffdttm))
                             +EXTRACT(minute FROM diffdttm))
                             +EXTRACT(second FROM diffdttm)-x.time_Elapsed) delta
FROM  x)
, z as  (
SELECT process_instance, process_name, detail_id
,  CASE WHEN time_elapsed < 0 THEN time_elapsed+delta 
             ELSE time_elapsed END AS time_elapsed
,  compile_count
, CASE WHEN compile_time < 0 THEN compile_time+delta
      ELSE compile_time END AS compile_time
, CASE WHEN execute_time < 0 THEN execute_time+delta
      ELSE execute_time END AS execute_time
, (
 SELECT 10*COUNT(DISTINCT h.sample_id) 
 FROM psprcsque q
 , dba_hist_snapshot x
  , dba_hist_active_sess_history h
 WHERE q.prcsinstance = y.process_instance
  AND  x.begin_interval_time <= y.enddttm
  AND  X.END_INTERVAL_TIME >= y.begindttm
  AND  h.sample_time between y.begindttm and y.enddttm
  AND  h.SNAP_id = x.SNAP_id
  AND  h.dbid = x.dbid
  AND  h.instance_number = x.instance_number
  AND  h.module = 'PSAE.'|| y.process_name||'.'||q.sessionidnum
  AND  h.action = y.detail_id
 AND h.in_hard_parse = 'Y'
 ) hard_parse_secs
, (
 SELECT 10*COUNT(DISTINCT h.sample_id)
 FROM psprcsque q
 , dba_hist_snapshot x
  , dba_hist_active_sess_history h
 WHERE q.prcsinstance = y.process_instance
  AND  x.begin_interval_time <= y.enddttm
  AND  X.END_INTERVAL_TIME >= y.begindttm
  AND  h.sample_time between y.begindttm and y.enddttm
  AND  h.SNAP_id = X.SNAP_id
  AND  h.dbid = x.dbid
  AND  h.instance_number = x.instance_number
  AND  h.module = 'PSAE.'|| y.process_name||'.'||q.sessionidnum
  AND  h.action = y.detail_id
 ) ash_secs
FROM  y
), a AS (
SELECT  /*process_name ,*/ detail_id
, SUM(compile_time+execute_time) step_time
, SUM(compile_count) compile_count
, COUNT(DISTINCT process_instance) processes
, SUM(hard_parse_secs) hard_parse_secs
, SUM(ash_secs) ash_secs
FROM  z
GROUP BY /*process_name,*/ detail_id)
SELECT  a.* 
FROM  a
WHERE  compile_count >= 10000
ORDER BY step_time DESC
/
spool off

Now we can also see how much time the database is spending on hard parse on each step, and it is this time that is likely to be saved by enabling ReUseStatement.
However, before we can enable the ReUseStatement attribute we must first manually inspect each step in Application Designer and determine whether doing so would break anything.  The Comment column in this profile was added manually as I did that.  Some statements I can change, some I have to accept the overhead.

                                   Step    Compile    Process      Parse         DB
DETAIL_ID                          Secs      Count  Instances       Secs       Time Comment
-------------------------------- ------ ---------- ---------- ---------- ---------- …………………………………………………………………………………………………………………………………
AP_PSTCOMMON.ACCT_UPD.Step02.S    12001      40704         10      11820      11920 Set ReUseStatement
FS_CEDT_ECMB.4EditCDT.uValCDT.S    5531      10289        679        620       5870 Dynamic SQL, can't set ReUseStatement
APVEDTMOVE.UPDQTY03.Step03.S       4306      49471         27       4020       4100 Set ReUseStatement
FS_VATUPDFS.Seq0-b.DJ300-2.S       4057     203704          3       3150       3860 Dynamic SQL, can't set ReUseStatement
FS_CEDT_ECFS.iTSELog.iTSELog.S     3332      19073        716       2130       3520 Dynamic SQL, can't set ReUseStatement
APVEDTMOVE.UPDQTY03.Step02.S       2796      49471         27       2730       2820 Set ReUseStatement
PC_BI_TO_PC.UPD_PRBI.UPDATE.S      1974      23132         10        230       1920 Set ReUseStatement 
FS_VATUPDFS.Seq0-a.X0001.D         1960      37368          3          0          0 Dynamic SQL, can't set ReUseStatement
FS_CEDT_ECMB.4uAnchCT.uAnchCDT.S   1319      10289        679        510       1290 Dynamic SQL, can't set ReUseStatement
AP_APY2015.V_CREATE.Step14.H       1169      11094         19          0          0 Set ReUseStatement
GL_JETSE.GA100.CHKEDT.S            1121      15776        569        860        930 Dynamic SQL, can't set ReUseStatement
FS_CEDT_ECMB.iTSELog.iTSELog.S      988      10289        679        450        990 Dynamic SQL, can't set ReUseStatement
FS_CEDT_ECMB.uMarkVal.uMarkVal.S    711      10289        679         50        670 Dynamic SQL, can't set ReUseStatement
FS_CEDT_ECMB.uMarkInv.uMarkInv.S    668      10289        679         40        790 Dynamic SQL, can't set ReUseStatement
  • Due to a bug in the instrumentation of Application Engine, the session's action attribute is not set for Do Select (type D) and Do When (type H) steps.  ASH data cannot therefore be matched for them.
  • More DB Time is reported for FS_CEDT_ECMB.uMarkInv.uMarkInv.S than is reported by batch timings.  This is a consequence of ASH sampling, where we count 10 seconds for each sample.

Conclusion

Setting ReUseStatement is very simple because it doesn't involve changing SQL, but there are lots of places where it could be set.  This technique picks out the relatively few places where doing so could potentially have a significant effect.