Monday, May 26, 2008

Performance Tuning the Performance Monitor Archive Process

I have been working with a PeopleSoft Performance Monitor (PPM) system that is collecting data for 7 monitored systems. Three of those systems are very active. We want to keep data for a rolling 7 days. We reached the point where the PPM archive process could not rows as fast as they were arriving! I have, of course, reduced the sample frequency of events and transactions, but the PPM is receiving between 20 and 30 million rows of history data per week.

It is also worth remarking here that since the application servers, web servers and Process Schedulers were moved to commodity Intel hardware there are now a larger number of these components, and PPM is receiving more event data.

I have taken a look inside the PPM archive process. PPM history data is removed by a PeopleCode program (PSPM_ARCHIVE.ARCHIVE.GBL.default.ARCPCODE.OnExecute) that is called from the Application Engine.

Rows of transaction date to be archived are identified by the following SQL (and there is a similar one for event data):

&TransHistSQL.Open("SELECT X.PM_INSTANCE_ID, X.PM_TRANS_DEFN_SET, X.PM_TRANS_DEFN_ID, X.PM_AGENTID, X.PM_TRANS_STATUS, X.OPRID, X.PM_PERF_TRACE, X.PM_CONTEXT_VALUE1, X.PM_CONTEXT_VALUE2, X.PM_CONTEXT_VALUE3, X.PM_CONTEXTID_1, X.PM_CONTEXTID_2, X.PM_CONTEXTID_3, X.PM_PROCESS_ID, %DateTimeOut(X.PM_AGENT_STRT_DTTM), %DateTimeOut(X.PM_MON_STRT_DTTM), X.PM_TRANS_DURATION, X.PM_PARENT_INST_ID, X.PM_TOP_INST_ID, X.PM_METRIC_VALUE1, X.PM_METRIC_VALUE2, X.PM_METRIC_VALUE3, X.PM_METRIC_VALUE4, X.PM_METRIC_VALUE5, X.PM_METRIC_VALUE6, X.PM_METRIC_VALUE7, X.PM_ADDTNL_DESCR, Z.PM_ARCHIVE_MODE FROM PSPMTRANSHIST X, PSPMAGENT Y, PSPMSYSDEFN Z WHERE X.PM_AGENTID=Y.PM_AGENTID AND Y.PM_SYSTEMID=Z.PM_SYSTEMID AND (Z.PM_ARCHIVE_MODE='1' OR Z.PM_ARCHIVE_MODE='2') AND %DateTimeDiff(X.PM_MON_STRT_DTTM, %CurrentDateTimeIn) >= (PM_MAX_HIST_AGE * 24 * 60)");

Lets look at the last condition in the WHERE clause (in bold). It uses the %DateTimeDiff PeopleCode macro to calculate the age of the row of data. This expands to.

… AND ROUND((CAST(( SYSDATE) AS DATE) - CAST((X.PM_MON_STRT_DTTM) AS DATE)) * 1440, 0) >= (PM_MAX_HIST_AGE * 24 * 60)

However, putting a function around a column prevents the database from using any index on that column. In this case it is not possible to create an function-based index to satisfy this expression because it contains SYSDATE.

The reaction of Oracle's Optimizer is to full scan the PSPMTRANSHIST table. With millions of rows of data in this table, a full scan is simply not going to perform adequately.

Instead, I would rather code something simple, but Oracle specific (and if I am going to do Oracle specific date arithmetic then there is no need to use %CurrentDateTimeIn).

… AND X.PM_MON_STRT_DTTM <= SYSDATE - PM_MAX_HIST_AGE


The PeopleCode can be adjusted to be sensitive to the database platform like this.


/*-- SELECT THE ROWS FROM PSPMTRANSHIST ELIGLIBLE FOR ARCHIVING */
If %DbType = "ORACLE" Then /*dmk 19.5.2008 oracle specific performance tuning*/
&TransHistSQL.Open("SELECT X.PM_INSTANCE_ID, X.PM_TRANS_DEFN_SET, X.PM_TRANS_DEFN_ID, X.PM_AGENTID, X.PM_TRANS_STATUS, X.OPRID, X.PM_PERF_TRACE, X.PM_CONTEXT_VALUE1, X.PM_CONTEXT_VALUE2, X.PM_CONTEXT_VALUE3, X.PM_CONTEXTID_1, X.PM_CONTEXTID_2, X.PM_CONTEXTID_3, X.PM_PROCESS_ID, %DateTimeOut(X.PM_AGENT_STRT_DTTM), %DateTimeOut(X.PM_MON_STRT_DTTM), X.PM_TRANS_DURATION, X.PM_PARENT_INST_ID, X.PM_TOP_INST_ID, X.PM_METRIC_VALUE1, X.PM_METRIC_VALUE2, X.PM_METRIC_VALUE3, X.PM_METRIC_VALUE4, X.PM_METRIC_VALUE5, X.PM_METRIC_VALUE6, X.PM_METRIC_VALUE7, X.PM_ADDTNL_DESCR, Z.PM_ARCHIVE_MODE FROM PSPMTRANSHIST X, PSPMAGENT Y, PSPMSYSDEFN Z WHERE X.PM_AGENTID=Y.PM_AGENTID AND Y.PM_SYSTEMID=Z.PM_SYSTEMID AND (Z.PM_ARCHIVE_MODE='1' OR Z.PM_ARCHIVE_MODE='2') AND X.PM_MON_STRT_DTTM <= SYSDATE - PM_MAX_HIST_AGE


and similarly for event data


/*-- SELECT THE ROWS FROM PSPMEVENTHIST ELIGLIBLE FOR ARCHIVING */
If %DbType = "ORACLE" Then
/*dmk 19.5.2008 - oracle specific performance tuning*/
&EventHistSQL.Open("SELECT X.PM_INSTANCE_ID, X.PM_EVENT_DEFN_SET, X.PM_EVENT_DEFN_ID, X.PM_AGENTID, %DateTimeOut(X.PM_AGENT_DTTM), %DateTimeOut(X.PM_MON_DTTM), X.PM_PROCESS_ID, X.PM_FILTER_LEVEL,X.PM_METRIC_VALUE1, X.PM_METRIC_VALUE2, X.PM_METRIC_VALUE3, X.PM_METRIC_VALUE4, X.PM_METRIC_VALUE5, X.PM_METRIC_VALUE6, X.PM_METRIC_VALUE7, X.PM_ADDTNL_DESCR, Z.PM_ARCHIVE_MODE FROM PSPMEVENTHIST X, PSPMAGENT Y, PSPMSYSDEFN Z WHERE X.PM_AGENTID=Y.PM_AGENTID AND Y.PM_SYSTEMID=Z.PM_SYSTEMID AND (Z.PM_ARCHIVE_MODE='1' OR Z.PM_ARCHIVE_MODE='2') AND X.PM_MON_DTTM <= SYSDATE - PM_MAX_HIST_AGE


The resulting query needs an index on PM_AGENTID and PM_MON_STRT_DTTM (on both PSPMTRANSHIST and PSPMEVENTHIST). There are delivered indexes that lead on the PM_AGENTID and have PM_MON_STRT_DTTM further down, but a dedicated index is beneficial.

Sunday, May 18, 2008

Enabling Oracle Database Trace on PeopleSoft processes with a Trigger (improved)

In Chapter 11 of PeopleSoft for the Oracle DBA, I suggested using a database trigger to enable Oracle's SQL Trace facility. Then, in chapter 14, I introduced a further trigger to disable trace for Application Engine processes, in case they are run with the PSAESRV process.

This is not a good idea because the cursors opened by the Application Engine program are still open when the AE program completed, and process updates its status from 7 (processing) to 9 (success) or 3 (error). This causes the unset_trace trigger to fire and disable SQL Trace. Consequently the STAT lines for these cursors are never emitted to the trace file. The STAT lines contain the execution plans and a wealth of other information.

Instead, I have a new version of the trace trigger (trace_trigger.sql can be down loaded from the Go-Faster website).

spool trace_trigger
rem (c)Go-Faster Consultancy Ltd. 2008

rem 1.5.2008 - moved logic to disable trace from unset_trace trigger into main trigger 
rem because cursors in AE not shut until after status is changed away from processing

ROLLBACK;
----------------------------------------------------------------
REM explicit grants by sys required on following privileges
----------------------------------------------------------------
GRANT ALTER SESSION TO sysadm;
GRANT EXECUTE ON sys.dbms_monitor TO sysadm;
GRANT EXECUTE ON sys.dbms_application_info TO SYSADM;
----------------------------------------------------------------

CREATE OR REPLACE TRIGGER sysadm.set_trace
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (new.runstatus = 7 AND old.runstatus != 7 AND new.prcstype != 'PSJob')(1)
DECLARE
  l_waits BOOLEAN := TRUE;(2)
  l_binds BOOLEAN := FALSE;
BEGIN
  --set module and action whether we are tracing or not
  sys.dbms_application_info.set_module((3)
 module_name => :new.prcsname,
 action_name => 'PI='||:new.prcsinstance
  );
   
  IF ( :new.runcntlid LIKE 'TRACEME%'(5)
  ----------------------------------------------------------------
  --code conditions for enabling trace here instead of when clause
  ----------------------------------------------------------------
  --  OR (    SUBSTR(:new.prcsname,1,3) = 'TL_'
  --      AND :new.rqstdttm <= TO_DATE('20080509','YYYYMMDD'))
  ----------------------------------------------------------------
     ) THEN

    --if we are going to trace, then set tracefile identifier
    EXECUTE IMMEDIATE 'ALTER SESSION SET TRACEFILE_IDENTIFIER(4) = '''||
      TRANSLATE(:new.prcstype     ,' -','__')||'_'||
      TRANSLATE(:new.prcsname     ,' -','__')||'_'||
      :new.prcsinstance||'_'||
       TRANSLATE(:new.servernamerun,' -','__')||
       '''';

    EXECUTE IMMEDIATE 'ALTER SESSION SET TIMED_STATISTICS = TRUE';
    EXECUTE IMMEDIATE 'ALTER SESSION SET MAX_DUMP_FILE_SIZE = 2097152'; --1Gb
    EXECUTE IMMEDIATE 'ALTER SESSION SET STATISTICS_LEVEL=ALL';

    ----------------------------------------------------------------
    --logic to determine whether you want to trace binds also
    ----------------------------------------------------------------
    IF :new.runcntlid LIKE 'TRACE%BIND%' THEN(6)
      l_binds := TRUE;
    END IF;
    ----------------------------------------------------------------

    sys.dbms_monitor.session_trace_enable(waits=>TRUE,binds=>l_binds);(7)
--  EXECUTE IMMEDIATE 'ALTER SESSION SET EVENTS ''10046 TRACE NAME CONTEXT FOREVER, LEVEL 12''';

  ELSIF :new.prcstype = 'Application Engine' THEN(8)
    --explicitly disable trace if application server process
    sys.dbms_monitor.session_trace_disable;

    --reset max dump file size AFTER disabling trace
    EXECUTE IMMEDIATE 'ALTER SESSION SET MAX_DUMP_FILE_SIZE = 5M';

    --if not tracing, then reset tracefile identifier
    EXECUTE IMMEDIATE 'ALTER SESSION SET TRACEFILE_IDENTIFIER(4) = ''''';
  END IF;
EXCEPTION WHEN OTHERS THEN NULL;
END;
/

show errors

DROP TRIGGER sysadm.unset_trace;

rem test that the trigger fires by updating something
UPDATE  sysadm.psprcsrqst new
SET     runstatus = 7
WHERE   runstatus != 7
AND     new.prcstype != 'PSJob'
AND    rownum < 1
--AND 1=2
;

ROLLBACK;
spool off
1. The trigger fires when any process begins. Previously I used logic in the WHEN clause to restrict when it fired.
2. Wait events are collected in the trace by default.
3. Irrespective of whether trace is enabled, the trigger also calls the dbms_application_info package to set the module name to the PeopleSoft Process Name, and action to the PeopleSoft Process Instance number.
4. Also irrespective of whether trace is enabled, the trace file identifier is set so that the trace file name includes the Process Type, Process Name, Process Instance Number, and the name of the Process Scheduler.
(Updated 6.2.2009) Previous versions of this trigger set tracefile identifier irrespective of whether trace was enabled. Unfortunately, this causes a small trace file to be generated for every process.5. Logic in the trigger determines whether a SQL Trace will be enabled. The supplied version of the trigger traces proceses whose run controls begin with 'TRACE'
6. Additional logic has been added to also enable bind variables to be captured if the run control begins with 'TRACEBIND'.
7. The dbms_monitor package is new in Oracle 10g, and is the only officially supported way to enable extended trace. If using the trigger on earlier versions of the database, it should be replaced with the commented code to set event 10046.
8. If it is not required to SQL Trace a process, and the process is an Application Engine program, then trace is explicitly disabled. Thus, if a PSAESRV process has traced an AE program, trace is disabled when it next handles a request. This has the advantage that the stat lines for the cursors are written to the trace file. However, the disadvantage is that the process continues to trace between requests, and the time between requests is reported as 'SQL*Net message from client'.