Thursday, June 15, 2006

Additional Batch Timings from Application Engine

I am doing some Application Engine development at the moment as a part of a project to replatform to Oracle. I go through an iterative process of changing SQL Server specific SQL (in a custom AE program) to run on Oracle, and then running the process until it fails again. However, not only must the processes run correctly on Oracle, but the new code needs to be efficient as possible in order to meet performance targets. So, I need to know how long each step in the AE program took to execute.

Application Engine batch timings are very useful metrics when tuning processes, but they are only written when the program ends, and they are only written to the timings tables in the database when a program completes successfully. I am repeatedly restarting the same process, gradually moving towards the end, so I will never get a complete timings report.

However, Application Engine programs writes a status string to table PSPRCSRQSTSTRNG which contains the step about to be executed. The value of this string can be seen in Process Monitor in the Process Detail page.

desc PSPRCSRQSTSTRNG
Name Null? Type
-------------------- -------- ---------------
PRCSINSTANCE NOT NULL NUMBER(38)
RQSTSTRINGTYPE NOT NULL VARCHAR2(2)
PRCSRQSTSTRING NOT NULL VARCHAR2(254)

I created a table to which a trigger logs what steps are executed. I have also used a sequence number to produce a reliable primary key on that table because values in the string will repeat when Application Engine is in a loop or when a process is restarted process in which case it keeps the same process instance number. The sequence number is generated by an Oracle Sequence to avoid locking problems caused by using a table to generate a sequence (as PeopleSoft generally does). I am not worried by gaps in the sequence.

CREATE TABLE dmk_prcsrqststrng_log
(prcsinstance NUMBER NOT NULL
,sequence_nbr NUMBER NOT NULL
,rqststringtype VARCHAR2(2) NOT NULL
,prcsrqststring VARCHAR2(254) NOT NULL
,datestamp DATE NOT NULL
,datetimestamp TIMESTAMP NOT NULL
,CONSTRAINT dmk_prcsrqststrng_log_pk
PRIMARY KEY (prcsinstance, sequence_nbr)
);

CREATE SEQUENCE dmk_prcsrqststrng_seq;

A trigger on PSPRCSRQSTSTRNG writes new values for PRCSRQSTSTRING to the log table. I have used an autonomous transaction in case the update rolls back after an error, and so that I can see the log record before Application Engine commits. I am also storing two timestamps when the audit record is written. The Oracle date type is only accurate to the second, but I have also used the newer timestamp datatype. On Oracle 9.2 (on Windows) I get millisecond accuracy, but on Oracle 10.2 I get microsecond accuracy.

CREATE OR REPLACE TRIGGER dmk_prcsrqststrng_log
AFTER INSERT OR UPDATE OR DELETE
ON sysadm.psprcsrqststrng
FOR EACH ROW
DECLARE
PRAGMA autonomous_transaction;
BEGIN
IF INSERTING OR UPDATING THEN
INSERT INTO dmk_prcsrqststrng_log
(prcsinstance
,sequence_nbr
,rqststringtype
,prcsrqststring
,datestamp
,datetimestamp)
VALUES
(:new.prcsinstance
,dmk_prcsrqststrng_seq.nextval
,:new.rqststringtype
,:new.prcsrqststring
,SYSDATE
,SYSTIMESTAMP);
ELSIF DELETING THEN
INSERT INTO dmk_prcsrqststrng_log
(prcsinstance
,sequence_nbr
,rqststringtype
,prcsrqststring
,datestamp
,datetimestamp)
VALUES
(:old.prcsinstance
,dmk_prcsrqststrng_seq.nextval
,:old.rqststringtype
,''
,SYSDATE
,SYSTIMESTAMP);
END IF;
COMMIT;
EXCEPTION
WHEN OTHERS THEN NULL;
END;
/
show errors

Unfortunately, the status record is not always updated or deleted when the Application Engine program ends. Therefore, another trigger is required to delete the process record when the application updates its status to no longer processing.

CREATE OR REPLACE TRIGGER dmk_prcsrqststrng_del
AFTER UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (new.runstatus != 7 AND old.runstatus = 7
AND new.prcstype IN ('Application Engine'))
DECLARE
PRAGMA autonomous_transaction;
BEGIN
DELETE FROM psprcsrqststrng WHERE prcsinstance = :new.prcsinstance;
COMMIT;
EXCEPTION WHEN OTHERS THEN NULL;
END;
/
show errors

This table can be used is various ways.

i) A simple report can be written to show the last 10 steps for each process instance, and their duration. I can run this while the program is executing to see how far it has got, and how long the current step has been executing.

COLUMN ranking FORMAT 990 HEADING 'Rank'
COLUMN sequence_nbr FORMAT 990 HEADING 'Seq'
COLUMN prcsrqststring FORMAT a25
COLUMN ratio FORMAT 90.0 HEADING '%'
COLUMN sum_duration FORMAT 999,990.000000 HEADING 'Total|Duration'
COLUMN duration FORMAT 99,990.000000
COLUMN avg_duration FORMAT 99,990.000000 HEADING 'Average|Duration'
COLUMN max_duration FORMAT 99,990.000000 HEADING 'Maximum|Duration'
COLUMN datetimestamp FORMAT a25
COLUMN executions FORMAT 990 HEADING 'Execs'
BREAK ON REPORT ON PRCSINSTANCE SKIP 1

SELECT y.prcsinstance
, y.sequence_nbr
, y.prcsrqststring
, y.datetimestamp
, (TO_DATE(TO_CHAR(y.nextdatetimestamp,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS')
-TO_DATE(TO_CHAR(y.datetimestamp,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS'))*86400
+TO_NUMBER(TO_CHAR(y.nextdatetimestamp,'xFF'))
-TO_NUMBER(TO_CHAR(y.datetimestamp,'xFF')) duration
FROM (
SELECT x.prcsinstance
, x.sequence_nbr
, x.prcsrqststring
, x.datetimestamp
, CASE WHEN x.prcsrqststring = '' THEN x.nextdatetimestamp
ELSE NVL(x.nextdatetimestamp,SYSTIMESTAMP)
END as nextdatetimestamp
, x.ranking
FROM (
SELECT x.prcsinstance
, x.sequence_nbr
, x.prcsrqststring
, x.datetimestamp
, LEAD(x.datetimestamp,1) OVER (PARTITION BY x.prcsinstance ORDER BY x.sequence_nbr) AS nextdatetimestamp
, RANK() OVER (ORDER BY x.datetimestamp desc) AS ranking
FROM dmk_prcsrqststrng_log x
) x
) y
WHERE y.ranking <= 10 ORDER BY prcsinstance, sequence_nbr; PRCSINSTANCE Seq PRCSRQSTSTRING DATETIMESTAMP DURATION ------------ ---- ------------------------------ ------------------------- -------------- 242797 688 Main 15-JUN-06 09.56.49.000721 15.227588 689 XXXIF_PAY_FI.WRITEOUT.Step01 15-JUN-06 09.57.04.228309 15.233425 690 XXX_GEN_MD5.MAIN.Step01 15-JUN-06 09.57.19.461734 45.355060 691 XXXIF_PAY_FI.WRITEOUT.Step03 15-JUN-06 09.58.04.816794 2,738.654582 242814 681 XXXIF_PAYA.Z_DelOld.? 14-JUN-06 18.00.14.747175 8.575205 682 14-JUN-06 18.00.23.322380

242815 683 Main 14-JUN-06 18.01.15.606713 19.816770
684 XXXIF_PAYH.Insert.Ins01 14-JUN-06 18.01.35.423483 15.516303
685 XXXIF_PAYH.Insert.Ins04 14-JUN-06 18.01.50.939786 15.160971
686 XXXIF_PAYH.Process.Hire 14-JUN-06 18.02.06.100757 163.917761
687 14-JUN-06 18.04.50.018518

ii) I can also aggregate the time for all the steps to determine what steps account for the longest execution time.

SELECT *
FROM ( /*rank lines in report, also calculate ration to report*/
SELECT RANK() OVER (ORDER BY sum_duration DESC) AS ranking
, d.*
, 100*ratio_to_report(sum_duration) OVER () AS ratio
FROM ( /*calculate sum and durations*/
SELECT c.prcsrqststring
, SUM(c.duration) sum_duration
, COUNT(*) executions
, AVG(c.duration) avg_duration
, MAX(c.duration) max_duration
FROM (
SELECT b.prcsrqststring
, (TO_DATE(TO_CHAR(b.nextdatetimestamp
,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS')
-TO_DATE(TO_CHAR(b.datetimestamp
,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS'))*86400
+TO_NUMBER(TO_CHAR(b.nextdatetimestamp,'xFF'))
-TO_NUMBER(TO_CHAR(b.datetimestamp,'xFF')) duration
FROM (
SELECT a.prcsinstance
, a.sequence_nbr
, a.prcsrqststring
, a.datetimestamp
, LEAD(a.datetimestamp,1)
OVER (PARTITION BY a.prcsinstance
ORDER BY a.sequence_nbr) AS nextdatetimestamp
, RANK() OVER (ORDER BY a.datetimestamp desc) AS ranking
FROM dmk_prcsrqststrng_log a
) b
WHERE b.nextdatetimestamp IS NOT NULL
AND prcsrqststring != ''
) c
GROUP BY c.prcsrqststring
HAVING SUM(c.duration) > 0
) d
) e
WHERE ranking <= 10 ;

                                    Total                 Average        Maximum
Rank PRCSRQSTSTRING Duration Execs Duration Duration %
---- ------------------------------ --------------- ----- -------------- -------------- -----
1 XXXIF_PAYC.ADDR_LD.060 11,509.840101 15 767.322673 980.808154 29.0
2 XXXIF_PAYC.ADDR_LD.050 7,476.656270 15 498.443751 679.516699 18.8
3 XXXIF_PAYC.ADDR_LD.020 3,669.344222 14 262.096016 669.283625 9.2
4 XXXIF_PAYC.ASGN_LD.076 3,076.882638 13 236.683280 2,729.356869 7.7
5 XXXIF_PAYC.ASGN_LD.010 2,721.361198 15 181.424080 332.187057 6.8
6 XXXIF_PAYC.PURGE.020 2,644.848544 1 2,644.848544 2,644.848544 6.7
7 XXXIF_PAYC.ASGN_LD.020 624.892364 15 41.659491 155.971344 1.6
8 XXXIF_PAYC.ADDR_LD.010 623.354434 14 44.525317 136.919156 1.6
9 XXXIF_PAYC.ASGN_LD.030 583.222200 10 58.322220 425.172493 1.5
10 XXXIF_PAYC.ASGN_LD.060 543.455165 14 38.818226 125.473966 1.4

This method, although useful, is not foolproof. The status string is only updated if AE Restart is enabled, and then not for for every step followed by a COMMIT. I haven't yet been able to work out all the criteria for this.