Tuesday, August 28, 2007

Performance Metrics for Scheduled Queries in PeopleSoft

The ability to log executions of ad-hoc Queries executed via the Query Viewer in the PIA was introduced in PeopleTools 8.44. However, not all queries are executed directly, they can also be scheduled via the Process Scheduler. Queries whose result sets are larger than the maximum fetch size for the PSQRYSRV server raise an error, and they must be scheduled.

Don't be tempted to increase the maximum fetch size because the entire result set for queries run in the PIA is set to the Java pool in the web server, and so it can impact all other users of that web server.

Scheduling queries is a good thing because:
i) You no longer copy the results to the web server Java pool.
ii) You can restrict the number of queries that can execute concurrently with standard Process Scheduler functionality.

However, Query logging does apply to scheduled queries, and recently I needed to find a way to work out which scheduled queries were executed most frequently and took the longest time. With most batch processes, it is just a matter of querying the process scheduler request table PSPRCSRQST, but scheduled queries are executed by an Application Engine program called PSQUERY. The query name is specified on a run control, but operators tend to recycle the run controls, so I can't use them. All queries will look the same. However, the name of the query output file includes the name of the query. The list of posted output files is held on the table PS_CDM_FILE_LIST. By joining these two tables, I know how long it took to execute each query, and I can construct the following query for Oracle RDBMS (updated 10.10.2011).

column qryname   format a30     heading 'Query Name'
column avg_secs format 9,999.9 heading 'Average|Exec|(s)'
column sum_secs format 999,990 heading 'Total|Exec|(s)'
column num_execs format 999,990 heading 'Number|of|Execs'
SELECT QRYNAME
, AVG(EXEC_SECS) AVG_SECS
, SUM(EXEC_SECS) SUM_SECS
, COUNT(*) NUM_EXECS
FROM (
SELECT
SUBSTR(F.FILENAME,1,INSTR(FILENAME,'-'||LTRIM(TO_CHAR(F.PRCSINSTANCE))||'.')-1
) QRYNAME
,((ENDDTTM+0)-(BEGINDTTM+0))*86400 EXEC_SECS
FROM PSPRCSRQST P, PS_CDM_FILE_LIST F
WHERE P.PRCSNAME = 'PSQUERY'
AND P.RUNSTATUS = 9
AND P.PRCSINSTANCE = F.PRCSINSTANCE
AND NOT F.CDM_FILE_TYPE IN('LOG','AET','TRC')
)
GROUP BY QRYNAME
ORDER BY SUM_SECS DESC
/


Average Total Number
Exec Exec of
Query Name (s) (s) Execs
------------------------------ -------- -------- --------
PYRL_SEGMENTATION 3,846.6 38,466 10
PYRL_MISC_ELEMENT 3,363.3 20,180 6
PYRL_ELEMENT_MISC 1,674.8 20,097 12
PYRL_NO_DED_RECIP 2,366.3 18,930 8
PYRL_LOAN_BY_PERIOD 2,888.0 5,776 2
PYRL_OPAY_AMT 1,237.0 4,948 4
PYRL_NET 4,893.0 4,893 1
Q_ELEMENT_AMOUNTS 1,083.5 2,167 2
PYRL_NI_EXCEP_LEAVERS 412.0 2,060 5
PYRL_BANK_DETAILS 179.5 1,795 10
...

But there some limitations:
i) Once a query has been purged from the Process Monitor is will no longer be included in the statistics.
ii) This technique cannot distinguish between a public and a private query of the same name. If any operators copy a public query to a private one of the same name, this query will report them all as one.

I have also produced a version of this query for SQL Server:

SELECT
SUBSTRING(F.FILENAME,1,PATINDEX('%-'+LTRIM(STR(F.PRCSINSTANCE))+'.%',FILENAME)-1) QRYNAME
, AVG(DATEDIFF(s,BEGINDTTM,ENDDTTM)) AVG_SECS
, SUM(DATEDIFF(s,BEGINDTTM,ENDDTTM)) SUM_SECS
, COUNT(*) NUM_EXECS
FROM PSPRCSRQST P, PS_CDM_FILE_LIST F
WHERE P.PRCSNAME = 'PSQUERY'
AND P.RUNSTATUS = 9
AND P.PRCSINSTANCE = F.PRCSINSTANCE
AND F.CDM_FILE_TYPE IN('CSV','XLS') --there might be more
GROUP BY SUBSTRING(F.FILENAME,1,PATINDEX('%-'+LTRIM(STR(F.PRCSINSTANCE))+'.%',FILENAME)-1)
ORDER BY SUM_SECS DESC