Batch processing is like opera (and baseball) - "It ain't over till the fat lady sings". Users care about when it starts and when it finishes. If the last process finishes earlier, then that is an improvement in performance.
This note describes a method of additionally prioritising processes queued to run on the process scheduler in PeopleSoft by their requesting operator ID and run control. Where processing consists of more instances of the same process than can run concurrently, it can be used to make the process scheduler run longer-running processes before shorter-running processes that were scheduled earlier, thus completing batch processing earlier.
In PeopleSoft, without customisation, it is only possible to prioritise processes queued to run on the process scheduler by assigning a priority to the process definition or their process category. Higher priority processes are selected to be run in preference to lower priorities. Otherwise, processes are run in the order of the time at which they are requested to run.
Problem Statement
This chart shows the database activity when the batch runs. We often see what has come to be called the 'long tail' while we wait for just a few long-running processes to complete.
Solution
- process_prioritisation_by_cumulative_runtime.sql - master script that creates metadata table and trigger and then:
- nvision_prioritisation_by_cumulative_runtime.sql - example script to create a procedure to populate metadata for nVision batch.
- gppdprun_prioritisation_by_cumulative_runtime.sql - example script to create a procedure to populate metadata for Payroll/Absence calculation batch.
- process_prioritisation_by_cumulative_runtime_test.sql - test trigger by inserting dummy data into process queue table.
- process_prioritisation_by_cumulative_runtime_report.sql - example of a report to compare median execution time with last execution time.
Metadata Table: PS_XX_GFCPRCSPRTY
We need a table that will hold the priority for each combination of process type, process name, operation ID, and run control ID. A corresponding record should be created using the Application Designer project in the GitHub repository.
create table sysadm.ps_xx_gfcprcsprty
(prcstype VARCHAR2(30 CHAR) NOT NULL
,prcsname VARCHAR2(12 CHAR) NOT NULL
,oprid VARCHAR2(30 CHAR) NOT NULL
,runcntlid VARCHAR2(30 CHAR) NOT NULL
,prcsprty NUMBER NOT NULL
--------------------optional columns
,avg_duration NUMBER NOT NULL
,med_duration NUMBER NOT NULL
,max_duration NUMBER NOT NULL
,cum_duration NUMBER NOT NULL
,tot_duration NUMBER NOT NULL
,num_samples NUMBER NOT NULL
) tablespace ptapp;
create unique index sysadm.ps_xx_gfcprcsprty
on sysadm.ps_xx_gfcprcsprty(prcstype, prcsname, oprid, runcntlid)
tablespace psindex compress 3;
Trigger Before Insert into PSPRCSQUE
As processes are scheduled in PeopleSoft, a row is inserted into the process scheduler queue table PSPRCSQUE. A trigger will be created on this table that fires after the insert. It will look for a matching row on the metadata table, PS_XX_GFCPRCSPRTY for the combination of process type, process name, operator ID, and run control ID. If found, the trigger will assign the specified priority to the process request. Otherwise, it will take no action.
CREATE OR REPLACE TRIGGER sysadm.psprcsque_set_prcsprty
BEFORE INSERT ON sysadm.psprcsque
FOR EACH ROW
WHEN (new.prcsname = 'RPTBOOK')
DECLARE
l_prcsprty NUMBER;
BEGIN
SELECT prcsprty
INTO l_prcsprty
FROM ps_xx_gfcprcsprty
WHERE prcstype = :new.prcstype
AND prcsname = :new.prcsname
AND oprid = :new.oprid
AND runcntlid = :new.runcntlid;
:new.prcsprty := l_prcsprty;
EXCEPTION
WHEN no_data_found THEN NULL;
WHEN others THEN NULL;
END;
/
show errors
In this case, I am only assigning priorities to RPTBOOK processes, so I have added a when clause to the trigger so that it only fires for RPTBOOK process requests. This can either be changed for other processes or removed entirely.
Priority Metadata
How the priorities should be defined will depend on the specific use case. In some cases, you may choose to create a set of metadata that remains unchanged.
In this case, the objective is that the processes to take the longest to run should be executed first. Therefore, I decided that the priority of each nVision report book process (by operator ID and run control ID) will be determined by the median elapsed execution time in the last two months. The priorities are allocated such that the sum of the median execution times for each priority will be as even as possible.
I have created a PL/SQL procedure GFCPRCSPRIORITY to truncate the metadata table and then repopulate it using a query on the process scheduler table (although, an Application Engine program could have been written to do this instead). Some details of that query will vary with the exact use case. The procedure is executed daily, thus providing a feedback loop so if the run time varies over time, or new processes are added to the batch, it will be reflected in the priorities.
REM nvision_prioritisation_by_cumulative_runtime.sql
set serveroutput on
create or replace procedure sysadm.gfcprcspriority as
PRAGMA AUTONOMOUS_TRANSACTION; --to prevent truncate in this procedure affecting calling session
l_hist INTEGER := 61 ; --consider nVision processes going back this many days
begin
EXECUTE IMMEDIATE 'truncate table ps_xx_gfcprcsprty';
--populate priorty table with known nVision processes
insert into ps_xx_gfcprcsprty
with r as (
select r.prcstype, r.prcsname, r.prcsinstance, r.oprid, r.runcntlid, r.runstatus, r.servernamerun
, CAST(r.rqstdttm AS DATE) rqstdttm
, CAST(r.begindttm AS DATE) begindttm
, CAST(r.enddttm AS DATE) enddttm
from t, psprcsrqst r
inner join ps.psdbowner p on r.dbname = p.dbname -- in test exclude any history copied from another database
where r.prcstype like 'nVision%' --limit to nVision processes
and r.prcsname like 'RPTBOOK' -- limit to report books
and r.enddttm>r.begindttm --it must have run to completion
and r.oprid IN('NVISION','NVISION2','NVISION3','NVISION4') --limit to overnight batch operator IDs
and r.begindttm >= TRUNC(SYSDATE)+.5-l_hist --consider process going back l_hist days from midday today
and r.runstatus = '9' --limit to successful processes
and r.begindttm BETWEEN ROUND(r.begindttm)-5/24 AND ROUND(r.begindttm)+5/24 --started between 7pm and 5am
), x as (
select r.*, CEIL((enddttm-begindttm)*1440) duration -–rounded up to the next minute
from r
), y as (
select prcstype, prcsname, oprid, runcntlid
, AVG(duration) avg_duration
, MEDIAN(CEIL(duration)) med_duration
, MAX(duration) max_duration
, SUM(CEIL(duration)) sum_duration
, COUNT(*) num_samples
from x
group by prcstype, prcsname, oprid, runcntlid
), z as (
select y.*
, sum(med_duration) over (order by med_duration rows between unbounded preceding and current row) cum_duration
, sum(med_duration) over () tot_duration
from y
)
select prcstype, prcsname, oprid, runcntlid
, avg_duration, med_duration, max_duration, cum_duration, tot_duration, num_samples
--, CEIL(LEAST(tot_duration,cum_duration)/tot_duration*3)*4-3 prcsprty --3 priorities
, CEIL(LEAST(tot_duration,cum_duration)/tot_duration*9) prcsprty --9 priorities
--, DENSE_RANK() OVER (order by med_duration) prcsprty --unlimited priorities
from z
order by prcsprty, cum_duration;
dbms_output.put_line(sql%rowcount||' rows inserted');
commit;
end gfcprcspriority;
/
show errors
Metadata
This is the metadata produced on a test system by the above query. It will vary depending on what has been run recently, and how it performed. There are more, shorter processes in the lower priority groups, and fewer, longer processes in the higher priority groups.
PRCSTYPE PRCSNAME OPRID RUNCNTLID PRCSPRTY
------------------------------ ------------ ------------ ------------------------------ ----------
…
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_XXX1 1
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_XXX3 1
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_LLLL8 1
…
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_LLLL9 2
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_STAT8 2
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_STAT1 2
…
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_TEMPXX 6
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_3 6
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_17 6
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_STAT4 7
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_28 7
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_INCXXX 8
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_MORYYY1 8
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_24 9
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_16 9
A Test Script
This test script inserts some dummy rows into PSPRCSQUE to check whether a priority is assigned by the trigger. The insert is then rolled back.
INSERT INTO psprcsque (prcsinstance, prcstype, prcsname, oprid, runcntlid)
VALUES (-42, 'nVision-ReportBook', 'RPTBOOK', 'NVISION','NVS_RPTBOOK_17');
INSERT INTO psprcsque (prcsinstance, prcstype, prcsname, oprid, runcntlid)
VALUES (-43, 'nVision-ReportBook', 'RPTBOOK', 'NVISION','NVS_RPTBOOK_STAT1');
select prcsinstance, prcstype, prcsname, oprid, runcntlid, prcsprty from psprcsque where prcsinstance IN(-42,-43);
rollback;
You can see that it was successful because priorities 2 and 7 were assigned.
PRCSINSTANCE PRCSTYPE PRCSNAME OPRID RUNCNTLID PRCSPRTY
------------ ------------------------------ ------------ ------------ ------------------------------ ----------
-43 nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_STAT1 2
-42 nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_17 7
Monitoring Script
This query in script process_prioritisation_by_cumulative_runtime_report.sql reports on the average, median, and cumulative median execution time for each nVision process that ran to success during the overnight processing window as calculated by the package GFCPRCSPRIORITY and stored in PS_XX_GFCPRCSPRTY. It also compares that to the priority and last actual run time for that process.
Example Output
Cum.
Average Median Median Total Last Run Actual
Prcs Duration Duration Duration Duration Duration Num Process Duration Duration Duration Priorty
PRCSTYPE PRCSNAME OPRID RUNCNTLID Prty (mins) (mins) (mins) (mins) (mins) Samples Priority (mins) Diff % Diff Diff
-------------------- ---------- ------------ -------------------- ---- -------- -------- -------- -------- -------- ------- -------- -------- -------- -------- -------
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_4 9 90.65 131 209 1834 1997 23 6 189 58 44 3
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_16 9 159.17 163 209 1997 1997 23 9 177 14 9 0
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_14 8 89.26 127 215 1703 1997 23 6 167 40 31 2
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_24 8 115.87 117 165 1576 1997 23 9 144 27 23 -1
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_MORYYY1 7 93.13 85 165 1459 1997 23 8 158 73 86 -1
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_28 7 88.30 80 172 1374 1997 23 8 108 28 35 -1
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_INCXXX 6 83.61 79 149 1294 1997 18 7 118 39 49 -1
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_17 6 70.96 69 105 1143 1997 23 7 81 12 17 -1
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_STAT4 6 68.00 72 81 1215 1997 8 7 81 9 13 -1
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_MMMMMM 5 52.45 46 119 914 1997 22 5 91 46 100 0
nVision-ReportBook RPTBOOK NVISION2 NVS_RPTBK_TEMPXX 5 50.48 49 104 963 1997 23 5 94 45 92 0
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_3 5 55.52 55 99 1018 1997 23 6 79 24 44 -1
nVision-ReportBook RPTBOOK NVISION NVS_RPTBOOK_1 5 47.70 56 137 1074 1997 23 4 56 0 0 1
…
Monitoring Query
The query in prcsmap.sql is used to produce the data for a map of the processes, showing request time, time spent queuing, and time spent executing. It is the basis of the second chart above. I normally run this in SQL Developer and export the data as an Excel workbook. There is an example spreadsheet in the Github repository.
No comments :
Post a Comment