Friday, November 10, 2023

Prioritising Scheduled Processes by Operator ID/Run Control

Batch processing is like opera (and baseball) - "It ain't over till the fat lady sings".  Users care about when it starts and when it finishes.  If the last process finishes earlier, then that is an improvement in performance. 

This note describes a method of additionally prioritising processes queued to run on the process scheduler in PeopleSoft by their requesting operator ID and run control.  Where processing consists of more instances of the same process than can run concurrently, it can be used to make the process scheduler run longer-running processes before shorter-running processes that were scheduled earlier, thus completing batch processing earlier.

In PeopleSoft, without customisation, it is only possible to prioritise processes queued to run on the process scheduler by assigning a priority to the process definition or their process category.  Higher priority processes are selected to be run in preference to lower priorities.  Otherwise, processes are run in the order of the time at which they are requested to run.

Problem Statement

During an overnight batch, many nVision report books are scheduled to run on the Windows process schedulers by one of several specific batch operator IDs.  Many more reports are scheduled than can run concurrently, so some execute while others queue.  Inevitably, the reports have widely varying execution times.  The maximum concurrency of the nVision report book (RPTBOOK) process definition has been set, and the Oracle database resource manager has also been configured, to prevent too many of these processes from overloading the database.

CPU Utilisation of BatchThis chart shows the database activity when the batch runs.  We often see what has come to be called the 'long tail' while we wait for just a few long-running processes to complete.

The next chart shows the processing time of each nVision report process.  The blue bars run from when the started to when it ended, ordered by start time.   The clear boxes below run from the time when it was requested to when it started, thus showing the period for which the processes were queued on a process scheduler, but were blocked because the maximum number of processes were already processing.
Process Map (without prioritisation)
All these processes run with the same priority because they are the same process definition.  Some long-running jobs execute earlier in the batch simply because they were scheduled earlier, but others that started later, run on beyond the end of the batch.  
It would be better if the longest-running processes were executed earlier, irrespective of the order in which they were requested.  Thus the shorter processes can run later as slots on the scheduler become free, and thus all processes should finished both closer together and earlier.
There is nothing delivered in the PeopleSoft process scheduler configuration that will let you assign different priorities to different executions of the same process.  In PeopleSoft, only three priorities are defined PRCSPRIORITY (1=Low, 5=Medium, 9=High) on the process definition and the server category.  These priorities are transferred to the process request queue (PSPRCSQUE.PRCSPRTY).  
If we could put our own priority into that column we could control the priority of the request.  


Introduce a database trigger that fires on insert into PSPRCSQUE and sets a priority specified on a new metadata table.  PRCSPRTY is not validated by PeopleSoft, and therefore any value can be specified.  
The files are available in a Github repository davidkurtz/psprcspty. The exact metadata will vary with the use case and requirements, but I provided some examples of how it might be generated.


We need a table that will hold the priority for each combination of process type, process name, operation ID, and run control ID.  A corresponding record should be created using the Application Designer project in the GitHub repository.

create table sysadm.ps_xx_gfcprcsprty
(prcstype  VARCHAR2(30 CHAR) NOT NULL
,prcsname  VARCHAR2(12 CHAR) NOT NULL
,oprid     VARCHAR2(30 CHAR) NOT NULL
,runcntlid VARCHAR2(30 CHAR) NOT NULL
,prcsprty  NUMBER NOT NULL
--------------------optional columns
,avg_duration NUMBER NOT NULL
,med_duration NUMBER NOT NULL
,max_duration NUMBER NOT NULL
,cum_duration NUMBER NOT NULL
,tot_duration NUMBER NOT NULL
,num_samples  NUMBER NOT NULL
) tablespace ptapp;

create unique index sysadm.ps_xx_gfcprcsprty
on sysadm.ps_xx_gfcprcsprty(prcstype, prcsname, oprid, runcntlid) 
tablespace psindex compress 3;

Trigger Before Insert into PSPRCSQUE

As processes are scheduled in PeopleSoft, a row is inserted into the process scheduler queue table PSPRCSQUE.  A trigger will be created on this table that fires after the insert.  It will look for a matching row on the metadata table, PS_XX_GFCPRCSPRTY for the combination of process type, process name, operator ID, and run control ID.  If found, the trigger will assign the specified priority to the process request.  Otherwise, it will take no action.

CREATE OR REPLACE TRIGGER sysadm.psprcsque_set_prcsprty
BEFORE INSERT ON sysadm.psprcsque
WHEN (new.prcsname = 'RPTBOOK')
  l_prcsprty NUMBER;
  SELECT prcsprty
  INTO   l_prcsprty
  FROM   ps_xx_gfcprcsprty
  WHERE  prcstype = :new.prcstype
  AND    prcsname = :new.prcsname
  AND    oprid = :new.oprid
  AND    runcntlid = :new.runcntlid;
  :new.prcsprty := l_prcsprty;
  WHEN no_data_found THEN NULL;
  WHEN others THEN NULL;
show errors

In this case, I am only assigning priorities to RPTBOOK processes, so I have added a when clause to the trigger so that it only fires for RPTBOOK process requests.  This can either be changed for other processes or removed entirely.

Priority Metadata

How the priorities should be defined will depend on the specific use case.  In some cases, you may choose to create a set of metadata that remains unchanged.

In this case, the objective is that the processes to take the longest to run should be executed first.  Therefore, I decided that the priority of each nVision report book process (by operator ID and run control ID) will be determined by the median elapsed execution time in the last two months.  The priorities are allocated such that the sum of the median execution times for each priority will be as even as possible.  

I have created a PL/SQL procedure GFCPRCSPRIORITY to truncate the metadata table and then repopulate it using a query on the process scheduler table (although, an Application Engine program could have been written to do this instead).  Some details of that query will vary with the exact use case. The procedure is executed daily, thus providing a feedback loop so if the run time varies over time, or new processes are added to the batch, it will be reflected in the priorities.

REM nvision_prioritisation_by_cumulative_runtime.sql

set serveroutput on
create or replace procedure sysadm.gfcprcspriority as
  PRAGMA AUTONOMOUS_TRANSACTION; --to prevent truncate in this procedure affecting calling session
  l_hist INTEGER := 61 ; --consider nVision processes going back this many days
  EXECUTE IMMEDIATE 'truncate table ps_xx_gfcprcsprty';

--populate priorty table with known nVision processes 
insert into ps_xx_gfcprcsprty
with r as (
select r.prcstype, r.prcsname, r.prcsinstance, r.oprid, r.runcntlid, r.runstatus, r.servernamerun
, CAST(r.rqstdttm AS DATE) rqstdttm
, CAST(r.begindttm AS DATE) begindttm
, CAST(r.enddttm AS DATE) enddttm
from t, psprcsrqst r
  inner join ps.psdbowner p on r.dbname = p.dbname -- in test exclude any history copied from another database
where r.prcstype like 'nVision%' --limit to nVision processes
and r.prcsname like 'RPTBOOK' -- limit to report books
and r.enddttm>r.begindttm  --it must have run to completion
and r.oprid IN('NVISION','NVISION2','NVISION3','NVISION4')  --limit to overnight batch operator IDs
and r.begindttm >= TRUNC(SYSDATE)+.5-l_hist --consider process going back l_hist days from midday today
and r.runstatus = '9' --limit to successful processes
and r.begindttm BETWEEN ROUND(r.begindttm)-5/24 AND ROUND(r.begindttm)+5/24 --started between 7pm and 5am
), x as (
select r.*, CEIL((enddttm-begindttm)*1440) duration -–rounded up to the next minute
from r
), y as (
select prcstype, prcsname, oprid, runcntlid
, AVG(duration) avg_duration
, MEDIAN(CEIL(duration)) med_duration 
, MAX(duration) max_duration
, SUM(CEIL(duration)) sum_duration
, COUNT(*) num_samples
from x
group by prcstype, prcsname, oprid, runcntlid
), z as (
select y.* 
, sum(med_duration) over (order by med_duration rows between unbounded preceding and current row) cum_duration 
, sum(med_duration) over () tot_duration
from y
select prcstype, prcsname, oprid, runcntlid 
, avg_duration, med_duration, max_duration, cum_duration, tot_duration, num_samples
--, CEIL(LEAST(tot_duration,cum_duration)/tot_duration*3)*4-3 prcsprty  --3 priorities
, CEIL(LEAST(tot_duration,cum_duration)/tot_duration*9) prcsprty  --9 priorities
--, DENSE_RANK() OVER (order by med_duration) prcsprty --unlimited priorities
from z
order by prcsprty, cum_duration;

  dbms_output.put_line(sql%rowcount||' rows inserted');

end gfcprcspriority;
show errors
In testing, I found that using just the 3 delivered levels of priority was not sufficiently granular to prioritise the jobs adequately, so I chose to use 9 levels (1 to 9).  The process priority on PRCSQUE is not validated, so I can use any value.  I also found I could just rank the processes from 1 to n by duration, and that would also work.

It is possible to create additional priority levels for process categories (see also More Process Priority Levels for the Process Scheduler), but that still only works for prioritising different processes over each other.


This is the metadata produced on a test system by the above query.  It will vary depending on what has been run recently, and how it performed.  There are more, shorter processes in the lower priority groups, and fewer, longer processes in the higher priority groups.  

PRCSTYPE                       PRCSNAME     OPRID        RUNCNTLID                        PRCSPRTY
------------------------------ ------------ ------------ ------------------------------ ----------
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_XXX1                          1
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_XXX3                          1
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_LLLL8                         1
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_LLLL9                         2
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT8                       2
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT1                       2
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_TEMPXX                        6
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_3                           6
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_17                          6
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT4                       7
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_28                          7
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_INCXXX                        8
nVision-ReportBook             RPTBOOK      NVISION2     NVS_RPTBK_MORYYY1                       8
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_24                          9
nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_16                          9

A Test Script

This test script inserts some dummy rows into PSPRCSQUE to check whether a priority is assigned by the trigger. The insert is then rolled back.

INSERT INTO psprcsque (prcsinstance, prcstype, prcsname, oprid, runcntlid)
VALUES (-42, 'nVision-ReportBook', 'RPTBOOK', 'NVISION','NVS_RPTBOOK_17');
INSERT INTO psprcsque (prcsinstance, prcstype, prcsname, oprid, runcntlid)
VALUES (-43, 'nVision-ReportBook', 'RPTBOOK', 'NVISION','NVS_RPTBOOK_STAT1');
select prcsinstance, prcstype, prcsname, oprid, runcntlid, prcsprty from psprcsque where prcsinstance IN(-42,-43);

You can see that it was successful because priorities 2 and 7 were assigned.

PRCSINSTANCE PRCSTYPE                       PRCSNAME     OPRID        RUNCNTLID                        PRCSPRTY
------------ ------------------------------ ------------ ------------ ------------------------------ ----------
         -43 nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_STAT1                       2
         -42 nVision-ReportBook             RPTBOOK      NVISION      NVS_RPTBOOK_17                          7

Monitoring Script

This query in script process_prioritisation_by_cumulative_runtime_report.sql reports on the average, median, and cumulative median execution time for each nVision process that ran to success during the overnight processing window as calculated by the package GFCPRCSPRIORITY and stored in PS_XX_GFCPRCSPRTY.  It also compares that to the priority and last actual run time for that process.

Example Output

                                                                        Average   Median            Median    Total         Last Run   Actual                   
                                                                  Prcs Duration Duration Duration Duration Duration     Num  Process Duration Duration Duration Priorty
PRCSTYPE             PRCSNAME   OPRID        RUNCNTLID            Prty   (mins)   (mins)   (mins)   (mins)   (mins) Samples Priority   (mins)    Diff    % Diff    Diff
-------------------- ---------- ------------ -------------------- ---- -------- -------- -------- -------- -------- ------- -------- -------- -------- -------- -------
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_4           9    90.65      131      209     1834     1997      23        6      189       58       44       3 
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_16          9   159.17      163      209     1997     1997      23        9      177       14        9       0

nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_14          8    89.26      127      215     1703     1997      23        6      167       40       31       2
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_24          8   115.87      117      165     1576     1997      23        9      144       27       23      -1

nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_MORYYY1       7    93.13       85      165     1459     1997      23        8      158       73       86      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_28          7    88.30       80      172     1374     1997      23        8      108       28       35      -1

nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_INCXXX        6    83.61       79      149     1294     1997      18        7      118       39       49      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_17          6    70.96       69      105     1143     1997      23        7       81       12       17      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_STAT4       6    68.00       72       81     1215     1997       8        7       81        9       13      -1

nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_MMMMMM        5    52.45       46      119      914     1997      22        5       91       46      100       0
nVision-ReportBook   RPTBOOK    NVISION2     NVS_RPTBK_TEMPXX        5    50.48       49      104      963     1997      23        5       94       45       92       0
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_3           5    55.52       55       99     1018     1997      23        6       79       24       44      -1
nVision-ReportBook   RPTBOOK    NVISION      NVS_RPTBOOK_1           5    47.70       56      137     1074     1997      23        4       56        0        0       1

Monitoring Query

The query in prcsmap.sql is used to produce the data for a map of the processes, showing request time, time spent queuing, and time spent executing.  It is the basis of the second chart above. I normally run this in SQL Developer and export the data as an Excel workbook.  There is an example spreadsheet in the Github repository.

9 levels of Prioritisation

With prioritisation, we can see that the long-running jobs with higher priority ran earlier.  

We can also see that some of the higher-priority jobs that are scheduled later are running earlier than those scheduled earlier, and are thus finishing earlier.

There is no longer any tail of processing.  Instead, load drops quickly at the end of the batch, and the batch as a whole finishes earlier.

Thursday, April 27, 2023

Querying the PeopleSoft Message Log with SQL

It is easy to access the PeopleSoft message log in Process Monitor component, but it can be a little difficult to work with in SQL because it is stored in multiple tables.

This started when I wanted to generate a PeopleSoft log message as a single string of text, so I could investigate shared pool memory errors by searching for ORA-04031 errors.  Ultimately, the string 'ORA-04031' is stored in PS_MESSAGE_LOGPARM, but I wanted to see the whole error message.

  • Each process, has a request record on PSPRCSRQST, it can have many messages.
  • Each message is stored in the message log table PS_MESSAGE_LOG
  • The text of each message is stored in the message catalogue table PSMSGCATDEFN.  It can have up to 9 substitution strings (%1, %2, etc).
  • These correspond to up to 9 parameters stored on PS_MESSAGE_LOGPARM that are substituted into the message string.

I assemble the message text in a PL/SQL function exactly as PeopleTools programs do, substituting the variables in the message string from the message catalogue with the parameter values.  The PL/SQL function is put into the SQL query as a common table expression so that I don't have to create a function or package in the database.  The function returns the full message text in a CLOB, thus I can then easily manipulate the message string in SQL.

In this case, I wrote a SQL query to search for ORA-04031 (see psmsglogora4031.sql on Github), but the same PL/SQL function can be used in various queries.  

It can be slow to search the generated message string.  It can be faster to search PS_MESSAGE_LOGPARM directly. 

WITH FUNCTION psmsgtext(p_process_instance INTEGER, p_message_seq INTEGER) RETURN CLOB IS
  l_message_log ps_message_log%ROWTYPE;
  l_message_text CLOB;
  INTO   l_message_log
  FROM   ps_message_log 
  WHERE  process_instance = p_process_instance
  AND    message_seq = p_message_seq;

  SELECT message_text
  INTO   l_message_text
  FROM   psmsgcatdefn
  WHERE  message_set_nbr = l_message_log.message_set_nbr
  AND    message_nbr     = l_message_log.message_nbr;

  FOR i IN (
    SELECT *
    FROM   ps_message_logparm
    WHERE  process_instance = p_process_instance
    AND    message_seq = p_message_seq
    ORDER BY parm_seq
  ) LOOP
    l_message_text := REPLACE(l_message_text,'%'||i.parm_seq,i.message_parm);

  --and tidy up the unused replacements at the end
  RETURN REGEXP_REPLACE(l_message_text,'%[1-9]','');
x as (
select r.prcstype, r.prcsname, r.oprid, r.runcntlid
, l.*, psmsgtext(l.process_instance, l.message_seq) message_text
from ps_message_log l
LEFT OUTER JOIN psprcsrqst r ON r.prcsinstance = l.process_instance
select *
from x
ORDER BY dttm_stamp_sec
Now, I can easily produce a report of messages, like this:

Process              Process         Operator                             Process  Msg                                   Msg         Msg
Type                 Name            ID         Run Control              Instance  Seq JOBID           PROGRAM_NAME     Set#  Msg#   Sev DTTM_STAMP_SEC
-------------------- --------------- ---------- ---------------------- ---------- ---- --------------- --------------- ----- ----- ----- ----------------------------
nVision-ReportBook   RPTBOOK         VP1        NVS_XXXXXXX_99            1234567    1 PRCS SCHDL      psprschd           65    70     0 01/04/2023 20.37.21
Process Request shows status of 'INITIATED' or 'PROCESSING' but no longer running

nVision-ReportBook   RPTBOOK         VP1        NVS_XXXXXXX_99            1234567    2 PRCS SCHDL      psprschd           65    73     0 01/04/2023 20.37.23
PSNT1 failed to post files to the report repository.  Server scheduled to try again on 2023-04-01-  See log

Friday, April 14, 2023

Purging PeopleTools Physical Cache Files after Database Restore or Database Flashback

I have written previously about how to clear the physical cache files on a PeopleTools process, but I have found myself explaining it a few times recently, so I am going to post about it again.

When you refresh the database of a PeopleSoft system, you need to clear the physical cache files of the PeopleTools processes.  The files are outside the database and they no longer reflect what is inside the database.  This includes all application servers and process schedulers and anywhere where Application Designer or a client process is used.

It is common to refresh a database when testing a PeopleSoft system.  For example, to copy production to a performance test environment.  It is also increasingly common to use Oracle database flashback during testing.  A guaranteed restore point is taken, a test is performed, and then the database is flashed back to that restore point.  Flashback returns the whole database not just to the same logical state, but also the same physical state.  Block for block, the entire database is physically the same as when the restore point was taken.  Thus a test is completely repeatable with the same initial conditions.  Although the database instance will have been restarted during the flashback so the content of the database memory will have been cleared.

It is also common, after the flashback to then make small changes or corrections, take a new restore point and repeat the test.  Some of those changes might include Application Designer projects that will then be loaded into the physical cache.  Flashing the database back won't change the physical cache files stored outside the database, so they need to be cleared too.  Otherwise, they may have higher version numbers than the objects in the database, and caching won't work correctly.  When you retest, your changes may not be loaded and executed by PeopleTools processes.

The officially approved method is to go around each server and use either the purge option in the psadmin utility or manually delete the files.  See:

However, since at least PeopleTools 5, it has been possible to invalidate all physical cache files on all servers by updating the LASTREFRESHDTTM on the single row in table PSSTATUS.  Any cached object older than the value of LASTREFRESHDTTM will be purged from the cache when the process that reference that cache is started. Therefore, if immediately after a restore or flashback that value is updated to the current system time, all caches will be purged as the processes are restarted.


Wednesday, April 12, 2023

Programmatically Suspending and Restarting the Process Scheduler

I found this question on a message forum, wrote a note, and forgot about it:

Anyone have any tricks on suspending the process schedulers programmatically?  I just tried using the following but the weird thing was I saw at least one of my process schedulers unsuspend itself, so I must be missing an update to a table.


The process scheduler writes its status to SERVERSTATUS so that it can be seen in the Process Monitor.  Instructions to the process scheduler are read from SERVERACTION, so this is the column that must be updated.  Both columns have a set of XLAT values that translate the status.

PeopleSoft Field NameDescription
5=Running With No Report Node
6=Suspended - Disk Low
7=Suspended - Offline
8=Running - Report Rep. Full
SERVERACTIONProcess Server Action

You can see how PeopleSoft does this by tracing the Process Monitor component as it issues commands to the process scheduler.  Hence you can issue commands as follows:
  • Stop
update psserverstat
set    serveraction = 1 /*Stop*/
where  serverstatus = 3 /*Running*/
and    servername = …
  • Suspend
update psserverstat
set    serveraction = 2 /*Suspend*/
where  serverstatus = 3 /*Running*/
and    servername = …
  • Restart (after suspension)
update psserverstat
set    serveraction = 3 /*Restart*/
where  serverstatus = 2 /*Suspended*/
and    servername = …
  • Startup (if the Tuxedo domain is running)
update psserverstat
set    serveraction = 3
where  servername = …

Tuesday, April 11, 2023

Oracle SQL Tracing Processes from Startup

Sometimes, ASH and AWR are not enough.  SQL may not be sampled by ASH if it is short-lived, and even if it is sampled, the SQL may not be captured by AWR.  Sometimes, in order to investigate a problem effectively, it is necessary to use database session SQL trace.  

It is easy to trace a process initiated by the process scheduler with a trigger (see Enabling Oracle Database Trace on PeopleSoft processes with a Trigger).

Another tactic is to use an AFTER LOGON trigger with logic to look at the program name.  The program name can be read using SYS_CONTEXT().  If it matches what I am looking for, I can enable session trace.

Here is an example I used for the OpenXML nVision server PSNVSSRV

  • I want to trace SQL and not any wait events or bind variables.  Therefore, I will set event 10046 at level 1.
  • I also set a tracefile_identifier that will be included in the trace file name, so I can more easily identify the trace file.
REM additional SQL trace triggers
CREATE OR REPLACE TRIGGER sysadm.gfc_nvision_trace_on_logon
ON sysadm.schema
  l_process_instance INTEGER;
  l_program          VARCHAR2(64 CHAR);
  l_sql              VARCHAR2(100);

  INTO   l_program
  FROM   dual;

  IF l_program like 'PSNVSSRV%' THEN --then this is a NVISION session
    EXECUTE IMMEDIATE 'ALTER SESSION SET tracefile_identifier = ''PSNVSSRV''';

show errors
ALTER TRIGGER sysadm.gfc_nvision_trace_on_logon ENABLE;
See also Reading Trace files with SQL