Thursday, November 04, 2010

PeopleTools 8.50 uses DBMS_APPLICATION_INFO to Identify Database Sessions

I recently worked on a PeopleTools 8.50 system in production for the first time and was able to make use of the new Oracle specific instrumentation in PeopleTools.

PeopleTools now uses the DBMS_APPLICATION_INFO package to set module and action session attributes.  This data is then copied into the Active Session History (ASH).
  • Within the PIA, the application server sets module and action to the name of the current component and page within the current components
  • For Integration Broker messages they are set to service and queue name.
  • For Application Engine processes the module is set to PSAE and the action is set to the program name.
The first two of these three changes are very welcome, but I think the instrumentation of Application Engine is inadequate. I have a better suggestion which can be implemented with a database trigger.

Enterprise Manager
In Oracle Enterprise Manager, I can see the top SQL statements and group them by Module and Action, so I now can see which components are consuming the most time.  I can also produce an AWR for a specific component or page within a component.

OEM Screenshot of a PeopleTools 8.50 system.

I can query the ASH data to profile which pages consume the most time on the database.

MODULE                 ACTION                   ASH_SECS 
---------------------- ---------------------- ---------- 
RECV_PO                PO_PICK_ORDERS                240 
XXX_REQ_INQUIRY        xyzzy                         170 
XXX_REQ_WRKLST         XXX_REQ_WORKLIST              170 
VCHR_EXPRESS           VCHR_LINE_RECV_WRK            170 
XXX_FIN_WORKLIST       XXX_FIN_WORKLIST              160 
VCHR_EXPRESS           VCHR_EXPRESS1                 160 
PURCHASE_ORDER_EXP     PO_EXPRESS                    140 
XXX_HOME_PAGE          XXX_HOME_PAGE                 140 
RECV_PO                RECV_WPO                      130 
VCHR_EXPRESS           xyzzy                         120 
XXX_PUR_WORKLIST       XXX_PUR_WRKLST                120 
CDM_RPT                CDM_RPT_INDEX                 100 
sum                                                  2820

(Updated 19.11.2010) For some components the action is set to ‘xyzzy’. This seems to be a default value set when the component is opened, but before any of the pages are processed.  Therefore, it refers to activity in the search dialogue, including processing of :
  • look ups to obtain values for search criteria
  • SQL issued during SearchSave PeopleCode to validate the search criteria.
  • the query on the Component Search record
Batch Processes
Now, I want to look at what happens in batch processes. In previous versions of PeopleTools, the module was set to the same value as the program name, and action was left blank. This is not particularly helpful. In 8.50 module is set to PSAE for Application Engine processes, and action is set to the name of the program.

MODULE                       ACTION                      ASH_SECS
---------------------------- ------------------------- ----------
PSNVS.EXE                                                   10220
PSAE                         AP_PSTPYMNT                     3100
PSAE                         AP_MATCH                        2690
sqrw.exe                                                     1770
PSAE                         PO_RECVACCR                     1390
PSQRYSRV.exe                                                  880
PSAE                         FS_STREAMLN                      870
PSPUBDSP.exe                                                  850
PSBRKDSP.exe                                                  740
PSPRCSRV.exe                                                  690
PSSUBDSP.exe                                                  620

This is certainly better than in previous versions. However, its weakness is that if I have multiple concurrent instances of the same process, although I could tell that the ASH data had come from different sessions, I would not be determine which came from which session. This situation could occur, for example, in Global Payroll if ‘streaming’ was configured, where the payroll calculation can be broken into many processes that run concurrently.

In another blog posting, Using Oracle Enterprise Manager (Grid Control) with PeopleSoft, I proposed a trigger that writes name of the PeopleSoft process name to module, and the Process Instance number into action. Furthermore, this trigger works for all processes scheduled by the Process Scheduler.  I think that that trigger is still useful in PeopleTools 8.50.

Wednesday, September 15, 2010

Performance Overhead of Multiple SQL calls in SQR

I was asked to look at a fairly simple SQR program that reported on payroll data. It pivoted data for each employee on which it reported. It makes 21 calls to two procedures that each obtain a value by summing data across different sets of data in one of two payroll result tables.

The ASH data shows that most of the database time, 184 out of 192 seconds is spent in the two statements that aggregate that data. These statements are in the procedures that are called within the driving query.

------------- ------------------- ---------- ----------
515d3buvaf8us          1162018321        615        133
55a20fnkwv0ht          3972836246        615         51
sum                                                 192

However, more significantly, only 192 seconds out of a total elapsed run time of 615 seconds is spent in the database. That is just 31%. So most of the time is spent executing code within the SQR program.

We need to look at the code to see exactly what is happening.

This is the driving query. It returns each employee who is paid in a specified payroll, and then for each row the procedure get_gp_acum is used to fetch the sum of certain accumulators for that payroll and employee


  Let $pin_name = 'GBR AC GRTX SEG'  !Taxable Gross 
  do get_gp_acum
  Let #Taxable_gross = &b.calc_rslt_val
  Let $pin_name = 'GBR AC NIBL SEG'  !Nlable Gross
  do get_gp_acum
  Let #Niable_gross1 = &b.calc_rslt_val
from ps_gpgb_ee_rslt a, ps_person_name n
where a.emplid = n.emplid
and a.cal_run_id = $PNL_CAL_RUN_ID1
and a.empl_rcd = 0
and a.gp_paygroup = $PNL_PYGRP1
and a.cal_id = $PNL_CAL_ID1
and a.orig_cal_run_id = a.cal_run_id
order by a.emplid,a.gp_paygroup

This is one of the two procedures that is called to obtain each value.  It simply sums the data for that employee.

Begin-Procedure get_gp_acum
sum(b.calc_rslt_val) &b.calc_rslt_val
from ps_xx_gpacum_rpt1 b, ps_gp_pin c
where b.emplid    = &a.emplid
and b.empl_rcd    = 0
and b.cal_run_id  = &a.cal_run_id
and b.gp_paygroup = &a.gp_paygroup
and b.cal_id      = &a.cal_id
and b.orig_cal_run_id = &a.orig_cal_run_id
and b.rslt_seg_num    = &a.rslt_seg_num  
and b.orig_cal_run_id = b.cal_run_id
and b.pin_num = c.pin_num
and c.pin_nm = $pin_name

This code is very clear, well structured, and easy to maintain. The only trouble is that it is slow. Each SQL calls makes SQR do a lot of work, and that takes time.

In this case there was not much procedural code in the SQR and so I was able to coalesce the SQL from the called procedures into the driving query.

If the called procedures had been simple single row look-ups I could have used an outer-join. However, as they are using a group function (sum), I put the query into a scalar query (a query within in the select clause that returns only one row and one column). Each call to a procedure was replaced with a separate scalar query. I ended up with 21 scalar queries.

During this rewrite I encountered an SQR quirk; if the scalar query was placed in the main select clause, SQR produces errors because it is expecting an expression, and it complains that the SELECT keyword is not a variable. I then had to wrap the query in an in-line view. Each scalar query must be given a column alias, and the column alias can be referenced in the SQR select clause.

Begin-Procedure MAIN-REPORT

 Let #Taxable_gross = &a.gbr_ac_grtx_seg

 Let #Niable_gross1 = &a.gbr_ac_nibl_seg

from (
select a.emplid, a.cal_run_id, a.cal_id, a.gp_paygroup, a.orig_cal_run_id, a.rslt_seg_num, a.tax_code_uk, a.gpgb_ni_category
,NVL((SELECT sum(b.calc_rslt_val) from ps_xx_gpacum_rpt1 b, ps_gp_pin c 
      where b.emplid = a.emplid and b.empl_rcd = 0 and b.cal_run_id = a.cal_run_id 
      and b.gp_paygroup = a.gp_paygroup and b.cal_id = a.cal_id 
      and b.orig_cal_run_id = a.orig_cal_run_id and b.rslt_seg_num = a.rslt_seg_num 
      and b.orig_cal_run_id = a.cal_run_id and b.pin_num = c.pin_num 
      and c.pin_nm = 'GBR AC GRTX SEG'),0) gbr_ac_grtx_seg
,NVL((SELECT sum(b.calc_rslt_val) from ps_xx_gpacum_rpt1 b, ps_gp_pin c 
      where b.emplid = a.emplid and b.empl_rcd = 0 and b.cal_run_id = a.cal_run_id 
      and b.gp_paygroup = a.gp_paygroup and b.cal_id = a.cal_id 
      and b.orig_cal_run_id = a.orig_cal_run_id and b.rslt_seg_num = a.rslt_seg_num 
      and b.orig_cal_run_id = a.cal_run_id and b.pin_num = c.pin_num 
      and c.pin_nm = 'GBR AC NIBL SEG'),0) gbr_ac_nibl_seg
from ps_gpgb_ee_rslt a
where a.empl_rcd = 0
and a.orig_cal_run_id = a.cal_run_id
) a
,ps_person_name n
where a.cal_run_id = $PNL_CAL_RUN_ID1
and a.gp_paygroup = $PNL_PYGRP1
and a.cal_id = $PNL_CAL_ID1
and a.emplid = n.emplid
order by a.emplid,a.gp_paygroup

The SQL looks much more complicated, as does the execution plan.  However, the effect on performance was dramatic.

------------- ------------------- ---------- ----------
f9d03ffbftv81          1694704409        154        114
5d2x9mqvvyrjk           989254841        154          2
3v550ghn6z8jv          1521271881        154          1
sum                                                 117

The response of just the combined SQL at 117 seconds is better than the separate SQLs at 154 seconds. Much more significantly, the amount of time spent in SQR (rather than the database) has fallen from 432 seconds to just 37. Therefore, 90% of the SQR response time was spent on submitting the SQL calls in the called procedures.


SQL calls in SQR are expensive. The cost of making lots of calls inside a loop or another driving query can add up to a significant amount of time. SQRs that consume time in this way will also be consuming CPU and memory on the server where the Process Scheduler is located. 

In this case, combining SQL statements also improved SQL performance, but that will not always be the case.

There are times when better performance can be achieved at the cost of more convoluted code. In each case there is a judgement to be made as to whether improvement in performance is worth the increase in complexity.

Friday, September 10, 2010

PeopleSoft Run Control Purge Utility

Run Control records are used to pass parameters into processes scheduled processes. These tables tend to grow, and are rarely purged. Generally, once created a run control is not deleted.  When operator accounts are deleted, the Run Controls remain, but are no longer accessible to anyone else.

I have worked on systems where new Run Controls, whose IDs contain either a date or sequence number, are generated for each process. The result is that the Run Control tables, especially child tables, grow quickly and if not regularly managed will become very large. On one system, I found 18 million rows on one table!

--------------- ------------------ ---------- ----------
TL_RUN_CTRL_GRP RUN_CNTL_ID          18424536     126377
AEREQUESTPARM   RUN_CNTL_ID           1742676      19280
AEREQUESTTBL    RUN_CNTL_ID            333579       3271
XPQRYRUNPARM    RUN_CNTL_ID            121337       1630
TL_TA_RUNCTL    RUN_CNTL_ID            112920        622

I have written a simple Application Engine process, GFC_RC_ARCH, that purges old Run Controls from these tables.  It can be downloaded from my website.

Run Control records are easily identified. They are characterised by:
  • the first column of these tables is always OPRID, and the second is either RUNCNTLID or RUN_CNTL_ID,
  • these two columns are also the first two columns of the unique key,
  • the Run Control tables appear on pages of components that are declared as the process security component for that process.
I have decided that if the combination of OPRID and RUN_CNTL_ID does not appear in the process scheduler request table, PSPRCSRQST, then the Run Control record should be deleted. Thus, as the delivered Process Scheduler Purge process, PRCSPURGE, deletes rows from the Process Scheduler tables, so my purge process will delete rows from the Run Control tables.

I have chosen to make these two Application Engine processes mutually exclusive, so the Process Scheduler will not run both at the same time, but that configuration cannot be delivered in an Application Designer project.

Monday, July 12, 2010

Announcing the Co-Operative PeopleTools Table Reference

Update 20.6.2019: I have published on Github the code used to generate the PeopleTools Table reference described here.  The element of co-operation is should be easier via updates to the metadata scripts in Github.  See blog post PeopleTools Table Reference Generator.

I have created a reference to the PeopleTools tables and views on my website. 

In the course of my work on PeopleSoft, I spend a lot of time looking at the PeopleTools tables. They contain meta-data about the PeopleSoft application. Much of the application is stored in various tables that are maintained by Application Designer. Some tables provide information about the Data Model. Others contain configuration data that is maintained via PeopleTools components in the PIA.

Many of my utility scripts query information from PeopleTools tables and some also update them. Of course, that is strictly not supported, but if you understand how the tables fit together it can be done relatively safely.  So, it is very helpful to be able to understand what is in these tables.

In PeopleSoft for the Oracle DBA, I discussed some of the PeopleTools tables that are of regular interest. I included the tables that correspond to the database catalogue, and I discussed what happens during the PeopleSoft login procedure, submission of process requests to the Process Scheduler and PS/Query. The tables that are maintained by the process scheduler are valuable because they contain information about who ran what process when, and how long they ran.

I am not the only person to have started to document the PeopleTools tables on their website or blog, most people have picked a few tables that are of particular interest. However, I want to tackle the problem in a slightly different way. There are over 3000 PeopleTools tables and views (as defined by the PeopleTools object security group in PSOBJGROUP). Tackling all of them manually would be a monumental task.

Nevertheless, I do want a complete reference. So, I have written code to dynamically generate a page for each PeopleTools table and view, and I have put as much information about these records as I can find in the PeopleTools tables themselves. Reference to related objects, including objects referenced in the text of views, appear as links to those pages.

I have started to manually add my own annotation to the generated pages.  So far I have only added descriptions to a few tables (marked with an asterisk). However, I would like to make this a collaborative project, and I have already had updates to some pages.
  • There is a page for each PeopleTools table and view. If you save that page, add descriptions, and return it to me, I will upload it to the site.
  • You can also add links to related websites and blog pages to your entries.
  • Please put your name and, if you wish, a link to your website to the bottom of the pages you author.
  • Let me know if you think you have found a mistake. 
I hope you find it useful.

    Thursday, June 17, 2010

    Configuring Large PeopleSoft Application Servers

    Occasionally, I see very large PeopleSoft systems running on large proprietary Unix servers with many CPUs.  In an extreme case, I needed to configure application server domains with up to 14 PSAPPSRV processes per domain (each domain was on a virtual server with 8 CPU cores, co-resident with the Process Scheduler).

    The first and most important point to make is don't have too many server processes.  If you run out of CPU or if you fully utilise all the physical memory and start to page memory from disk, then you have too many server processes.  It is better to queue on a Tuxedo queue rather than the CPU run queue, or disk queue during paging.

    Multiple APPQ/PSAPPSRV Queues

    A piece of advice that I originally got from BEA (prior to their acquisition by Oracle) was that you should not have more than 10 server processes on a single queue in Tuxedo.  Otherwise, you are likely to suffer from contention on the IPC queue structure because processes must acquire exclusive access to the queue in order to enqueue a service request to the queue or dequeue a request from it.  Instead multiple queues should be configured that are both serviced by the same server processes and so advertise the same services. 

    If you look at the 'large' template delivered by PeopleSoft, you will see that it produces a domain that runs between 9 and 15 PSAPPSRV processes.  This does not conform to the advice I received from BEA.  I repeated this advice in PeopleSoft for the Oracle DBA.  Though I cannot now find the source for it, I stand by it.  I have recently been able to conduct some analysis to confirm it on a real production system.  Domains with two queues of 8 PSAPPPSRV server process each out performed domains with only a single queue.

    Load Balancing Across Queues

    If the same service is advertised on multiple queues, then Tuxedo recommended that you should specify realistic service loads and use Tuxedo load balancing to determine where to enqueue requests.  I want to emphasise that I am talking about load balancing across queues within a Tuxedo domain, and not about load balancing across Tuxedo domains in the web server.

    This is what the Tuxedo documentation says about load balancing:

    "Load balancing is a technique used by the BEA Tuxedo system for distributing service requests evenly among servers that offer the same service. Load balancing avoids overburdening some servers while leaving others idle or infrequently used. Before sending a request to a service routine, the BEA Tuxedo system identifies all servers capable of handling the request and selects the one most appropriate for maintaining a balanced load across all the servers in the configuration.

    You can control whether a load-balancing algorithm is used on the system as a whole. Such as algorithm should be used only when necessary, that is, only when a service is offered by servers that use more than one queue. Services offered by only one server, or by multiple servers in a Multiple Server, Single Queue (MSSQ) do not need load balancing. The LDBAL parameter for these services should be set to N. In other cases, you may want to set LDBAL to Y."

    It doesn't state that load balancing is mandatory for multi-queue domains, and only hints that it might improve performance.  If load balancing is not used, the listener process puts the messages on the first empty queue (one where no requests are queued).  If all queues have requests the listener round-robins between the queues.

    You could consider giving ICScript, GetCertificate and other services with small service times a higher Tuxedo Service priority.  This means they jump the queue 9 times out of 10.  ICScript is generally used during navigation, GetCertificate is used at log on.  Giving these services higher priority will mean they perform well even when the system is busy.  Users often need to do several mouse clicks to navigate around the system, but these services are usually quick.  This will improve the user experience without changing the overall performance of the system.


    I have recently been able to test the performance of a domains with up to 14 PSAPPSRVs on a single IPC queue, versus domains with two queues with up to 7 PSAPPSRVs each, both with and without Tuxedo queue balancing.   These results come from a real production system where the multiple queue configuration was implemented on 2 of the 4 application servers.  The system has a short-lived weekly peak period of on-line processing.  During that time Tuxedo spawns additional PSAPPSRV processes, and so I get different sets of times for different numbers of process. 

    The timings are produced from transactions sampled by PeopleSoft Performance Monitor.  I capture the number of spawned processes using the Tuxmon scripts on my website that use tmadmin to collect Tuxedo metrics.

    1 Queue2 Queue

    Server Processes per Queue

    Number of Services

    Mean ICPanel Service Time

    Server Processes per Queue

    Number of Services

    Mean ICPanel Service Time
















































    The first thing to acknowledge is that this data is quite noisy because it comes from a real production system, and the effects we are looking for are quite small.

    I am satisfied that the domains with two PSAPPSRV queues generally perform better under high load, than those under 1.  Not only does the queue time increase on the single queue domain, the service time also increases.

    However, I cannot demonstrate that Tuxedo Load Balancing makes a significant difference in either direction.

    My results suggest that domains with multiple queues for requests handled by PSAPPSRV process perform slightly better without load balancing if there is no queue of requests, but perform slightly better if there is a queue of pending requests.  However, the difference is small.  It is not large enough to be statistically significant in my test data.


    If you have a busy system with lots of on-line users, and sufficient hardware to resource it, then you might reach a point when you need more than 10 PSAPPSRVs.  In which case, I recommend that you configure multiple Tuxedo queues.

    On the whole, I would recommend that Tuxedo Load Balancing should be configured.  I would not expect it to improve performance, but it will not degrade it either.

    Friday, June 11, 2010

    Life Cycle of a Process Request

    Oracle's Flashback Query facility lets you query a past version of a row by using the information in the undo segment.  The VERSIONS option lets you seen all the versions that are available. Thus, it is possible to write a simple query to retrieve the all values that changed on a process request record through its life cycle.

    The Oracle parameter undo_retention determines how long that data remains in the undo segment. In my example, it is set to 900 seconds, so I can only query versions in the last 15 minutes. If I attempt to go back further than this I will get an error.

    column prcsinstance heading 'P.I.' format 9999999
    column rownum heading '#' format 9 
    column versions_starttime format a22
    column versions_endtime format a22
    SELECT rownum, prcsinstance
    , begindttm, enddttm
    , runstatus, diststatus
    , versions_operation, versions_xid
    , versions_starttime, versions_endtime
    FROM sysadm.psprcsrqst
    VERSIONS BETWEEN timestamp
    systimestamp - INTERVAL '15' MINUTE AND
    WHERE prcsinstance = 2920185
    -- -------- ------------------- ------------------- -- -- - ---------------- --------------------- ----------------------
     1  2920185 10:51:13 11/06/2010 10:52:11 11/06/2010 9  5  U 000F00070017BD63 11-JUN-10 10.52.10 AM
     2  2920185 10:51:13 11/06/2010 10:52:11 11/06/2010 9  5  U 001A002C001CB1FF 11-JUN-10 10.52.10 AM 11-JUN-10 10.52.10 AM
     3  2920185 10:51:13 11/06/2010 10:52:11 11/06/2010 9  7  U 002C001F000F87C0 11-JUN-10 10.52.10 AM 11-JUN-10 10.52.10 AM
     4  2920185 10:51:13 11/06/2010 10:52:11 11/06/2010 9  1  U 000E000A001771CE 11-JUN-10 10.52.10 AM 11-JUN-10 10.52.10 AM
     5  2920185 10:51:13 11/06/2010 10:52:02 11/06/2010 9  1  U 002A000F00125D89 11-JUN-10 10.52.01 AM 11-JUN-10 10.52.10 AM
     6  2920185 10:51:13 11/06/2010                     7  1  U 0021000B00132582 11-JUN-10 10.51.10 AM 11-JUN-10 10.52.01 AM
     7  2920185                                         6  1  U 0004002000142955 11-JUN-10 10.51.10 AM 11-JUN-10 10.51.10 AM
     8  2920185                                         5  1  I 0022002E0013F260 11-JUN-10 10.51.04 AM 11-JUN-10 10.51.10 AM

    Now, I can see each of the committed versions of the record. Note that each version is the result of a different transaction ID.
    Reading up from the last and earliest row in the report, you can see the history of this process request record.

    • At line 8 it was inserted (the value of psuedocolumn VERSION_OPERATION is 'I') at RUNSTATUS 5 (queued) by the component the operator used to submit the record.
    • At line 7, RUNSTATUS was updated to status 6 (Initiated) by the process scheduler.
    • At line 6 the process begins and updates the BEGINDTTM with the current database time, and sets RUNSTATUS to 7 (processing).
    • At line 5 the process completes, updates ENDDTTM to the current database time, and sets RUNSTATUS to 9 (success).
    • At line 4 the ENDDTTM is updated again. This update is performed by the Distribution Server process in the Process Scheduler domain as report output is posted to the report repository.  Note that the value is 1 second later than the VERSIONS_ENDTIME, therefore this time stamp is based on the operating system time for the host running the process scheduler. This server's clock is slightly out of sync with that of the database server.
    • At lines 3 to 1 there are 3 further updates as the distribution status is updated twice more.

    For me, the most interesting point is that ENDDTTM is updated twice; first with the database time at which the process ended, and then again with the time at which any report output was successfully completed.

    I frequently want measure the performance of a processes. I often write script that calculate the duration of the process as being the difference between ENDDTTM and BEGINDTTM, but now it is clear that this includes the time taken to post the report and log files to the report repository.

    For Application Engine processes, you can still recover the time when the process ended. If batch timings are enabled and written to the database, the BEGINDTTM and ENDDTTM are logged in PS_BAT_TIMINGS_LOG.

    select * from ps_bat_timings_log where process_instance = 2920185
    ---------------- ------------ ------------------------------ ------------------------------
    ------------------- ------------------- ------------ ---------- ----------- -----------
             2920185 XXX_XXX_XXXX 52630500                       16023
    10:51:12 11/06/2010 10:52:02 11/06/2010        49850      35610       13730        1159

    You can see above that ENDDTTM is the time when the process ended.

    That gives me some opportunities. For Application Engine programs, I can measure the amount of time taken to posting report content, separately from the process execution time.  This query shows me that this particular process took 49 seconds, but the report output took a further 9 seconds to post.

    SELECT r.begindttm begindttm
    , NVL(l.enddttm, r.enddttm) enddttm
    , (NVL(l.enddttm, r.enddttm)-r.begindttm)*86400 exec_secs
    , r.enddttm posttime
    , (r.enddttm-NVL(l.enddttm, r.enddttm))*86400 post_secs
    FROM sysadm.psprcsrqst r
        LEFT OUTER JOIN sysadm.ps_bat_timings_log l
        ON l.process_instance = r.prcsinstance
    WHERE r.prcsinstance = 2920185
    BEGINDTTM           ENDDTTM              EXEC_SECS POSTTIME             POST_SECS
    ------------------- ------------------- ---------- ------------------- ----------
    10:51:13 11/06/2010 10:52:02 11/06/2010         49 10:52:11 11/06/2010          9  

    For more detail on the Flashback Query syntax see the Oracle SQL Reference.

    Wednesday, March 31, 2010

    Oracle Plan Stability (Stored Outlines) in PeopleSoft Global Payroll

    A longer version of this posting, with experimental results, is available on my website.

    In PeopleSoft for the Oracle DBA, I wrote a page (p. 291) explaining why stored outlines were not suitable for use in PeopleSoft.  Five years later, my view has not significantly changed.  Essentially, stored outlines work best with shared SQL, and there isn't much shared SQL in PeopleSoft, because a lot of it is dynamically generated.
    • Code generated by the component processor is dynamically generated.  At save time, only fields that have changed are updated.
    • PeopleCode can written in such a way that where clauses are dynamically assembled
    • nVision reports have variable numbers of criteria on literal tree node IDs in the queries.
    • By default in Application Engine, bind variables are converted to literals before the SQL is submitted to the database.  Even if this is overridden by enabling ReUseStatement in Application Engine or by using Cursor Sharing in the database, the code still wouldn’t be sharable.  Different instances of Application Engine executing the same program can use different instances on non-shared temporary records, so the tables in otherwise identical SQL statements are not the same.  You would get one version of the statement per temporary table instance.
    However, there are very limited exceptions to this rule (otherwise I wouldn't have a story to tell).  The SQL in COBOL and SQR programs are more likely to be shareable.   Although some programs are coded to generate SQL dynamically, bind variables are passed through to SQL statements, and they use regular tables for working storage and not PeopleSoft temporary records.
    A Global Payroll customer came to me with a problem where the payroll calculation (GPPDPRUN) would usually run well, but sometimes, the execution plan of a statement would change and the calculation would take additional several hours.  It is significant that the Global Payroll engine is written in COBOL.  My usual response to this sort of problem in Global Payroll is to add a hint to the stored statement.  Usually, I find that only a few statements that are affected.  However, after this happened a couple of times in production, it was clear that we couldn't continue to react to these problems. We needed to proactively stop this happening again.  This is exactly what stored outlines are designed to do.

    Using Stored Outlines in the PeopleSoft GP Engine

    Earlier I said that we could apply stored outlines to the Global Payroll calculation engine (GPPDPRUN) because it generally doesn’t use dynamic code with embedded literal values.  
    While outlines are being created, the following privilege needs to be granted.  It can be revoked later.


    We can create a trigger to collect the stored outlines for a payroll calculation, thus:
    • The trigger fires when a payroll calculation process starts or finishes. 
    • At the start a payroll process it starts collecting stored outlines in a category called the same as the process; GPPDPRUN.
    • When the process finishes, outline collection is disabled by setting it back to false.
    CREATE OR REPLACE TRIGGER sysadm.gfc_create_stored_outlines
    BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
    WHEN (new.prcsname = 'GPPDPRUN' AND (new.runstatus = 7 OR old.runstatus = 7))
     l_sql VARCHAR2(100);
      l_sql := 'ALTER SESSION SET create_stored_outlines = ';
      IF :new.runstatus = 7 THEN
        EXECUTE IMMEDIATE l_sql||:new.prcsname;
      ELSIF :old.runstatus = 7 THEN
      END IF;
    EXCEPTION WHEN OTHERS THEN NULL; --because I dont want to crash the process scheduler  

    The exact number of outlines that are collected during this process will vary depending upon configuration, and which employees are processed as different payroll rules are invoked.
    If no more outlines are to be collected the CREATE ANY OUTLINE privilege can be revoked.  This does not prevent the outlines from being used.
    Then, the category of outlines can be used in subsequent executions by replacing the trigger above with the one below, and the execution plans cannot change so long as the SQL doesn’t change.
    CREATE OR REPLACE TRIGGER sysadm.gfc_use_stored_outlines
    BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
    WHEN (new.prcsname = 'GPPDPRUN' AND (new.runstatus = 7 OR old.runstatus = 7))
     l_sql VARCHAR2(100);
      l_sql := 'ALTER SESSION SET use_stored_outlines = ';
      IF :new.runstatus = 7 THEN
    EXECUTE IMMEDIATE l_sql||:new.prcsname;
      ELSIF :old.runstatus = 7 THEN
      END IF;
    EXCEPTION WHEN OTHERS THEN NULL; --because I dont want to crash the process scheduler 

    After running an identify-and-calc and a cancel, we can see how many of the outlines are actually used.
    SELECT category, count(*) outlines
    , sum(decode(used,'USED',1,0)) used
    FROM user_outlines
    GROUP BY category
    ORDER BY 1

    I have a large number of unused outlines because of additional recursive SQL generated because OPTIMIZER_DYNAMIC_SAMPLING was set 4. This does not occur if this parameter is set to the default of 2.
    CATEGORY                         OUTLINES       USED
    ------------------------------ ---------- ----------
    GPPDPRUN                              572        281

    I can then remove the unused outlines.
    EXECUTE dbms_outln.drop_unused;

    Used flags on the outlines can be reset, so we later we can see the outlines being used again.
     FOR i IN (SELECT * FROM user_outlines WHERE category = 'GPPDPRUN') LOOP
     END LOOP;

    If I want to go back running without outlines, I just disable the trigger
    ALTER TRIGGER sysadm.stored_outlines DISABLE;

    To re-enable outlines, just re-enable the trigger.
    ALTER TRIGGER sysadm.stored_outlines ENABLE;


    Stored Outlines have very limited application in a PeopleSoft system.  However, they can easily be collected and used with the PeopleSoft Global Payroll engine.  It is just a matter of granting a privilege and using the database triggers on the process request table.
    Testing that they actually have the desired effect is quite difficult, because you are trying to prove a negative.  I don’t think it is adequate simply to say that the outline has been used.
    • First you would need an environment where payroll calculation performs well, where you could collect outlines.
    • Then you would need a payroll calculation that performs poorly because the execution plan for at least one SQL statement is different
      • Either, on a second environment with exactly the same code.
      • Or in the same environment on a different set of data.
    • Then, it would be possible to demonstrate that applying the outline causes the execution plan to revert and restores the performance.  This can be confirmed by comparison of the ASH data for the various scenarios.
    Even if you don’t want to use a stored outline immediately, it might be advantageous to collect them, and have them available when you do encounter a problem.

    Thursday, March 25, 2010

    Capturing DDL for Database Objects Not Managed by PeopleTools

    I have written before about the challenges of managing database objects and attributes of database objects that are not managed by PeopleTools Application Designer. I proposed a DDL trigger to prevent such objects being dropped or altered. However, sometimes it is necessary to temporarily disable this DDL trigger, such as during patch or upgrade release it is necessary to disable this trigger to apply the changes.

    Now, I have another DDL trigger and a packaged procedure that captures the DDL to recreate objects that are recursively dropped (such as DML triggers on tables). The DDL is stored in a database table. This trigger can remain permanently enabled, and the table it maintains can be used to see what objects are missing, as well as holding the SQL to rebuild them.

    An common example of where this is can be valuable is where a system uses database triggers to capture audit data.  This method is often preferred because it generally performs better than having the application server generate additional DML to the audit table, and also captures updates made in other processes.  PeopleSoft even deliver processes to generate the DML triggers that write to the audit tables.  However, if you alter the table in Application Designer, perhaps only because you are applying a PeopleSoft fix, and apply the changes to the database by recreating the table, then the trigger will be lost.  It is then up to the customer to make sure the audit trigger is replaced.  There is absolutely nothing to warn you that the trigger is lost, and the application will still function without the trigger, but your updates will not be audited.

    When a table is dropped, the trigger calls a procedure in the package that checks for:
    • indexes that are not managed by PeopleTools (such as function-based indexes),
    • triggers not managed by PeopleTools (other than the PSU triggers created for mobile agents),
    • materialised view logs.
    • If the table is partitioned or global temporary the DDL for the object being dropped is also captured.
    When an index is dropped the index check is performed. Similarly the DDL to rebuild partitioned indexes or indexes on Global Temporary tables is also captured.

    When an object for which the DDL has been captured is explicitly dropped, this is indicated on the table GFC_REL_OBJ by storing the time at which it was dropped. When it is recreated this time-stamp is cleared.  Thus it is possible to decide whether something was deliberately or accidentally dropped.

    Thursday, March 04, 2010

    Hinting Dynamically Generated SQL in Application Engine

    One of the clever things you can do in Application Engine is dynamically generate parts of a SQL statement.  However, this can produce challenges should decide that you also need to add an optimiser hint to a SQL statement.  In this post I want to demonstrate how you can also use the same techniques to dynamically generate valid Optimizer hints.

    The following statement was generated by a step in a delivered Application Engine that had not been previously been changed.  The PS_XXX tables are custom tables, but the references are in custom meta-data that is used to dynamically generate the SQL.  I have added the hints and comments. 

    The specific hints and comments are not important here, the point is how I managed to generate them.

    Note that:
    • I have added a comment that contains the name of the application engine step that generated this SQL.  This has can be added automatically with the aeid.sql script on my website.  It can be difficult to identify which step generated which static SQL statement.  It can be impossible to do that with dynamic SQL.  This identifying comment appears in the SQL.
    • I have added a QB_NAME hint to the sub-query, and then the hint can refer to that sub-query (see related blog entry Hinting Sub-Queries on Oracle).
    • Rather than specify the index name in the INDEX hint I have use the table_name(column_list) construction to specify an index on a named table that starts with the named columns.  This is good general practice, the hint remains valid in the case that an index name changes.  However, it is also useful here because there is no meta-data that I can use to construct the name of the index.
    UPDATE /*+INDEX(@SUB1 ln@SUB1 PS_XXX_SPLT_TA2(process_instance,resource_id))          INDEX(PS_XXX_SPLT_TA1 PS_XXX_SPLT_TA1(process_instance,iu_line_type))*/    /*ID-IU_PROCESSOR.P150.P150-010.S*/     PS_XXX_SPLT_TA1 SET    iu_line_type='U'WHERE iu_line_type='2' AND process_instance=12345678 AND setid_iu='XXXXX' AND ledger_group = 'ACTUALS' AND EXISTS (    SELECT /*+QB_NAME(SUB1)*/ 'X'     FROM PS_XXX_SPLT_TA2 ln    where ln.EMPLID2 = PS_XXX_SPLT_TA1.EMPLID2     and ln.SEQ_NUM = PS_XXX_SPLT_TA1.SEQ_NUM     and ln.BUSINESS_UNIT = PS_XXX_SPLT_TA1.BUSINESS_UNIT     and ln.RESOURCE_ID = PS_XXX_SPLT_TA1.RESOURCE_ID     AND ln.setid_iu = 'XXXXX'    AND ln.ledger_group = 'ACTUALS'     AND ln.process_instance = 12345678     GROUP BY ln.BUSINESS_UNIT_GL , ln.ledger, ln.OPERATING_UNIT, ln.BUSINESS_UNIT, ln.RESOURCE_ID, ln.EMPLID2, ln.SEQ_NUM     HAVING SUM(ln.RESOURCE_AMOUNT) <> 0)

    So how did I get those hints into the SQL?

    First of all remember that Application Engine doesn't know anything about SQL.  An Application Engine step is just a string of characters that will be submitted to the database.  Any PeopleCode macros are executed as the step is prepared and the resultant string is then set to the database.

    The name of the PeopleSoft record being updated is in a variable line_wrk2_rec.  I can introduce the contents of the variable line_wrk2_rec with the %BIND() macro.  Normally a string bind variable is delimited by single quotes because it is used in a function or predicate, but the quotes can be suppressed with the NOQUOTES option.  I can convert the PeopleSoft record name to the database table name with the %Table() PeopleCode macro. 

    So, in this example
    Code in Application Engine Step Expansion
    %BIND(line_wrk_rec) 'XXX_SPLT_TA1'
    %BIND(line_wrk_rec,NOQUOTES)) XXX_SPLT_TA1
    %Table(%BIND(line_wrk_rec,NOQUOTES)) PS_XXX_SPLT_TA1

    Note that delivered PeopleSoft SQL never specifies a row source alias on the table being updated because this would be invalid SQL on SQL Server.  Thus one SQL statement can be used on multiple platforms.  Although it is possible to have platform specific steps in Application Engine, PeopleSoft development avoid this wherever possible because it increases their development overheads.  So the row source alias is the table name. 

    I have used the expression %Table(%BIND(line_wrk_rec,NOQUOTES)) twice; once for the table alias and then again when I specify the index.  I want to force the use of an index on PS_XXX_SPLT_TA1 that leads on columns PROCESS_INSTANCE and RESOURCE_ID.  This is the code in the Application Engine step that generates the SQL statement above.

    UPDATE /*+INDEX(@SUB1 ln@SUB1 %Table(%BIND(line_wrk2_rec,NOQUOTES))(process_instance,resource_id))        INDEX(%Table(%BIND(line_wrk_rec,NOQUOTES)) %Table(%BIND(line_wrk_rec,NOQUOTES))(process_instance,iu_line_type))*/    /*ID-IU_PROCESSOR.P150.P150-010.S*/%Table(%BIND(line_wrk_rec,NOQUOTES))  SET iu_line_type='U'  WHERE iu_line_type='2'    AND process_instance=%BIND(process_instance)    AND setid_iu=%BIND(iu_proc_002_aet.setid_ledger)    AND ledger_group = %BIND(iu_proc_002_aet.ledger_group)    AND EXISTS (     SELECT /*+QB_NAME(SUB1)*/ 'X'       FROM %Table(%BIND(line_wrk2_rec,NOQUOTES)) ln         %BIND(iu_where_aet.iu_where_sql,NOQUOTES)%BIND(iu_group_by_aet.iu_group_by_sql,NOQUOTES)         %BIND(iu_proc_002_aet.where_bu,NOQUOTES)        AND ln.setid_iu = %BIND(iu_proc_002_aet.setid_ledger)        AND ln.ledger_group = %BIND(iu_proc_002_aet.ledger_group)        AND ln.process_instance = %ProcessInstance      GROUP BY ln.%BIND(iu_sys_tran_aet.fieldname_bu_gl,NOQUOTES), ln.ledger%BIND(iu_proc_001_aet.iu_group_by_sql,NOQUOTES)      HAVING SUM(ln.%BIND(iu_sys_tran_aet.fieldname_base_amt,NOQUOTES)) <> 0)

    There is a problem here (at least there is in theory). If this statement executes for a different table the index hint will instruct the SQL to look for an index on that different table on the same columns.  The predicates in the where clauses are also derived from dynamic code.  If I was being absolutely rigorous, I would have added some procedural code in preceding steps to build this part of the hints dynamically too, however, I am also trying to keep the customisation to a minimum in an area of code that is otherwise vanilla.

    I have accepted a compromise.  If the step executes for a different table, the hint will probably be invalid because there is probably no such index on these columns on that table.  There is a risk that such an index does exist and so the hint could be valid but totally inappropriate to the situation because the predicates are totally different.  The result could be very poor performance.  However, in this case, in practice, this risk is zero, and the hint results in appropriate behaviour in all scenarios.  Although this is something that you need to consider on a case by case basis

    Wednesday, January 20, 2010

    Performance Metrics and XML Reporting in PeopleSoft

    I am working with a PeopleSoft system that makes extensive use of XML Publisher reporting.  Generally these reports are based on SQL queries that are defined in the PeopleSoft PS/Query utility. 

    These queries are run by an Application Engine program PSXPQRYRPT that runs the SQL query and produces the report in a single process.  This is a generic application engine program that runs any XML report.  Line 45 (in PeopleTools 8.49) of step MAIN.ExecRPT executes the report with the ProcessReport PeopleCode command.

    &oRptDefn.ProcessReport(&TemplateId, &Languaged, &AsOfDate, &oRptDefn.GetOutDestFormatString(&OutDestFormat));

    Batch Timings
    Analysis of the Application Engine batch timings indicate that nearly all the time in this Application Engine program is spent in PeopleCode, and that this not SQL execution time.  This is misleading.  The ProcessReport command is PeopleCode, but behind the scenes it also issues the SQL in the report data source.  Not all the time is SQL, but the Application Engine Batch Timings does not count any of this as SQL because it is not in the PeopleCode SQL Class. 

    Let’s look at an example Batch Timings report (I have edited it down, removing zero and insignificant timings).

    PeopleSoft Application Engine Timings
                                     (All timings in seconds)
                                   C o m p i l e    E x e c u t e    F e t c h        Total           
    SQL Statement                  Count   Time     Count   Time     Count   Time     Time    
    ------------------------------ ------- -------- ------- -------- ------- -------- --------
    Record.SelectByKey PSDBFIELD                          8      0.1       8      0.0      0.1
    SELECT PSPRCSRQST                                     2      0.1       2      0.0      0.1
    SELECT PSXPTMPLDEFN                                   3      0.1       3      0.0      0.1
    AE Program: PSXPQRYRPT
    MAIN.ExecRpt.H                       1      0.0       1      0.0       1      0.0      0.0
                                   Call    Non-SQL  SQL      Total   
    PeopleCode                     Count   Time     Time     Time    
    ------------------------------ ------- -------- -------- --------
    AE Program: PSXPQRYRPT
    MAIN.ExecRpt                         1    643.2      0.3    643.5
                                           -------- -------- --------
                                              643.2      0.3    643.5
                                                            E x e c u t e   
    PEOPLECODE Builtin/Method                               Count   Time    
    ------------------------------------------------------- ------- --------
    Boolean(Type 5) BuiltIns                                     90      0.1
    DateTime(Type 11) BuiltIns                                    1      0.1
    SQL(Type 524290) Methods                                     19      0.1
    SQL(Type 524290) BuiltIns                                     9      0.1
    Record(Type 524291) Methods                                1104      0.1
    Session(Type 524303) Methods                                207    633.2
    JavaObject(Type 524315) BuiltIns                              6      2.2
    PostReport(Type 524324) Methods                               2      0.7
    Total run time                :      644.0
    Total time in application SQL :        0.3   Percent time in application SQL :        0.0%
    Total time in PeopleCode      :      643.2   Percent time in PeopleCode      :       99.9%
    • The total execution time of is 644 seconds.
    • 643.2s are reported as being in PeopleCode, and as coming from the MAIN.ExecRpt step.
    • Only 0.3s of that is SQL time, and that comes from PeopleCode functions in the SQL Class  (CreateRecord , CreateSQL etc.)
    • The ProcessReport method does not appear in the PeopleCode SQL analysis section of the report, because it is not classed as SQL.
    Identifying the Report ID and Report Source
    So, the next question is how can I find out which report this process is running.  Different instances of this report may be running different queries.

    You can get the list files generated by a process from the Content Management PeopleTools table PS_CDM_FILE_LIST.  Application Engine processes usually produce various log files (with extensions .aet, .trc and .log), the name of the other file is the same as the name of the report followed by an extension that will vary depending on format.

    You can look at the report definition on-line under Reporting Tools -> XML Publisher -> Report Definition, and that will show you the Data Source ID

    The Data Source is defined in a component accessible from the previous entry in the same menu.

    Analysing Execution Times
    I have written this query to aggregate execution time for PSXPQRYRPT by output file name. 
    • PSPRCSRQST is the process request table, from which I get the run time of the process.
    • PS_CDM_FILE_LIST lists the files generated by the process which would be posted to the Report Repository. If I exclude the usual trace files, I am left with the report ID.
    • PSXPRPTDEFN is the report definition record, from which I can get the data source ID
    • PSXPDATSRC specifies the data source.  A data source type (DS_TYPE) of QRY indicates a PeopleSoft PS/Query.  If it is a private query, the OPRID will have a value.
    column report_defn_id heading 'Report ID'      format a12 
    column ds_type        heading 'Type'           format a4 
    column ds_id          heading 'Data Source ID' format a30 
    column oprid          heading 'Owner'          format a10 
    column processes      heading 'Prcs'           format 990 
    column reprots        heading 'Reps'           format 9,990 
    column secs           heading 'Total|Seconds'  format 999,990 
    column median         heading 'Median|Seconds' format 999,990 
    column variance       heading 'Variance'       format 9990.0
    SELECT d.report_defn_id, d.ds_type, d.ds_id, d.oprid 
    ,      SUM(y.processes) processes
    ,      SUM(y.reports) reports
    ,      SUM(y.secs) secs
    ,      median(y.secs) median
    ,      variance(y.secs) variance
    FROM (
     SELECT x.prcsinstance
     ,      x.filename
     ,      COUNT(DISTINCT x.prcsinstance) processes
     ,      COUNT(*) reports
     ,      SUM(x.secs) secs
     FROM   (
      SELECT r.prcsinstance
      ,      f.filename
      ,      86400*(r.enddttm-r.begindttm)*ratio_to_report(1) over (partition by r.prcsinstance) secs
      FROM   sysadm.psprcsrqst r
      ,      sysadm.ps_cdm_file_list f
      WHERE  r.prcsname = 'PSXPQRYRPT'
      AND    r.prcsinstance = f.prcsinstance
      AND    NOT f.cdm_file_type IN('AET','TRC','LOG')
      AND    r.begindttm >= TRUNC(SYSDATE)
      ) x
     GROUP BY x.prcsinstance, x.filename
     ) y
    ,      sysadm.psxprptdefn d
    WHERE  d.report_defn_id = SUBSTR(y.filename,1,instr(y.filename,'.')-1)
    GROUP BY d.report_defn_id, d.ds_type, d.ds_id, d.oprid 
    ORDER BY secs DESC

    I can now see which report process has been executed run how many times, how many copies of the report have been produced, and where the processing time is being spent, and so which one I should look at first.

    Total   Median
    Report ID    Type Data Source ID                 Owner      Prcs   Reps  Seconds  Seconds Variance
    ------------ ---- ------------------------------ ---------- ---- ------ -------- -------- --------
    XXX_WK_LATE  QRY  XXX_WKLY_LATENESS_RPT                       20     20    2,973      148   3702.9
    XXX_HRAM_CON QRY  XXX_HRPD_CNT_AMD_RPT_QRY                    92     92    2,677       27    108.4
    XXX_CKOUT_RP QRY  XXX_CHECKOUT_REPORT                         47     47    2,043       41    347.7
    XXX_BNK_RPT  QRY  XXX_BNK_DTLS_SCH_QRY                         1     75      141      141      0.0


    If you have a SQL performance problem with an XML report because the source PS/Query performs poorly, the batch timings will lead you to believe that you have a PeopleCode problem and not a SQL problem. 

    Beware, this may not be the case.

    Most of the execution time for XML Reports sourced from PS/Queries is almost certain to be SQL execution time.  The above query will tell you which queries are consuming how much time, and so inform your tuning effort.

    Sunday, January 10, 2010

    Automatically Identifying Stored Statements & Using DBMS_STATS in PeopleSoft Cobol

    It is not possible to reference a long column in a Oracle database trigger, but it is possible to reference LOBs. From PeopleSoft v9 Applications, the long columns have become CLOBs. Hence, it is now possible to have a database trigger fire on insert into the Stored Statement table and so automatically make certain changes to Stored Statements as they are loaded by Data Mover. Previously, these changes could be made by a PL/SQL script, but you had to remember to run in after any Stored Statements were reloaded. I have published a new script (gfc_stmtid_trigger.sql) on my website that creates two such database triggers.

    Trigger GFC_STMTID

    In PeopleSoft for the Oracle DBA (listing 11-23, page 280), I proposed a PL/SQL procedure to add identifying comments to stored statements (see blog entry: Identifying Application Engine Source Code), so that the statements can be identified in traces or Oracle Enterprise Manager.


    PeopleSoft Cobol programs do not use the PeopleSoft DDL models that Application Engine users when it processes the %UPDATESTATS macro. In another blog posting, I showed how to change the stored statements to call the wrapper package, so that the program uses the Oracle supplied dbms_stats procedure instead of the depreciated ANALYZE command. This trigger automatically replaces the %UPDATESTATS macro in the stored statement with a call to the wrapper package.