Friday, November 23, 2012

PeopleTools 8.52 Application Engine sets MODULE and ACTION

Update 4.3.2015: See also Undocumented Application Engine Parameter: EnableAEMonitoring

I have written and spoken often about the huge importance of instrumentation in an application. PeopleSoft introduced internal instrumentation to PeopleTools 8.44 for its own Performance Monitor that works on any platform. Since PeopleTools 7.53, when running on an Oracle database, PeopleTools also calls the Oracle supplied package dbms_application_info package to set the module and action for the session. The values set were not particularly useful, so I wrote a PL/SQL package (psftapi) and trigger to set the module and action attributes when a process started.

In PeopleTools 8.50 this instrumentation was enhanced to set module and action to the component and page name in the PIA. These values appear in Oracle Enterprise Manager and Active Session History (ASH). They can be used to profile components.

Now, in PeopleTools 8.52 Application Engine has been enhanced to set the action to the full Application Engine step name.

To illustrate what happens, I have produced an Oracle Extended SQL Trace for the process, but I have shown only the lines in a SQL Trace file with 3 asterisks. So you can see MODULE and ACTION being set (and some other timestamp information). The lines in italic were emitted when module and action were set by my psftapi package, in which I simply set the MODULE to the Application Engine program name.
*** 2012-11-22 21:47:38.282
*** SESSION ID:(7.2237) 2012-11-22 21:47:38.282
*** CLIENT ID:(PS) 2012-11-22 21:47:38.282
*** SERVICE NAME:(HCM91) 2012-11-22 21:47:38.282
*** MODULE NAME:(PSPMCSOSUM) 2012-11-22 21:47:38.282
*** ACTION NAME:(PI=867:Processing) 2012-11-22 21:47:38.282
*** MODULE NAME:(PSAE.PSPMCSOSUM.1448) 2012-11-22 21:47:38.419
*** ACTION NAME:(PSPMCSOSUM) 2012-11-22 21:47:38.419
*** ACTION NAME:(PSPMCSOSUM.MAIN.STATS.S) 2012-11-22 21:47:38.420
*** 2012-11-22 21:47:38.588
*** ACTION NAME:(PSPMCSOSUM.MAIN.CTL.P) 2012-11-22 21:47:38.919
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:47:38.937
*** ACTION NAME:(PSPMCSOSUM.GETCNT.CNT.P) 2012-11-22 21:47:38.963
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:48:15.168
*** ACTION NAME:(PSPMCSOSUM.GETCNT.CNT.P) 2012-11-22 21:48:15.168
*** ACTION NAME:(PSPMCSOSUM.MAIN.Close.P) 2012-11-22 21:48:15.197
*** ACTION NAME:(PI=867:Success) 2012-11-22 21:48:15.201

Application Engine also sets the MODULE attribute.  In the above example it was set to PSAE.PSPMCSOSUM.1448.  The string is composed of three parts.
  • PSAE indicates that it was running Application Engine. It sets the same value for both stand-alone and Tuxedo server Application Engine processes.
  • PSPMCSOSUM was the name of the Application Engine Program
  • 1448 was the operating system process ID of the Application Engine process.  This value is recorded in the column SESSIONIDNUM on the table PSPRCSQUE.
The inclusion of the SESSIONIDNUM is useful because it is possible to determine the process instance.  If you have multiple instances of the same Application Engine program running concurrently, you can work out which ASH samples correspond to which process instance by matching the session ID.

SELECT  r.prcsinstance, h.module, h.action, sum(1) ash_secs
FROM v$active_Session_history h
, psprcsque q
, psprcsrqst r
WHERE r.prcsinstance = q.prcsinstance
AND h.module = 'PSAE.'||q.prcsname||'.'||q.sessionidnum
AND h.sample_time BETWEEN r.begindttm AND NVL(r.enddttm,SYSDATE)
and r.prcsinstance = 867
GROUP BY r.prcsinstance, h.module, h.action
ORDER BY ash_secs DESC

PRCSINSTANCE MODULE               ACTION                    ASH_SECS
------------ -------------------- ------------------------- --------
         867 PSAE.PSPMCSOSUM.1448 PSPMCSOSUM.GETCNT.CNT.P          7
         867 PSAE.PSPMCSOSUM.1448 PSPMCSOSUM.MAIN.STATS.S          1

Unfortunately, this change to MODULE also means that it is no longer possible to enable trace by setting a watchpoint (as I blogged previously) now does not work with Application Engine because it is not possible to predict the value of the SESSIONIDNUM in MODULE!

I am testing with 8.52.12 and have found a significant problem. Application Engine doesn't set the action on DO SELECT statements. The first statement in the trace extract below does come from PSPMCSOSUM.MAIN.CTL PeopleCode, however, the second SQL statement actually comes from PSPMCSOSUM.MAIN.SELECT.D, but the absence of another action line would lead you to believe the statement came from the preceding PeopleCode step.
*** ACTION NAME:(PSPMCSOSUM.MAIN.CTL.P) 2012-11-22 21:47:38.919
WAIT #0: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220814597
WAIT #0: nam='SQL*Net message from client' ela= 740 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220815371
CLOSE #348647416:c=0,e=33,dep=0,type=1,tim=257220815447
PARSING IN CURSOR #348528704 len=98 dep=0 uid=45 oct=3 lid=45 tim=257220815496 hv=1424819941 ad='7ff2559e550' sqlid='98af7ppafu1r5'
PARSE #348528704:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1148792852,tim=257220815496
EXEC #348528704:c=0,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1148792852,tim=257220815572
WAIT #348528704: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220815612
FETCH #348528704:c=0,e=2824,p=0,cr=16,cu=0,mis=0,r=1,dep=0,og=1,plh=1148792852,tim=257220818462
STAT #348528704 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=16 pr=0 pw=0 time=2828 us)'
STAT #348528704 id=2 cnt=3075 pid=1 pos=1 obj=228740 op='INDEX FULL SCAN PS_PSPMCSO_CHART (cr=16 pr=0 pw=0 time=333 us cost=0 size=13 card=1)'
WAIT #348528704: nam='SQL*Net message from client' ela= 123 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220818671
FETCH #348528704:c=0,e=1,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=1148792852,tim=257220818696
WAIT #348528704: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220818714
WAIT #348528704: nam='SQL*Net message from client' ela= 702 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220819434
CLOSE #349389936:c=0,e=16,dep=0,type=1,tim=257220819496
PARSING IN CURSOR #349702624 len=555 dep=0 uid=45 oct=3 lid=45 tim=257220819548 hv=1888777338 ad='7ff2559ce90' sqlid='5phgqq9s98x3u'
PARSE #349702624:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4226533831,tim=257220819547
EXEC #349702624:c=0,e=47,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4226533831,tim=257220819669
WAIT #349702624: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220819723
WAIT #349702624: nam='SQL*Net more data to client' ela= 76 driver id=1413697536 #bytes=8145 p3=0 obj#=-40016373 tim=257220831271
FETCH #349702624:c=15600,e=11621,p=0,cr=214,cu=0,mis=0,r=201,dep=0,og=1,plh=4226533831,tim=257220831375
WAIT #349702624: nam='SQL*Net message from client' ela= 988 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832412
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832481
WAIT #0: nam='SQL*Net message from client' ela= 40 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832541
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:47:38.937

However, MAIN.SELECT.D duly appears in the AE batch timings report.

                               C o m p i l e    E x e c u t e    F e t c h        Total           
SQL Statement                  Count   Time     Count   Time     Count   Time     Time    
------------------------------ ------- -------- ------- -------- ------- -------- --------

MAIN.CTL.D                           1      0.0       1      0.0       2      0.0      0.0
MAIN.SELECT.D                        1      0.0       1      0.0     556      0.0      0.0
MAIN.STATS.S                         1      0.0       1      0.5       0      0.0      0.5

This is a serious problem  If you profile the top SQL statements in an Application Engine (using either ASH as I have done above, or by profiling a trace file and looking for the top SQL statement by searching through the raw trace file) it will lead you to the wrong conclusion!  Time spent in PSPMCSOSUM.MAIN.SELECT.D will be accounted as having been spent in PSPMCSOSUM.GETCNT.CNT.P.  It is not until you look for the specific SQL statement in Application Designer that you realise that the source code doesn't match the instrumentation.
Nevertheless, this instrumentation is a very welcome and significant improvement for Application Engine. It will help to localise performance problems. However, until the omissions are fixed, you will need to be aware of their implications and avoid falling into the traps.

My thanks to Wolfgang Breitling for telling me about the feature.

Friday, November 16, 2012

Using Two Temporary Tablespaces in PeopleSoft

A longer version of this posting, with all necessary code and a demonstration test, is available as a technical note on the Go-Faster website.  I am working with two different PeopleSoft customers who have had challenges with the size of the temporary tablespaces.  Critical batch processes have failed because they have exhausted space in the temporary tablespace.
ORA-01652 unable to extend temporary segment...
  • In one case, the one and only temporary tablespace in a Payroll system has over time been extended to in excess of 360 GB.  This has happen in response to PeopleSoft processes that failed because they cannot allocate temporary tablespace because somebody else has consumed it.  This treated the symptom rather than the cause.  This system has a number of other Oracle database users who are have read-only access to the PeopleSoft data to perform adhoc queries.  These users all share the one temporary tablespace.  Occasionally, a query will be submitted that runs for many hours, writing many gigabytes of data to the temporary tablespace, when it would have been better to terminate the process
  • Another system has 64 GB in the PSTEMP temporary tablespace used by SYSADM.  All other users already use another temporary tablespace, but PeopleSoft processes sometimes still fail because most of the temporary tablespace has been consumed by an adhoc PS/Query process, and there is nothing left for other processes. This system also has other Oracle database users with read-only access, but here they use the default TEMP temporary tablespace.
If a PeopleSoft system has database users executing adhoc queries, then allocating those users to separate temporary tablespace is a sensible first step.

However, in this document I suggest going further.  I propose switching some PeopleSoft processes to use a different temporary tablespace.  Regular PeopleSoft processing will continue to use the first temporary tablespace, but adhoc queries will use the second temporary tablespace.  Thus, the first temporary tablespace can be sized to cater for normal processes safe in the knowledge that it won’t be consumed by adhoc queries and you won’t get failures due to space errors.  Meanwhile, the second tablespace can be limited to a reasonable size  and queries that make unreasonable demands on the temporary tablespace will error.

I am going to exploit two features:
  • When a session sets CURRENT_SCHEMA to another schema it uses the temporary tablespace assigned to that schema rather than its own.
  • From PeopleTools 8.51 you can direct read-only activity via a second connection to a standby database. If you are using Oracle Active Data Guard you must configure something that looks like a second PeopleSoft database in another schema, but which actually is composed of database synonyms that point to the first schema.
To make PeopleSoft process use an alternative temporary tablespace:
  • Create a second schema (I'll call it SYSADMRP) in the same database, and specify a different temporary tablespace for the user.
  • Create synonyms in the second schema for every table and view in the PeopleSoft (SYSADM) schema.
If you are not using Active DataGuard or are below PeopleTools 8.51 then you can set CURRENT_SCHEMA with two triggers.
  • Create an AFTER LOGON trigger that will set CURRENT_SCHEMA to SYSADM for PSQRYSRV query server processes only.

ON sysadm.schema
  l_module VARCHAR2(64);
  SELECT sys_context('USERENV', 'MODULE') 
  INTO   l_module
  FROM   dual;

  IF UPPER(l_module) LIKE 'PSQRYSRV%' THEN --then this is a PSQRYSRV session
  • Create another trigger to set current schema for the application engine process that runs scheduled queries (you might want to add others to this list).

CREATE OR REPLACE TRIGGER sysadm.set_current_schema
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
WHEN (  (new.runstatus = '7' OR old.runstatus != '7') --if starting or terminating
     AND new.prcsname IN('PSQUERY')) --restrict to certain programs
  IF :new.runstatus = '7' THEN --if starting set alternative schema
  ELSE --when process terminates reset to standard schema in case this is a PSAESRV process

If you are using at least PeopleTools 8.51, you don't need these triggers.  Instead you can configure the standby connection but to the same database
  • PS/Queries will always use the standby connection and hence the alternative tablespace.  
  • Batch processes can be marked as read only to make them connect to the standby connection. Remember that you must use PSAESRV processes.
  • If you want to make specific PeopleSoft components use the alternative temporary tablespace then you can only do this by perverting the PeopleTools Active Data Guard support and marking components as read only.


Now, you can choose an appropriate size for the one temporary tablespaces that will be sufficient for the regular operation of the application.  Adhoc queries will use alternative temporary tablespace.  You might choose to set a temporary tablespace size that may still cause queries to fail with a temproray tablespace error, but at least that won’t cause business-as-usual processes to crash.

You could even choose to run with three temporary tablespaces, one for PeopleSoft processes, one for PeopleSoft queries, and one for adhoc users accessing the database directly.

Sunday, September 30, 2012

Maintaining Optimizer Statistics on PeopleSoft on Oracle 11g

I have been considering how to collect optimizer statistics for a PeopleSoft system running on an Oracle 11g database.  Despite 11g being several years old, most of my current customers are still using 10g, though some are looking at the upgrade to 11g.  I believe a slightly different approach is required.

In 2009, I wrote a series of blog postings on the subject of collecting statistics.  However these were all based on Oracle 10g.  I proposed a PL/SQL package that would use meta-data in a database table to determine how to collect statistics on a table, or deliberate supress collection of statistics.
I also recommended that statistics on tables created for use as temporary records in Application Engine programs should have their statistics deleted and locked  to prevent system-wide jobs refreshing their statistics. I proposed a package that collected statistics according to meta-data defined in a table.

IN 2011, Oracle published document 1322888.1 “pscbo_stats - Improving Statistics in Oracle RDBMS for PeopleSoft Enterprise”.  It takes a similar approach to my 10g package.  A PL/SQL package is used to collect statistics.  A number of tables control whether, when and how statistics are collected on each record.  Oracle's package is also intended to be used to collect schema-wide statistics.

The psbo_stats package is a attempt to solve a genuine problem, and it has continued to evolve since its initial release.  However, I have a number of objections to it.
  • You are expected to replace the default automatic jobs that collect statistics with pscbo_stats, so it is a move away from standard default maintenance procedures.
  • pscbo_stats is still fundamentally a 10g solution. It does not use 11g table preferences.
  • It does use the Oracle automatic sample size in 11g if histograms are not to be collected.  Otherwise, it defaults to the previous behaviour of either using 100% sample size for when called by %UpdateStats with the ‘high’ sample size, or a variety of fixed sample sizes based on internal rules and the number of rows in the table.
  • The package contains a procedure that collects statistics on all objects in the schema that also refreshes statistics that are not stale but which have not been refreshed for a period of time determined by the size of the table. 
  • When it collects histograms it always sets the maximum bucket size of 254. This may not always be desirable for height balanced histograms.
  • There is no support for collecting aggregated or incremental statistics on partitioned objects.
I think that the 11g table preferences offer better control over collection of statistics.

Oracle considerably enhanced the delivered DBMS_STATS package in 11g.  It became possible to specify default values for parameters in the dbms_stats.gather_table_stats program for for each table.

The guiding principle in 11g, and one that is not specific to PeopleSoft, is that instead of calling dbms_stats.gather_table_stats with the desired parameters, we should set table preferences with the desired parameters and then just call dbms_stats without table specific parameters. We can then just leave the default database and schema-wide procedures get on with the job of collecting statistics.    

A document that describes the updated approach and the various scripts are available on the Go-Faster website at

CAVEAT: This document represents some experimental work that is in progress.  It has not been tested against a customer system, let alone been used in production. I would welcome any feedback, and the opportunity to work with someone on a PeopleSoft system on Oracle 11g.

Update 4.2.2015: I have put this solution into PeopleSoft systems at two customer systems last year, so it is no longer just theory.

Thursday, September 20, 2012

Enabling Oracle Extended SQL Trace by Module and Action

I have written previously about the value of assigning meaningful values to the module and action attributes on a database session (see Using Oracle Enterprise Manager (Grid Control) with PeopleSoft). Oracle added instrumentation to PeopleTools 8.50 (see PeopleTools 8.50 uses DBMS_APPLICATION_INFO to Identify Database Sessions) that sets module and action for on-line and batch sessions.  However, I still use my own trigger to set these attributes for processes initiated by the Process Scheduler.

I originally became interested in module and action because it made it possible to analyse performance problems in specific processes with Active Session History (see Practical Use of ASH). However, since Oracle 10g there is also a very easy way to enable Oracle's extended session trace in the sessions relating to  specific processes.

In the Oracle supplied PL/SQL package DBMS_MONITOR, there are two programs to specify combinations of module and action for which SQL trace will be enabled.  It is rather like setting a watchpoint in a debugger. SERV_MOD_ACT_TRACE_ENABLE is used to create a watchpoint, and SERV_MOD_ACT_TRACE_DISABLE removes it. Trace is then enabled in a session when the module and action is set to a value that matches a watchpoint, and disabled when the module and/or action is changed to a value for which there is no watchpoint.

This screenshot from OEM shows that a query is being run within a PeopleSoft component RECV_PO, page PO_PICK_ORDERS.

It would be a simple matter to have Oracle trace the session for that component by setting a watchpoint.

 ,module_name =>'RECV_PO'

Note that the service name must be specified as part of the watchpoint.  This will usually be the same as the PeopleSoft database name.  Thus, watchpoints copied to another database by cloning won't be effective because the service name will be different.

I could also have specified an action in the example above, but I want to trace all pages in the component and the search dialogue.

As with the other programs in dbms_monitor, information on wait events are included in traces by default.  In the above example, I have also requested information on bind variables to be included in the trace.

When setting a SQL trace for an on-line component it is possible that many different users could trigger tracing.  Trace will be enabled and disabled for each user. The result is that you will get a trace file for each application server process, and each trace file might contain the activity for more than one user if the application server process handled more than one service request.

Watchpoints can be removed in a similar way.
 ,module_name =>'RECV_PO'
 ,action_name =>'PO_PICK_ORDERS');

Watchpoints are held the table SYS.WRI$_TRACING_ENABLED.

SELECT * FROM sys.wri$_tracing_enabled

---------- ---------- ------------- --------------- ------------- -----
         5 P1PPP      RECV_PO       PO_PICK_ORDERS                   12

This technique is also effective for processes initiated by the Process Scheduler. The default behaviour from PeopleTools 8.50 is to set Action to the process name, my trigger sets Module to the process name.
If I specify only the Module in SERV_MOD_ACT_TRACE_ENABLE, then it trace will be enabled for that module and for any action.
 ,module_name =>'GPPDPRUN');
I choose to set the action to the Process Instance number because I find it useful to relate ASH data to a specific batch process. The above screenshot from OEM shows a streamed Global Payroll calculation running with many concurrent GPPDPRUN processes. The watchpoint would cause each and every processes to trace.

However, it has been suggested that if the Action was set to the process run control ID then the watchpoint could be set to the specific run control value used by a user or scheduled job. It would be a simple matter of changing the trigger (I'll leave that as an excercise).  Then, a user could be told to use a specific value for run control ID that would then invoke trace.
 ,action_name =>'TRACEME');

Update 24.11.2012 This technique will not work for Application Engine from PeopleTools 8.52 because it now sets MODULE to a value that includes a session number and so cannot be predicted.

Thursday, July 19, 2012

Gathering Aggregated Cost-Based Optimiser Statistics on Partitioned Objects

Recently, I have been looking into how to gather cost-based optimizer statistics on composite partitioned objects. 

Database partitioning is not used by default in PeopleSoft because it is database specific.  Not all databases support partitioning, and where they do it is done in a platform specific manner.  Application Designer has no capability to create partitioned objects (although from PeopleTools 8.51 it will preserve partitioning in tables and index that are already partitioned in an Oracle database). It is therefore no surprise that it is not used widely by PeopleSoft customers.  However, partitioning is essential in Global Payroll systems that use 'streamed'  processing (many concurrent payroll processes).  By definition, these are the payroll systems that have large volumes of data where collecting statistics on the results tables can quickly become a challenge.  Partitioning can also have application in General Ledger reporting in Financials, particularly where nVision is in use.

It is important that statistics on partitioned tables are both up to date and accurate.  However, collecting global statistics on a large partitioned object can be a time-consuming and resource intensive business as Oracle samples all the physical partitions or sub-partitions. Briefly, if you do not collect global statistics on a partitioned table, Oracle will aggegrate the statistics on the physical partitons or sub-partitions to calculate statistics on the logical table and partition segments.

Oracle 10g makes a number of mistakes in its calculation of these aggregated statistics.  In particular the number of distinct values on columns by which the table is partitioned have impossibly low values.  This is can affect cardinality calculations and so lead the optimizer to choose the wrong execution plan.

I have now published the second of two documents on my website that examine aspects of statistics on partitioned, and in particular composite partitioned tables. The first document examines the problems in 10g, and proposes a procedure to 'correct' the aggregated statistics to at least minimum possible values.
The second document looks at the same issue in 11g and shows that while most of the issues are fixed, one problem remains.