Friday, November 23, 2012

PeopleTools 8.52 Application Engine sets MODULE and ACTION

Update 4.3.2015: See also Undocumented Application Engine Parameter: EnableAEMonitoring

I have written and spoken often about the huge importance of instrumentation in an application. PeopleSoft introduced internal instrumentation to PeopleTools 8.44 for its own Performance Monitor that works on any platform. Since PeopleTools 7.53, when running on an Oracle database, PeopleTools also calls the Oracle supplied package dbms_application_info package to set the module and action for the session. The values set were not particularly useful, so I wrote a PL/SQL package (psftapi) and trigger to set the module and action attributes when a process started.

In PeopleTools 8.50 this instrumentation was enhanced to set module and action to the component and page name in the PIA. These values appear in Oracle Enterprise Manager and Active Session History (ASH). They can be used to profile components.

Now, in PeopleTools 8.52 Application Engine has been enhanced to set the action to the full Application Engine step name.

To illustrate what happens, I have produced an Oracle Extended SQL Trace for the process, but I have shown only the lines in a SQL Trace file with 3 asterisks. So you can see MODULE and ACTION being set (and some other timestamp information). The lines in italic were emitted when module and action were set by my psftapi package, in which I simply set the MODULE to the Application Engine program name.
*** 2012-11-22 21:47:38.282
*** SESSION ID:(7.2237) 2012-11-22 21:47:38.282
*** CLIENT ID:(PS) 2012-11-22 21:47:38.282
*** SERVICE NAME:(HCM91) 2012-11-22 21:47:38.282
*** MODULE NAME:(PSPMCSOSUM) 2012-11-22 21:47:38.282
*** ACTION NAME:(PI=867:Processing) 2012-11-22 21:47:38.282
*** MODULE NAME:(PSAE.PSPMCSOSUM.1448) 2012-11-22 21:47:38.419
*** ACTION NAME:(PSPMCSOSUM) 2012-11-22 21:47:38.419
*** ACTION NAME:(PSPMCSOSUM.MAIN.STATS.S) 2012-11-22 21:47:38.420
*** 2012-11-22 21:47:38.588
*** ACTION NAME:(PSPMCSOSUM.MAIN.CTL.P) 2012-11-22 21:47:38.919
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:47:38.937
*** ACTION NAME:(PSPMCSOSUM.GETCNT.CNT.P) 2012-11-22 21:47:38.963
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:48:15.168
*** ACTION NAME:(PSPMCSOSUM.GETCNT.CNT.P) 2012-11-22 21:48:15.168
*** ACTION NAME:(PSPMCSOSUM.MAIN.Close.P) 2012-11-22 21:48:15.197
*** ACTION NAME:(PI=867:Success) 2012-11-22 21:48:15.201

Application Engine also sets the MODULE attribute.  In the above example it was set to PSAE.PSPMCSOSUM.1448.  The string is composed of three parts.
  • PSAE indicates that it was running Application Engine. It sets the same value for both stand-alone and Tuxedo server Application Engine processes.
  • PSPMCSOSUM was the name of the Application Engine Program
  • 1448 was the operating system process ID of the Application Engine process.  This value is recorded in the column SESSIONIDNUM on the table PSPRCSQUE.
The inclusion of the SESSIONIDNUM is useful because it is possible to determine the process instance.  If you have multiple instances of the same Application Engine program running concurrently, you can work out which ASH samples correspond to which process instance by matching the session ID.

SELECT  r.prcsinstance, h.module, h.action, sum(1) ash_secs
FROM v$active_Session_history h
, psprcsque q
, psprcsrqst r
WHERE r.prcsinstance = q.prcsinstance
AND h.module = 'PSAE.'||q.prcsname||'.'||q.sessionidnum
AND h.sample_time BETWEEN r.begindttm AND NVL(r.enddttm,SYSDATE)
and r.prcsinstance = 867
GROUP BY r.prcsinstance, h.module, h.action
ORDER BY ash_secs DESC

PRCSINSTANCE MODULE               ACTION                    ASH_SECS
------------ -------------------- ------------------------- --------
         867 PSAE.PSPMCSOSUM.1448 PSPMCSOSUM.GETCNT.CNT.P          7
         867 PSAE.PSPMCSOSUM.1448 PSPMCSOSUM.MAIN.STATS.S          1

Unfortunately, this change to MODULE also means that it is no longer possible to enable trace by setting a watchpoint (as I blogged previously).  It no longer works with Application Engine because it is not possible to predict the value of the SESSIONIDNUM in MODULE.

I am testing with 8.52.12 and have found a significant problem. Application Engine doesn't set the action on DO SELECT statements. The first statement in the trace extract below does come from PSPMCSOSUM.MAIN.CTL PeopleCode, however, the second SQL statement actually comes from PSPMCSOSUM.MAIN.SELECT.D, but the absence of another action line would lead you to believe the statement came from the preceding PeopleCode step.
*** ACTION NAME:(PSPMCSOSUM.MAIN.CTL.P) 2012-11-22 21:47:38.919
WAIT #0: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220814597
WAIT #0: nam='SQL*Net message from client' ela= 740 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220815371
CLOSE #348647416:c=0,e=33,dep=0,type=1,tim=257220815447
PARSING IN CURSOR #348528704 len=98 dep=0 uid=45 oct=3 lid=45 tim=257220815496 hv=1424819941 ad='7ff2559e550' sqlid='98af7ppafu1r5'
PARSE #348528704:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1148792852,tim=257220815496
EXEC #348528704:c=0,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1148792852,tim=257220815572
WAIT #348528704: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220815612
FETCH #348528704:c=0,e=2824,p=0,cr=16,cu=0,mis=0,r=1,dep=0,og=1,plh=1148792852,tim=257220818462
STAT #348528704 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=16 pr=0 pw=0 time=2828 us)'
STAT #348528704 id=2 cnt=3075 pid=1 pos=1 obj=228740 op='INDEX FULL SCAN PS_PSPMCSO_CHART (cr=16 pr=0 pw=0 time=333 us cost=0 size=13 card=1)'
WAIT #348528704: nam='SQL*Net message from client' ela= 123 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220818671
FETCH #348528704:c=0,e=1,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=1148792852,tim=257220818696
WAIT #348528704: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220818714
WAIT #348528704: nam='SQL*Net message from client' ela= 702 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220819434
CLOSE #349389936:c=0,e=16,dep=0,type=1,tim=257220819496
PARSING IN CURSOR #349702624 len=555 dep=0 uid=45 oct=3 lid=45 tim=257220819548 hv=1888777338 ad='7ff2559ce90' sqlid='5phgqq9s98x3u'
PARSE #349702624:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4226533831,tim=257220819547
EXEC #349702624:c=0,e=47,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4226533831,tim=257220819669
WAIT #349702624: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220819723
WAIT #349702624: nam='SQL*Net more data to client' ela= 76 driver id=1413697536 #bytes=8145 p3=0 obj#=-40016373 tim=257220831271
FETCH #349702624:c=15600,e=11621,p=0,cr=214,cu=0,mis=0,r=201,dep=0,og=1,plh=4226533831,tim=257220831375
WAIT #349702624: nam='SQL*Net message from client' ela= 988 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832412
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832481
WAIT #0: nam='SQL*Net message from client' ela= 40 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832541
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:47:38.937

However, MAIN.SELECT.D duly appears in the AE batch timings report.

                               C o m p i l e    E x e c u t e    F e t c h        Total           
SQL Statement                  Count   Time     Count   Time     Count   Time     Time    
------------------------------ ------- -------- ------- -------- ------- -------- --------

MAIN.CTL.D                           1      0.0       1      0.0       2      0.0      0.0
MAIN.SELECT.D                        1      0.0       1      0.0     556      0.0      0.0
MAIN.STATS.S                         1      0.0       1      0.5       0      0.0      0.5

This is a serious problem  If you profile the top SQL statements in an Application Engine (using either ASH as I have done above, or by profiling a trace file and looking for the top SQL statement by searching through the raw trace file) it will lead you to the wrong conclusion!  Time spent in PSPMCSOSUM.MAIN.SELECT.D will be accounted as having been spent in PSPMCSOSUM.GETCNT.CNT.P.  It is not until you look for the specific SQL statement in Application Designer that you realise that the source code doesn't match the instrumentation.
Nevertheless, this instrumentation is a very welcome and significant improvement for Application Engine. It will help to localise performance problems. However, until the omissions are fixed, you will need to be aware of their implications and avoid falling into the traps.

My thanks to Wolfgang Breitling for telling me about the feature.

Friday, November 16, 2012

Using Two Temporary Tablespaces in PeopleSoft

A longer version of this posting, with all necessary code and a demonstration test, is available as a technical note on the Go-Faster website.  I am working with two different PeopleSoft customers who have had challenges with the size of the temporary tablespaces.  Critical batch processes have failed because they have exhausted space in the temporary tablespace.
ORA-01652 unable to extend temporary segment...
  • In one case, the one and only temporary tablespace in a Payroll system has over time been extended to in excess of 360 GB.  This has happened in response to PeopleSoft processes that failed because they cannot allocate temporary tablespace because somebody else has consumed it.  This treated the symptom rather than the cause.  This system has a number of other Oracle database users who have read-only access to the PeopleSoft data to perform ad-hoc queries.  These users all share the one temporary tablespace.  Occasionally, a query will be submitted that runs for many hours, writing many gigabytes of data to the temporary tablespace, when it would have been better to terminate the process
  • Another system has 64 GB in the PSTEMP temporary tablespace used by SYSADM.  All other users already use another temporary tablespace, but PeopleSoft processes sometimes still fail because most of the temporary tablespace has been consumed by an ad-hoc PS/Query process, and there is nothing left for other processes. This system also has other Oracle database users with read-only access, but here they use the default TEMP temporary tablespace.
If a PeopleSoft system has database users executing ad-hoc queries, then allocating those users to separate temporary tablespace is a sensible first step.

However, in this post, I suggest going further.  I propose switching some PeopleSoft processes to use a different temporary tablespace.  Regular PeopleSoft processing will continue to use the first temporary tablespace, but ad-hoc queries will use the second temporary tablespace.  Thus, the first temporary tablespace can be sized to cater for normal processes safe in the knowledge that it won’t be consumed by ad-hoc queries and you won’t get failures due to space errors.  Meanwhile, the second tablespace can be limited to a reasonable size, and queries that make unreasonable demands on the temporary tablespace will error.

I am going to exploit two features:
  • When a session sets CURRENT_SCHEMA to another schema it uses the temporary tablespace assigned to that schema rather than its own.
  • From PeopleTools 8.51 you can direct read-only activity via a second connection to a standby database. If you are using Oracle Active Data Guard you must configure something that looks like a second PeopleSoft database in another schema, but that is composed of database synonyms that point to the first schema.
To make PeopleSoft process use an alternative temporary tablespace:
  • Create a second schema (I'll call it SYSADMRP) in the same database, and specify a different temporary tablespace for the user.
  • Create synonyms in the second schema for every table and view in the PeopleSoft (SYSADM) schema.
If you are not using Active DataGuard or are below PeopleTools 8.51 then you can set CURRENT_SCHEMA with two triggers.
  • Create an AFTER LOGON trigger that will set CURRENT_SCHEMA to SYSADM for PSQRYSRV query server processes only.
ON sysadm.schema
  l_module VARCHAR2(64);
  SELECT sys_context('USERENV', 'MODULE') 
  INTO   l_module
  FROM   dual;

  IF UPPER(l_module) LIKE 'PSQRYSRV%' THEN --then this is a PSQRYSRV session
  • Create another trigger to set the current schema for the application engine process that runs scheduled queries (you might want to add others to this list).
CREATE OR REPLACE TRIGGER sysadm.set_current_schema
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
WHEN (  (new.runstatus = '7' OR old.runstatus != '7') --if starting or terminating
     AND new.prcsname IN('PSQUERY')) --restrict to certain programs
  IF :new.runstatus = '7' THEN --if starting set alternative schema
  ELSE --when process terminates reset to standard schema in case this is a PSAESRV process

If you are using at least PeopleTools 8.51, you don't need these triggers.  Instead, you can configure the standby connection to the same database
  • PS/Queries will always use the standby connection and hence the alternative tablespace.  
  • Batch processes can be marked as read-only to make them connect to the standby connection. Remember that you must use PSAESRV processes.
  • If you want to make specific PeopleSoft components use the alternative temporary tablespace then you can only do this by perverting the PeopleTools Active Data Guard support and marking components as read-only.


Now, you can choose an appropriate size for the one temporary tablespaces that will be sufficient for the regular operation of the application.  Adhoc queries will use alternative temporary tablespace.  You might choose to set a temporary tablespace size that may still cause queries to fail with a temporary tablespace error, but at least that won’t cause business-as-usual processes to crash.

You could even choose to run with three temporary tablespaces, one for PeopleSoft processes, one for PeopleSoft queries, and one for ad-hoc users accessing the database directly.