EMEA PeopleSoft Roadshow 2014

Thursday, January 30, 2014

Implementing Deferred Segment Creation After an Upgrade

I have written previously about Deferred Segment Creation. Each empty table and its indexes use only 64Kb (assuming an 8Kb blocksize and locally managed tablespaces), but in a PeopleSoft there can be be tens of thousands of such tables and that adds up to a saving worth making.

If you are upgrading your database to 11gR2 , you might want to make sure that you are using it.  Deferred segment creation was introduced in Oracle 11.2.0.2 and it became the default in 11.2.0.3.  However, any table created in a previous version will have a physical segment.

This problem could affect any system, but it also manifests itself in PeopleSoft in a particular way.

When you run the alter scripts in PeopleTools a table may be recreated.  If it is a regular table (record type 0) then the CREATE TABLE command will not specify a segment creation clause and so the segment creation will be deferred until rows are inserted.

CREATE TABLE PS_TL_IPT1 (PROCESS_INSTANCE DECIMAL(10) NOT NULL,
   EMPLID VARCHAR2(11) NOT NULL,
...
   INITIAL_SEQ_NBR DECIMAL(15) NOT NULL) TABLESPACE TLWORK STORAGE
 (INITIAL 40000 NEXT 100000 MAXEXTENTS UNLIMITED PCTINCREASE 0)
 PCTFREE 10 PCTUSED 80
/
However, from PeopleTools 8.51, Application Designer uses the Oracle delivered DBMS_METADATA package to extract the DDL to recreate the object from the actual object.  However, this behaviour only occurs for Temporary working storage tables (record type 7).  Yes, these are exactly the tables that would benefit most from defered segment creation because in many systems there are many unused temporary table instances.  If table was created under a version of the database prior to 11.2.0.2 then the segment will exist and DBMS_METADATA will generate the DDL with the SEGMENT CREATION IMMEDIATE clause. 

-- Create temporary table 
CREATE TABLE PSYPERSON (EMPLID VARCHAR2(11) NOT NULL,
...
   LAST_CHILD_UPDDTM tIMESTAMP) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 80 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 40960 NEXT 106496 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "HRLARGE"
/
You can use DBMS_SPACE_ADMIN.DROP_EMPTY_SEGMENTS to remove the segments for any empty tables (and their indexes) for which the segment has been previously created.  There would be no harm in simply running this program for every table in the system.  If there are rows then DBMS_SPACE_ADMIN will take no action.

The following script identifies candidate tables where the statistics suggest that there are no rowa or where there are no statistics.  I am indebted to Tim Hall for the idea for this tip.

set serveroutput on 
BEGIN
 FOR i IN (
  SELECT owner, table_name
  FROM   all_tables
  WHERE  owner = 'SYSADM'
  AND    segment_created = 'YES'
  AND    temporary = 'N'
  AND   (num_rows = 0 OR num_rows IS NULL)
 ) LOOP
  dbms_output.put_line(i.owner||'.'||i.table_name);
  dbms_space_admin.drop_empty_segments (
    schema_name    => i.owner,
    table_name     => i.table_name);
 END LOOP;
END;
/
As this package drops the empty segments, the SEGMENT_CREATED column on USER_TABLES changes to NO and if you were to extract the DDL with DBMS_METADATA the SEGMENT CREATION clause would have changed to DEFERRED.

As soon as any data is inserted, the segment is created, SEGMENT_CREATED changes to YES and the DDL generated by DBMS_METADATA would have SEGMENT CREATION IMMEDIATE.

The result is that 64Kb of space (assuming a block size of 8Kb) will be freed up for each empty table and index segment that is dropped. Your mileage may vary, but in my demo HR database that is over 20000 tables and 25000 indexes. 2.7Gb isn't a vast amount these days, but it is an easy win.

Added 1.2.2014:
To answer Noons' question below.  So long as the table or partition doesn't have any rows, the segment will be dropped it will as if the segment creation had been deferred.  You don't have to do anything special to the table.  There is no problem applying this to any empty tables create with their segments.  Here is a simple test with my results on 11.2.0.3:

I will create a table and I have explicitly created the segment immediately, then I insert a row, commit the insert and delete the row.  I haven't even bothered to commit the delete.
SQL> create table t(a number) segment creation immediate;
SQL> insert into t values(42);
SQL> commit;
SQL> delete from t;
SQL> select segment_type, segment_name, tablespace_name from user_Segments where segment_name = 'T';

SEGMENT_TYPE       SEGMENT_NAME    TABLESPACE_NAME
------------------ --------------- ------------------------------
TABLE              T               PSDEFAULT

SQL> select table_name, segment_created from user_tables where table_name = 'T';

TABLE_NAME                     SEG
------------------------------ ---
T                              YES

SQL> select dbms_metadata.get_ddl('TABLE','T') from dual;

DBMS_METADATA.GET_DDL('TABLE','T')
--------------------------------------------------------------------------------
  CREATE TABLE "SYSADM"."T"
   ( "A" NUMBER
   ) SEGMENT CREATION IMMEDIATE


So, at the moment the segment exists, it has had rows in it, but they have been deleted and the table is empty. If I run DBMS_SPACE_ADMIN.DROP_EMPTY_SEGMENTS the segment is dropped.

SQL> execute dbms_space_admin.drop_empty_segments (user,'T');
SQL> select segment_type, segment_name, tablespace_name from user_Segments where segment_name = 'T';

no rows selected

SQL> select table_name, segment_created from user_tables where table_name = 'T';

TABLE_NAME                     SEG
------------------------------ ---
T                              NO

SQL> select dbms_metadata.get_ddl('TABLE','T') from dual;

DBMS_METADATA.GET_DDL('TABLE','T')
--------------------------------------------------------------------------------
  CREATE TABLE "SYSADM"."T"
   ( "A" NUMBER
   ) SEGMENT CREATION DEFERRED

Friday, January 03, 2014

Date to Timestamp Conversion during PeopleTools Upgrade

This blog posting describes a script to convert Oracle date columns to Timestamps as used from PeopleTools 8.50 but only rebilding those indexes that reference those columns, rather than drop and recreate every index in the system, thus producing a significant saving of time during the upgrade.

(A longer version of this arcticle is available on my website)

I am working on a PeopleSoft upgrade project.  We are going from PeopleTools 8.49 to 8.53.  One of the things that happens is that some date columns in the Oracle database become timestamps.

Timestamps were introduced by Oracle in version 10g of the database, and provide the ability to store times accurate to the nanosecond (although the default is microsecond). Dates are accurate to the whole second.

There are 3 types of temporal column in PeopleSoft as defined on PSDBFIELD.  Prior to PeopleTools 8.50 they all become Oracle data columns in the database.  However, from PeopleTools 8.50; Time and DateTime fields are built as TimeStamp columns.if bit 5 of PSSTATUS.DATABASE_OPTIONS (value 32) is set.

PeopleTools Field Type
Database Column Type
PeopleTools <= 8.49 PeopleTools >= 8.50
4 Date DATE DATE
5 Time DATE TIMESTAMP
6 DateTime DATE TIMESTAMP

Timestamps must be handled differently to dates in SQL.  Some date arithmetic must be done differently, in particular the difference between two timestamps is a timestamp rather than a number of days.  Therefore this setting also controls how PeopleCode date macros expand on Oracle.

During the upgrade, PeopleSoft Change Assistant simply alters all the Time and DateTime columns from dates to timestamps.  This generally works well.  The data value doesn't appear to get longer, so the block doesn't run out of room leading to row migratition, and so it isn't necessary to rebuild every table that is affected.

However, there are some limitations. If the column being converted to a timestamp falls into one of the following categories you will get an error.
  • The column is a key column in a function-based index.
  • The table or index is partitioned by the column.
The functional key index issue has not affected many customers because the upgrade template drops all the indexes before altering the tables and rebuilding them again.

However, dropping and recreating all these indexes can be very time consuming and increases the duration of the outage required to perform the upgrade.  This has been my incentive to find a better way.

Function-Based Indexes
PeopleSoft Application Desginer defines some key and search fields as descending.  The rows in components and the results of search dialogue are sorted on the key fields in the order specified.  Application Designer then indexes these columns in descending order (prior to PeopleTools 8 and since PeopleTools 8.47).  If any column in an Oracle index is in decending order the index is created as a function-based index.  Consequently, there can be a lot of descending indexes in a PeopleSoft system!  HR systems are particularly affected because many tables are effective-dated, and the field EFFDT is usually a descending key field.

It is not possible to alter a column to a timestamp if it appears anywhere in a function-based index. You get the following error:
ORA-30556: functional index is defined on the column to be modified 

Partitioning
Partitioning is not something that you usually encounter in a vanilla PeopleSoft system, but it can be added by customisation.  You generate the necessary DDL yourself if you want to use it. However, from PeopleTools 8.51 Application Designer will preserve existing partitioning.

In the system on which I am working, when partitioned the audit tables by AUDIT_STAMP which is a DateTime field.
ORA-14060: data type or length of an table partitioning column may not be changed 
We have had no alternative but to rebuild these tables and repopulate the data. This has also dealt with all locally partitioned indexes.

We have also found that we have one global index partitioned on a timestamp.
ORA-14061: data type or length of an index partitioning column may not be changed 
We also have had to drop this index in order to alter the table.

My Approach
We have had no alternative but to rebuild and repopulate our partitioned audit tables which are partitioned by a DateTime field.  However, that is what we did when we first partitioned them.  The scripts are very similar to those generated by Application Designer.  The table is renamed, a new one is built, and the data is copied.  In our case these scripts are built with a PL/SQL utility.  This also addressed the need to rebuild the locally partitioned indexes..

To minimize the number of indexes which must be rebuilt I have written a PL/SQL script (http://www.go-faster.co.uk/scripts/gfc_desc_timestamp_index.sql) that:
  • identifies the indexes that need to be dropped.
  • captures the DDL to recreate the indexes using DBMS_METADATA and stores it in a table,
  • drops the indexes,
  • alters the columns that cannot be altered with the index in place,
  • recreates the index.
The script successfully handles partitioned function-based indexes.

On this particular HR system we only rebuilt about 400 indexes instead of over 10000. Now the standard PeopleSoft upgrade template can be run without dropping or recreating any further indexes.

Thursday, January 02, 2014

Minimum Number of Recycling Server Processes

When I rebuilt my demo system (some while ago) with PeopleTools 8.52, I noticed a new message generated by ubbgen in PeopleTools 8.52 when the minimum number of recycling servers is set to 1.

WARNING: PSAPPSRV, PSSAMSRV, PSQRYSRV, PSQCKSRV, PSPPMSRV and PSANALYTICSRV are configured with Min instance set to 1. 
To avoid loss of service, configure Min instance to at least 2.

What Produces the Message?
ubbgen is the PeopleSoft utility that merges the template file (psappsrv.ubx) with the configuration file (psappsrv.cfg) file to produce the Tuxedo configuration file (psappsrv.ubb) and the environment file (psappsrv.env). It is invoked by psadmin during Tuxedo domain configuration.
ubbgen -t psappsrv.ubx -c psappsrv.cfg -o psappsrv.ubb -v psappsrv.val -q y -u PUBSUB=n/QUICKSRV=n/QUERYSRV=n/JOLT=y/JRAD=n/DBGSRV=n/RENSRV=n/MCF=n/PPM=n/ANALYTICSRV=n
The message is produced at this time.

Recycling Servers
Several servers in a peopleSoft application Server domain recycle after they have handled a number of services. Recycling is a PeopleSoft behaviour and not a Tuxedo behaviour.  It is controlled by the Recycle Count parameter in the PeopleSoft configuration file (psappsrv.cfg).  This parameter is not referenced in the template file (psappsrv.ubx).

[PSAPPSRV]
;=========================================================================
; Settings for PSAPPSRV
;=========================================================================

;-------------------------------------------------------------------------
; UBBGEN settings
Min Instances=2
Max Instances=3
Service Timeout=0

;-------------------------------------------------------------------------
; Number of services after which PSAPPSRV will automatically restart.
; If the recycle count is set to zero, PSAPPSRV will never be recycled.
; The default value is zero.
; Dynamic change allowed for Recycle Count
Recycle Count=1000

PeopleSoft first started using BEA Tuxedo (as it was then) in PeopleTools 6 to remote call Cobol processes in the Financials product.  The Application Server was introduced in PeopleTools 7.  PSAPPSRV had recycling from the first release.  Legend has it that the engineers at Tuxedo where horrified when they heard that PeopleSoft had introduced recycling to resolve problems created by dynamic memory allocation and deallocation by the Panel Processesor (now known as the component Processor).

Tuxedo servers are supposed to be robust, long lived and not require to be regularly restarted.  Hence the server restart functionality in Tuxedo is only designed to be invoked in the rare occasions when a server process crashes.  Server processes are started by the Restart Server (restartsrv), and only one process can be started concurrently.

The minimum of 2 applies to the following servers because they can both recycle and be configured to spawn additional instances on demand PSAPPSRV, PSANALYTICSRV, PSSAMSRV, PSQCKSRV, PSQRYSRV, PSPUBHND, PSSUBHND, PSBRKHND.  In the delivered configuration file for the developer domain, recycling is disabled for several servers by setting the recycle count to zero.  However, the message is produced by ubbgen irrespective of the value of recycle count.

The message handler servers can be set to have just a single instance, without producing any warning.  They only consume messages from the dispatcher processes so their temporary disapearance will not cause user errors.

What Happens When The Only Server Recycles?
Tuxedo server processes consume service requests placed on queues.  When a server starts up, it advertises its services on the Bulletin Board.  Tuxedo processes that submit requests (mostly the JSH processes, but also the message dispatcher processes, the Process Scheduler) look up on the Bulletin Board where a service is advertised and place it on the appropriate queue.

When a server process performs an orderly shutdown (as it does during a recycle) it removes the adverts for its services (the command is unadvertise).  If a process crashes the Bulletin Board Liason process (BBL) detects the crash and cleans the Bullentin Board. When all services advertised on a queue have been unadvertised, the queue is also removed from the Bulletin Board.  If the process submitting the service request cannot find any server advertising a service it generates an error.  
074001.GO-FASTER-6!JSH.3124.5736.-2: JOLT_CAT:1043: "ERROR: tpacall() call failed, tperrno = 6"
This is why the minimum is 2.  If you recycled the only server process, it is possible for someone to produce this error (see also Minimum Number of Application Server Processes).

Should I set the minimum higher than 2?
The number of services handled by different server processes on the same queue is usually uneven because the service is handled by the first free server.  Therefore it is rare for the processes will reach the recycle count simultaneously, but it can still happen.  Even in a quiet system that doesn't have sufficient activity to justify 3 PSAPPSRVs, I prefer to set the minimum number of servers to at least 3. 

Debugger
NB: If the debugger process is enabled then PSADMIN will force the minimum and maximum number of PSAPPSRV processes to at least 2.  ubbgen will actually update psappsrv.cfg.
Warning:  PSAPPSRV Min Instances too small, setting to 2 for debugger.

Friday, November 23, 2012

PeopleTools 8.52 Application Engine sets MODULE and ACTION

I have written and spoken often about the huge importance of instrumentation in an application. PeopleSoft introduced internal instrumentation to PeopleTools 8.44 for its own Performance Monitor that works on any platform. Since PeopleTools 7.53, when running on an Oracle database, PeopleTools also calls the Oracle supplied package dbms_application_info package to set the module and action for the session. The values set were not particularly useful, so I wrote a PL/SQL package (psftapi) and trigger to set the module and action attributes when a process started.

In PeopleTools 8.50 this instrumentation was enhanced to set module and action to the component and page name in the PIA. These values appear in Oracle Enterprise Manager and Active Session History (ASH). They can be used to profile components.

Now, in PeopleTools 8.52 Application Engine has been enhanced to set the action to the full Application Engine step name.

To illustrate what happens, I have produced an Oracle Extended SQL Trace for the process, but I have shown only the lines in a SQL Trace file with 3 asterisks. So you can see MODULE and ACTION being set (and some other timestamp information). The lines in italic were emitted when module and action were set by my psftapi package, in which I simply set the MODULE to the Application Engine program name.
*** 2012-11-22 21:47:38.282
*** SESSION ID:(7.2237) 2012-11-22 21:47:38.282
*** CLIENT ID:(PS) 2012-11-22 21:47:38.282
*** SERVICE NAME:(HCM91) 2012-11-22 21:47:38.282
*** MODULE NAME:(PSPMCSOSUM) 2012-11-22 21:47:38.282
*** ACTION NAME:(PI=867:Processing) 2012-11-22 21:47:38.282
*** MODULE NAME:(PSAE.PSPMCSOSUM.1448) 2012-11-22 21:47:38.419
*** ACTION NAME:(PSPMCSOSUM) 2012-11-22 21:47:38.419
*** ACTION NAME:(PSPMCSOSUM.MAIN.STATS.S) 2012-11-22 21:47:38.420
*** 2012-11-22 21:47:38.588
*** ACTION NAME:(PSPMCSOSUM.MAIN.CTL.P) 2012-11-22 21:47:38.919
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:47:38.937
*** ACTION NAME:(PSPMCSOSUM.GETCNT.CNT.P) 2012-11-22 21:47:38.963
…
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:48:15.168
*** ACTION NAME:(PSPMCSOSUM.GETCNT.CNT.P) 2012-11-22 21:48:15.168
*** ACTION NAME:(PSPMCSOSUM.MAIN.Close.P) 2012-11-22 21:48:15.197
*** ACTION NAME:(PI=867:Success) 2012-11-22 21:48:15.201

Application Engine also sets the MODULE attribute.  In the above example it was set to PSAE.PSPMCSOSUM.1448.  The string is composed of three parts.
  • PSAE indicates that it was running Application Engine. It sets the same value for both stand-alone and Tuxedo server Application Engine processes.
  • PSPMCSOSUM was the name of the Application Engine Program
  • 1448 was the operating system process ID of the Application Engine process.  This value is recorded in the column SESSIONIDNUM on the table PSPRCSQUE.
The inclusion of the SESSIONIDNUM is useful because it is possible to determine the process instance.  If you have multiple instances of the same Application Engine program running concurrently, you can work out which ASH samples correspond to which process instance by matching the session ID.

SELECT  r.prcsinstance, h.module, h.action, sum(1) ash_secs
FROM v$active_Session_history h
, psprcsque q
, psprcsrqst r
WHERE r.prcsinstance = q.prcsinstance
AND h.module = 'PSAE.'||q.prcsname||'.'||q.sessionidnum
AND h.sample_time BETWEEN r.begindttm AND NVL(r.enddttm,SYSDATE)
and r.prcsinstance = 867
GROUP BY r.prcsinstance, h.module, h.action
ORDER BY ash_secs DESC
/

PRCSINSTANCE MODULE               ACTION                    ASH_SECS
------------ -------------------- ------------------------- --------
         867 PSAE.PSPMCSOSUM.1448 PSPMCSOSUM.GETCNT.CNT.P          7
         867 PSAE.PSPMCSOSUM.1448 PSPMCSOSUM.MAIN.STATS.S          1

Unfortunately, this change to MODULE also means that it is no longer possible to enable trace by setting a watchpoint (as I blogged previously) now does not work with Application Engine because it is not possible to predict the value of the SESSIONIDNUM in MODULE!

I am testing with 8.52.12 and have found a significant problem. Application Engine doesn't set the action on DO SELECT statements. The first statement in the trace extract below does come from PSPMCSOSUM.MAIN.CTL PeopleCode, however, the second SQL statement actually comes from PSPMCSOSUM.MAIN.SELECT.D, but the absence of another action line would lead you to believe the statement came from the preceding PeopleCode step.
*** ACTION NAME:(PSPMCSOSUM.MAIN.CTL.P) 2012-11-22 21:47:38.919
WAIT #0: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220814597
WAIT #0: nam='SQL*Net message from client' ela= 740 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220815371
CLOSE #348647416:c=0,e=33,dep=0,type=1,tim=257220815447
=====================
PARSING IN CURSOR #348528704 len=98 dep=0 uid=45 oct=3 lid=45 tim=257220815496 hv=1424819941 ad='7ff2559e550' sqlid='98af7ppafu1r5'
END OF STMT
PARSE #348528704:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1148792852,tim=257220815496
EXEC #348528704:c=0,e=30,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=1148792852,tim=257220815572
WAIT #348528704: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220815612
FETCH #348528704:c=0,e=2824,p=0,cr=16,cu=0,mis=0,r=1,dep=0,og=1,plh=1148792852,tim=257220818462
STAT #348528704 id=1 cnt=1 pid=0 pos=1 obj=0 op='SORT AGGREGATE (cr=16 pr=0 pw=0 time=2828 us)'
STAT #348528704 id=2 cnt=3075 pid=1 pos=1 obj=228740 op='INDEX FULL SCAN PS_PSPMCSO_CHART (cr=16 pr=0 pw=0 time=333 us cost=0 size=13 card=1)'
WAIT #348528704: nam='SQL*Net message from client' ela= 123 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220818671
FETCH #348528704:c=0,e=1,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=0,plh=1148792852,tim=257220818696
WAIT #348528704: nam='SQL*Net message to client' ela= 0 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220818714
WAIT #348528704: nam='SQL*Net message from client' ela= 702 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220819434
CLOSE #349389936:c=0,e=16,dep=0,type=1,tim=257220819496
=====================
PARSING IN CURSOR #349702624 len=555 dep=0 uid=45 oct=3 lid=45 tim=257220819548 hv=1888777338 ad='7ff2559ce90' sqlid='5phgqq9s98x3u'
SELECT DISTINCT OPRID , TO_CHAR(CAST((LOGINDTTM) AS TIMESTAMP),'YYYY-MM-DD-HH24.MI.SS.FF') , TO_CHAR(CAST((LOGOUTDTTM) AS TIMESTAMP),'YYYY-MM-DD-HH24.MI.SS.FF') , PM_SIGNON_TYPE , PM_SIGNOFF_TYPE FROM PSPMCSOLOG_VW WHERE '1' <> '2' AND TO_DATE(TO_CHAR(LOGINDTTM,'YYYY-MM-DD'),'YYYY-MM-DD') >= TO_DATE('2005-07-06','YYYY-MM-DD') AND TO_DATE(TO_CHAR(LOGOUTDTTM,'YYYY-MM-DD'),'YYYY-MM-DD') >= TO_DATE('2005-07-06','YYYY-MM-DD') AND OPRID NOT IN ( SELECT OPRID FROM PSPMCSOCHRTXOP) AND LOGIPADDRESS NOT IN ( SELECT LOGIPADDRESS FROM PSPMCSOCHRTXIP) ORDER BY 2
END OF STMT
PARSE #349702624:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4226533831,tim=257220819547
EXEC #349702624:c=0,e=47,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=4226533831,tim=257220819669
WAIT #349702624: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220819723
WAIT #349702624: nam='SQL*Net more data to client' ela= 76 driver id=1413697536 #bytes=8145 p3=0 obj#=-40016373 tim=257220831271
FETCH #349702624:c=15600,e=11621,p=0,cr=214,cu=0,mis=0,r=201,dep=0,og=1,plh=4226533831,tim=257220831375
WAIT #349702624: nam='SQL*Net message from client' ela= 988 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832412
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832481
WAIT #0: nam='SQL*Net message from client' ela= 40 driver id=1413697536 #bytes=1 p3=0 obj#=-40016373 tim=257220832541
*** ACTION NAME:(PSPMCSOSUM.MAIN.SELECT.C) 2012-11-22 21:47:38.937

However, MAIN.SELECT.D duly appears in the AE batch timings report.


                               C o m p i l e    E x e c u t e    F e t c h        Total           
SQL Statement                  Count   Time     Count   Time     Count   Time     Time    
------------------------------ ------- -------- ------- -------- ------- -------- --------
…
AE Program: PSPMCSOSUM

MAIN.CTL.D                           1      0.0       1      0.0       2      0.0      0.0
MAIN.SELECT.D                        1      0.0       1      0.0     556      0.0      0.0
MAIN.STATS.S                         1      0.0       1      0.5       0      0.0      0.5
…

This is a serious problem  If you profile the top SQL statements in an Application Engine (using either ASH as I have done above, or by profiling a trace file and looking for the top SQL statement by searching through the raw trace file) it will lead you to the wrong conclusion!  Time spent in PSPMCSOSUM.MAIN.SELECT.D will be accounted as having been spent in PSPMCSOSUM.GETCNT.CNT.P.  It is not until you look for the specific SQL statement in Application Designer that you realise that the source code doesn't match the instrumentation.
 
Nevertheless, this instrumentation is a very welcome and significant improvement for Application Engine. It will help to localise performance problems. However, until the ommisions are fixed, you will need to be aware of their implications and avoid falling into the traps.

My thanks to Wolfgang Breitling for telling me about the feature.

Friday, November 16, 2012

Using Two Temporary Tablespaces in PeopleSoft

A longer version of this posting, with all necessary code and a demonstration test, is available as a technical note on the Go-Faster website.  I am working with two different PeopleSoft customers who have had challenges with the size of the temporary tablespaces.  Critical batch processes have failed because they have exhausted space in the temporary tablespace.
ORA-01652 unable to extend temporary segment...
  • In one case, the one and only temporary tablespace in a Payroll system has over time been extended to in excess of 360 GB.  This has happen in response to PeopleSoft processes that failed because they cannot allocate temporary tablespace because somebody else has consumed it.  This treated the symptom rather than the cause.  This system has a number of other Oracle database users who are have read-only access to the PeopleSoft data to perform adhoc queries.  These users all share the one temporary tablespace.  Occasionally, a query will be submitted that runs for many hours, writing many gigabytes of data to the temporary tablespace, when it would have been better to terminate the process
  • Another system has 64 GB in the PSTEMP temporary tablespace used by SYSADM.  All other users already use another temporary tablespace, but PeopleSoft processes sometimes still fail because most of the temporary tablespace has been consumed by an adhoc PS/Query process, and there is nothing left for other processes. This system also has other Oracle database users with read-only access, but here they use the default TEMP temporary tablespace.
If a PeopleSoft system has database users executing adhoc queries, then allocating those users to separate temporary tablespace is a sensible first step.

However, in this document I suggest going further.  I propose switching some PeopleSoft processes to use a different temporary tablespace.  Regular PeopleSoft processing will continue to use the first temporary tablespace, but adhoc queries will use the second temporary tablespace.  Thus, the first temporary tablespace can be sized to cater for normal processes safe in the knowledge that it won’t be consumed by adhoc queries and you won’t get failures due to space errors.  Meanwhile, the second tablespace can be limited to a reasonable size  and queries that make unreasonable demands on the temporary tablespace will error.

I am going to exploit two features:
  • When a session sets CURRENT_SCHEMA to another schema it uses the temporary tablespace assigned to that schema rather than its own.
  • From PeopleTools 8.51 you can direct read-only activity via a second connection to a standby database. If you are using Oracle Active Data Guard you must configure something that looks like a second PeopleSoft database in another schema, but which actually is composed of database synonyms that point to the first schema.
To make PeopleSoft process use an alternative temporary tablespace:
  • Create a second schema (I'll call it SYSADMRP) in the same database, and specify a different temporary tablespace for the user.
  • Create synonyms in the second schema for every table and view in the PeopleSoft (SYSADM) schema.
If you are not using Active DataGuard or are below PeopleTools 8.51 then you can set CURRENT_SCHEMA with two triggers.
  • Create an AFTER LOGON trigger that will set CURRENT_SCHEMA to SYSADM for PSQRYSRV query server processes only.

CREATE OR REPLACE TRIGGER sysadm.psqrysrv
AFTER LOGON
ON sysadm.schema
DECLARE
  l_module VARCHAR2(64);
BEGIN
  SELECT sys_context('USERENV', 'MODULE') 
  INTO   l_module
  FROM   dual;

  IF UPPER(l_module) LIKE 'PSQRYSRV%' THEN --then this is a PSQRYSRV session
    EXECUTE IMMEDIATE 'ALTER SESSION SET CURRENT_SCHEMA=SYSADMRP';
  END IF;
EXCEPTION WHEN NO_DATA_FOUND THEN NULL;
END;
/ 
  • Create another trigger to set current schema for the application engine process that runs scheduled queries (you might want to add others to this list).

CREATE OR REPLACE TRIGGER sysadm.set_current_schema
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (  (new.runstatus = '7' OR old.runstatus != '7') --if starting or terminating
     AND new.prcsname IN('PSQUERY')) --restrict to certain programs
DECLARE
BEGIN
  IF :new.runstatus = '7' THEN --if starting set alternative schema
    EXECUTE IMMEDIATE 'ALTER SESSION SET CURRENT_SCHEMA=SYSADMRP';
  ELSE --when process terminates reset to standard schema in case this is a PSAESRV process
    EXECUTE IMMEDIATE 'ALTER SESSION SET CURRENT_SCHEMA=SYSADM';
  END IF;
END;
/

If you are using at least PeopleTools 8.51, you don't need these triggers.  Instead you can configure the standby connection but to the same database
  • PS/Queries will always use the standby connection and hence the alternative tablespace.  
  • Batch processes can be marked as read only to make them connect to the standby connection. Remember that you must use PSAESRV processes.
  • If you want to make specific PeopleSoft components use the alternative temporary tablespace then you can only do this by perverting the PeopleTools Active Data Guard support and marking components as read only.

Conlusion

Now, you can choose an appropriate size for the one temporary tablespaces that will be sufficient for the regular operation of the application.  Adhoc queries will use alternative temporary tablespace.  You might choose to set a temporary tablespace size that may still cause queries to fail with a temproray tablespace error, but at least that won’t cause business-as-usual processes to crash.

You could even choose to run with three temporary tablespaces, one for PeopleSoft processes, one for PeopleSoft queries, and one for adhoc users accessing the database directly.