Wednesday, May 08, 2019

PeopleSoft Administrator Podcast: #183 – Effective Performance Monitoring

I recently recorded a podcast with Dan Iverson and Kyle Benson for the PeopleSoft Administrator Podcast, this time about instrumentation, monitoring the performance of PeopleSoft system, and Performance Monitor.  There is also just a little about cursor sharing.

(3 May 2019) #183 – Effective Performance Monitoring

You can listen to the podcast on psadmin.io, or subscribe with your favourite podcast player, or in iTunes.

Friday, March 08, 2019

Effective PeopleSoft Performance Monitoring

This advice note describes how to configure PeopleSoft systems on Oracle so that performance metrics are collected that are useful performance monitoring and subsequent performance tuning.

Contents

  • Oracle RDBMS Instrumentation
    • On-Line Component/Page Information
    • Application Engine Step Information
    • Cobol/nVision Instrumentation
    • nVision Layout Instrumentation
    • 2nd Database Connection Instrumentation
  • PeopleTools Performance Metrics
    • Cobol and Application Engine Batch Timings
    • PeopleSoft Performance Monitor

Summary of Recommendations

  • Set EnableAEMonitoring=1 in all Application Server and process scheduler domains in order to enable PeopleSoft instrumentation on-line, in the integration broker, and in Application Engine programs.
  • Implement the PSFTAPI package and trigger described above to set module and action at the start of all processes.
  • Implement Fine-Grained Audit policy and handler to instrument nVision processes if required.
  • Reduce AWR snapshot frequency from the default of 1 hour to every 15 minutes.
  • Enable Application Engine batch timings to at least database and preferably also file by setting AETrace=1152 in all Application Server and Process Scheduler domain configuration files.
  • Enable Cobol statement timings report should be written to log file by setting TraceSQL=128 in all Process Scheduler domain configuration files.
  • Performance Monitor is complex to set up and the delivered analytics are limited.  Nonetheless, the information that can be obtained from the data collected can be worth the effort of configuration to address on-line configuration and performances issues.

Oracle RDBMS Instrumentation

Oracle provides the dbms_application_info package to set certain attributes on the current database session.  These attributes are visible in some of the dynamic performance version, and are picked up by Active Session History (ASH) and can also be seen in AWR reports, Enterprise Manager screens, SQL trace files, and other performance utilities such as EDB360.  The package was first documented in Oracle 7.3.3
"Application developers can use the DBMS_APPLICATION_INFO package to record the name of the executing module or transaction in the database for use later when tracking the performance of various modules… System administrators can also use this information to track resource usage by module. When an application registers with the database, its name and actions are recorded in the V$SESSION and V$SQLAREA views. Your applications should set the name of the module and name of the action automatically each time a user enters that module. The module name could be the name of a … code segment in an … application. The action name should usually be the name or description of the current transaction within a module."
Oracle 7 Tuning, release 7.3.3 ©Oracle 1997, Chapter 23 Registering Applications
See also One of my Favourite Database Things: DBMS_APPLICATION_INFO
If module and action are consistently set to meaningful values as the application executes it is then possible to determine from where SQL statements originated and how different parts of the application are performing.
ASH is separately licenced as part the Diagnostics Pack, that is only available on Enterprise Edition of the database.  However, most PeopleSoft customers running on Oracle do so on Enterprise Edition and are licenced for the diagnostics pack.
PeopleSoft has set client_info since PeopleTools 7.53 so the PeopleSoft operator ID can be associated with the database session, mainly to allow auditing to be done with database triggers.  However, this attribute is not persisted to the ASH data.  However, client_id is also set to the operator ID, and this is collected by ASH.
PeopleTools has set module and action since PeopleTools 8.50, though the exact values used have changed sometimes with the PeopleTools version.
A new application server/process scheduler domain parameter EnableAEMonitoring was introduced in PeopleTools 8.54 to control this behaviour and it is not enabled by default.  If monitoring is not enabled module defaults to the program name and action remains blank.
Recommendation: Set EnableAEMonitoring=1 in all application server and process scheduler domains in order to enable PeopleSoft instrumentation on-line, in the integration broker, and in Application Engine programs.
See also:

On-Line Component/Page Information

In the online application, module and action are set to the component and page name respectively.  In a search dialogue, Action is set to 'xyzzy'.
In the Integration Broker module and action are set to service name and queue name.

Application Engine Step Information

In Application Engine, module is set to a string that includes the name of the Application Engine main program the was called and the sessionid_num for the current process instance number recorded on the process scheduler request table PSPRCSQUE.  For example: PSAE.PSPMCSOSUM.1448
Later on, it may be necessary to use regular expressions in SQL to process this string before profiling the ASH data.
Action is set to string concatenated from the Application Engine program, section, step name, and step type.  For example: PSPMCSOSUM.GETCNT.CNT.P
The program name may be different from that shown in module if one AE program calls another.
Note: Application Engine still doesn't reset ACTION on DO SELECT steps

Cobol/nVision Instrumentation

Cobol, nVision, and SQR do not set module and action.  Instead, they can be set at the start of every program initiated by Process Scheduler using a trigger on PSPRCSRQST.  The first thing a process does when it is initiated is to set the run status on its scheduler request record to 7, indicating that it is processing.  This is visible in the Process Monitor component.  A trigger on this transition can set module and action for the session to the program and process instance number on that row of data.
This technique was used prior to PeopleSoft adding instrumentation to PeopleTools.  It can still be applied to all processes, including Application Engine because any PeopleSoft instrumentation will simply overwrite the value set by the trigger.
Recommendation: Implement the PSFTAPI package and trigger described above in order to set module and action at the start of all processes.
The same technique was also used prior to the introduction of ASH to enable Oracle SQL Trace if a particular run control was specified, and is still occasionally useful.

nVision Layout Instrumentation

One of the challenges of tuning and monitoring nVision is to be able to identify each report being run.   nVision reports always run as the same process name, either NVSRUN for a single report, RPTBOOK for a report book of many reports, or DRILLDWN for a nVision drill-down query.  Knowing the process instance is useful because then we can look up the operator and run control ID
However, it would also be useful to know the report ID being run.  When each individual nVision report starts it queries the runtime parameters from the PS_NVS_REPORT PeopleTools table.  There is no update, so it is not possible to capture this with a DML trigger.  Instead, it is possible to define a fine-grained audit policy on the query and set module within a PL/SQL handler package that is invoked by the fine-grained audit.
Recommendation: Implement Fine-Grained Audit policy and handler to instrument nVision processes if required

2nd Database Connection Information

PeopleSoft programs use a second database connection to increment sequence numbers to minimise the row level locking on such tables.  It is like an AUTONOMOUS_TRANSACTION in PL/SQL.  There is no PeopleSoft instrumentation on this session.  It is possible to use an AFTER LOGON trigger to set client_info, module and action.

Oracle Automatic Workload Repository (AWR) Snapshots

PeopleSoft generates a lot of non-shareable SQL.
  • Dynamically generated SQL, often in PeopleCode, concatenates strings of SQL with bind variables, thus the bind variables become literals in the final SQL statement.  Statements with different literal values are considered to be different statements with different SQL_IDs.
  • Similarly, dynamic Cobol statements result in literal values in the SQL statement.
  • %BIND() variables in Application Engine will also become literal values in the SQL Statement unless the ReUseStatement attribute is set on the AE step, however, this cannot be set on statements with dynamic fragments code are introduced with %BIND(…,NOQUOTES).
  • Application Engine programs that use temporary records can use different non-shared instances of the record in different executions, and this also results in different statements with different SQL_IDs.
  • See also Performance Benefits of ReUse Statement Flag in Application Engine
Consequently, the library caching is not particularly efficient in PeopleSoft, and dynamic SQL statements are often quickly aged out of the library cache.  AWR snapshots can only capture the SQL that is in the library cache at the time of the snapshot.  If the SQL statement, or at least a statement with the same force matching signature or plan hash value, cannot be found the AWR it cannot be identified or analysed.  Therefore, it is advantageous to increase the snapshot frequency on PeopleSoft systems.
Recommendation: Reduce AWR snapshot frequency from the default of 1 hour to every 15 minutes.  This change results in only a modest increase overhead in processing and space on AWR, but it is worth the additional information that is captured.
This advice also applies to Statspack that may be used if you are not licenced for the Diagnostics Pack.

PeopleTools Performance Metrics

Batch Timings

PeopleSoft Application Engine and Cobol programs can emit batch timings reports on successful completion.
Application Engine
Application Engine batch timings are controlled by the AETrace flag in the Process Scheduler domain configuration file and for on-line AE programs in the Application Server domain configuration files.
  • AETrace=128: batch timings report is written to the AE Trace file to
  • AETrace=1024: batch timings are written to PS_BAT_TIMINGS% tables in the database
The overhead of batch timings is negligible while the program is running because it is accounted in memory and only written to file or database when the process completes successfully.
Recommendation: Enable Application Engine batch timings to at least database and preferably also file by setting AETrace=1152 in all Application Server and Process Scheduler domain configuration files.
The trace setting in the process scheduler configuration can be overridden by setting process specific command line parameter overrides in the process definition.  This is often done to set other trace settings, it is also common to see these unintentionally left in place longer than necessary.  If trace is set in this way it should always also set the batch timings flags.
See PeopleBooks -> Development Tools -> Application Engine -> Enabling Application Engine Tracing
Cobol
PeopleSoft Cobol programs can only write batching timings reports to file and not to the database.  This is controlled by a different parameter.
  • TraceSQL = 128: Enable Cobol statement timings report 
Recommendation: Enable Cobol statement timings report should be written to log file by setting TraceSQL=128 in all Process Scheduler domain configuration files.
This trace setting can also be overridden by setting process specific command line parameter overrides in the process definition.  If trace is set in this way it should always also set the batch timings flags.

PeopleSoft Performance Monitor

This provides information about the performance of the PIA including response times for the online transactions.  Metrics are stored in a separate monitoring PeopleSoft system to minimize the effect of measurement intrusion.  It optionally samples the state of each web server, application server and process scheduler collecting operating system and Tuxedo metrics.  It also has a PIA session trace capability.
The sampled data includes the number of busy application server processes and length of inbound Tuxedo service queues.  This data can be used to validate the sizing of the application servers.
Recommendation: Performance Monitor is complex to set up and the delivered analytics are limited.  Nonetheless, the information that can be obtained from the data collected can be worth the effort of configuration to address on-line configuration and performances issues.

Thursday, September 06, 2018

How Not to Collect Optimizer Statistics in an Application Engine Program

I was told about a PeopleSoft customer experiencing an Oracle error when collecting statistics during an Application Engine.
ORA-06533: Subscript beyond count 
ORA-06512: at "SYS.DBMS_STATS"…
It is possibly a manifestation of a database bug. The workaround was not to use AUTO_SAMPLE_SIZE, so instead, this customer initially coded an explicit call to DBMS_STATS.GATHER_TABLE_STATS with a specific sample size.
This blog is not about the bug, but how to manage the workaround.

DO NOT TRY THIS AT HOME!


I think that there are a number of problems with this approach
  1. Using a fixed sample size rather than AUTO_SAMPLE_SIZE should only be considered as a temporary workaround. The new hash-based number-of-distinct-values (NDV) algorithm in Oracle 12c only works with AUTO_SAMPLE_SIZE, and it produces more accurate statistics and runs faster because saves a large sort operation. Coding a fixed sample size into an Application Engine requires a managed change to be made, tested and released into production, and then when the underlying problem is resolved the customisation needs to be removed by the same managed process. 
  2. DBMS_STATS.GATHER_TABLE_STATS implies a commit. That can lead to problems that PeopleSoft avoids by only calling statistics via the %UpdateStats macro and controlling when that macro does and does not execute. 

Committing and Restart Checkpointing in Application Engine

If a restartable Application Engine program fails it rolls back to the last commit point and it can then be restarted from that point. Committing is managed by Application Engine at section and step levels where the program state record is updated accordingly. If an error occurs in a step after the implicit commit in DBMS_STATS, it can result in the data in the application tables being different to where the state record indicates the program can be restarted. The program may not restart, or it could conceivably execute but produce erroneous results.
Committing inside a do while loop, including any other Application Engine program called from inside the loop is suppressed at Application Engine section/step level and therefore the execution of %UpdateStats macro is also suppressed. Otherwise, you could get rogue ORA-01555 Snapshot Too Old errors. Suppression of %UpdateStats is reported in the Application Engine step trace.
-- 19:07:37.199 .(AEMINITEST.MAIN.LOOP) (Do Select)
%Select(recname) SELECT recname FROM psrecdefn WHERE recname = 'JOB'
/
-- Buffers:
--   1) JOB
-- 19:07:37.204 Iteration 1 of .(AEMINITEST.MAIN.LOOP) (Do Select) loop

-- 19:07:37.208 .(AEMINITEST.MAIN.LOOP) (SQL)
RECSTATS PS_JOB LOW
/
-- 19:07:37.211 UpdateStats ignored - COMMIT required

-- 19:07:37.212 .(AEMINITEST.MAIN.LOOP) (Do Fetch)
Even a previously uncommitted SQL step can lead to %UpdateStats being suppressed.
-- 19:07:35.205 .(AEMINITEST.MAIN.Step01) (SQL)
UPDATE PS_AETEMPTBLMGR SET AE_DEDICATED = 1 WHERE 0 = 1
/
-- Row(s) affected: 0

-- 19:07:35.213 .(AEMINITEST.MAIN.Step02) (PeopleCode)

-- 19:07:35.220 .(AEMINITEST.MAIN.US1) (SQL)
RECSTATS PS_JOB LOW
/
-- 19:07:35.227 UpdateStats ignored - COMMIT required
If you code DBMS_STATS.GATHER_TABLE_STATS explicitly, Application Engine will not recognise the step as having committed. In the following example, you can see the %UpdateStats on the last step has been suppressed because it Application Engine does not recognise that the update in the first step has been committed by the call to DBMS_STATS.
-- 19:12:06.690 .(AEMINITEST.MAIN.Step01) (SQL)
UPDATE PS_AETEMPTBLMGR SET AE_DEDICATED = 1 WHERE 0 = 1
/
-- Row(s) affected: 0
-- 19:12:06.696 .(AEMINITEST.MAIN.Step02) (PeopleCode)
-- 19:12:06.700 .(AEMINITEST.MAIN.GTS) (SQL)
%Execute(/) BEGIN 
dbms_stats.gather_table_stats('SYSADM','PS_JOB',estimate_percent=>1); END; /
/
-- Row(s) affected: 1

-- 19:12:09.871 .(AEMINITEST.MAIN.US1) (SQL)
RECSTATS PS_JOB LOW
/
-- 19:12:09.877 UpdateStats ignored - COMMIT required
/
Perhaps, the safest form of this workaround would be to have the step with DBMS_STATS and the immediately preceding step explicitly commit as in the following example.  I have also made the program restartable.  Now restart data is checkpointed, and the %UpdateStats macro executes at step US1.
Restart Data CheckPointed
/
COMMIT
/

-- 19:20:24.792 .(AEMINITEST.MAIN.GTS) (SQL)
%Execute(/) BEGIN 
dbms_stats.gather_table_stats('SYSADM','PS_JOB',estimate_percent=>1); END; /
/
-- Row(s) affected: 1
/ 
/
Restart Data CheckPointed
/
COMMIT
/

-- 19:20:29.153 .(AEMINITEST.MAIN.US1) (SQL)
RECSTATS PS_JOB LOW
/
-- Row(s) affected: 1
/
However, you have to consider the state the application data after an error, whether you wish to restart or cancel the Application Engine because you can no longer rollback.

Doing the Right Thing

I recommend that:
  • You should only ever collect stats in Application Engine with the %UpdateStats macro that in turn executes the command in the DDL model.
  • From Oracle 11g both PeopleSoft statistics gathering DDL models should be the same and should ultimately call DBMS_STATS without any parameters other than the table name. The default value of ESTIMATE_PERCENT is AUTO_SAMPLE_SIZE.
There are two DDL models in PeopleSoft because %UpdateStats can be invoked with a second parameter to collect the statistics HIGH or LOW. This dates back to Oracle's ANALYZE command that could either compute or estimate statistics (and other database platforms had similar options). Collecting optimizer statistics with ANALYZE has been deprecated for many years, but the command still has other valid uses.  It was superceded by DBMS_STATS in Oracle 8i (released in 1998).
Automatic sample size was introduced in Oracle 9i.  In Oracle 9i and 10g, it was usual to use automatic sample size in the high statistics gathering model and a small fixed sample size in the low model for use on very large tables.  The LOW parameter was specified on %Updatestats in Application Engine programs as necessary.
This approach became redundant from Oracle 11g with the introduction of table preferences. If you need to collect statistics with a specific rather than the automatic sample size or specify any other parameters, then a table preference should be created. Preferences apply wherever statistics are gathered on that table and not overridden in the call of DBMS_STATS., including schema and database-wide operations such as the maintenance window.  If there are multiple places where statistics are collected on a table, a preference assures that the statistics will always be collected will be consistently.
From Oracle 12c, as the new NDV algorithm only works with AUTO_SAMPLE_SIZE, you should always use the default unless you have an overarching reason to the contrary, and then you should use a table preference. This approach does not require any application code change because the preference is an attribute of a table in the database.
I recommend using GFCPSSTATS package, calling it from the DDL model (see previous blog Managing Cost-Based Optimizer Statistics for PeopleSoft). The package also includes a mechanism to specify table preferences in metadata, so that they are automatically instantiated when a table is created or altered by Application Designer, and are applied to every instance of a temporary record.
gfcpsstats11.ps_stats(p_ownname=>[DBNAME], p_tabname=>[TBNAME]);
Alternatively, just call DBMS_STATS with no additional parameters (other than FORCE, in case you lock statistics on temporary tables) but then you must manage table preferences manually.
DBMS_STATS.GATHER_TABLE_STATS(ownname=>[DBNAME], tabname=>[TBNAME], force=>TRUE);

Further Reading

Friday, June 29, 2018

Managing Cost-Based Optimizer Statistics for PeopleSoft

I gave this presentation to UKOUG PeopleSoft Roadshow 2018

PeopleSoft presents some special challenges when it comes to collecting and maintaining the object statistics used by the cost-based optimizer.

I have previously written and blogged on this subject.  This presentation focuses exclusively on the Oracle database and draws together the various concepts into a single consistent picture.  It makes clear recommendations for Oracle 12c that will help you work with the cost-based optimizer, rather than continually fight against it.

It looks at collecting statistics for permanent and temporary working storage tables and considers some other factors that can affect optimizer statistics.

This presentation also discusses PSCBO_STATS, that is going to be shipped with PeopleTools, and compares and contrasts it with GFCPSSTATS11.

Saturday, April 14, 2018

Application Engine in Process Scheduler: PSAESRV Server Process -v- Standalone PSAE executable

Whether to use the Application Engine server process (PSAESRV) in the process scheduler tuxedo domain or the standalone PSAE executable is a frequently discussed point amongst PeopleSoft administrator.  Over the years, I have written various things on the subject.  I am going to draw them together in this blog, and restate Oracle’s now clear advice about when to use which option.

In PeopleTools 8.4, the Process Scheduler became a fully fledged Tuxedo domain.  The PSAESRV process was also introduced at this time.  It is a persistent process that handles both Application Engine and Optimization Engine requests.  Each server process creates persistent database connections.  A number of these server processes are started with the domain.   The PSAESRV process does not spawn like other Tuxedo server processes.  Instead, you must configure the number of server processes to match the maximum number of concurrent Application Engine process requests and concurrent Optimization Engine requests that the process scheduler can run.  The server was introduced to handle very short-lived Application Engine programs thus avoiding the overhead of instantiating a new process and new database sessions for each process request.  CRM typically uses Application Engine in this fashion, but generally, you do not see this in other PeopleSoft products.

Oracle has not always been clear what they mean by a short-lived process.  It has been suggested that if Application Engine processes are typically taking less than 10-30 seconds, or if you run more than 1000 Application Engine processes requests per hour (note 651970.1) you should use PSAESRVs.
PeopleBooks advises you should use PSAESRV because it delivers improved system performance.  However, PeopleTools Performance Guidelines Red Paper (Doc ID 747389.1) contradicts this somewhat.  Ultimately, if you have any doubts, you should it test each way and determine whether one way gives a clear advantage over the other.

Oracle Support Note "What will be the impact of disabling PSAESRV on the Process Scheduler (Doc ID 651970.1)" explains that if PSAESRV is disabled in the Tuxedo domain configuration, the Process Scheduler goes back to the legacy behaviour and spawns a stand-alone PSAE process for each Application Engine request.  “The Application Engine will take a bit longer to start, [the] time delay may be range from millisecond to seconds” depending on hardware and configuration.

The stand-alone process has several advantages.
  • At the end of the Application Engine program, it disconnects from the database and terminates.  Thus releasing resources from the process and the database session.  Whereas the persistent Application Engine process has been reported to accumulate allocated memory over time.
  • If you are using Oracle database Global Temporary Tables in an application engine, then you should not use PSAESRV because the tables are always created PRESERVE ON COMMIT and so are only released when the database session terminates.
  • If you set any session parameters within an Application Engine program run via PSAESRV, or enable database trace, then these settings will carry forward from one Application Program to the next unless you reset the parameter at the end of the program, or the start of the next.  This is not a concern with standalone PSAE processes.
However, there is at least one case where you must use the server process:
  • If you are using Oracle Active Data Guard and wish to mark some Application Engine programs as read-only then they must be run via the PSAESRV process

Conclusion

  • PeopleTools Performance Guidelines Red Paper (Doc ID 747389.1 sums it up very nicely: “PSAE is as good as PSAESRV for most practical purposes.  If you have an application engine job that runs longer than 10 seconds, PSAE is equivalent to PSAESRV.  PSAE has the added advantage of being recycled at the end of each application engine job, cleaning up any outstanding SQL cursors to the database that may have been left behind.  Because PSAE recycles after each use, PSAE does not have any possible memory leakage problem that may occupy the precious system memory.  In short, PSAE is a cleaner workhorse.”
  • I think it is reasonable to use PSAESRV in CRM.  For all other products, I recommend that PSAESRV should be disabled from all Process Schedulers.
    • If you do have some Application Processes that are both short-lived (i.e. less than 10 seconds) and run frequently, then consider creating other process schedulers with PSAESRV processes that are dedicated to running only these process.  You can then move these processes to a new Process Scheduler category that only runs on these new Process Scheduler.
  • PSAESRV is configured by default, so if you don’t want to use it, and mostly you won’t, then you have to remember to disable it.