Sunday, September 30, 2012

Maintaining Optimizer Statistics on PeopleSoft on Oracle 11g (and beyond)

Update 20.4.2023: Nearly everything in this article also applies to later versions of Oracle too.
I have been considering how to collect optimizer statistics for a PeopleSoft system running on an Oracle 11g database.  Despite 11g being several years old, most of my current customers are still using 10g, though some are looking at the upgrade to 11g.  I believe a slightly different approach is required.

In 2009, I wrote a series of blog postings on the subject of collecting statistics.  However, these were all based on Oracle 10g.  I proposed a PL/SQL package that would use meta-data in a database table to determine how to collect statistics on a table, or deliberately suppress the collection of statistics.
I also recommended that statistics on tables created for use as temporary records in Application Engine programs should have their statistics deleted and locked to prevent system-wide jobs from refreshing their statistics. I proposed a package that collected statistics according to meta-data defined in a table.

IN 2011, Oracle published document 1322888.1 “pscbo_stats - Improving Statistics in Oracle RDBMS for PeopleSoft Enterprise”.  It takes a similar approach to my 10g package.  A PL/SQL package is used to collect statistics.  A number of tables control whether, when, and how statistics are collected on each record.  Oracle's package is also intended to be used to collect schema-wide statistics.

The psbo_stats package is an attempt to solve a genuine problem, and it has continued to evolve since its initial release.  However, I have a number of objections to it.
  • You are expected to replace the default automatic jobs that collect statistics with pscbo_stats, so it is a move away from standard default maintenance procedures.
  • pscbo_stats is still fundamentally a 10g solution. It does not use 11g table preferences.
  • It does use the Oracle automatic sample size in 11g if histograms are not to be collected.  Otherwise, it defaults to the previous behaviour of either using 100% sample size for when called by %UpdateStats with the ‘high’ sample size, or a variety of fixed sample sizes based on internal rules and the number of rows in the table.
  • The package contains a procedure that collects statistics on all objects in the schema that also refreshes statistics that are not stale but which have not been refreshed for a period of time determined by the size of the table. 
  • When it collects histograms it always sets the maximum bucket size of 254. This may not always be desirable for height-balanced histograms.
  • There is no support for collecting aggregated or incremental statistics on partitioned objects.
I think that the 11g table preferences offer better control over collection of statistics.

Oracle considerably enhanced the delivered DBMS_STATS package in 11g.  It became possible to specify default values for parameters in the dbms_stats.gather_table_stats program for each table.

The guiding principle in 11g, and one that is not specific to PeopleSoft, is that instead of calling dbms_stats.gather_table_stats with the desired parameters, we should set table preferences with the desired parameters and then just call dbms_stats without table-specific parameters. We can then just leave the default database and schema-wide procedures to get on with the job of collecting statistics.    

A document that describes the updated approach and the various scripts is available on the Go-Faster website at http://www2.go-faster.co.uk/docs.htm#Managing.Statistics.11g.

CAVEAT: This document represents some experimental work that is in progress.  It has not been tested against a customer system, let alone been used in production. I would welcome any feedback, and the opportunity to work with someone on a PeopleSoft system on Oracle 11g.

Update 4.2.2015: I have put this solution into PeopleSoft systems at two customer systems last year, so it is no longer just theory.

Update 5.4.2017: PeopleTools 8.55 set the PeopleSoft owner ID (usually SYSADM) in variable [DBNAME] in lowercase. Thus it is passed in lowercase to the PL/SQL package has been updated to handle this.

Thursday, September 20, 2012

Enabling Oracle Extended SQL Trace by Module and Action

I have written previously about the value of assigning meaningful values to the module and action attributes on a database session (see Using Oracle Enterprise Manager (Grid Control) with PeopleSoft). Oracle added instrumentation to PeopleTools 8.50 (see PeopleTools 8.50 uses DBMS_APPLICATION_INFO to Identify Database Sessions) that sets module and action for on-line and batch sessions.  However, I still use my own trigger to set these attributes for processes initiated by the Process Scheduler.

I originally became interested in module and action because it made it possible to analyse performance problems in specific processes with Active Session History (see Practical Use of ASH). However, since Oracle 10g there is also a very easy way to enable Oracle's extended session trace in the sessions relating to specific processes.

In the Oracle supplied PL/SQL package DBMS_MONITOR, there are two programs to specify combinations of module and action for which SQL trace will be enabled.  It is rather like setting a watchpoint in a debugger. SERV_MOD_ACT_TRACE_ENABLE is used to create a watchpoint, and SERV_MOD_ACT_TRACE_DISABLE removes it. Trace is then enabled in a session when the module and action is set to a value that matches a watchpoint and disabled when the module and/or action is changed to a value for which there is no watchpoint.

This screenshot from OEM shows that a query is being run within a PeopleSoft component RECV_PO, page PO_PICK_ORDERS.

It would be a simple matter to have Oracle trace the session for that component by setting a watchpoint.

BEGIN
 sys.dbms_monitor.serv_mod_act_trace_enable
 (service_name=>'P1PPP'
 ,module_name =>'RECV_PO'
 ,waits=>TRUE
 ,binds=>TRUE);
END;
/

Note that the service name must be specified as part of the watchpoint.  This will usually be the same as the PeopleSoft database name.  Thus, watchpoints copied to another database by cloning won't be effective because the service name will be different.

I could also have specified an action in the example above, but I want to trace all pages in the component and the search dialogue.

As with the other programs in dbms_monitor, information on wait events is included in traces by default.  In the above example, I have also requested information on bind variables to be included in the trace.

When setting a SQL trace for an on-line component it is possible that many different users could trigger tracing.  Trace will be enabled and disabled for each user. The result is that you will get a trace file for each application server process, and each trace file might contain the activity for more than one user if the application server process handled more than one service request.

Watchpoints can be removed in a similar way.
BEGIN
 sys.dbms_monitor.serv_mod_act_trace_disable
 (service_name=>'P1PPP'
 ,module_name =>'RECV_PO'
 ,action_name =>'PO_PICK_ORDERS');
END;
/

Watchpoints are held the table SYS.WRI$_TRACING_ENABLED.

SELECT * FROM sys.wri$_tracing_enabled
/

TRACE_TYPE PRIMARY_ID QUALIFIER_ID1 QUALIFIER_ID2   INSTANCE_NAME FLAGS
---------- ---------- ------------- --------------- ------------- -----
         5 P1PPP      RECV_PO       PO_PICK_ORDERS                   12

This technique is also effective for processes initiated by the Process Scheduler. The default behaviour from PeopleTools 8.50 is to set Action to the process name, my trigger sets Module to the process name.
If I specify only the Module in SERV_MOD_ACT_TRACE_ENABLE, then it trace will be enabled for that module and for any action.
BEGIN
 sys.dbms_monitor.serv_mod_act_trace_disable
 (service_name=>'AAAAAAA'
 ,module_name =>'GPPDPRUN');
END;
/
I choose to set the action to the Process Instance number because I find it useful to relate ASH data to a specific batch process. The above screenshot from OEM shows a streamed Global Payroll calculation running with many concurrent GPPDPRUN processes. The watchpoint would cause each and every process to trace.

However, it has been suggested that if the Action was set to the process run control ID then the watchpoint could be set to the specific run control value used by a user or scheduled job. It would be a simple matter of changing the trigger (I'll leave that as an exercise).  Then, a user could be told to use a specific value for run control ID that would then invoke trace.
BEGIN
 sys.dbms_monitor.serv_mod_act_trace_disable
 (service_name=>'AAAAAAA'
 ,action_name =>'TRACEME');
END;
/

Update 24.11.2012 This technique will not work for Application Engine from PeopleTools 8.52 because it now sets MODULE to a value that includes a session number and so cannot be predicted.

Thursday, July 19, 2012

Gathering Aggregated Cost-Based Optimiser Statistics on Partitioned Objects

Recently, I have been looking into how to gather cost-based optimizer statistics on composite partitioned objects. 

Database partitioning is not used by default in PeopleSoft because it is database specific.  Not all databases support partitioning, and where they do it is done in a platform specific manner.  Application Designer has no capability to create partitioned objects (although from PeopleTools 8.51 it will preserve partitioning in tables and index that are already partitioned in an Oracle database). It is therefore no surprise that it is not used widely by PeopleSoft customers.  However, partitioning is essential in Global Payroll systems that use 'streamed'  processing (many concurrent payroll processes).  By definition, these are the payroll systems that have large volumes of data where collecting statistics on the results tables can quickly become a challenge.  Partitioning can also have application in General Ledger reporting in Financials, particularly where nVision is in use.

It is important that statistics on partitioned tables are both up to date and accurate.  However, collecting global statistics on a large partitioned object can be a time-consuming and resource intensive business as Oracle samples all the physical partitions or sub-partitions. Briefly, if you do not collect global statistics on a partitioned table, Oracle will aggegrate the statistics on the physical partitons or sub-partitions to calculate statistics on the logical table and partition segments.

Oracle 10g makes a number of mistakes in its calculation of these aggregated statistics.  In particular the number of distinct values on columns by which the table is partitioned have impossibly low values.  This is can affect cardinality calculations and so lead the optimizer to choose the wrong execution plan.

I have now published the second of two documents on my website that examine aspects of statistics on partitioned, and in particular composite partitioned tables. The first document examines the problems in 10g, and proposes a procedure to 'correct' the aggregated statistics to at least minimum possible values.
The second document looks at the same issue in 11g and shows that while most of the issues are fixed, one problem remains.

Sunday, June 26, 2011

Deferred Segment Creation in PeopleSoft

This note has been in my to do folder for a while since I found these two excellent blog entries about Deferred Segment Creation by Chistian Antognini.
They made me think about the use of this feature in PeopleSoft. In most PeopleSoft systems there are lots of empty tables, sometimes because not all the modules delivered in the database are in use, but also because not all the temporary table instances have ever been used. This blog entry from Tom Kyte feels very close to home
Deferred segment creation is available from Oracle 11gR1. From Oracle 11.2.0.2 this becomes the default, so there is no need to reconfigure anything in PeopleSoft in order use this Oracle feature. You may want to drop empty segments created under previous versions of the database.

Thursday, March 03, 2011

More Process Priority Levels for the Process Scheduler

This note started out as an idea for how to add more priority levels to the PeopleSoft Process Scheduler to improve control over prioritisation of processes in a complex batch. While testing I found some interesting behaviour that I had not expected. I will also explain how I did some of my tests, because they are easily reproducible.

More Priority Levels

By default, PeopleSoft process can be given priority 1 (low), 5 (medium), and 9 (high). If the Process Scheduler is too busy to start all the processes scheduled to be started, it starts the higher priority ones in preference to the lower priority.

That’s fine, but sometimes in a very complex batch environment 3 process levels are not enough, and it would be useful to have more priority levels so that it is possible to define the hierarchy of processing to a finer level.

There does not appear to be anything special about the priorities delivered by PeopleSoft. They do not appear to be hard-coded anywhere within the SQL submitted Process Scheduler. As you might expect, some of the queries are sorted in descending order of priority.

The priority of a process in the Process Scheduler is defined by the single character field PRCSPRIORITY on the record PRCSDEFN. All you have to do is define addition translate values

The new priorities can be assigned via the delivered components.

I started with the intention of applying the new priority level to processes through the delivered component. However, I encountered some quirky behaviour from the Process Scheduler and Master Process Scheduler. There is something happening inside the programs rather than the SQL where they do not always take the process with the highest priority if that priority is not one of the delivered values. Therefore, I do not recommend using the new priorities on process definitions.

However, Server Categories also have priorities. Testing has shown that the priority of the Process Category take precedence over the priority on the Process Definition.

The new server categories should be defined in the Process Category Administration page of the Process Scheduler System Settings component


Then priorities and maximum concurrencies can be assigned to the categories in each Process Scheduler Server definition.


Testing the Additional Priorities
I have (with some assistance) constructed a simple test harness. I have an Application Engine program, AE_SLEEP, that does nothing, but sleeps for a period of time by calling the Java Sleep() method in a PeopleCode step.

Local JavaObject &Obj;Local ProcessRequest &RQST;
MessageBox(0, "", 0, 0, "Sleep Begin for " | GFC_SLEEP_AET.DURATION.Value | " seconds ");

&Obj = CreateJavaObject("java.lang.Thread");&Obj.start();
&Obj.sleep(GFC_SLEEP_AET.DURATION.Value * 1000);

MessageBox(0, "", 0, 0, "Sleep END");

I have a number of other Application engine programs that have different priorities, or that are in categories with different priorities, that call AE_SLEEP.

Finally, there is a process AE_SLEEP_RUN which submits requests for the other processes based on a control table. In this test I will be setting up some test data to demonstrate how the Process Scheduler behaves, in the next article, I will be modeling the behaviour of a real batch load from a real system

TRUNCATE TABLE ps_gfc_sleep_rc
/
INSERT INTO ps_gfc_sleep_rc
(run_cntl_id, prcstype, prcsname, offset_amount, duration, rundttm, last_run_cntl_id, next_run_cntl_id)
SELECT level, 'Application Engine', 'AE_SLEEP'||level, 5*level, 175, NULL, ' ', ' '
FROM dual CONNECT BY level <= 9
/
COMMIT
/ 
ALTER SESSION SET NLS_DATE_FORMAT='hh24:mi:ss';
set lines 120
COLUMN run_cntl_id FORMAT a8 heading 'Run|Control'
COLUMN last_run_cntl_id FORMAT a8 heading 'Last|Run|Control'
COLUMN next_run_cntl_id FORMAT a8 heading 'Next|Run|Control'
COLUMN prcsinstance FORMAT 99999 HEADING 'PRCS|INSTANCE'
COLUMN offset_amount FORMAT 9999 HEADING 'Offset|Amount'
COLUMN rownum FORMAT 99 HEADING 'Row|#'
COLUMN prcstype FORMAT a18
COLUMN prcsname FORMAT a12
COLUMN prcsprty FORMAT 9 HEADING 'PRCS|PRTY'
COLUMN prcscategory FORMAT a11 HEADING 'PRCS|CATEGORY'
COLUMN serverassign FORMAT a6 HEADING 'SERVER|ASSIGN'
COLUMN runstatus FORMAT a6 HEADING 'RUN|STATUS' 
 
SELECT run_cntl_id, prcsname, offset_amount, duration, rundttm 
FROM ps_gfc_sleep_rc
/

So the AE_SLEEP_RUN will schedule one process every 5 seconds. The lowest priority processes are scheduled first.

Run                            Offset
Control  PRCSNAME     RUNDTTM  Amount   DURATION
-------- ------------ -------- ------ ----------
1        AE_SLEEP1                  5        175
2        AE_SLEEP2                 10        175
3        AE_SLEEP3                 15        175
4        AE_SLEEP4                 20        175
5        AE_SLEEP5                 25        175
6        AE_SLEEP6                 30        175
7        AE_SLEEP7                 35        175
8        AE_SLEEP8                 40        175
9        AE_SLEEP9                 45        175

I can query what actually happened with the following SQL.

SELECT rownum, a.* 
FROM (
   SELECT SYSDATE, q.prcsinstance, q.prcsname, q.prcsprty, q.rundttm
   , r.begindttm, r.enddttm, q.serverassign, q.runstatus
   --, r.prcscategory
   FROM psprcsrqst r, psprcsque q
   WHERE r.prcsinstance = q.prcsinstanceAND r.prcsname like 'AE_SLEEP_'
   AND r.prcsinstance > (
      SELECT MAX(prcsinstance)
      FROM psprcsrqst
      WHERE prcsname = 'AE_SLEEP_RUN' AND runstatus = '9')
ORDER BY r.begindttm, q.prcsprty DESC, q.rundttm, r.prcsinstance) a
/

In all of the examples below, the first three low-priority processes run first because they are scheduled to run and therefore start before the higher priority processes are scheduled. The Process Scheduler is then blocked until all the other processes are due to have run, and so priority determines while processes the scheduler starts next.

However, due to a quirk in Process Scheduler, the processes are not assigned to the Process Scheduler in time order. In the following example, AE_SLEEP5 was executed before AE_SLEEP6.

Row           PRCS
#   SYSDATE   INSTANCE PRCSNAME     RUNDTTM  BEGINDTTM ENDDTTM
--- --------- -------- ------------ -------- --------- --------
  1 13:14:17      4583 AE_SLEEP1    13:03:27 13:03:55  13:07:08
  2 13:14:17      4584 AE_SLEEP2    13:03:32 13:03:55  13:07:08
  3 13:14:17      4585 AE_SLEEP3    13:03:37 13:03:55  13:07:08
  4 13:14:17      4591 AE_SLEEP9    13:04:07 13:07:12  13:10:27
  5 13:14:17      4587 AE_SLEEP5    13:03:47 13:07:13  13:10:27
  6 13:14:17      4588 AE_SLEEP6    13:03:52 13:07:13  13:10:27
  7 13:14:17      4586 AE_SLEEP4    13:03:42 13:10:31  13:13:46
  8 13:14:17      4589 AE_SLEEP7    13:03:57 13:10:47  13:14:01
  9 13:14:17      4590 AE_SLEEP8    13:04:02 13:10:47  13:14:01

But if the non-standard priority jobs are put in a different category, they are executed in the right order. It would appear that there is some logic in the Master Process Scheduler (and not in the SQL queries it submits) that works category by category.

Row              PRCS              PRCS                            SERVER RU
  # SYSDATE  INSTANCE PRCSNAME     PRTY RUNDTTM  BEGINDTT ENDDTTM  ASSIGN ST PRCSCATEGORY
--- -------- -------- ------------ ---- -------- -------- -------- ------ -- ------------
  1 17:44:21     4681 AE_SLEEP5       5 17:31:10 17:31:24 17:34:39 PSNT   9  Default
  2 17:44:21     4682 AE_SLEEP5       5 17:31:13 17:31:24 17:34:39 PSNT   9  Default
  3 17:44:21     4683 AE_SLEEP5       5 17:31:16 17:31:40 17:34:56 PSNT   9  Default
  4 17:44:21     4690 AE_SLEEP7       7 17:32:43 17:34:43 17:37:57 PSNT   9  Priority 9
  5 17:44:21     4689 AE_SLEEP7       7 17:32:44 17:34:43 17:37:57 PSNT   9  Priority 9
  6 17:44:21     4688 AE_SLEEP7       7 17:32:45 17:34:59 17:38:14 PSNT   9  Priority 9
  7 17:44:21     4687 AE_SLEEP7       7 17:32:46 17:38:01 17:41:16 PSNT   9  Priority 9
  8 17:44:21     4684 AE_SLEEP5       5 17:31:19 17:38:01 17:41:16 PSNT   9  Default
  9 17:44:21     4685 AE_SLEEP5       5 17:31:22 17:38:17 17:41:33 PSNT   9  Default 
 10 17:44:21     4686 AE_SLEEP5       5 17:31:25 17:41:20 17:44:20 PSNT   9  Default

If there are two new priorities in the new category, then they do not run in the right order.

Row              PRCS              PRCS                            SERVER RUN
  # SYSDATE  INSTANCE PRCSNAME     PRTY RUNDTTM  BEGINDTT ENDDTTM  ASSIGN STATUS PRCSCATEGORY
--- -------- -------- ------------ ---- -------- -------- -------- ------ ------ ------------
  1 19:01:58     4716 AE_SLEEP5       5 18:06:58 18:07:26 18:10:40 PSNT   9      Default
  2 19:01:58     4718 AE_SLEEP5       5 18:07:04 18:07:26 18:10:40 PSNT   9      Default
  3 19:01:58     4719 AE_SLEEP5       5 18:07:07 18:07:26 18:10:40 PSNT   9      Default
  4 19:01:58     4723 AE_SLEEP6       6 18:08:25 18:10:44 18:13:43 PSNT   9      Priority 9
  5 19:01:58     4722 AE_SLEEP6       6 18:08:30 18:10:44 18:13:43 PSNT   9      Priority 9
  6 19:01:58     4717 AE_SLEEP5       5 18:07:01 18:10:44 18:13:43 PSNT   9      Default
  7 19:01:58     4724 AE_SLEEP7       7 18:08:40 18:14:02 18:17:00 PSNT   9      Priority 9
  8 19:01:58     4725 AE_SLEEP7       7 18:08:45 18:14:02 18:17:00 PSNT   9      Priority 9
  9 19:01:58     4720 AE_SLEEP5       5 18:07:10 18:14:02 18:17:00 PSNT   9      Default
 10 19:01:58     4721 AE_SLEEP5       5 18:07:13 18:17:18 18:20:17 PSNT   9      Default

However, if I create a separate category for each priority, and assign the priority to that category in the server definition, then the processes run in the desired order.

Row              PRCS              PRCS                            SERVER RUN
  # SYSDATE  INSTANCE PRCSNAME     PRTY RUNDTTM  BEGINDTT ENDDTTM  ASSIGN STATUS PRCSCATEGORY
--- -------- -------- ------------ ---- -------- -------- -------- ------ ------ ------------
  1 19:38:00     4740 AE_SLEEP5       5 19:25:10 19:25:23 19:28:23 PSNT   9      Default
  2 19:38:00     4741 AE_SLEEP5       5 19:25:13 19:25:39 19:28:38 PSNT   9      Default
  3 19:38:00     4742 AE_SLEEP5       5 19:25:16 19:25:39 19:28:38 PSNT   9      Default
  4 19:38:00     4748 AE_SLEEP7       7 19:26:52 19:28:41 19:31:41 PSNT   9      Priority 7
  5 19:38:00     4749 AE_SLEEP7       7 19:26:57 19:28:58 19:31:56 PSNT   9      Priority 7
  6 19:38:00     4747 AE_SLEEP6       6 19:26:37 19:28:58 19:31:56 PSNT   9      Priority 6
  7 19:38:00     4746 AE_SLEEP6       6 19:26:42 19:32:00 19:34:59 PSNT   9      Priority 6
  8 19:38:00     4743 AE_SLEEP5       5 19:25:19 19:32:16 19:35:15 PSNT   9      Default
  9 19:38:00     4744 AE_SLEEP5       5 19:25:22 19:32:16 19:35:16 PSNT   9      Default
 10 19:38:00     4745 AE_SLEEP5       5 19:25:25 19:35:02 19:37:59 PSNT   9      Default

Conclusion
The Process Scheduler does not reliably take process priorities into account when scheduling processes,  so I do not recommend using the additional priorities on processes, just the three delivered priorities (1, 5, and 9).

However, all the priorities do work properly on process categories. The priority of the server class does not appear to have any effect (I have not produced results for those tests here).

The priority of the category to which a process belongs takes precedence over the priority of the process. Thus a low priority process in a high priority category will be executed by the Process Scheduler in preference to a high priority process in a low priority category. Naturally if two processes are in the same category, or categories of equal priority, the higher priority process will be scheduled first.

Acknowledgements
My thanks to Andy Mason of Business Integrations for contributing to the initial concept and the Java code to make Application Engine sleep and assistance with the test harness.

Further Reading
I have published a paper on my website that describes how to extend this technique to model a real production batch, and predict the impact of Process Scheduler configuration changes.