Monday, December 04, 2006

Retrieving Oracle trace files via an external table

December 21st 2006: Correction to script

When talking about performance tuning, I will often tell developers that they need to use Oracle SQL*Trace and TKPROF because that is an absolutely reliable way to find out how their SQL executed, how long it took and hence where they should focus their attention to improve performance. However, that requires access to the trace files in the USER_DUMP_DEST directory on the database server, which is something that database administrators are justifiably reticent to provide. When I am on a customer site, I often experience the same difficulties.

Even if developers can get a log on to a development database server, trace files will only be readable on Unix servers within the DBA group unless _trace_file_public is set to true.

Another option is to retrieve trace files from the user_dump_dest directory via an external table. The idea is not mine. I've seen this technique demonstrated by Tanel Poder, there is an article by Jared Still that demonstrates how to retrieve the alert log, and there is also an posting on Ask Tom using utl_dir. However, I couldn't find a script to do trace files, so I wrote one.

The script assumes the trace file is created by the current session. The first thing it does is to create a database directory that maps to the USER_DUMP_DEST directory, and an external table that corresponds to the trace file. Then you can just query the trace file in SQL*PLus and spool the output to a local file.

Of course, this also shows that external tables are a significant security risk. The privilege to create and read from directories carefully controlled.

REM user_dump_dest.sql
REM (c) Go-Faster Consultancy Ltd.
REM 30.11.2006 initial version

REM trace the thing you are interested in.
ALTER SESSION SET tracefile_identifier = 'gfctest';
ALTER SESSION SET sql_trace = true;
SELECT * FROM dual;
ALTER SESSION SET sql_Trace = false;

REM determine path for user_dump_dest and create an database directory
set echo off feedback off verify on timi off
column dir new_value dir format a18
column path new_value path format a60
SELECT name dir, value path
FROM v$parameter
WHERE name = 'user_dump_dest'
/
CREATE OR REPLACE DIRECTORY &dir AS '&path';

REM determine the name of the trace file from show process ID, and database name and parameters
column tracefile_name new_value tracefile_name
SELECT LOWER(d.name)||'_ora_'||p.spid
||DECODE(p.value,'','','_'||value) tracefile_name
FROM v$parameter p, v$database d, sys.v_$session s, sys.v_$process p
,(SELECT sid FROM v$mystat WHERE rownum=1) m
WHERE p.name = 'tracefile_identifier'
AND s.paddr = p.addr
AND s.sid = m.sid
/

REM create an external table that corresponds to the trace file
DROP TABLE &tracefile_name;
CREATE TABLE &tracefile_name
(trace_line VARCHAR2(4000))
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER DEFAULT DIRECTORY user_dump_dest
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE NOBADFILE NODISCARDFILE NOLOGFILE
FIELDS MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(trace_line CHAR(4000)))
LOCATION ('&tracefile_name..trc')
);

REM just query the trace file back to a local spool file in SQL Plus
set head off pages 0 termout off
spool &tracefile_name..trc
SELECT * FROM &tracefile_name;
spool off
set termout on
DROP TABLE &tracefile_name;

Wednesday, November 01, 2006

Truncating a Table does not affect the Materialized View Log

I am working on a site that is replicating tables between databases using Materialized Views. I have realised that if a replicated table is truncated on the source database, that the rows remain in the materialized view on the target until a complete refresh is performed. Worse, if ROWID based Materialized View is used, then the fast refresh process will error with 'ORA-12034: materialized view log on "SYSADM"."T_R" younger than last refresh'. In PeopleSoft, there are no primary keys, and it is not possible to build them if any of the key columns are nullable, which is the case with non-required date fields in PeopleSoft. Here is an example. I will create two tables, and replicate them with Materialized Views. One by primary key, and the other by ROWID.
CREATE TABLE t_pk
(a NUMBER
,b VARCHAR2(20)
,CONSTRAINT t PRIMARY KEY(a));

CREATE TABLE t_r
(a NUMBER
,b VARCHAR2(20));

CREATE MATERIALIZED VIEW LOG ON t_pk WITH PRIMARY KEY;
CREATE MATERIALIZED VIEW t_pk_mv REFRESH FAST WITH PRIMARY KEY
AS SELECT * FROM t_pk ;

CREATE MATERIALIZED VIEW LOG ON t_r WITH ROWID;
CREATE MATERIALIZED VIEW t_r_mv REFRESH FAST WITH ROWID
AS SELECT * FROM t_r ;

INSERT INTO t_pk VALUES(1,'Old');
INSERT INTO t_pk VALUES(2,'Old');
INSERT INTO t_r VALUES(1,'Old');
INSERT INTO t_r VALUES(2,'Old');

BEGIN dbms_mview.refresh(list => 'T_PK_MV', method => 'f'); END;
BEGIN dbms_mview.refresh(list => 'T_R_MV', method => 'f'); END;

SELECT * FROM t_pk_mv;
A B
- ---
1 Old
2 Old

SELECT * FROM t_r_mv;
A B
- ---
1 Old
2 Old
So the data has replicated and everything seems to working fine. Now lets truncate the source tables, and put some new data in. One row has the same key value, one does not.
TRUNCATE TABLE t_pk;
TRUNCATE TABLE t_r;

INSERT INTO t_pk VALUES(2,'New');
INSERT INTO t_pk VALUES(3,'New');
INSERT INTO t_r VALUES(2,'New');
INSERT INTO t_r VALUES(3,'New');
The fast refresh of the Materialized View with the primary key appears to work. The new rows are inserted into the Materialized View, replacing existing rows with the same key values, but the old rows remain.
BEGIN dbms_mview.refresh(list => 'T_PK_MV', method => 'f'); END;
PL/SQL procedure successfully completed.

SELECT * FROM t_pk;
A B
- ---
2 New
3 New

SELECT * FROM t_pk_mv;
A B
- ---
1 Old
2 New
3 New
The fast refresh of the Materialized View by ROWID fails, and the old data remains in place.
BEGIN dbms_mview.refresh(list => 'T_R_MV', method => 'f'); END;
*
ERROR at line 1:
ORA-12034: materialized view log on "SYSADM"."T_R" younger than last refresh
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 803
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 860
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 841
ORA-06512: at line 1
But rows 1 & 2 are in MV but not source table
SELECT * FROM t_r;
A B
- ---
2 New
3 New

SELECT * FROM t_r_mv;
A B
- ---
1 Old
2 Old
However, a full refresh corrects the discrepancy
BEGIN dbms_mview.refresh(list => 'T_PK_MV', method => 'c'); END;
BEGIN dbms_mview.refresh(list => 'T_R_MV', method => 'c'); END;

SELECT * FROM t_pk_mv;
A B
- ---
2 New
3 New

SELECT * FROM t_r_mv;
A B
- ---
2 New
3 New
It seems to me, that it would be appropriate to prevent a Truncate command executing if there is a Materialized View Log on the table. I have written such a trigger
CREATE OR REPLACE TRIGGER mvtrunc_lock
BEFORE TRUNCATE
ON SYSADM.SCHEMA
DECLARE
 e_generate_message EXCEPTION;
 l_recname  VARCHAR2(15 CHAR);
 l_msg      VARCHAR2(100 CHAR) := 'No Message.';
 l_msg2     VARCHAR2(100 CHAR) := 'Cannot '||ora_sysevent
   ||' '||lower(ora_dict_obj_type)
   ||' '||ora_dict_obj_owner||'.'||ora_dict_obj_name||'.  ';

 sql_text ora_name_list_t;
 l_sql_stmt VARCHAR2(1000 CHAR) := '';
 n          INTEGER;
 i          INTEGER;

BEGIN
 /*extract the originating SQL statement into a string variable*/
 n := ora_sql_txt(sql_text);
 FOR i IN 1..n LOOP
  l_sql_stmt := SUBSTR(l_sql_stmt || sql_text(i),1,1000);
 END LOOP;

 IF ora_dict_obj_type = 'TABLE' AND 
    ora_sysevent = 'TRUNCATE' THEN

  BEGIN /*if a materialized view log exists*/
   SELECT 'There is a materialized view log.'
   INTO   l_msg
   FROM   all_mview_logs l
   WHERE  ROWNUM = 1
   AND    l.master = ora_dict_obj_name
   AND    l.log_owner = ora_dict_obj_owner
   ;
   RAISE e_generate_message;
  EXCEPTION
   WHEN NO_DATA_FOUND THEN NULL;
  END;
 END IF;

 EXCEPTION
  WHEN NO_DATA_FOUND THEN NULL;
  WHEN e_generate_message THEN
   RAISE_APPLICATION_ERROR(-20042,
     'MVTRUNC_LOCK:'||l_msg2||l_msg||CHR(10)||'SQL:'||l_sql_stmt);
END;
/

show errors
The trigger MVTRUNC_LOCK can be downloaded from the Go-Faster website.

Wednesday, October 25, 2006

DDL Triggers to prevent loss of database objects not managed by PeopleTools

Sometimes you have to certain database techniques or create database objects on tables that
are maintained by PeopleTools, but which themselves are not maintained by PeopleTools. This is often as a result of performance tuning activities where you choose to use features of the Oracle database that PeopleSoft do not use because they are not available on other database platforms.
  • Function Based Indexes: In Oracle it is possible to implement an index on a function. A typical example would be an index on UPPER(NAME) on PS_NAMES to facilitate case
    insensitive searching.
  • Record based auditing can be done with a database DML trigger, instead of the default functionality of the Application Server. This was implemented by PeopleSoft to improve performance of the auditing, and is a rare example of PeopleSoft coding specific code for each platform because the trigger DDL is slightly different.
However, these objects are not created by the DDL scripts built by the Application Designer, and can accidentally be lost when the table is altered with an 'alter by recname' script generated by Application Designer. The implications can be quite serious. If the auditing trigger were to be lost, then the application would continue to run, but no audit data would be produced, and no error would be raised.
In Oracle it is possible to build DDL triggers. Just as DML triggers fire when the data is changed, DDL triggers fire when particular DDL commands are issued. I have created a trigger called PSFT_DDL_LOCK (available from the Go-Faster website) that fires when an object is altered or dropped. In certain cases the trigger will raise an error, this causes the original DDL command to fail, and thus prevents loss of the unmanaged objects. If the table related to the object being dropped or altered is not managed by PeopleSoft (if it can't be found in PSRECDEFN), the trigger does not raise any error. Otherwise,
  • If a trigger is being dropped or altered, and the name of that trigger does not start with PSU, then an error is raised. Triggers that are named PSU% are created by Application Designer for use with Mobile Agents If an index is dropped or altered, the trigger checks that it is defined in PeopleTools. Indexes that correspond to Unique, Duplicate and Alternate Keys in Application Designer (where the index name is 'PS', followed by either a digit or an underscore, followed by the record name) are ignored.
  • If a table is dropped or altered, the DDL trigger checks than there are no user indexes or triggers not defined in PeopleSoft, nor any primary key constraints, materialized views or materialized view logs on the table. It also checks that the table or index is not partitioned, clustered, global temporary or index organised.
If any of the tests fail, then the trigger raises an exception the DDL statement fails with an error message generated in the trigger. The SQL that generated the error is also included in the error text.
When an error is generated by this trigger during development or migration activities, it usually indicates that there is another database object that you need to consider before issuing the command that errorred. It is not simply a matter of disabling the trigger and trying again.
The trigger does have a couple of side effects.
  • There are a number of SQL statements that are run in the trigger, and this does impact the performance of DDL commands. If you are dropping all the tables in a schema, then it would be advisable to disable the trigger. The trigger definitely needs the following function based index to be created on PSRECDEFN, because it needs to look up the PeopleSoft record from the table name.
CREATE INDEX pszpsrecdefn
ON psrecdefn (DECODE(sqltablename,' ','PS_'recname,sqltablename))
TABLESPACE PSINDEX PCTFREE 0;
  • If a user index is removed from the PeopleTools definition before it is dropped, the trigger will raise an error. However, in such cases it would be better to keep the definition of the index in Application Designer and set the platform radio button to 'None' so that PeopleTools does not build it. That way a comment can be preserved to explain why the index is no longer necessary.
Updated 2.6.2011: The checking behaviour trigger can be disabled for just the current session by calling this packaged procedure that sets a package global variable which is read by a function called by the trigger. For the one session where this is done, the checks are not performed by trigger are disabled:
execute psft_ddl_lock.set_ddl_permitted(TRUE);

The behaviour can be reenabled like this:
execute psft_ddl_lock.set_ddl_permitted(FALSE);

The previous version of this trigger (called T_LOCK) did not have this capability, and so I suggested disabling the trigger, but this affected all sessions.

Thursday, October 19, 2006

Nullable VARCHAR2 columns in PeopleSoft

Since time immemorial, we were told by PeopleSoft that all numeric, character or required fields in their applications are created as NOT NULL columns in the database. And so we confidently repeat the mantra and build our SQL accordingly.

But it is not completely true!

Long Character fields in PeopleSoft that are not defined as 'required' in the Application Designer are nullable in the database. Long Character field are usually created as LONG columns in the database unless their length is defined as between 1 and 2000 characters, in which case they are created as VARCHAR2 columns.

So a Long Character field of between 1 and 2000 characters becomes a nullable VARCHAR2 column in the database (it has taken me 10 years to find this, and it's yet another piece of errata in my book!).

The edit and key options in PeopleTools are restricted on a long field, but nonetheless it opens some interesting possibilities. Including sparse indexes on nullable VARCHAR2 columns, where you identify rows for a query with non-null flags, and set the flag to null afterwards, thus removing it from the index.

Wednesday, October 11, 2006

What not to Code: The evils of tables as sequence generators and timestamps for uniqueness

This is a story about the dangers of using tables as sequence generators and then the risks of using date/time functions to generate unique keys. In this case, a simple and initially effective change resulted in a time-delayed logic bomb of remarkable subtlety and devastating effect.
The story happens to be about a PeopleSoft system, but it could have happened to any application. It starts with a pair of ERP systems. Changes in the HR system generate cases in the CRM system that the back office HR helpdesk processes. The HR system sends a message to the CRM system, where it may turn into one or more workflow messages. Some HR actions can give rise to many hundreds or even thousands of CRM workflows. For each Workflow, a run control record is created on a table that is keyed on the operator ID and a Run Control ID (an arbitrary ID to identify the control record).
The vanilla version of this code used a sequence number generated, in the usual PeopleSoft platform agnostic way, from a table. In extreme situations, a batch HR process can generate many thousands of workflows. These messages are processed by the subscription servers of an application server domain. This system is busy enough to have several of these domains, each with several subscription handler processes. So isn't long before contention on the row level lock taken during the sequence generation table became an issue. It was noticed that adding additional subscription handlers didn't increase throughput, and that database locking also impeded online users who would also execute this code.
So a developer changed the code and now the run control was a string derived by concatenating the operator ID (originally it used just the first three characters of the operator ID) and the current Time. In case this run control already existed (usually because a message generated several work flows) the code goes into a loop, generating a new run control from the operator and time each time. Adding more subscription handlers still doesn't increase message throughput - if anything it makes it worse.
Three more facts to bear in mind:
  1. The time is in the format HH24:MI:SS.
  2. The publish and subscribe application server processes all connect as the operator PSAPPS.
  3. The run control table is not purged.
Function CreateRunControl(...) Returns string

REM &run_cntl_id = String(GetNextNumberWithGaps( RB_WF_RTEKEYMAX.WF_SCHED_NBR_NEXT, 1999999999, 1));
/* workaround to avoid dead locks ... */
&run_cntl_id = %OperatorId String(%Time);
&chk_rc = "N";
SQLExec("SELECT 'Y' FROM PS_RB_RUN_CNTL_WF WHERE RUN_CNTL_ID = :1", &run_cntl_id, &chk_rc);
While &chk_rc = "Y"
&chk_rc = "N";
&run_cntl_id = %OperatorId String(%Time);
SQLExec("SELECT 'Y' FROM PS_RB_RUN_CNTL_WF WHERE RUN_CNTL_ID = :1", &run_cntl_id, &chk_rc);
End-While;
...
&return = &recRouteKeys.Insert(); /*Inserts run control*/
So yesterday when user FRED generated a workflow at 10.27.42 the run control is 'FRED10.27.42'. If today he generates another workflow at 10.27.42 the code will loop until 10.27.43 until it can generate run control that has not been used. In fact, Fred can never generate a run control at 10.27.42 ever again. Also, if a single operation generates 3 run controls then it will take at least 2 extra seconds to allocate the second and third run control.
The situation for the message processing application server is even more severe. It can generate hundreds or even thousands of workflows in an evening. The subscription handlers execute the loop in the code until a time of day when they have not processed a message before.
On a day that I was on this site, I heard that a worker in the HR call centre had had nothing to do all day, and then at 5.30 in the evening, as she was about to go home, 15 new cases appeared on her list and she was not happy (call centre workers are measured on the number of cases they process and the length of their queues). Sure enough, we found that there were over 50000 records, 3600 records per hour for user PSAPPS continuously from 3am to 5.30pm. When we got to 86400 records processing of workflow from inbound messages would have stopped completely! We also found that some operators have generated over 40 messages in a given minute and so the save time of some of their operations is starting to be seriously affected.
The correct solution to all of this would be to have used an Oracle sequence. You get a unique number every time, and there is no contention on its allocation. In this application, there is no objection to any gaps in the sequence.
The original code suffered from database locking problems that effectively serialised run control processing. The code change worked well in test and initially worked fairly well in production. Although the message processing application servers could only produce a maximum of one workflow per second before they would start to execute the loop, which they could do over 100 times per second. Thus database locking was replaced with high CPU consumption in the application server as it looped, and additional SQL execution and logical I/O in the database leading to increased latch contention and also increasing database CPU consumption. As the problem became more severe, other parts of the application also suffered as users waited for their requests to be processed by an available process in the application server. It all amounts to a viscous and downward spiral.

Conclusions

  • Oracle recognised the problems of sequence generation, so they created Sequences. You've paid good money for them, so use them.
  • While date and time functions can appear to produce unique values, you need to think carefully about what will happen in a busy multi-user system.

Tuesday, October 10, 2006

Global nVision performance options

Another back door into PeopleSoft: I've been working on nVision recently. In PeopleTools 8.x, the nVision performance options are stored on the tree definition table, PSTREEDEFN (in previous versions, they were stored in the layout). The performance options should be set as follows

  • Use Literal Values: The SQL generated by nVision uses a list of litteral values instead of a joining to one or more tree selector tables. This simplifies the query by reducing the number of tables queried, although the query will be much longer because there will be many criteria on litteral values select from those same tree selector tables. The fewer tables in the from clauses, the fewer way the optimizer will have to execute the statement incorrectly!
  • Use Static Selectors: nVision queries data from the tree selector tables rather than build an extract of part of the trees. Sometimes nVision doesn't clear out this temporary data, usually when it errors for some reason. Hence, debris builds up in the tree selector tables over time an this degrades the performance of the queries.
  • Use Between on the Selection Criteria: This doesn't improve performance, but merely shortens the length of the SQL statement.

spool nvperfopts
SELECT T.SETID, T.TREE_NAME, T.EFFDT
, T.TREE_ACC_SELECTOR --static selctors
, X1.XLATSHORTNAME
, T.TREE_ACC_SEL_OPT --between
, X2.XLATSHORTNAME
, T.TREE_ACC_METHOD --literals
, X3.XLATSHORTNAME
FROM PSTREEDEFN T
, PSXLATITEM X1
, PSXLATITEM X2
, PSXLATITEM X3
WHERE (T.TREE_ACC_SELECTOR != 'S'
OR T.TREE_ACC_SEL_OPT != 'B'
OR T.TREE_ACC_METHOD != 'L')
AND X1.FIELDNAME = 'TREE_ACC_SELECTOR'
AND X1.FIELDVALUE = TREE_ACC_SELECTOR
AND X2.FIELDNAME = 'TREE_ACC_SEL_OPT'
AND X2.FIELDVALUE = TREE_ACC_SEL_OPT
AND X3.FIELDNAME = 'TREE_ACC_METHOD'
AND X3.FIELDVALUE = TREE_ACC_METHOD;

/*increment the version numbers*/
UPDATE PSLOCK
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','TDM');

UPDATE PSVERSION
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','TDM');

/*update nvision flags and version number on trees*/
UPDATE PSTREEDEFN
SET TREE_ACC_SELECTOR = 'S' --static selctors
, TREE_ACC_SEL_OPT = 'B' --between
, TREE_ACC_METHOD = 'L' --literals
, VERSION = (SELECT VERSION
FROM PSLOCK WHERE OBJECTTYPENAME = 'TDM')
WHERE TREE_ACC_SELECTOR != 'S'
OR TREE_ACC_SEL_OPT != 'B'
OR TREE_ACC_METHOD != 'L';

SELECT T.SETID, T.TREE_NAME, T.EFFDT
, T.TREE_ACC_SELECTOR --static selctors
, X1.XLATSHORTNAME
, T.TREE_ACC_SEL_OPT --between
, X2.XLATSHORTNAME
, T.TREE_ACC_METHOD --literals
, X3.XLATSHORTNAME
FROM PSTREEDEFN T
, PSXLATITEM X1
, PSXLATITEM X2
, PSXLATITEM X3
WHERE (T.TREE_ACC_SELECTOR = 'S'
OR T.TREE_ACC_SEL_OPT = 'B'
OR T.TREE_ACC_METHOD = 'L')
AND X1.FIELDNAME = 'TREE_ACC_SELECTOR'
AND X1.FIELDVALUE = TREE_ACC_SELECTOR
AND X2.FIELDNAME = 'TREE_ACC_SEL_OPT'
AND X2.FIELDVALUE = TREE_ACC_SEL_OPT
AND X3.FIELDNAME = 'TREE_ACC_METHOD'
AND X3.FIELDVALUE = TREE_ACC_METHOD;

SETID TREE_NAME EFFDT T XLATSHORTN T XLATSHORTN T XLATSHORTN
----- ------------------ ------------------- - ---------- - ---------- - ----------
QUERY_TREE_WF 00:00:00 01/01/1990 S Static B Between L Literal
QUERY_TREE_PT 00:00:00 01/01/1900 S Static B Between L Literal
QUERY_TREE_OLAP 00:00:00 01/01/1900 S Static B Between L Literal
SPN01 SPANISH_ACCOUNT 00:00:00 01/01/1900 S Static B Between L Literal
SHR03 ALL_PURCHASE_ITEMS 00:00:00 01/01/1900 S Static B Between L Literal
SHR02 ALL_PURCHASE_ITEMS 00:00:00 01/01/1900 S Static B Between L Literal
SHARE VENDOR_TYPE1 00:00:00 01/01/1900 S Static B Between L Literal
SHARE USER_LIMITS 00:00:00 01/01/1900 S Static B Between L Literal
SHARE TR_INSTRUMENTS 00:00:00 01/01/1900 S Static B Between L Literal
SHARE TR_CASH_POSITION 00:00:00 01/01/1900 S Static B Between L Literal
SHARE STOCK_AVAILABILITY 00:00:00 01/01/1900 S Static B Between L Literal
SHARE SALES_PERSON 00:00:00 01/01/1900 S Static B Between L Literal


spool off

Friday, October 06, 2006

SQL Tips for Developers: Performance Improvements with the APPEND hint

In all PeopleSoft products, it is common to find batch programs populating large working storage or reporting tables. When using Oracle RDBMS, the append hint can be used to reduce the overhead and thus improve the performance of the insert operation. Thus

INSERT /*+ APPEND*/ INTO ... SELECT ...

It works by permitting the process to insert the data directly into new data blocks at the end of the object, rather than using space currently allocated to the object. So the new data is appended to the table. This is much faster because it permits Oracle to by-pass some of the processes that occur for each row inserted, such as index maintenance which only occurs when the insert is committed. Significantly, the hint also reduces the volume of redo written.

Every time a piece of data is changed in an Oracle database, the new data is written to the redo logs. This information can then be used to reapply changes to a database during media recovery, or it can be applied to a standby database as a part of a disaster recover solution (Oracle call this technique 'Physical DataGuard'). Every change, even to working storage tables and including indexes, generates redo information.

The reduction in redo can be very significant in systems using DataGuard across a WAN. Bandwidth constraints can cause a backlog of redo information waiting to be transmitted. In most systems the majority of redo is generated by overnight batch processes and it can take time during the day for that backlog to clear and for the DR system to catch up with production. Hence, reducing redo is important.

Therefore the APPEND hint should be used in insert statements when both of the following criteria are met:
  1. The table has been created or truncated prior to the insert. It is not sufficient to delete all the rows because the empty blocks remain allocated to the table.
  2. The table should be populated in a single monolithic insert statement. Multiple statements should be combined using the UNION ALL operator between the queries. Otherwise most of the benefit will be lost during multiple index maintenance phases.
It is easy to demonstrate this with some a simple test. I will create a simple table with a unique index, just as PeopleSoft would. Then I will truncate it and populate it, first without the APPEND hint, and then with it.

CREATE TABLE t
(a NUMBER
,b VARCHAR2(200));

CREATE UNIQUE INDEX t ON t(a);

TRUNCATE TABLE t;
INSERT INTO t
SELECT ROWNUM, RPAD('42',200,'.')
FROM dba_objects;

TRUNCATE TABLE t;
INSERT /*+ APPEND*/ INTO t
SELECT ROWNUM, RPAD('42',200,'.')
FROM dba_objects;

The session statistics show the savings in a number of areas.

Session Stats               No Hint Append Hint
------------------------ ---------- -----------
db block changes 202,926 3,192
db block gets 159,148 4,796
db block gets direct 1,413
db block gets from cache 159,148 3,383
DB time 1,308 828
redo entries 104,842 1,853
redo size 32,391,936 2,838,148

Fewer blocks are changed because they are not written and then rewritten as each row is inserted. Table blocks are only written once during the insert and the index is only maintained when the insert is committed.

Other recommendations:
  • When using the APPEND hint, the insert should be committed immediately. It must be committed before the table can be referenced, otherwise an error will be generated.

  • The data in the table will have been completely replaced by this point, so it is almost certainly appropriate to refresh the optimiser statistics. In Applicaition Engine use the %UpdateStats MetaSQL.

  • The APPEND hint cannot be used in conjuction with the %InsertSelect MetaSQL because the hint MUST be placed between the 'insert' and 'into' keywords. NB: The statement will not fail if you get the hint wrong.

Thursday, August 17, 2006

PeopleTools Platform Flags on Indexes

I have noticed on databases that have been upgraded to PT8.45, that most of the indexes say they will build on some database platforms when they will actually build on either all of them, or none of them.

This is because PSINDEXDEFN still contains flags for platforms that are no longer supported by PeopleSoft/Oracle and can no longer be set individually in Application Designer.
  • PLATFORM_SBS: SQL Base
  • PLATFORM_ALB: Allbase
  • PLATFORM_DB4: DB2/AS400
However, Application Designer still checks the values of these flags and if any flags have different values, then the SOME radio button is selected. However, all six platform flags have the same value.

This is a nuisance because PeopleSoft releases indexes for specific platforms, and some indexes are suppressed on certain platforms. It is not easy to see from the indexes view of Application Designer whether this index will be built. It would be easier if it said ALL or NONE when it means that.

I have fixed this by updating PeopleTools tables as follows. The first query reports on the indexes where the supported platform flags all have the same value and one of the unsupported platform flags are different.

column RECNAME format a15
column INDEXID format a5 heading 'IndexID'
column DDLCOUNT format 999 heading 'DDLCnt'
column CUSTKEYORDER format 9999 heading 'CustKeyOrder'
column KEYCOUNT format 999 heading 'KeyCnt'
column PLATFORM_SBS format 999 heading 'SBS'
column PLATFORM_DB2 format 999 heading 'DB2'
column PLATFORM_ORA format 999 heading 'ORA'
column PLATFORM_INF format 999 heading 'INF'
column PLATFORM_DBX format 999 heading 'DBx'
column PLATFORM_ALB format 999 heading 'ALB'
column PLATFORM_SYB format 999 heading 'SYB'
column PLATFORM_MSS format 999 heading 'MSS'
column PLATFORM_DB4 format 999 heading 'DB4'
column ACTIVEFLAG Format 999 heading 'Active'
column CLUSTERFLAG format 999 heading 'ClustFlag'
column UNIQUEFLAG format 999 heading 'UniqueFlag'
column INDEXTYPE format 999 heading 'IndexType'
column IDXCOMMENTS format a60

spool fixindexdefn
SELECT *
FROM PSINDEXDEFN
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND (PLATFORM_ORA!=PLATFORM_SBS
  OR PLATFORM_ORA!=PLATFORM_ALB
  OR PLATFORM_ORA!=PLATFORM_DB4)
;


                                                 Cust
                Index Index Unique Clust          Key  Key  DDL
RECNAME         ID     Type   Flag  Flag Active Order  Cnt  Cnt  SBS  DB2  ORA  INF  DBx  ALB  SYB  MSS  DB4
--------------- ----- ----- ------ ----- ------ ----- ---- ---- ---- ---- ---- ---- ---- ---- ---- -
IDXCOMMENTS
------------------------------------------------------------
PSGATEWAYDEL    A         4      0     0      1     1    1    0    1    1    1    1    1    1    1    1    0
for clearlisting

PSLOCALEOPTNDFN _         1      1     1      1     0    3    0    1    1    1    1    1    1    1    1    0


PSLOCK          _         1      1     1      1     0    1    0    1    1    1    1    1    1    1    1    0


The following SQL commands set the SQL flags for the unsupported platforms to the value for the supported platforms. The version number on the record definition is updated so that PeopleSoft reaches the object.

UPDATE PSVERSION
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','RDM');

UPDATE PSLOCK
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','RDM');

UPDATE PSRECDEFN
SET VERSION = (
  SELECT VERSION
  FROM PSVERSION
  WHERE OBJECTTYPENAME = 'RDM')
WHERE RECNAME IN (
  SELECT RECNAME
  FROM PSINDEXDEFN
  WHERE PLATFORM_DB2=PLATFORM_DBX
  AND PLATFORM_DBX=PLATFORM_INF
  AND PLATFORM_INF=PLATFORM_ORA
  AND PLATFORM_ORA=PLATFORM_SYB
  AND ( PLATFORM_ORA!=PLATFORM_SBS
    OR PLATFORM_ORA!=PLATFORM_ALB
    OR PLATFORM_ORA!=PLATFORM_DB4)
);

UPDATE psindexdefn
SET PLATFORM_DB4=PLATFORM_ORA
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND PLATFORM_ORA!=PLATFORM_DB4;

UPDATE psindexdefn
SET PLATFORM_ALB=PLATFORM_ORA
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND PLATFORM_ORA!=PLATFORM_ALB;

UPDATE psindexdefn
SET PLATFORM_SBS=PLATFORM_ORA
WHERE PLATFORM_DB2=PLATFORM_DBX
AND PLATFORM_DBX=PLATFORM_INF
AND PLATFORM_INF=PLATFORM_ORA
AND PLATFORM_ORA=PLATFORM_SYB
AND PLATFORM_SYB=PLATFORM_MSS
AND PLATFORM_ORA!=PLATFORM_SBS;

COMMIT;
spool off


The platform flags now say 'ALL'. Not a tremendous change, but at least I can immediately see that these indexes do build on all platforms without having to open each one.

Thursday, June 15, 2006

Additional Batch Timings from Application Engine

I am doing some Application Engine development at the moment as a part of a project to replatform to Oracle. I go through an iterative process of changing SQL Server specific SQL (in a custom AE program) to run on Oracle, and then running the process until it fails again. However, not only must the processes run correctly on Oracle, but the new code needs to be efficient as possible in order to meet performance targets. So, I need to know how long each step in the AE program took to execute.

Application Engine batch timings are very useful metrics when tuning processes, but they are only written when the program ends, and they are only written to the timings tables in the database when a program completes successfully. I am repeatedly restarting the same process, gradually moving towards the end, so I will never get a complete timings report.

However, Application Engine programs writes a status string to table PSPRCSRQSTSTRNG which contains the step about to be executed. The value of this string can be seen in Process Monitor in the Process Detail page.

desc PSPRCSRQSTSTRNG
Name Null? Type
-------------------- -------- ---------------
PRCSINSTANCE NOT NULL NUMBER(38)
RQSTSTRINGTYPE NOT NULL VARCHAR2(2)
PRCSRQSTSTRING NOT NULL VARCHAR2(254)

I created a table to which a trigger logs what steps are executed. I have also used a sequence number to produce a reliable primary key on that table because values in the string will repeat when Application Engine is in a loop or when a process is restarted process in which case it keeps the same process instance number. The sequence number is generated by an Oracle Sequence to avoid locking problems caused by using a table to generate a sequence (as PeopleSoft generally does). I am not worried by gaps in the sequence.

CREATE TABLE dmk_prcsrqststrng_log
(prcsinstance NUMBER NOT NULL
,sequence_nbr NUMBER NOT NULL
,rqststringtype VARCHAR2(2) NOT NULL
,prcsrqststring VARCHAR2(254) NOT NULL
,datestamp DATE NOT NULL
,datetimestamp TIMESTAMP NOT NULL
,CONSTRAINT dmk_prcsrqststrng_log_pk
PRIMARY KEY (prcsinstance, sequence_nbr)
);

CREATE SEQUENCE dmk_prcsrqststrng_seq;

A trigger on PSPRCSRQSTSTRNG writes new values for PRCSRQSTSTRING to the log table. I have used an autonomous transaction in case the update rolls back after an error, and so that I can see the log record before Application Engine commits. I am also storing two timestamps when the audit record is written. The Oracle date type is only accurate to the second, but I have also used the newer timestamp datatype. On Oracle 9.2 (on Windows) I get millisecond accuracy, but on Oracle 10.2 I get microsecond accuracy.

CREATE OR REPLACE TRIGGER dmk_prcsrqststrng_log
AFTER INSERT OR UPDATE OR DELETE
ON sysadm.psprcsrqststrng
FOR EACH ROW
DECLARE
PRAGMA autonomous_transaction;
BEGIN
IF INSERTING OR UPDATING THEN
INSERT INTO dmk_prcsrqststrng_log
(prcsinstance
,sequence_nbr
,rqststringtype
,prcsrqststring
,datestamp
,datetimestamp)
VALUES
(:new.prcsinstance
,dmk_prcsrqststrng_seq.nextval
,:new.rqststringtype
,:new.prcsrqststring
,SYSDATE
,SYSTIMESTAMP);
ELSIF DELETING THEN
INSERT INTO dmk_prcsrqststrng_log
(prcsinstance
,sequence_nbr
,rqststringtype
,prcsrqststring
,datestamp
,datetimestamp)
VALUES
(:old.prcsinstance
,dmk_prcsrqststrng_seq.nextval
,:old.rqststringtype
,''
,SYSDATE
,SYSTIMESTAMP);
END IF;
COMMIT;
EXCEPTION
WHEN OTHERS THEN NULL;
END;
/
show errors

Unfortunately, the status record is not always updated or deleted when the Application Engine program ends. Therefore, another trigger is required to delete the process record when the application updates its status to no longer processing.

CREATE OR REPLACE TRIGGER dmk_prcsrqststrng_del
AFTER UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (new.runstatus != 7 AND old.runstatus = 7
AND new.prcstype IN ('Application Engine'))
DECLARE
PRAGMA autonomous_transaction;
BEGIN
DELETE FROM psprcsrqststrng WHERE prcsinstance = :new.prcsinstance;
COMMIT;
EXCEPTION WHEN OTHERS THEN NULL;
END;
/
show errors

This table can be used is various ways.

i) A simple report can be written to show the last 10 steps for each process instance, and their duration. I can run this while the program is executing to see how far it has got, and how long the current step has been executing.

COLUMN ranking FORMAT 990 HEADING 'Rank'
COLUMN sequence_nbr FORMAT 990 HEADING 'Seq'
COLUMN prcsrqststring FORMAT a25
COLUMN ratio FORMAT 90.0 HEADING '%'
COLUMN sum_duration FORMAT 999,990.000000 HEADING 'Total|Duration'
COLUMN duration FORMAT 99,990.000000
COLUMN avg_duration FORMAT 99,990.000000 HEADING 'Average|Duration'
COLUMN max_duration FORMAT 99,990.000000 HEADING 'Maximum|Duration'
COLUMN datetimestamp FORMAT a25
COLUMN executions FORMAT 990 HEADING 'Execs'
BREAK ON REPORT ON PRCSINSTANCE SKIP 1

SELECT y.prcsinstance
, y.sequence_nbr
, y.prcsrqststring
, y.datetimestamp
, (TO_DATE(TO_CHAR(y.nextdatetimestamp,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS')
-TO_DATE(TO_CHAR(y.datetimestamp,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS'))*86400
+TO_NUMBER(TO_CHAR(y.nextdatetimestamp,'xFF'))
-TO_NUMBER(TO_CHAR(y.datetimestamp,'xFF')) duration
FROM (
SELECT x.prcsinstance
, x.sequence_nbr
, x.prcsrqststring
, x.datetimestamp
, CASE WHEN x.prcsrqststring = '' THEN x.nextdatetimestamp
ELSE NVL(x.nextdatetimestamp,SYSTIMESTAMP)
END as nextdatetimestamp
, x.ranking
FROM (
SELECT x.prcsinstance
, x.sequence_nbr
, x.prcsrqststring
, x.datetimestamp
, LEAD(x.datetimestamp,1) OVER (PARTITION BY x.prcsinstance ORDER BY x.sequence_nbr) AS nextdatetimestamp
, RANK() OVER (ORDER BY x.datetimestamp desc) AS ranking
FROM dmk_prcsrqststrng_log x
) x
) y
WHERE y.ranking <= 10 ORDER BY prcsinstance, sequence_nbr; PRCSINSTANCE Seq PRCSRQSTSTRING DATETIMESTAMP DURATION ------------ ---- ------------------------------ ------------------------- -------------- 242797 688 Main 15-JUN-06 09.56.49.000721 15.227588 689 XXXIF_PAY_FI.WRITEOUT.Step01 15-JUN-06 09.57.04.228309 15.233425 690 XXX_GEN_MD5.MAIN.Step01 15-JUN-06 09.57.19.461734 45.355060 691 XXXIF_PAY_FI.WRITEOUT.Step03 15-JUN-06 09.58.04.816794 2,738.654582 242814 681 XXXIF_PAYA.Z_DelOld.? 14-JUN-06 18.00.14.747175 8.575205 682 14-JUN-06 18.00.23.322380

242815 683 Main 14-JUN-06 18.01.15.606713 19.816770
684 XXXIF_PAYH.Insert.Ins01 14-JUN-06 18.01.35.423483 15.516303
685 XXXIF_PAYH.Insert.Ins04 14-JUN-06 18.01.50.939786 15.160971
686 XXXIF_PAYH.Process.Hire 14-JUN-06 18.02.06.100757 163.917761
687 14-JUN-06 18.04.50.018518

ii) I can also aggregate the time for all the steps to determine what steps account for the longest execution time.

SELECT *
FROM ( /*rank lines in report, also calculate ration to report*/
SELECT RANK() OVER (ORDER BY sum_duration DESC) AS ranking
, d.*
, 100*ratio_to_report(sum_duration) OVER () AS ratio
FROM ( /*calculate sum and durations*/
SELECT c.prcsrqststring
, SUM(c.duration) sum_duration
, COUNT(*) executions
, AVG(c.duration) avg_duration
, MAX(c.duration) max_duration
FROM (
SELECT b.prcsrqststring
, (TO_DATE(TO_CHAR(b.nextdatetimestamp
,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS')
-TO_DATE(TO_CHAR(b.datetimestamp
,'YYYYMMDDHH24MISS'),'YYYYMMDDHH24MISS'))*86400
+TO_NUMBER(TO_CHAR(b.nextdatetimestamp,'xFF'))
-TO_NUMBER(TO_CHAR(b.datetimestamp,'xFF')) duration
FROM (
SELECT a.prcsinstance
, a.sequence_nbr
, a.prcsrqststring
, a.datetimestamp
, LEAD(a.datetimestamp,1)
OVER (PARTITION BY a.prcsinstance
ORDER BY a.sequence_nbr) AS nextdatetimestamp
, RANK() OVER (ORDER BY a.datetimestamp desc) AS ranking
FROM dmk_prcsrqststrng_log a
) b
WHERE b.nextdatetimestamp IS NOT NULL
AND prcsrqststring != ''
) c
GROUP BY c.prcsrqststring
HAVING SUM(c.duration) > 0
) d
) e
WHERE ranking <= 10 ;

                                    Total                 Average        Maximum
Rank PRCSRQSTSTRING Duration Execs Duration Duration %
---- ------------------------------ --------------- ----- -------------- -------------- -----
1 XXXIF_PAYC.ADDR_LD.060 11,509.840101 15 767.322673 980.808154 29.0
2 XXXIF_PAYC.ADDR_LD.050 7,476.656270 15 498.443751 679.516699 18.8
3 XXXIF_PAYC.ADDR_LD.020 3,669.344222 14 262.096016 669.283625 9.2
4 XXXIF_PAYC.ASGN_LD.076 3,076.882638 13 236.683280 2,729.356869 7.7
5 XXXIF_PAYC.ASGN_LD.010 2,721.361198 15 181.424080 332.187057 6.8
6 XXXIF_PAYC.PURGE.020 2,644.848544 1 2,644.848544 2,644.848544 6.7
7 XXXIF_PAYC.ASGN_LD.020 624.892364 15 41.659491 155.971344 1.6
8 XXXIF_PAYC.ADDR_LD.010 623.354434 14 44.525317 136.919156 1.6
9 XXXIF_PAYC.ASGN_LD.030 583.222200 10 58.322220 425.172493 1.5
10 XXXIF_PAYC.ASGN_LD.060 543.455165 14 38.818226 125.473966 1.4

This method, although useful, is not foolproof. The status string is only updated if AE Restart is enabled, and then not for for every step followed by a COMMIT. I haven't yet been able to work out all the criteria for this.

Tuesday, May 16, 2006

Identifying Candidate Records for the Lookup Exclusion Table

Recently, I wrote about the lookup exclusion table, and then I started to think about identifying automatic lookups that should suppressed. The following query will identify all the records that are used as look-up records.

SELECT DISTINCT r.recname
FROM pspnlfield p
, psrecfielddb f
, psrecdefn r
WHERE p.fieldname = f.fieldname
AND p.recname = f.recname
AND r.recname = f.edittable

I started on the basis that this is a performance issue. I took a minimalist approach that I would only suppress lookups that were consuming a significant amount of tine. So I looked at whether it was possible to use performance Monitor data to identify long running lookups. When you capture a PMU either through sampling or a performance trace you get a heirarchy of the following transactions when you performance a lookup.
  • 401:Entry and exit for Component search and processing on AppSrv
  • 403:Modal component/secondary page/lookup page from a Level 1
  • 408:All SQL calls excluding PMUs 406 and 407
Note: Transaction 408 is only collected if you have verbose trace enabled for PSAPPSRV. Be aware that this level of tracing can have an impact on performance.

Performance monitor transactions are held in the table PSPMTRANSHIST. This table can become rather large, I found it easier to extract the transaction types that are of interest to me into another table, and then analyse that data.


DROP TABLE dmktranshist;

CREATE TABLE dmktranshist PCTFREE 0 as
SELECT pm_trans_defn_id
, pm_instance_id, pm_parent_inst_id, pm_top_inst_id
, pm_context_value1, pm_context_value2, pm_context_value3
, pm_metric_value7
--, pm_addtnl_descr
, pm_trans_duration
FROM pspmtranshist
WHERE pm_trans_defn_set = 1
AND pm_trans_defn_id IN(401,403,408)
AND ( pm_trans_defn_id = 401
OR (pm_trans_defn_id = 403 AND pm_context_value3 = 'Launch Look up Page')
OR (pm_trans_defn_id = 408 AND pm_metric_value7 LIKE 'SELECT PS%' AND SUBSTR(pm_metric_value7,10,1) = '_')
);

CREATE UNIQUE INDEX dmktranshist1
ON dmktranshist(pm_instance_id) PCTFREE 0;

CREATE INDEX dmktranshist2
ON dmktranshist(pm_trans_defn_id, pm_parent_inst_id, pm_metric_value7)
PCTFREE 0 COMPRESS 1;

CREATE INDEX dmktranshist3
ON dmktranshist(pm_trans_defn_id, pm_context_value3)
PCTFREE 0 COMPRESS 1;

ANALYZE TABLE dmktranshist ESTIMATE STATISTICS FOR ALL COLUMNS;


I have constructed two queries to identify component look-ups that are acocunting for a large amount of response time. This first query simply sums the duration and counts the number of executions of each lookup record. Details of the component and page are also available on the table.

COLUMN pm_instance_id    FORMAT 999999999999999
COLUMN pm_parent_inst_id FORMAT 999999999999999
COLUMN pm_top_inst_id FORMAT 999999999999999
COLUMN component FORMAT a25 HEADING 'Component'
COLUMN page FORMAT a35 HEADING 'Page'
COLUMN pm_context_value3 HEADING 'Context'
ttitle 'Total Time grouped by Lookup Record'
SELECT
-- a.pm_context_value1 component,
-- a.pm_context_value2 page,
a.pm_context_value3
, SUM(b.pm_trans_duration)/1000 duration
, COUNT(*) num_lookups
, SUM(b.pm_trans_duration)/1000/COUNT(*) avg_duration
FROM dmktranshist a
, dmktranshist b
WHERE a.pm_trans_defn_id = 401
AND b.pm_trans_defn_id = 403
AND a.pm_top_inst_id = b.pm_top_inst_id
AND a.pm_instance_id = b.pm_parent_inst_id
AND b.pm_context_value3 = 'Launch Look up Page'
GROUP BY a.pm_context_value3 --,a.pm_context_value1 ,a.pm_context_value2
--HAVING SUM(b.pm_trans_duration)>=50000 /*use this restrict runs to top lookups.*/
ORDER BY 2 DESC
/

Total Time grouped by Lookup Record
Context
--------------------------------------------------------------------
DURATION NUM_LOOKUPS AVG_DURATION
---------- ----------- ------------
Click Prompt Button/Hyperlink for Field DEPOSIT_CONTROL.BANK_CD
.849 1 .849
Click Prompt Button/Hyperlink for Field DEPOSIT_CONTROL.BANK_ACCT_KEY
.319 1 .319
...

The other query additionally retrieves the SQL that was run, but this data is only available if the PMU trace was at Verbose level on the PSAPPSRV process.
ttitle 'SQL run by lookup - only with verbose or debug level tracing'

SELECT a.pm_context_value1 component
, a.pm_context_value2 page
, a.pm_context_value3
, c.pm_metric_value7
, b.pm_trans_duration/1000 duration
FROM dmktranshist a, dmktranshist b, dmktranshist c
WHERE a.pm_trans_defn_id = 401
AND b.pm_trans_defn_id = 403
AND c.pm_trans_defn_id = 408
AND a.pm_top_inst_id = b.pm_top_inst_id
AND b.pm_top_inst_id = c.pm_top_inst_id
AND c.pm_top_inst_id = a.pm_top_inst_id
AND a.pm_instance_id = b.pm_parent_inst_id
AND b.pm_instance_id = c.pm_parent_inst_id
AND b.pm_context_value3 = 'Launch Look up Page'
AND c.pm_metric_value7 LIKE 'SELECT PS%'
AND SUBSTR(c.pm_metric_value7,10,1) = '_'
AND c.pm_instance_id = (
SELECT MAX(c1.pm_instance_id)
FROM dmktranshist c1
WHERE c1.pm_parent_inst_id = c.pm_parent_inst_id
AND c1.pm_top_inst_id = c.pm_top_inst_id
AND c1.pm_metric_value7 LIKE 'SELECT PS%'
AND c1.pm_trans_defn_id = 408
AND SUBSTR(c1.pm_metric_value7,10,1) = '_')
/


Component Page
------------------------- -----------------------------------
Context
--------------------------------------------------------------------
PM_METRIC_VALUE7
--------------------------------------------------------------------
DURATION
----------
PAYMENT_EXPRESS.GBL PAYMENT_EXPRESS1
Click Prompt Button/Hyperlink for Field DEPOSIT_CONTROL.BANK_CD
SELECT PS_BANK_AR_BD_VW A
.849

PAYMENT_EXPRESS.GBL PAYMENT_EXPRESS1
Click Prompt Button/Hyperlink for Field DEPOSIT_CONTROL.BANK_ACCT_KEY
SELECT PS_BANK_AR_D_VW A
.319

Suppressing the automatic lookup saves the user from waiting for an unnecessary query. The question is what is the criteria for adding a record to the lookup exclusion table.
  • Certainly if a lookup record returns more than 300 rows it should not be executed automatically, because only the first 300 rows will be returned, and then the user will probably not have the row they are looking for and they will have to add search criteria and repeat the search.
  • There is a good case for suppressing the automatic lookup when it returns more than 100 rows. The results will return up to 3 pages of 100 rows per page. Then the user may well have to navigate to a different page in order to find what they are looking for.
  • Having discussed this issue with a few people, there is some support for the suggestion that the automatic search should be supressed for records with more than 25-50 rows. If there are more rows than this, the user is likely to have to scroll down the page the desired row.
The number of rows that a lookup record may retrieve is therefore a good criteria for deciding whether to suppress the automatic lookup. So I have constructed the following SQL script to identify all candidate records and count the number of rows that each returns. First I need a table in which to store the results.

CREATE TABLE lux_cand
(recname VARCHAR2(15) NOT NULL
,sqltablename VARCHAR2(18)
,rectype NUMBER NOT NULL
,num_lookups NUMBER NOT NULL
,num_rows NUMBER
,lead_key VARCHAR2(18)
,num_lead_keys NUMBER
,lastupddttm DATE )
/

CREATE UNIQUE INDEX lux_cand ON lux_cand(recname)
/


Then all the records with lookup exclusion fields will be copied into our new working storage table.

INSERT INTO lux_cand (recname, sqltablename, rectype, lead_key, num_lookups)
SELECT r.recname
, CASE WHEN r.rectype IN(0,1,6) THEN /*regular records*/
DECODE(r.sqltablename,' ','PS_'r.recname,r.sqltablename)
END
, r.rectype
, k.fieldname
, count(*) num_lookups
FROM pspnlfield p
, psrecfielddb f
, psrecdefn r
LEFT OUTER JOIN pskeydefn k
ON k.recname = r.recname
AND k.indexid = '_'
AND k.keyposn = 1
WHERE p.fieldname = f.fieldname
AND p.recname = f.recname
AND f.edittable > ' '
AND r.recname = f.edittable
GROUP BY r.recname, r.rectype, k.fieldname
, DECODE(r.sqltablename,' ','PS_'r.recname,r.sqltablename)
/

COMMIT
/

/*if SETID is a key, the we will group by SETID*/
UPDATE lux_cand l
SET l.lead_key = 'SETID'
WHERE l.lead_key IS NULL
AND EXISTS(
SELECT 'x'
FROM psrecfielddb k
WHERE k.recname = l.recname
AND k.fieldname = 'SETID'
AND MOD(k.useedit,2) = 1 /*bit 1 of useedit is set to one for key columns*/
);

/*store first defined key column*/
UPDATE lux_cand l
SET l.lead_key = (
SELECT k.fieldname
FROM psrecfielddb k
WHERE k.recname = l.recname
AND MOD(k.useedit,2) = 1
AND k.fieldnum = (
SELECT MIN(k1.fieldnum)
FROM psrecfielddb k1
WHERE k1.recname = k.recname
AND MOD(k1.useedit,2) = 1 ))
WHERE l.lead_key IS NULL
;

/*clear sqltablename where table does not exist*/
UPDATE lux_cand l
SET l.sqltablename = ''
WHERE l.sqltablename IS NOT NULL
AND l.rectype = 0
AND NOT EXISTS(
SELECT 'x'
FROM user_tables o
WHERE o.table_name = l.sqltablename) ;

/*clear sqltablename where the view does not exist*/
UPDATE lux_cand l
SET l.sqltablename = ''
WHERE l.sqltablename IS NOT NULL
AND l.rectype IN(1,6)
AND NOT EXISTS(
SELECT 'x'
FROM user_views o
WHERE o.view_name = l.sqltablename)
/

/*deliberately exclude this view from the test, it takes a long time to count the rows and it should be on the lookup exclusion table regardless*/
UPDATE lux_cand l
SET l.lastupddttm = SYSDATE, l.num_rows = 301
WHERE l.recname IN('PSFIELDRECDVW')
AND l.lastupddttm IS NULL
/

COMMIT
/

/*count number of rows in each table*/
DECLARE
l_num_rows INTEGER;
l_lead_keys INTEGER;
BEGIN
FOR l IN (
SELECT l.* FROM lux_cand l
WHERE l.lastupddttm IS NULL
AND l.sqltablename IS NOT NULL
ORDER BY SIGN(l.rectype) /*do tables first*/
, case when 'PS_'l.recname = l.sqltablename then 0 else 1 end /*tools tables last*/
, l.rectype /*order by rectype*/
, l.recname
) LOOP
l_num_rows := 0;
l_lead_keys := 0;
IF l.lead_key = 'SETID' THEN /*count max rows per setid*/
EXECUTE IMMEDIATE 'SELECT MAX(num_rows), COUNT(*) FROM (SELECT COUNT(*) num_rows FROM 'l.sqltablename' GROUP BY 'l.lead_key')' INTO l_num_rows, l_lead_keys;
UPDATE lux_cand
SET num_rows = NVL(l_num_rows,0)
, num_lead_keys = NVL(l_lead_keys,0)
, lastupddttm = SYSDATE
WHERE recname = l.recname;
ELSE /*count number of rows*/
EXECUTE IMMEDIATE 'SELECT COUNT(*) FROM 'l.sqltablename' WHERE rownum <= 301' INTO l_num_rows; UPDATE lux_cand SET num_rows = NVL(l_num_rows,0) , lastupddttm = SYSDATE WHERE recname = l.recname;
END IF;
COMMIT;
END LOOP;
END;
/


Now, the table LUX_CAND contains a list of all the tables and view used as lookups, and the number of rows that they return (I cannot do anything about PeopleSoft dynamic views). The next thing is to add any record that returns too many rows into the lookup exclusion table.

/*put records with too many rows into the exclusion table*/
INSERT INTO psrecxl
SELECT recname FROM lux_cand WHERE num_rows > 100
MINUS
SELECT recname FROM psrecxl
/

Don't worry about an negative effects of having a large number records in the lookup exclusion table. This information is cached by the application server. Although there is no version number on PSRECXL, when you save an update to the lookup exclusion table, there is component PeopleCode that updates the SYS and PPC version numbers, which causes the application server to recache this table, so that changes take effect immediately. So when updating PSRECXL with this script, it is also necessary to increment the version numbers.

UPDATE PSVERSION SET VERSION = VERSION + 1 WHERE OBJECTTYPENAME IN('SYS','PPC');
UPDATE PSLOCK SET VERSION = VERSION + 1 WHERE OBJECTTYPENAME IN('SYS','PPC');
COMMIT;

On my Financials 8.4 demo system I found 4089 records being used as lookup records. Of those 557 have over 300 rows, 1058 have over 100 rows, and 1711 records return over 25 rows.
On balance, I think that I would be prepared to add all 1711 records to the lookup exclusion table. It would save unecessary searches.

Saturday, May 06, 2006

Lookup Exclusion Table

One of the catchphrases that I have adopted (I think from Cary Milsap of Hotsos) is 'the fastest way to do something is not to do it at all'. Instead of making something take less time by tuning, it is much better make it take no time at all by not doing it in the first place!

This applies to PeopleSoft as much as anything else. In this posting I want to draw attention to the Lookup exclusion table. Lots of fields in PeopleSoft applications have lookups associated with them, indicated by the magnifying glass icon.

Click on image to enlarge
When you click on that you are taken to a lookup dialogue, but normally the search fires automatically. Sometimes the operator must wait a while for the SQL queries to return the data. If there is a lot of data only 300 rows are retrieved into the component buffer, and only the first 100 rows of that set are shown. For example, here is an example from Application Hire in HR illustrated with the demo database.

Click on image to enlarge
These results are usually useless to the operator who must enter criteria into the lookup search dialogue and search again. It would be better if the first automatic search could be suppressed. Fortunately, this can be done by putting the prompt table associated with the field on the lookup exclusion table.

Click on image to enlarge
Now the user goes straight into a waiting page and is prompted to enter search criteria to find the relavent data. Even better, the benefit to the user is immediate. You don't have to restart any servers for this configuration setting to be recognised, and for the behaviour of the lookup to change.

Click on image to enlarge
The lookup exclusion table is a very simple and quick way of improving performance by cutting out unnecessary work. It is also much easier to implement these changes because there are only configuration, rather than customisation.

Thursday, April 27, 2006

Measuring Network Latency with the PeopleSoft Performance Monitor Transaction 116

One of the things I am using this blog for is as an addendum to the book. When I wrote the chapter about performance monitor, I had never used it in a real production scenario. Since publication, I have learnt a lot more about it. One of the things that I missed was transaction 116, 'Redirected round trip time (network latency)'.
When you log into the PIA the browser, the browser is redirected to another page. I understand this transaction to measure the time taken for the login page with the redirection tag to go from the web server to browser, to be processed on the browser, and for the browser to return the page to which it was redirected. Thus it measures the latency of a return trip on the network. This and PeopleSoft Ping are the only metrics that PeopleSoft generates that indicate the performance of network between the web server and a user's workstation.
click on image to enlarge
This transaction holds various pieces of information about the user's PC, including its IP address (although that might be the address of a firewall or load balancer), the user's Session ID on the web server, and the user agent information, from which you can determine the browser and version, and the operating system of their workstation.
So now, if you know where a user is physically located, you may be able to detect a correlation between location and the round trip during. One limitation is that this transaction is only collected when a user logs in, and so you may not collect data points when you need them.

Wednesday, April 26, 2006

Migrating DDL Overides with Application Designer

Here is a simple tip, but I (probably because I don't often migrate PeopleSoft projects myself) think it is rather significant.

Ideally, if you a setting non-default storage options such as PCTFREE on tables or indexes, of if you are compressing indexes, then you should use DDL Overrides in PeopleSoft Application Designer to manage these options so that when you build a build script, it contains the correct options. Otherwise, these settings could be lost when a build scripts is run.

Click on image to enlarge

However, but default the Application Designer does migrate these DDL overrides. When you are about to copy a project select the copy options, and under the 'General Options' tab select the 'Take DDL from Source' radio button.

Click on image to enlarge

Unfortunately, Application Designer does not remember this option, and you need to remember to set it every time you migrate a project.

So now you can put DDL overrides into development environemt and migrate them through into production

Tuesday, April 25, 2006

Performance Tuning the Performance Monitor

I am a big fan of the PeopleSoft Performance Monitor, but the more often I use it I discover it has a few rough edges. Some of the analytics components do not perform well when their is either a large amount of data in the PSPMTRANSHIST table, or when there are a lot of agents in a system. So I have used the Performance Monitor to trace the Performance Monitor
components.

You have to configure it to self-monitor, ignoring all the warning messages.
Setting the filter level on the agents.
But, you are not interested in the sampled statistics on this database, so disable monitoring by setting the filter level on all agents to standby.
Configuring the Performance Monitor for self-monitoring Instead enable Performance Trace with either Verbose or Debug
Click on image to enlarge The component trace indicates which analytic queries take the most time.
Click on image to enlarge
I have changed and added the following indexes to the performance tables.

CREATE INDEX PSDPSPMEVENTHIST ON PSPMEVENTHIST
(PM_EVENT_DEFN_SET, PM_EVENT_DEFN_ID, PM_AGENTID, PM_AGENT_DTTM)
TABLESPACE PSINDEX ... PCTFREE 1 COMPRESS 3
/
CREATE INDEX PSCPSPMTRANSHIST ON PSPMTRANSHIST
(
PM_TRANS_DEFN_SET,PM_TRANS_DEFN_ID, PM_PERF_TRACE,PM_METRIC_VALUE7
/*,PM_MON_STRT_DTTM */
) TABLESPACE PSINDEX ... PCTFREE 1 COMPRESS 4
/
CREATE INDEX PSDPSPMTRANSHIST ON PSPMTRANSHIST
(PM_TOP_INST_ID, PM_TRANS_DEFN_SET, PM_TRANS_DEFN_ID)
TABLESPACE PSINDEX ... PCTFREE 1
COMPRESS 3
/


The Performance Monitor tables (PSPMTRANSHIST, PSPMTRANSARCH, PSPMEVENTHIST, PSPMEVENTARCH) are never updated. The performance collator application server process (PSPPMSRV) inserts data into the performance monitor tables, and later the performance monitor archive Appliction Engine process (PSPM_ARCHIVE) will insert them into the %ARCH tables, and delete them from the %HIST tables. Therefore, it is sensible to pack the data by minimising free space in these tables, and so reduce both logical and phyiscal I/O. So
I have reduced the free space in the data blocks (PCTFREE) to 0% on tables and indexes, and increased PCTUSED to 99%.

I have also found it beneficial to collect histograms on certain columns

BEGIN
sys.dbms_stats.gather_table_Stats
(ownname => 'SYSADM'
,tabname => 'PSPMTRANSHIST'
,estimate_percent => DBMS_STATS.AUTO_SAMPLE_SIZE
,method_opt => 'FOR COLUMNS PM_PERF_TRACE, PM_TRANS_DEFN_ID,
pm_metric_value7'
,cascade => TRUE
);
END;
/

If you have performance problems with Performance Monitor, remember that you can also use Performance Monitor to analyse its own analytics.

Using DDL Triggers to protect database objects not managed by Application Designer

Sometimes it is necessary create certain database objects manually, and not manage them via the Application Designer. However, this can introduce some management problems. When an object is altered in the Application Designer, it is then necessary to build an alter script that may rebuild the whole table. It is easy for the additional objects to be accidentally lost. There are two main scenarios.
  • Additional indexes and Function based indexes: During the course of performance tuning it is often necessary to build additional indexes. Ideally these indexes should be added via the Application Designer. However, a DBA might add an index directly to the production environment, and it may take time to get a project moved into that environment. The other scenario, is that the Application Designer cannot build function-based indexes. These are particularly useful to build upper case indexes on columns, to support case-insensitive searching.
  • PeopleSoft delivers a mechanism to build triggers to perform DML auditing. However, if you then rebuild the underlying table, then the trigger will be lost, and there is nothing PeopleSoft to warn you of this, or to audit this situation.

My solution to both scenarios is to create a DDL trigger to prevent accidental ALTER and DROP commands on objects not defined by PeopleTools. When you do want to alter this objects you can then disable this trigger. So you now have a way of controlling when you can and can't drop additional triggers and indexes. You still have to remmeber to switch this trigger back on again!

CREATE OR REPLACE TRIGGER t_lock 
BEFORE DROP OR ALTER
ON SYSADM.SCHEMA
DECLARE
l_generate_message EXCEPTION;
l_recname VARCHAR2(15 CHAR);
l_msg VARCHAR2(100 CHAR) := 'No Message.';
l_msg2 VARCHAR2(100 CHAR) := 'Cannot '||ora_sysevent||' '||lower(ora_dict_obj_type)||' '||ora_dict_obj_owner||'.'||ora_dict_obj_name;
BEGIN
IF ora_dict_obj_type = 'TABLE' THEN
SELECT r.recname
INTO l_recname
FROM psrecdefn r
WHERE DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) = ora_dict_obj_name
;

BEGIN
SELECT 'Trigger '||t.trigger_name||' exists on table '||ora_dict_obj_name||'.'
INTO l_msg
FROM all_triggers t
WHERE ROWNUM = 1
AND t.table_name = ora_dict_obj_name
AND t.table_owner = ora_dict_obj_owner
AND t.trigger_name != 'PSU'||l_recname
;
RAISE l_generate_message;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
END;

BEGIN
SELECT 'Index '||i.index_name||' on table '||ora_dict_obj_name||' is managed outside PeopleTools.'
INTO l_msg
FROM all_indexes i
WHERE ROWNUM = 1
AND i.table_name = ora_dict_obj_name
AND i.table_owner = ora_dict_obj_owner
AND NOT EXISTS(
SELECT 'x'
FROM psrecdefn r, psrecfielddb p, psrecfield f, psindexdefn j
WHERE DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) = i.table_name
AND r.recname = p.recname
AND p.fieldname = f.fieldname
AND (f.recname = p.recname_parent
OR (MOD(f.useedit/1,2)=1 /*key*/
OR MOD(f.useedit/2,2)=1 /*dup*/
OR MOD(f.useedit/16,2)=1 /*alt*/))
AND j.recname = p.recname_parent
AND 'PS'||j.indexid||r.recname = i.index_name
)
;
RAISE l_generate_message;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
END;

ELSIF ora_dict_obj_type = 'TRIGGER' THEN

BEGIN
SELECT 'Trigger '||t.trigger_name||' exists on PeopleSoft record '||r.recname||'.'
INTO l_msg
FROM all_triggers t, psrecdefn r
WHERE ROWNUM = 1
AND t.trigger_name = ora_dict_obj_name
AND t.owner = ora_dict_obj_owner
AND t.table_owner = ora_dict_obj_owner
AND DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) = t.table_name
AND t.trigger_name != 'PSU'||r.recname
;
RAISE l_generate_message;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
END;

ELSIF ora_dict_obj_type = 'INDEX' THEN

BEGIN
SELECT 'Index '||i.index_name||' on table '||ora_dict_obj_name||' is managed outside PeopleTools.'
INTO l_msg
FROM all_indexes i
WHERE ROWNUM = 1
AND i.index_name = ora_dict_obj_name
AND i.owner = ora_dict_obj_owner
AND NOT EXISTS(
SELECT 'x'
FROM psrecdefn r, psrecfielddb p, psrecfield f, psindexdefn j
WHERE DECODE(r.sqltablename,' ','PS_'||r.recname,r.sqltablename) = i.table_name
AND r.recname = p.recname
AND p.fieldname = f.fieldname
AND (f.recname = p.recname_parent
OR (MOD(f.useedit/1,2)=1 /*key*/
OR MOD(f.useedit/2,2)=1 /*dup*/
OR MOD(f.useedit/16,2)=1 /*alt*/))
AND j.recname = p.recname_parent
AND 'PS'||j.indexid||r.recname = i.index_name
)
;
RAISE l_generate_message;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
END;

END IF;

EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
WHEN l_generate_message THEN
Raise_application_error(-20042,'T_LOCK: '||l_msg||' '||l_msg2);
END;
/

Notes:

  • The trigger performs a number of test queries on PeopleTools or Oracle Catalogue views. There are different queries depending upon the type of object being ALTERed or DROPped. If a query returns a row then there is a problem, and an exception is raised. The queries also generate part of the error message that is returned.
  • The trigger only affects operations on tables that are specified in PSRECDEFN as type 0 or 7 records. I have not added an handling for the additional instances of a PeopleSoft temporary table.
  • The ora_% variables are described in the RDBMS documentation in Application Developer's Guide - Fundamentals. In the 9.2 documentation this is in chapter 16. Working with System Events Event Attribute Functions

So here are a few tests. I have created a function based index and a two triggers on PS_RT_RATE_TBL. I have chosen this table for the example because it also has a PeopleSoft generated trigger for Mobile agents.

CREATE INDEX DMK_RT_RATE_TBL 
ON PS_RT_RATE_TBL(UPPER(RT_RATE_INDEX));
Index created.
CREATE OR REPLACE TRIGGER pstrt_rate_tbl
AFTER INSERT OR UPDATE OR DELETE
ON ps_rt_rate_tbl
FOR EACH ROW
BEGIN
raise_application_error(-20042,'No DML on PS_RT_RATE_TBL');
END;
/
Trigger created.
CREATE OR REPLACE TRIGGER psurt_rate_tbl
AFTER INSERT OR UPDATE OR DELETE
ON ps_rt_rate_tbl
FOR EACH ROW
BEGIN
raise_application_error(-20042,'No DML on PS_RT_RATE_TBL');
END;
/
Trigger created.

So the t_lock trigger prevents me from droping or altering either the function based index or the table.

DROP TABLE PS_RT_RATE_TBL
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-20042: T_LOCK: Trigger PSTRT_RATE_TBL exists on table PS_RT_RATE_TBL. Cannot DROP table
SYSADM.PS_RT_RATE_TBL
ORA-06512: at line 99

ALTER TABLE PS_RT_RATE_TBL RENAME to DMK
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-20042: T_LOCK: Trigger PSTRT_RATE_TBL exists on table PS_RT_RATE_TBL. Cannot ALTER table
SYSADM.PS_RT_RATE_TBL
ORA-06512: at line 99

DROP INDEX DMKZRT_RATE_TBL
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-20042: T_LOCK: Index DMKZRT_RATE_TBL is managed outside PeopleTools. Cannot DROP index
SYSADM.DMKZRT_RATE_TBL
ORA-06512: at line 99

But I can drop any other index on the table that is maintained by PeopleTools.

DROP INDEX PS_RT_RATE_TBL
Index dropped.

I can't alter any trigger on the table except the PSU trigger that is created by Application Designer for objects that are maintained by mobile agents.

ALTER TRIGGER PSTRT_RATE_TBL DISABLE
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-20042: T_LOCK: Trigger PSTRT_RATE_TBL exists on PeopleSoft record RT_RATE_TBL. Cannot ALTER trigger
SYSADM.PSTRT_RATE_TBL
ORA-06512: at line 99

DROP TRIGGER PSTRT_RATE_TBL
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-20042: T_LOCK: Trigger PSTRT_RATE_TBL exists on PeopleSoft record RT_RATE_TBL. Cannot DROP trigger
SYSADM.PSTRT_RATE_TBL
ORA-06512: at line 99

ALTER TRIGGER PSURT_RATE_TBL DISABLE;
Trigger altered.

DROP TRIGGER PSURT_RATE_TBL;
Trigger dropped.

Tuesday, April 18, 2006

%FirstRows MetaSQL uses FIRST_ROWS instead of FIRST_ROWS(n) hint on Oracle

More MetaSQL madness. The following SQL was identified as a long running query from a PeopleTools SQL trace. Note the FIRST_ROWS hint.

PSAPPSRV.904 1-22827 13.52.01 0.000 Cur#2.904.CRPRD01 RC=0 Dur=0.000 COM Stmt=SELECT /*+ FIRST_ROWS */ CASE_ID, BUSINESS_UNIT, RC_VERTICAL, SETID_CUSTOMER, BO_ID_CUST, ROLE_TYPE_ID_CUST, BO_NAME, BO_ID_CONTACT, ROLE_TYPE_ID_CNTCT, BO_NAME_2, ROLE_TYPE_DESCR, CM_TYPE_ID, PROFILE_CM_SEQ, RC_PURP_TYPE_DESCR, SITE_ID, DESCR, PERSON_PIN, SIN, CASE_CONTACT, RC_CONTACT_NAME, SETID_ENTL, SETID_PRODUCT, SETID_PROV_GRP, PROVIDER_GRP_ID, SECURE_CASE_FLG, CUST_STATUS, EMPL_STATUS, EMPLID, SETID_DEPT, DEPTID, RC_DEPT_DESCR, SETID_LOCATION, LOCATION, RC_LOC_DESCR, PHYSICAL_LOCATION, RC_PHONE, EXTENSION, EMAIL_ADDR, RC_SUMMARY, RC_STATUS, RC_SHORT_DESCR, TO_CHAR(CREATION_DATE,'YYYY-MM-DD'), TO_CHAR(CLOSED_DATE,'YYYY-MM-DD'), COMPETENCY, START_DT, END_DT, RBTACCTID, RBTACCTNO, RCTINSTPHONENO, BO_NAME_3, BO_ID_SITE, ROLE_TYPE_ID_SITE, ADDRESS1, ADDRESS2, ADDRESS3, ADDRESS4, ADDRESS1_AC, ADDRESS2_AC, ADDRESS3_AC, ADDRESS4_AC, CITY, CITY_AC, STATE, POSTAL, COUNTRY, BO_ID_AGR, ROLE_TYPE_ID_AGR
FROM PS_RC_CASE_HD_VW2
WHERE BUSINESS_UNIT = 'HRSUK'
AND RC_CONTACT_NAME = 'Hpolite'
AND RC_VERTICAL = 'HD'
AND MARKET = 'HHD'
ORDER BY CASE_ID DESC

The SQL Statement comes from a SQL exec in the DERIVEDRCSEARCH.SEARCH_BUTTON.FieldChange peoplecode, but it contains the %FirstRows metaSQL which controls how the rows are fetched.

…%FirstRows(" String(&rc_case_qry_rows + 51) ") …

The %FirstRows metaSQL is introducing the /*+ FIRST_ROWS */ hint on Oracle. On Microsoft SQLServer it evaluates to TOP(n), causing the query to return only the first n rows of the data set. This is a change in behaviour introduced in PeopleTools 8.46 prior to which this macro evaluated to a blank string when PeopleSoft runs on an Oracle database.

The solution to my performance problem was simply to remove the hint. The following table shows timings in SQL*Plus for the statement with and without the hints.

HintExecution Time(UAT system)Execution Time (PRD system)
First Rows69s37.08s
No hint, any FIRST_ROWS(n), or ALL_ROWS5s<1s


This hint is deprecated in Oracle 9i having been replaced with FIRST_ROWS(n). FIRST_ROWS is that it uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows. Oracle’s Performance Tuning and Planning manual was that “Using heuristics sometimes leads the CBO to generate a plan with a cost that is significantly larger than the cost of a plan without applying the heuristic. FIRST_ROWS is available for backward compatibility and plan stability”.

The problem is that the FIRST_ROWS hint includes some rules that override the usual cost-based behaviour, including that an index can be used to avoid a sort operation, no matter how expensive the path may be. In most situations FIRST_ROWS is simply an inappropriate hint in Oracle 9i.

I have found two support cases (200991709 and 200769258) where this hint was causing a problem, and it is probably in response to these, that there is now a workaround available in PeopleTools 8.46. A new parameter OracleDisableFirstRowsHint has been added to the Database Options section of both the Application Server configuration file (psappsrv.cfg) and the Process Scheduler configuration file (psprcs.cfg). This flag defaults to 0, and should be set to 1 in order to supress this hint.

However, PeopleTools 8.46 is not certified on any release of Oracle prior to 9.2.0.6.0, and the FIRST_ROWS hint was deprecated in Oracle 9i when it was replaced by FIRST_ROWS(n). MetaSQLs are there to produce appropriate platform specific code. It would be much better if this MetaSQL generates a FIRST_ROWS(n) hint in the first place, perhaps resulting in this

…Stmt=SELECT /*+ FIRST_ROWS(100)*/ …

Wednesday, April 12, 2006

%CurrentDateIn MetaSQL prevents Oracle CBO from correctly evaluating selectivity of predicate (up to Oracle 9i)

I thought this might be an interesting subject for the first post to a new blog.

Update 6.2.2009: The problem described in this posting is not an issue in Oracle 10g - see Changes in Calculation of Predicate Selectivity in Oracle 10g

I recently discovered the Oracle Cost Based Optimizer choosing a poor execution plan for a particular critical SQL statement in a CRM system because the expansion of the %CurrentDateIn macro is excessively complicated. The problem occurs in a delivered view PS_RBC_PACKAGE_VW and and a custom view PS_XX_RBCPKCLTR_VW. The views both contain a pair of date conditions, which are coded in line with PeopleSoft standards.

AND A.FROM_DATE <= %CurrentDateIn AND A.TO_DATE >= %CurrentDateIn
AND E.FROM_DATE <= %CurrentDateIn AND E.TO_DATE >= %CurrentDateIn

On an Oracle RDBMS, this expands to

AND A.FROM_DATE <= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’) AND A.TO_DATE >= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’)
AND E.FROM_DATE <= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’) AND E.TO_DATE >= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’)

With the result that this statement took over 15 seconds to execute. However, if the view is recoded as follows

AND A.FROM_DATE <= TRUNC(SYSDATE) AND A.TO_DATE >= TRUNC(SYSDATE)
AND E.FROM_DATE <= TRUNC(SYSDATE) AND E.TO_DATE >= TRUNC(SYSDATE)

Then the execution time fell to less than 1 seconds.

To explain why, I shall use a very simple example that is easy to reproduced. In the following script, I have created a table with a 1000 rows and a few columns. Column B is just for padding so that the rows are not unrealistically small. Columns C, D and E contain some dates. The data is evenly distributed in C and D. In E the distribution is deliberately uneven, there are more dates further in the past.

DROP TABLE t1;
CREATE TABLE t1
(a NUMBER NOT NULL
,b VARCHAR2(2000) NOT NULL
,c DATE NOT NULL
,d DATE NOT NULL
,e DATE NOT NULL);

INSERT INTO t1
SELECT rownum
, RPAD(TO_CHAR(TO_DATE(rownum,'J'),'Jsp'),1000,'.') padd
, SYSDATE-rownum+4.2
, SYSDATE-rownum+42
, SYSDATE-sqrt(rownum)+4.2
FROM dba_objects
WHERE rownum <= 1000 ; 

Lets look at the execution plans of a few simple SQLs. Both of the following queries return 4 rows, but the cardinality is very different. In the first statement I have used just the simple TRUNC(SYSDATE), the optimizer has correctly worked out that the query will return 4 rows. However, in the second I have used the expansion of the %CurrentDateIn macro. Because the predicate contains a function, the optimizer uses a hard coded guess that the selectivity of the condition is 5% of the table.

SELECT * FROM t1 WHERE c > TRUNC(SYSDATE);
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=4 Bytes=4096)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=4 Bytes=4096)

SELECT * FROM t1 WHERE c > TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD');
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=50 Bytes=51200)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=50 Bytes=51200)

If I repeat the queries on column D, the cardinality goes up to 41 (it will in fact return 42 rows, but its close), but the cardinality is still 50.

SELECT * FROM t1 WHERE d > TRUNC(SYSDATE);
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=41 Bytes=41984)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=41 Bytes=41984)

SELECT * FROM t1 WHERE d > TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD');
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=50 Bytes=51200)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=50 Bytes=51200)

If I now try column E, the calculated cardinality the calculated is now 115, although the queries actually return 22 rows. The optimizer makes a mistake because the data values are not evenly distributed between the high and low values for the column.

SELECT * FROM t1 WHERE e > TRUNC(SYSDATE);
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=115 Bytes=117760)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=115 Bytes=117760)

SELECT * FROM t1 WHERE e > TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD');

Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=50 Bytes=51200)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=50 Bytes=51200)

So this test illustrates that the expansion of %CurrentDateIn prevents Oracle’s CBO from evaluating the selectivity correctly, and that this causes it to use a guess that is sometimes too high, and sometimes too low, but always incorrect. In some cases this will lead to an inappropriate execution plan. If TRUNC(SYSDATE) were used, the optimizer would have better information about the SQL and be able to make better decisions.
I say that this is a bug in PeopleTools. There is, of course, a simple workaround. Simply code TRUNC(SYSDATE) in the application as a customisation where necessary. However, there is no reason why the expansion of this metaSQL could not be changed, and delivered as a PeopleTools patch.