Wednesday, October 25, 2006

DDL Triggers to prevent loss of database objects not managed by PeopleTools

Sometimes you have to certain database techniques or create database objects on tables that
are maintained by PeopleTools, but which themselves are not maintained by PeopleTools. This is often as a result of performance tuning activities where you choose to use features of the Oracle database that PeopleSoft do not use because they are not available on other database platforms.
  • Function Based Indexes: In Oracle it is possible to implement an index on a function. A typical example would be an index on UPPER(NAME) on PS_NAMES to facilitate case
    insensitive searching.
  • Record based auditing can be done with a database DML trigger, instead of the default functionality of the Application Server. This was implemented by PeopleSoft to improve performance of the auditing, and is a rare example of PeopleSoft coding specific code for each platform because the trigger DDL is slightly different.
However, these objects are not created by the DDL scripts built by the Application Designer, and can accidentally be lost when the table is altered with an 'alter by recname' script generated by Application Designer. The implications can be quite serious. If the auditing trigger were to be lost, then the application would continue to run, but no audit data would be produced, and no error would be raised.
In Oracle it is possible to build DDL triggers. Just as DML triggers fire when the data is changed, DDL triggers fire when particular DDL commands are issued. I have created a trigger called PSFT_DDL_LOCK (available from the Go-Faster website) that fires when an object is altered or dropped. In certain cases the trigger will raise an error, this causes the original DDL command to fail, and thus prevents loss of the unmanaged objects. If the table related to the object being dropped or altered is not managed by PeopleSoft (if it can't be found in PSRECDEFN), the trigger does not raise any error. Otherwise,
  • If a trigger is being dropped or altered, and the name of that trigger does not start with PSU, then an error is raised. Triggers that are named PSU% are created by Application Designer for use with Mobile Agents If an index is dropped or altered, the trigger checks that it is defined in PeopleTools. Indexes that correspond to Unique, Duplicate and Alternate Keys in Application Designer (where the index name is 'PS', followed by either a digit or an underscore, followed by the record name) are ignored.
  • If a table is dropped or altered, the DDL trigger checks than there are no user indexes or triggers not defined in PeopleSoft, nor any primary key constraints, materialized views or materialized view logs on the table. It also checks that the table or index is not partitioned, clustered, global temporary or index organised.
If any of the tests fail, then the trigger raises an exception the DDL statement fails with an error message generated in the trigger. The SQL that generated the error is also included in the error text.
When an error is generated by this trigger during development or migration activities, it usually indicates that there is another database object that you need to consider before issuing the command that errorred. It is not simply a matter of disabling the trigger and trying again.
The trigger does have a couple of side effects.
  • There are a number of SQL statements that are run in the trigger, and this does impact the performance of DDL commands. If you are dropping all the tables in a schema, then it would be advisable to disable the trigger. The trigger definitely needs the following function based index to be created on PSRECDEFN, because it needs to look up the PeopleSoft record from the table name.
CREATE INDEX pszpsrecdefn
ON psrecdefn (DECODE(sqltablename,' ','PS_'recname,sqltablename))
TABLESPACE PSINDEX PCTFREE 0;
  • If a user index is removed from the PeopleTools definition before it is dropped, the trigger will raise an error. However, in such cases it would be better to keep the definition of the index in Application Designer and set the platform radio button to 'None' so that PeopleTools does not build it. That way a comment can be preserved to explain why the index is no longer necessary.
Updated 2.6.2011: The checking behaviour trigger can be disabled for just the current session by calling this packaged procedure that sets a package global variable which is read by a function called by the trigger. For the one session where this is done, the checks are not performed by trigger are disabled:
execute psft_ddl_lock.set_ddl_permitted(TRUE);

The behaviour can be reenabled like this:
execute psft_ddl_lock.set_ddl_permitted(FALSE);

The previous version of this trigger (called T_LOCK) did not have this capability, and so I suggested disabling the trigger, but this affected all sessions.

Thursday, October 19, 2006

Nullable VARCHAR2 columns in PeopleSoft

Since time immemorial, we were told by PeopleSoft that all numeric, character or required fields in their applications are created as NOT NULL columns in the database. And so we confidently repeat the mantra and build our SQL accordingly.

But it is not completely true!

Long Character fields in PeopleSoft that are not defined as 'required' in the Application Designer are nullable in the database. Long Character field are usually created as LONG columns in the database unless their length is defined as between 1 and 2000 characters, in which case they are created as VARCHAR2 columns.

So a Long Character field of between 1 and 2000 characters becomes a nullable VARCHAR2 column in the database (it has taken me 10 years to find this, and it's yet another piece of errata in my book!).

The edit and key options in PeopleTools are restricted on a long field, but nonetheless it opens some interesting possibilities. Including sparse indexes on nullable VARCHAR2 columns, where you identify rows for a query with non-null flags, and set the flag to null afterwards, thus removing it from the index.

Wednesday, October 11, 2006

What not to Code: The evils of tables as sequence generators and timestamps for uniqueness

This is a story about the dangers of using tables as sequence generators and then the risks of using date/time functions to generate unique keys. In this case, a simple and initially effective change resulted in a time-delayed logic bomb of remarkable subtlety and devastating effect.
The story happens to be about a PeopleSoft system, but it could have happened to any application. It starts with a pair of ERP systems. Changes in the HR system generate cases in the CRM system that the back office HR helpdesk processes. The HR system sends a message to the CRM system, where it may turn into one or more workflow messages. Some HR actions can give rise to many hundreds or even thousands of CRM workflows. For each Workflow, a run control record is created on a table that is keyed on the operator ID and a Run Control ID (an arbitrary ID to identify the control record).
The vanilla version of this code used a sequence number generated, in the usual PeopleSoft platform agnostic way, from a table. In extreme situations, a batch HR process can generate many thousands of workflows. These messages are processed by the subscription servers of an application server domain. This system is busy enough to have several of these domains, each with several subscription handler processes. So isn't long before contention on the row level lock taken during the sequence generation table became an issue. It was noticed that adding additional subscription handlers didn't increase throughput, and that database locking also impeded online users who would also execute this code.
So a developer changed the code and now the run control was a string derived by concatenating the operator ID (originally it used just the first three characters of the operator ID) and the current Time. In case this run control already existed (usually because a message generated several work flows) the code goes into a loop, generating a new run control from the operator and time each time. Adding more subscription handlers still doesn't increase message throughput - if anything it makes it worse.
Three more facts to bear in mind:
  1. The time is in the format HH24:MI:SS.
  2. The publish and subscribe application server processes all connect as the operator PSAPPS.
  3. The run control table is not purged.
Function CreateRunControl(...) Returns string

REM &run_cntl_id = String(GetNextNumberWithGaps( RB_WF_RTEKEYMAX.WF_SCHED_NBR_NEXT, 1999999999, 1));
/* workaround to avoid dead locks ... */
&run_cntl_id = %OperatorId String(%Time);
&chk_rc = "N";
SQLExec("SELECT 'Y' FROM PS_RB_RUN_CNTL_WF WHERE RUN_CNTL_ID = :1", &run_cntl_id, &chk_rc);
While &chk_rc = "Y"
&chk_rc = "N";
&run_cntl_id = %OperatorId String(%Time);
SQLExec("SELECT 'Y' FROM PS_RB_RUN_CNTL_WF WHERE RUN_CNTL_ID = :1", &run_cntl_id, &chk_rc);
End-While;
...
&return = &recRouteKeys.Insert(); /*Inserts run control*/
So yesterday when user FRED generated a workflow at 10.27.42 the run control is 'FRED10.27.42'. If today he generates another workflow at 10.27.42 the code will loop until 10.27.43 until it can generate run control that has not been used. In fact, Fred can never generate a run control at 10.27.42 ever again. Also, if a single operation generates 3 run controls then it will take at least 2 extra seconds to allocate the second and third run control.
The situation for the message processing application server is even more severe. It can generate hundreds or even thousands of workflows in an evening. The subscription handlers execute the loop in the code until a time of day when they have not processed a message before.
On a day that I was on this site, I heard that a worker in the HR call centre had had nothing to do all day, and then at 5.30 in the evening, as she was about to go home, 15 new cases appeared on her list and she was not happy (call centre workers are measured on the number of cases they process and the length of their queues). Sure enough, we found that there were over 50000 records, 3600 records per hour for user PSAPPS continuously from 3am to 5.30pm. When we got to 86400 records processing of workflow from inbound messages would have stopped completely! We also found that some operators have generated over 40 messages in a given minute and so the save time of some of their operations is starting to be seriously affected.
The correct solution to all of this would be to have used an Oracle sequence. You get a unique number every time, and there is no contention on its allocation. In this application, there is no objection to any gaps in the sequence.
The original code suffered from database locking problems that effectively serialised run control processing. The code change worked well in test and initially worked fairly well in production. Although the message processing application servers could only produce a maximum of one workflow per second before they would start to execute the loop, which they could do over 100 times per second. Thus database locking was replaced with high CPU consumption in the application server as it looped, and additional SQL execution and logical I/O in the database leading to increased latch contention and also increasing database CPU consumption. As the problem became more severe, other parts of the application also suffered as users waited for their requests to be processed by an available process in the application server. It all amounts to a viscous and downward spiral.

Conclusions

  • Oracle recognised the problems of sequence generation, so they created Sequences. You've paid good money for them, so use them.
  • While date and time functions can appear to produce unique values, you need to think carefully about what will happen in a busy multi-user system.

Tuesday, October 10, 2006

Global nVision performance options

Another back door into PeopleSoft: I've been working on nVision recently. In PeopleTools 8.x, the nVision performance options are stored on the tree definition table, PSTREEDEFN (in previous versions, they were stored in the layout). The performance options should be set as follows

  • Use Literal Values: The SQL generated by nVision uses a list of litteral values instead of a joining to one or more tree selector tables. This simplifies the query by reducing the number of tables queried, although the query will be much longer because there will be many criteria on litteral values select from those same tree selector tables. The fewer tables in the from clauses, the fewer way the optimizer will have to execute the statement incorrectly!
  • Use Static Selectors: nVision queries data from the tree selector tables rather than build an extract of part of the trees. Sometimes nVision doesn't clear out this temporary data, usually when it errors for some reason. Hence, debris builds up in the tree selector tables over time an this degrades the performance of the queries.
  • Use Between on the Selection Criteria: This doesn't improve performance, but merely shortens the length of the SQL statement.

spool nvperfopts
SELECT T.SETID, T.TREE_NAME, T.EFFDT
, T.TREE_ACC_SELECTOR --static selctors
, X1.XLATSHORTNAME
, T.TREE_ACC_SEL_OPT --between
, X2.XLATSHORTNAME
, T.TREE_ACC_METHOD --literals
, X3.XLATSHORTNAME
FROM PSTREEDEFN T
, PSXLATITEM X1
, PSXLATITEM X2
, PSXLATITEM X3
WHERE (T.TREE_ACC_SELECTOR != 'S'
OR T.TREE_ACC_SEL_OPT != 'B'
OR T.TREE_ACC_METHOD != 'L')
AND X1.FIELDNAME = 'TREE_ACC_SELECTOR'
AND X1.FIELDVALUE = TREE_ACC_SELECTOR
AND X2.FIELDNAME = 'TREE_ACC_SEL_OPT'
AND X2.FIELDVALUE = TREE_ACC_SEL_OPT
AND X3.FIELDNAME = 'TREE_ACC_METHOD'
AND X3.FIELDVALUE = TREE_ACC_METHOD;

/*increment the version numbers*/
UPDATE PSLOCK
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','TDM');

UPDATE PSVERSION
SET VERSION = VERSION + 1
WHERE OBJECTTYPENAME IN('SYS','TDM');

/*update nvision flags and version number on trees*/
UPDATE PSTREEDEFN
SET TREE_ACC_SELECTOR = 'S' --static selctors
, TREE_ACC_SEL_OPT = 'B' --between
, TREE_ACC_METHOD = 'L' --literals
, VERSION = (SELECT VERSION
FROM PSLOCK WHERE OBJECTTYPENAME = 'TDM')
WHERE TREE_ACC_SELECTOR != 'S'
OR TREE_ACC_SEL_OPT != 'B'
OR TREE_ACC_METHOD != 'L';

SELECT T.SETID, T.TREE_NAME, T.EFFDT
, T.TREE_ACC_SELECTOR --static selctors
, X1.XLATSHORTNAME
, T.TREE_ACC_SEL_OPT --between
, X2.XLATSHORTNAME
, T.TREE_ACC_METHOD --literals
, X3.XLATSHORTNAME
FROM PSTREEDEFN T
, PSXLATITEM X1
, PSXLATITEM X2
, PSXLATITEM X3
WHERE (T.TREE_ACC_SELECTOR = 'S'
OR T.TREE_ACC_SEL_OPT = 'B'
OR T.TREE_ACC_METHOD = 'L')
AND X1.FIELDNAME = 'TREE_ACC_SELECTOR'
AND X1.FIELDVALUE = TREE_ACC_SELECTOR
AND X2.FIELDNAME = 'TREE_ACC_SEL_OPT'
AND X2.FIELDVALUE = TREE_ACC_SEL_OPT
AND X3.FIELDNAME = 'TREE_ACC_METHOD'
AND X3.FIELDVALUE = TREE_ACC_METHOD;

SETID TREE_NAME EFFDT T XLATSHORTN T XLATSHORTN T XLATSHORTN
----- ------------------ ------------------- - ---------- - ---------- - ----------
QUERY_TREE_WF 00:00:00 01/01/1990 S Static B Between L Literal
QUERY_TREE_PT 00:00:00 01/01/1900 S Static B Between L Literal
QUERY_TREE_OLAP 00:00:00 01/01/1900 S Static B Between L Literal
SPN01 SPANISH_ACCOUNT 00:00:00 01/01/1900 S Static B Between L Literal
SHR03 ALL_PURCHASE_ITEMS 00:00:00 01/01/1900 S Static B Between L Literal
SHR02 ALL_PURCHASE_ITEMS 00:00:00 01/01/1900 S Static B Between L Literal
SHARE VENDOR_TYPE1 00:00:00 01/01/1900 S Static B Between L Literal
SHARE USER_LIMITS 00:00:00 01/01/1900 S Static B Between L Literal
SHARE TR_INSTRUMENTS 00:00:00 01/01/1900 S Static B Between L Literal
SHARE TR_CASH_POSITION 00:00:00 01/01/1900 S Static B Between L Literal
SHARE STOCK_AVAILABILITY 00:00:00 01/01/1900 S Static B Between L Literal
SHARE SALES_PERSON 00:00:00 01/01/1900 S Static B Between L Literal


spool off

Friday, October 06, 2006

SQL Tips for Developers: Performance Improvements with the APPEND hint

In all PeopleSoft products, it is common to find batch programs populating large working storage or reporting tables. When using Oracle RDBMS, the append hint can be used to reduce the overhead and thus improve the performance of the insert operation. Thus

INSERT /*+ APPEND*/ INTO ... SELECT ...

It works by permitting the process to insert the data directly into new data blocks at the end of the object, rather than using space currently allocated to the object. So the new data is appended to the table. This is much faster because it permits Oracle to by-pass some of the processes that occur for each row inserted, such as index maintenance which only occurs when the insert is committed. Significantly, the hint also reduces the volume of redo written.

Every time a piece of data is changed in an Oracle database, the new data is written to the redo logs. This information can then be used to reapply changes to a database during media recovery, or it can be applied to a standby database as a part of a disaster recover solution (Oracle call this technique 'Physical DataGuard'). Every change, even to working storage tables and including indexes, generates redo information.

The reduction in redo can be very significant in systems using DataGuard across a WAN. Bandwidth constraints can cause a backlog of redo information waiting to be transmitted. In most systems the majority of redo is generated by overnight batch processes and it can take time during the day for that backlog to clear and for the DR system to catch up with production. Hence, reducing redo is important.

Therefore the APPEND hint should be used in insert statements when both of the following criteria are met:
  1. The table has been created or truncated prior to the insert. It is not sufficient to delete all the rows because the empty blocks remain allocated to the table.
  2. The table should be populated in a single monolithic insert statement. Multiple statements should be combined using the UNION ALL operator between the queries. Otherwise most of the benefit will be lost during multiple index maintenance phases.
It is easy to demonstrate this with some a simple test. I will create a simple table with a unique index, just as PeopleSoft would. Then I will truncate it and populate it, first without the APPEND hint, and then with it.

CREATE TABLE t
(a NUMBER
,b VARCHAR2(200));

CREATE UNIQUE INDEX t ON t(a);

TRUNCATE TABLE t;
INSERT INTO t
SELECT ROWNUM, RPAD('42',200,'.')
FROM dba_objects;

TRUNCATE TABLE t;
INSERT /*+ APPEND*/ INTO t
SELECT ROWNUM, RPAD('42',200,'.')
FROM dba_objects;

The session statistics show the savings in a number of areas.

Session Stats               No Hint Append Hint
------------------------ ---------- -----------
db block changes 202,926 3,192
db block gets 159,148 4,796
db block gets direct 1,413
db block gets from cache 159,148 3,383
DB time 1,308 828
redo entries 104,842 1,853
redo size 32,391,936 2,838,148

Fewer blocks are changed because they are not written and then rewritten as each row is inserted. Table blocks are only written once during the insert and the index is only maintained when the insert is committed.

Other recommendations:
  • When using the APPEND hint, the insert should be committed immediately. It must be committed before the table can be referenced, otherwise an error will be generated.

  • The data in the table will have been completely replaced by this point, so it is almost certainly appropriate to refresh the optimiser statistics. In Applicaition Engine use the %UpdateStats MetaSQL.

  • The APPEND hint cannot be used in conjuction with the %InsertSelect MetaSQL because the hint MUST be placed between the 'insert' and 'into' keywords. NB: The statement will not fail if you get the hint wrong.