I have been doing some on-line performance tuning on a PeopleSoft Financials system using PeopleSoft Performance Monitor (PPM). End-users have collect verbose PPM traces. Usually, when I use PPM in a production system, all the components are fully cached by the normal activity of the user (except when the application server caches have recently been cleared). However, when working in a user test environment it is common to find that the components are not fully cached. This presents two problems.
- The application servers spend quite a lot of time executing queries on the PeopleTools tables to load the components, pages and PeopleCode into their caches. We can see in the screenshot of the component trace that there is a warning message that component objects are not fully cached, and that these cache misses skew timings.
- In verbose mode, the PPM traces collect a lot of additional transactions capturing executions and fetches against PeopleTools tables. The PPM analytic components cannot always manage the resultant volume of transactions.
|Figure 1. Component trace as collected by PPM|
|Figure 2. SQL Summary of PPM trace with PeopleTools SQL|
|Figure 3: Details of longest PMU with PeopleTools SQL|
Another consequence of the PeopleTools data is that it can take a long time to open the PMU tree. There is no screenshot of the PMU tree here because in this case I had so much data that I couldn't open it before the transaction timed out!
SolutionMy solution to this problem is to delete the transactions that relate to PeopleTools SQL and correct the durations, and the number of executions and fetches held in summary transactions. The rationale is that these transactions would not normally occur in significant quantities in a real production system, and there is not much I can do about them when they do.
The first step is to clone the trace. I could work on the trace directly, but I want to preserve the original data.
PPM transactions are held in the table PSPMTRANSHIST. They have a unique identifier PM_INSTANCE_ID. A single server round trip, also called a Performance Monitoring Unit (PMU), will consist of many transactions. They can be shown as a tree and each transaction has another field PM_PARENT_INST_ID which holds the instance of the parent. This links the data together and we can use hierarchical queries in Oracle SQL to walk the tree. Another field PM_TOP_INST_ID identifies the root transaction in the tree.
Cloning a PPM trace is simply a matter of inserting data into PSPMTRANSHIST. However, when I clone a PPM trace I have to make sure that the instance numbers are distinct but still link correctly. In my system I can take a very simple approach. All the instance numbers actually collected by PPM are greater than 1016. So, I will simply use the modulus function to consistently alter the instances to be different. This approach may break down in future, but it will do for now.
On an Oracle database, PL/SQL is a simple and effective way to write simple procedural processes. I have written two anonymous blocks of code.
Note that the cloned trace will be purged from PPM like any other data by the delivered PPM archive process.
Now I will work on the cloned trace. I want to remove certain transaction.
- PeopleTools SQL. Metric value 7 reports the SQL operation and SQL table name. So if the first word is SELECT and the second word is a PeopleTools table name then it is a PeopleTools SQL operation. A list of PeopleTools tables can be obtained from the object security table PSOBJGROUP.
- Implicit Commit transactions. This is easy - it is just transaction type 425.
- Correct transaction duration for any parents of transaction. I work up the hierarchy of transactions and deduct the duration of the transaction that I am deleting from all of the parent.
- Transaction types 400, 427 and 428 all record PeopleTools SQL time (metric 66). When I come to that transaction I also deduct the duration of the deleted transaction from the PeopleTools SQL time metric in an parent transaction.
- Delete any children of the transactions that I delete.
- I must also count each PeopleTools SQL Execution transaction (type 408) and each PeopleTools SQL Fetch transaction (type 414) that I delete. These counts are also deducted from the summaries on the parent transaction 400.
Although this technique of first cloning the whole trace and then deleting the PeopleTools operations can be quite slow, it is not something that you are going to do very often.
Now, I have a second PPM trace that I can open in the analytic component.
|Figure 4: Original and Cloned PPM traces|
When I open the cloned trace, both timings in the duration summary have reduced as have the number of executions and fetches. The durations of the individual server round trips have also reduced.
|Figure 5: Component Trace without PeopleTools transactions|
All of the PeopleTools SQL operations have disappeared from the SQL summary.
|Figure 6: SQL Summary of PPM trace after removing PeopleTools SQL transactions|
The SQL summary now only has 125 rows of data.
|Figure 7: SQL Summary of PMU without PeopleTools SQL|
Now, the PPM tree component opens quickly and without error.
|Figure 8: PMU Tree after removing PeopleTools SQL|
There may still be more transactions in a PMU than I can show in a screenshot, but I can now find the statement that took the most time quite quickly.
|Figure 9: Long SQL transaction further down same PMU tree|
ConclusionsI think that it is reasonable and useful to remove PeopleTools SQL operations from a PPM trace.
In normal production operation, components will mostly be cached, and this approach renders traces collected in non-production environments both usable in the PPM analytic components and more realistic for performance tuning. However, it is essential that when deleting some transactions from a PMU, that summary data held in other transactions in the same PMU are also corrected so that the metrics remain consistent.