I have been doing some on-line performance tuning on a PeopleSoft Financials system using PeopleSoft Performance Monitor (PPM). End-users have collect verbose PPM traces. Usually, when I use PPM in a production system, all the components are fully cached by the normal activity of the user (except when the application server caches have recently been cleared). However, when working in a user test environment it is common to find that the components are not fully cached. This presents two problems.
- The application servers spend quite a lot of time executing queries on the PeopleTools tables to load the components, pages and PeopleCode into their caches. We can see in the screenshot of the component trace that there is a warning message that component objects are not fully cached, and that these cache misses skew timings.
- In verbose mode, the PPM traces collect a lot of additional transactions capturing executions and fetches against PeopleTools tables. The PPM analytic components cannot always manage the resultant volume of transactions.
Figure 1. Component trace as collected by PPM |
Figure 2. SQL Summary of PPM trace with PeopleTools SQL |
Figure 3: Details of longest PMU with PeopleTools SQL |
Another consequence of the PeopleTools data is that it can take a long time to open the PMU tree. There is no screenshot of the PMU tree here because in this case I had so much data that I couldn't open it before the transaction timed out!
Solution
My solution to this problem is to delete the transactions that relate to PeopleTools SQL and correct the durations, and the number of executions and fetches held in summary transactions. The rationale is that these transactions would not normally occur in significant quantities in a real production system, and there is not much I can do about them when they do.The first step is to clone the trace. I could work on the trace directly, but I want to preserve the original data.
PPM transactions are held in the table PSPMTRANSHIST. They have a unique identifier PM_INSTANCE_ID. A single server round trip, also called a Performance Monitoring Unit (PMU), will consist of many transactions. They can be shown as a tree and each transaction has another field PM_PARENT_INST_ID which holds the instance of the parent. This links the data together and we can use hierarchical queries in Oracle SQL to walk the tree. Another field PM_TOP_INST_ID identifies the root transaction in the tree.
Cloning a PPM trace is simply a matter of inserting data into PSPMTRANSHIST. However, when I clone a PPM trace I have to make sure that the instance numbers are distinct but still link correctly. In my system I can take a very simple approach. All the instance numbers actually collected by PPM are greater than 1016. So, I will simply use the modulus function to consistently alter the instances to be different. This approach may break down in future, but it will do for now.
On an Oracle database, PL/SQL is a simple and effective way to write simple procedural processes. I have written two anonymous blocks of code.
Note that the cloned trace will be purged from PPM like any other data by the delivered PPM archive process.
REM xPT.sql BEGIN --duplicate PPM traces FOR i IN ( SELECT h.* FROM pspmtranshist h WHERE pm_perf_trace != ' ' /*rows must have a trace name*/ -- AND pm_perf_trace = '9b. XXXXXXXXXX' /*I could specify a specific trace by name*/ AND pm_instance_id > 1E16 /*only look at instance > 1e16 so I do not clone cloned traces*/ ) LOOP INSERT INTO pspmtranshist (PM_INSTANCE_ID, PM_TRANS_DEFN_SET, PM_TRANS_DEFN_ID, PM_AGENTID, PM_TRANS_STATUS, OPRID, PM_PERF_TRACE, PM_CONTEXT_VALUE1, PM_CONTEXT_VALUE2, PM_CONTEXT_VALUE3, PM_CONTEXTID_1, PM_CONTEXTID_2, PM_CONTEXTID_3, PM_PROCESS_ID, PM_AGENT_STRT_DTTM, PM_MON_STRT_DTTM, PM_TRANS_DURATION, PM_PARENT_INST_ID, PM_TOP_INST_ID, PM_METRIC_VALUE1, PM_METRIC_VALUE2, PM_METRIC_VALUE3, PM_METRIC_VALUE4, PM_METRIC_VALUE5, PM_METRIC_VALUE6, PM_METRIC_VALUE7, PM_ADDTNL_DESCR) VALUES (MOD(i.PM_INSTANCE_ID,1E16) /*apply modulus to instance number*/ ,i.PM_TRANS_DEFN_SET, i.PM_TRANS_DEFN_ID, i.PM_AGENTID, i.PM_TRANS_STATUS, i.OPRID, SUBSTR('xPT'||i.PM_PERF_TRACE,1,30) /*adjust trace name*/, i.PM_CONTEXT_VALUE1, i.PM_CONTEXT_VALUE2, i.PM_CONTEXT_VALUE3, i.PM_CONTEXTID_1, i.PM_CONTEXTID_2, i.PM_CONTEXTID_3, i.PM_PROCESS_ID, i.PM_AGENT_STRT_DTTM, i.PM_MON_STRT_DTTM, i.PM_TRANS_DURATION, MOD(i.PM_PARENT_INST_ID,1E16), MOD(i.PM_TOP_INST_ID,1E16), /*apply modulus to parent and top instance number*/ i.PM_METRIC_VALUE1, i.PM_METRIC_VALUE2, i.PM_METRIC_VALUE3, i.PM_METRIC_VALUE4, i.PM_METRIC_VALUE5, i.PM_METRIC_VALUE6, i.PM_METRIC_VALUE7, i.PM_ADDTNL_DESCR); END LOOP; COMMIT; END; /
- PeopleTools SQL. Metric value 7 reports the SQL operation and SQL table name. So if the first word is SELECT and the second word is a PeopleTools table name then it is a PeopleTools SQL operation. A list of PeopleTools tables can be obtained from the object security table PSOBJGROUP.
- Implicit Commit transactions. This is easy - it is just transaction type 425.
- Correct transaction duration for any parents of transaction. I work up the hierarchy of transactions and deduct the duration of the transaction that I am deleting from all of the parent.
- Transaction types 400, 427 and 428 all record PeopleTools SQL time (metric 66). When I come to that transaction I also deduct the duration of the deleted transaction from the PeopleTools SQL time metric in an parent transaction.
- Delete any children of the transactions that I delete.
- I must also count each PeopleTools SQL Execution transaction (type 408) and each PeopleTools SQL Fetch transaction (type 414) that I delete. These counts are also deducted from the summaries on the parent transaction 400.
Although this technique of first cloning the whole trace and then deleting the PeopleTools operations can be quite slow, it is not something that you are going to do very often.
REM xPT.sql REM (c)Go-Faster Consultancy Ltd. 2014 set serveroutput on echo on DECLARE l_pm_instance_id_m4 INTEGER; l_fetch_count INTEGER; l_exec_count INTEGER; BEGIN /*now remove PeopleTools SQL transaction and any children and adjust trans durations*/ FOR i IN ( WITH x AS ( /*returns PeopleTools tables as defined in Object security*/ SELECT o.entname recname FROM psobjgroup o WHERE o.objgroupid = 'PEOPLETOOLS' AND o.enttype = 'R' ) SELECT h.pm_instance_id, h.pm_parent_inst_id, h.pm_trans_duration, h.pm_trans_defn_id FROM pspmtranshist h LEFT OUTER JOIN x ON h.pm_metric_value7 LIKE 'SELECT '||x.recname||'%' AND x.recname = upper(regexp_substr(pm_metric_value7,'[^ ,]+',8,1)) /*first word after select*/ WHERE pm_perf_trace like 'xPT%' /*restrict to cloned traces*/ -- AND pm_perf_trace = 'xPT9b. XXXXXXXXXX' /*work on a specific trace*/ AND pm_instance_id < 1E16 /*restrict to cloned traces*/ AND ( x.recname IS NOT NULL OR h.pm_trans_defn_id IN(425 /*Implicit Commit*/)) ORDER BY pm_instance_id DESC ) LOOP l_pm_instance_id_m4 := TO_NUMBER(NULL);
IF i.pm_parent_inst_id>0 AND i.pm_trans_duration>0 THEN FOR j IN( SELECT h.pm_instance_id, h.pm_parent_inst_id, h.pm_top_inst_id, h.pm_trans_defn_id , d.pm_metricid_3, d.pm_metricid_4 FROM pspmtranshist h INNER JOIN pspmtransdefn d ON d.pm_trans_defn_set = h.pm_trans_defn_set AND d.pm_trans_defn_id = h.pm_trans_Defn_id START WITH h.pm_instance_id = i.pm_parent_inst_id CONNECT BY prior h.pm_parent_inst_id = h.pm_instance_id ) LOOP /*decrement parent transaction times*/ IF j.pm_metricid_4 = 66 /*PeopleTools SQL Time (ms)*/ THEN --decrement metric 4 on transaction 400 --dbms_output.put_line('ID:'||i.pm_instance_id||' Type:'||i.pm_trans_defn_id||' decrement metric_value4 by '||i.pm_trans_duration); UPDATE pspmtranshist SET pm_metric_value4 = pm_metric_value4 - i.pm_trans_duration WHERE pm_instance_id = j.pm_instance_id AND pm_trans_Defn_id = j.pm_trans_defn_id AND pm_metric_value4 >= i.pm_trans_duration RETURNING pm_instance_id INTO l_pm_instance_id_m4; ELSIF j.pm_metricid_3 = 66 /*PeopleTools SQL Time (ms)*/ THEN --SQL time on serialisation --dbms_output.put_line('ID:'||i.pm_instance_id||' Type:'||i.pm_trans_defn_id||' decrement metric_value3 by '||i.pm_trans_duration); UPDATE pspmtranshist SET pm_metric_value3 = pm_metric_value3 - i.pm_trans_duration WHERE pm_instance_id = j.pm_instance_id AND pm_trans_Defn_id = j.pm_trans_defn_id AND pm_metric_value3 >= i.pm_trans_duration; END IF; UPDATE pspmtranshist SET pm_trans_duration = pm_trans_duration - i.pm_trans_duration WHERE pm_instance_id = j.pm_instance_id AND pm_trans_duration >= i.pm_trans_duration; END LOOP; END IF; l_fetch_count := 0; l_exec_count := 0; FOR j IN( /*identify transaction to be deleted and any children*/ SELECT pm_instance_id, pm_parent_inst_id, pm_top_inst_id, pm_trans_defn_id, pm_metric_value3 FROM pspmtranshist START WITH pm_instance_id = i.pm_instance_id CONNECT BY PRIOR pm_instance_id = pm_parent_inst_id ) LOOP IF j.pm_trans_defn_id = 408 THEN /*if PeopleTools SQL*/ l_exec_count := l_exec_count + 1; ELSIF j.pm_trans_defn_id = 414 THEN /*if PeopleTools SQL Fetch*/ l_fetch_count := l_fetch_count + j.pm_metric_value3; END IF; DELETE FROM pspmtranshist h /*delete tools transaction*/ WHERE h.pm_instance_id = j.pm_instance_id; END LOOP; IF l_pm_instance_id_m4 > 0 THEN --dbms_output.put_line('ID:'||l_pm_instance_id_m4||' Decrement '||l_exec_Count||' executions, '||l_fetch_count||' fetches'); UPDATE pspmtranshist SET pm_metric_value5 = pm_metric_value5 - l_exec_count , pm_metric_value6 = pm_metric_value6 - l_fetch_count WHERE pm_instance_id = l_pm_instance_id_m4; l_fetch_count := 0; l_exec_count := 0; END IF; END LOOP; END; /
Figure 4: Original and Cloned PPM traces |
When I open the cloned trace, both timings in the duration summary have reduced as have the number of executions and fetches. The durations of the individual server round trips have also reduced.
Figure 5: Component Trace without PeopleTools transactions |
All of the PeopleTools SQL operations have disappeared from the SQL summary.
Figure 6: SQL Summary of PPM trace after removing PeopleTools SQL transactions |
The SQL summary now only has 125 rows of data.
Figure 7: SQL Summary of PMU without PeopleTools SQL |
Now, the PPM tree component opens quickly and without error.
Figure 8: PMU Tree after removing PeopleTools SQL |
There may still be more transactions in a PMU than I can show in a screenshot, but I can now find the statement that took the most time quite quickly.
Figure 9: Long SQL transaction further down same PMU tree |
Conclusions
I think that it is reasonable and useful to remove PeopleTools SQL operations from a PPM trace.In normal production operation, components will mostly be cached, and this approach renders traces collected in non-production environments both usable in the PPM analytic components and more realistic for performance tuning. However, it is essential that when deleting some transactions from a PMU, that summary data held in other transactions in the same PMU are also corrected so that the metrics remain consistent.
No comments :
Post a Comment