Saturday, October 21, 2017

nVision Performance Tuning: 2. Effects of Performance Options

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

In this post, I examine the effect of different settings for the tree performance options on how the SQL is generated.  It is common, to see different performance options in use on different trees in the same SQL query.  It is important to be able to look at a piece of SQL generated by nVision and to be able to work out which performance options have been set on which trees.
  • Access Method: join to tree selector –v- use literal values
  • Tree Selectors: statics –v- dynamic
  • Selector Options: single values –v- inequalities/between

Access Method

This option allows you control how the tree selector table is combined with the ledger table.  The choices of interest are whether to join the tree selector, or whether to use literal values where feasible.

Join

Here, two tree selectors are joined to the ledger table
SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT)
FROM PS_LEDGER A, PSTREESELECT10 L4, … PSTREESELECT10 L 
WHERE A.LEDGER='ACTUALS'
AND A.FISCAL_YEAR=2017 
…
AND L.SELECTOR_NUM=73130 
AND A.CHARTFIELD1=L.RANGE_FROM_10
AND A.CURRENCY_CD='GBP'
GROUP BY L4.TREE_NODE_NUM
It is not feasible to use literal values for L4 because the data is grouped by L4.TREE_NODE, however, it is feasible to use literal values for L because no column from this table is included in either the SELECT or GROUP BY clauses.

Literal Values

When 'use literal values' is selected nVision replaces the join to the tree selector table with a set of literal values generated from the contents of the tree selector table.
In this example, the join to the CHARTFIELD1 tree selector has been replaced with a series of literal value predicates.  There is one for every selected tree leaf.  A simple equality condition for single value leaves, and a BETWEEN condition for ranged leaves.
SELECT L2.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT)
FROM PS_LEDGER A, PSTREESELECT10 L2
WHERE A.LEDGER='ACTUALS'
AND A.FISCAL_YEAR=2017
AND A.ACCOUNTING_PERIOD=1
…
AND L2.SELECTOR_NUM=101142
AND A.ACCOUNT=L2.RANGE_FROM_10
AND (A.CHARTFIELD1='0070700'
OR A.CHARTFIELD1='0078999'
…
OR A.CHARTFIELD1 BETWEEN '0091100' AND '0091101'
OR A.CHARTFIELD1='0150204')
AND A.CURRENCY_CD='GBP'
GROUP BY L2.TREE_NODE_NUM

Recommendations

In general, the literal values option is very beneficial.
  • It simplifies the queries generated by nVision by removing a table from the from clause and leaves the optimizer with fewer choices when determining the execution plan.
  • Data is filtered on the scan of the ledger table rather than when it is joined to the tree selector.  On an engineered system these predicates can be pushed down to the storage cells.
However, there are some disadvantages
  • As the number of leaves on a tree increase, so the number of literal predicates in the query increases, and the time taken to parse the SQL increases.  As a rough guideline, I would say that the parse time starts to become significant as the tree exceeds 2000 leaves, and you might be better using the Join Option.
  • Also, as the number of literal predicates increase the time taken to evaluate them increases.  This applies to both conventional and engineered systems.
  • Some PeopleSoft customers have found it generally beneficial to the system to raise the value of OPTIMIZER_DYNAMIC_SAMPLING from the default of 2 to 4 so that Oracle collects dynamic statistics if there are expressions in where clause predicates or multiple predicates on the same table (see Database SQL Tuning Guide -> Managing Optimizer Statistics).  However, this can significantly increase the parse overhead of nVision SQL using many predicates.  If necessary, this parameter can be reset at session level for nVision with a trigger.
  • The nVision client also has to do more work to generate the SQL.
  • Literal values increase the variability of SQL generated, reports using different tree nodes will generate SQL with different numbers of predicates and there is effectively no chance that any two statements would be similar enough to use any of Oracle's plan stability technologies (SQL baselines, SQL profiles, SQL plan directives).
  • In theory, cursor sharing could reduce the parse, but in my experience is that the optimizer does not correctly cost the predicates resulting in different execution plans with poorer performance.

Tree Selectors

This option allows you to choose whether a tree selector is managed in a static or dynamic manner.  As you will see, the Tree Selectors performance option is tied in with the Selector Options performance option.

Static

Where a tree uses a static selector, the entire tree is extracted into the tree selector.  There will be no tree node number criteria in the extract statement.
INSERT INTO PSTREESELECT10
(SELECTOR_NUM,TREE_NODE_NUM,RANGE_FROM_10,RANGE_TO_10) 
SELECT 78722,L.TREE_NODE_NUM, SUBSTR(L.RANGE_FROM,1,10),SUBSTR(L.RANGE_TO,1,10)
FROM PSTREELEAF L 
WHERE L.SETID='GLOBE' 
AND L.SETCNTRLVALUE=' ' 
AND L.TREE_NAME='FUNCTION' 
AND L.EFFDT=TO_DATE('1901-01-01','YYYY-MM-DD')
Note that the tree leaf was not joined to the underlying detail table.  Ranged leaves become ranged selectors.  This is because there is no mechanism to determine when the detail table has changed and thus when the static selector would be out of date and need to be refreshed.  This is why single value joins cannot be performed with static selectors.
Tree node number criteria will appear in the nVision queries to identify the required selectors.
SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
FROM PS_LEDGER A, PSTREESELECT10 L4, PSTREESELECT10 L2, PSTREESELECT10 L 
WHERE A.LEDGER='ACTUALS' 
AND A.FISCAL_YEAR=2017 
AND A.ACCOUNTING_PERIOD=1 
AND L4.SELECTOR_NUM=89052 
AND A.CHARTFIELD3>= L4.RANGE_FROM_10 
AND A.CHARTFIELD3 <= L4.RANGE_TO_10 
AND L4.TREE_NODE_NUM BETWEEN 1000000000 AND 2000000000
A control row is maintained on the table PSTREESELCTL.  This row indicates that a particular selector is static, and the version number is compared to the version number of PSTREEDEFN to determine whether the selector is valid, or whether the tree has changed.  If the tree has changed it will have a higher version number, and the selector will be extracted again.  The selector number from the query can be looked up on this table to identify the tree.

Dynamic

Dynamic selectors are built on-the-fly by nVision as the report runs.  They select just the rows from the tree that are required by the report.
INSERT INTO PSTREESELECT10
(SELECTOR_NUM,TREE_NODE_NUM,RANGE_FROM_10,RANGE_TO_10) 
SELECT DISTINCT 108090,L.TREE_NODE_NUM,D.ACCOUNT,D.ACCOUNT FROM
PS_GL_ACCOUNT_TBL D, PSTREELEAF L 
WHERE L.SETID='GLOBE' AND L.SETCNTRLVALUE=' ' 
AND L.TREE_NAME='ACCOUNT' 
AND L.EFFDT=TO_DATE('1901-01-01','YYYY-MM-DD') 
AND L.RANGE_FROM<>L.RANGE_TO 
AND D.ACCOUNT BETWEEN L.RANGE_FROM AND L.RANGE_TO 
AND D.SETID='GLOBE' AND
(L.TREE_NODE_NUM BETWEEN 789473386 AND 793372019 
OR L.TREE_NODE_NUM BETWEEN 810915873 AND 812865189 
…
OR L.TREE_NODE_NUM BETWEEN 1089668204 AND 1095516154 
OR L.TREE_NODE_NUM BETWEEN 1113060008 AND 1115009324)
All the tree node number predicates are found in the query that populates the selector table and not in the query that generates the report.
SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
FROM PS_LEDGER A, PSTREESELECT10 L4, PSTREESELECT10 L2, PSTREESELECT10 L 
WHERE A.LEDGER='S_USMGT' 
AND A.FISCAL_YEAR=2017 
AND A.ACCOUNTING_PERIOD=1 
…
AND L.SELECTOR_NUM=96774 
AND A.CHARTFIELD1=L.RANGE_FROM_10 AND
A.CURRENCY_CD='USD' GROUP BY L4.TREE_NODE_NUM
nVision will delete dynamic selectors after they have been used.  However, if the report crashes or times out, rows will be left in the selector table, and this debris can build up over time.  So there is a requirement to manage these tables.
Note that in this example when the selector was extracted, the tree leaf table was joined to the underlying table on which the tree was based (in this case GL_ACCOUNT_TBL).  This occurs when the single value join option is selected.  The tree selector contains a row for each value rather than for each tree leaf.  This will result in larger tree selectors where ranged leaves are used.
PeopleSoft do not provide any way to identify the tree used to create a dynamic selector.  From the SQL query, all we can see is the field to which it was joined.  The above example is a tree related to CHARTFIELD1.

Recommendations

Although static selectors are easier to manage, the decision to use dynamic selectors is driven by the need to use single value joins.
  • It is important to maintain up-to-date statistics on the selector tables.  As new selectors are extracted, a series of ever-increasing selector numbers are allocated from a sequence maintained on the table PSTREESELNUM.   If not maintained, the selector numbers will exceed the high value recorded in the column statistics for column SELECTOR_NUM, and the database will not correctly calculate the number of rows expected, resulting in inappropriate execution plans.
  • It is easier to manage optimizer statistics on static selectors because the PSTREESELCTL table records when they have been updated.  In fact, it is possible to put a trigger on that table to maintain statistics.  However, static selectors imply range predicates that bring other problems that I discuss in the next section.
  • Using compound triggers on the selector tables it is possible to maintain a log of selectors, maintain statistics on them and purge selectors left behind by nVision reports that have not cleaned up after themselves.  This is also used to identify the trees used in dynamic selectors.  I will describe this solution later.
Recommendation: Prior to Oracle 11g, I have recommended using static selectors and ranged joins, and by careful management of statistics on the tree selector and indexing of the ledger tables good results can be obtained.  The tree selectors are Cartesian joined together before using index lookups on the ledger tables.
However, this approach is generally not effective for queries that reference more than 3 tree selector, nor for very large trees.  The size of the Cartesian product is the product of the number of rows extracted from each tree selectors table.  For example, three selectors with 100 rows each will result in 1 million lookups on the ledger table.  As the size of the Cartesian product grows the number of index look-ups also grows.  There comes a point when it is better to remove the least selective tree from the Cartesian and hash joining it after the looking up ledger.  Striking the correct balance can be difficult, and leads to using different performance options in different reports.
From Oracle 11g, and especially on engineered systems, I now prefer dynamic selectors because they permit the use of single value joins that can use Bloom filters and Extended Statistics.  Although, where smaller trees are in use that do not significantly cut down the query result set, range joins may still be preferable, in which cases, so might static selectors.

Selector Options

This performance option controls how the tree selector is joined to the ledger table.  The choice is between single value equality joins or inequality joins.

Ranged Joins

There are two forms of ranged joins, either using the BETWEEN keyword or using a pair of inequalities.  There is no difference to Oracle because the optimizer rewrites BETWEEN as a pair of inequalities.
NB: If you specify single value joins on a static selector you still get an inequality ranged join because only the tree leaves are extracted to the selector.  .  nVision still does this even if all of the leaves in the tree are single values and not ranges.
SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
FROM PS_LEDGER A, PSTREESELECT10 L4, PSTREESELECT10 L2, PSTREESELECT10 L 
WHERE A.LEDGER='ACTUALS' 
AND A.FISCAL_YEAR=2017 
AND A.ACCOUNTING_PERIOD=1 
AND L4.SELECTOR_NUM=89052 
AND A.CHARTFIELD3>= L4.RANGE_FROM_10 
AND A.CHARTFIELD3 <= L4.RANGE_TO_10 
…
AND A.CURRENCY_CD='GBP' 
GROUP BY L4.TREE_NODE_NUM

Single Value Joins

If single value joins are selected, nVision generates an equality join between the tree selector and the ledger table.  The tree selector must be dynamic, and nVision extracts all the intermediate values between the tree node ranges on the underlying detail table.
SELECT L4.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
FROM PS_LEDGER A, PSTREESELECT10 L4, PSTREESELECT10 L2, PSTREESELECT10 L 
WHERE A.LEDGER='ACTUALS' 
AND A.FISCAL_YEAR=2017 
AND A.ACCOUNTING_PERIOD=1 
…
AND L.SELECTOR_NUM=96774 
AND A.CHARTFIELD1=L.RANGE_FROM_10 
AND A.CURRENCY_CD='GBP' 
GROUP BY L4.TREE_NODE_NUM

Recommendations

In general, I would recommend using single value joins and therefore also dynamic selectors.  The main advantage is that they enable two particular optimisations in Oracle.
  • Bloom filters only work with equality predicates, and therefore only with single value joins.  They enable a filter on one column of one table to be roughly mapped to another column of another table upon which there is an equality join.  This filters data earlier in a query, cutting down the size of the eventual join operation.  It is a rough match in that it may return some false positives to the subsequent hash operation, but it will not omit any results.  The hash join then does an exact match of the rows to produce the correct result set.  On an Engineered Oracle system this filter can also be pushed down to the storage cells. 
  • This is an example of the kind of execution plan that we should see when a Bloom filter is used.  A filter has been created on PSTREESELECT10 at line 3, and it is applied to PS_LEDGER at line 6.
-------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                    |       |       | 12876 (100)|          |       |       |
|   1 |  HASH GROUP BY                           |                    |   501 | 35571 | 12876  (42)| 00:00:01 |       |       |
|   2 |   HASH JOIN                              |                    |   975K|    66M| 12822  (41)| 00:00:01 |       |       |
|   3 |    JOIN FILTER CREATE                    | :BF0000            |  2577 | 59271 |     7   (0)| 00:00:01 |       |       |
|   4 |     PARTITION RANGE SINGLE               |                    |  2577 | 59271 |     7   (0)| 00:00:01 | 87968 | 87968 |
|   5 |      INDEX FAST FULL SCAN                | PS_PSTREESELECT10  |  2577 | 59271 |     7   (0)| 00:00:01 | 87968 | 87968 |
|   6 |    JOIN FILTER USE                       | :BF0000            |  1715K|    78M| 12804  (41)| 00:00:01 |       |       |
|   7 |     PARTITION RANGE ITERATOR             |                    |  1715K|    78M| 12804  (41)| 00:00:01 |    27 |    34 |
|   8 |      MAT_VIEW REWRITE ACCESS STORAGE FULL| PS_LEDGER          |  1715K|    78M| 12804  (41)| 00:00:01 |    27 |    34 |
-------------------------------------------------------------------------------------------------------------------------------
  • Extended statistics, (i.e. on groups of columns) also only work with equality predicates.  Single value join queries can, therefore, benefit from them.
There are drawbacks to be overcome.
  • Single value joins require the use dynamic selectors.  That requires management of statistics on the dynamic selectors, and debris left in the tree selectors table from failed reports that have not cleaned up after themselves.
  • Single value joins can result in larger tree selector tables with many more rows than tree leaves.  For large trees or trees with many values in leaf ranges, resulting in more than about 2000 rows on the tree selector tables, it can be advantageous to change the Access Method performance option from using literal values to join because the parse overhead becomes significant.

No comments :