Thursday, November 30, 2017

nVision Performance Tuning: 11. Excel -v- OpenXML

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

The general objective the performance tuning changes described in this series of blog posts has been to improve the performance of individual nVision reports, but also to allow many reports to execute concurrently.
However, if you use Excel 2010, Excel 2013 or above, then you may notice run times are significantly longer than with Excel 2007.  Also, from PeopleTools 8.54, Excel 2007 is no longer certified.
The problem is discussed in Oracle support note E-NV: nVision Performance using Microsoft Office 2010 and above (Doc ID 1537491.1).  Essentially, Excel 2010 upwards only runs single threaded.  Only one Excel nVision process that is not waiting for a database call to return can run concurrently on any one Windows server at any one time.  If you want to be able to run 10 concurrent nVision reports you would need to run one on each of 10 process schedulers, on 10 different windows servers.
From PT8.54, OpenXML is the default and preferred engine for executing nVision report on the process scheduler.  This uses a different PeopleSoft executable (PSNVSSRV).  It does not suffer from the single-threading problem so multiple processes can run concurrently.  It can also be run on non-Windows environments.
However, there are some limitations with OpenXML:
  • Excel macros are ignored during report generation, although macros can be put into a layout that will execute when the generated report is subsequently opened in Excel.
  • There are problems with nPlosion.  
  • Any print area set in the layout is lost.
  • When rerunning nVision to file any pre-existing file is not overwritten.
Therefore, it may be necessary to continue to run some nVision reports on Excel.  This would require:
  • Separate process schedulers configured to run Excel rather than OpenXML on each available Windows server.  Excel is selected by setting the variable Excel Automation = 1, in the nVision section of the process scheduler configuration file (psprcs.cfg).  
  • A new Excel nVision process type should be configured to run specific layouts or reportbooks on Excel.  
  • That new process type should only run on these additional process schedulers.  It should have a maximum concurrence of 1, or at most 2, on each Process Scheduler.  These schedulers should be configured to run this new process type (and a single Application Engine so that the purge process can run).

Tuesday, November 28, 2017

nVision Performance Tuning: 10. Maintaining Statistics on Non-Partitioned Tree Selector Tables

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

If you do not partition your selector tables, and if you predominantly use static selectors then you can maintain the statistics on the selector table as the new static selectors are created or updated.  PeopleTools table PSTREESELCTL is used to track static selectors.  It is keyed by selector number.  A row is inserted when a new selector is created, and the version number is updated when an existing selector is re-extracted.  Therefore, a trigger on this table can be used to submit a job to refresh the statistics on the tree selector.
CREATE OR REPLACE TRIGGER sysadm.pstreeselector_stats
BEFORE INSERT OR UPDATE ON sysadm.pstreeselctl
FOR EACH ROW
DECLARE
  l_jobno      NUMBER;
  l_cmd        VARCHAR2(1000);
  l_table_name VARCHAR2(18);
  l_suffix     VARCHAR2(2);
BEGIN
  l_table_name := 'PSTREESELECT'||LTRIM(TO_CHAR(:new.length,'00'));
  l_suffix     := SUBSTR(l_table_name,-2);
  l_cmd := 'dbms_stats.gather_table_stats(ownname=>user,tabname=>'''||l_table_name||''',force=>TRUE);'
       ||'dbms_stats.set_column_stats(ownname=>user,tabname=>'''||l_table_name||''',colname=>''RANGE_FROM_'||l_suffix||''',density=>1,force=>TRUE);'
       ||'dbms_stats.set_column_stats(ownname=>user,tabname=>'''||l_table_name||''',colname=>''RANGE_TO_'||l_suffix||''',density=>1,force=>TRUE);'
  dbms_output.put_line(l_cmd);
  dbms_job.submit(l_jobno,l_cmd);
EXCEPTION WHEN OTHERS THEN NULL;
END;
/
When all the selectors exist in a single segment, the optimizer will not correctly calculate the cardinality of the expressions in the nVision queries on the tree selector tables.  Therefore, I have found it necessary to manually set the density on the range columns to 1 to get the effective execution plans at least some of the time.
N.B. This trigger should not be used if the selector tables are partitioned.


Friday, November 24, 2017

nVision Performance Tuning: 9. Using Compression without the Advanced Compression Licence

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

Table compression can significantly decrease the size of tables and reduce the volume of I/O required to retrieve data.  Compression of the ledger, ledger budget, and summary ledger tables can significantly improve the performance of scans in nVision.
The Advanced Compression licence enables compression features to be used where segments are still updated by the application.  Without this licence, only simple table compression can be used, although Hybrid Column Compression (HCC) can only be used on an engineered system.  Neither forms of compression can be used in on-line line transaction processing mode.  A table that is marked for compression, but that is populated in conventional path mode will not be compressed.  Any compressed blocks will no longer be compressed after being updated.
However, in most systems, ledger rows cease to be updated within a month or two of the period closing.  The ledger tables will usually be partitioned by fiscal year and accounting period, so a month after the period has closed the corresponding partition will cease to be updated.  Therefore, it could then be compressed.
I usually compress historical partitions when I introduce partitioning.  This following example comes from an engineered system, so Hybrid Columnar Compression has been used.
  • There is one range partition each for the whole of fiscal years 2014 and 2015.  This data is historical and rarely queried, and then not by a single period.  A more aggressive compression QUERY HIGH has been used.  
  • Monthly partitioning is used for the previous fiscal year, 2016.  These partitions are compressed, but using QUERY LOW which should deliver better performance with lower CPU overhead than QUERY HIGH (although usually there isn't much in it).
  • Partition LEDGER_2017_05 is not partitioned because it is current (at the time of building this script) and could still be updated by the application.
CREATE TABLE sysadm.ps_ledger
(...)
TABLESPACE GLLARGE
PCTFREE 10 PCTUSED 80
PARTITION BY RANGE(FISCAL_YEAR,ACCOUNTING_PERIOD)
SUBPARTITION BY LIST (LEDGER)
(PARTITION ledger_2014 VALUES LESS THAN (2015,0) PCTFREE 0 COMPRESS FOR QUERY HIGH
(SUBPARTITION ledger_2014_actuals VALUES ('ACTUALS')
…
,SUBPARTITION ledger_2014_z_others VALUES (DEFAULT)
)
,PARTITION ledger_2015 VALUES LESS THAN (2016,0) PCTFREE 0 COMPRESS FOR QUERY HIGH
(SUBPARTITION ledger_2015_actuals VALUES ('ACTUALS')
…
,SUBPARTITION ledger_2015_z_others VALUES (DEFAULT)
)
,PARTITION ledger_2016_bf VALUES LESS THAN (2016,1) PCTFREE 0 COMPRESS FOR QUERY LOW
(SUBPARTITION ledger_2016_bf_actuals VALUES ('ACTUALS')
…
,SUBPARTITION ledger_2016_bf_z_others VALUES (DEFAULT)
)
…
,PARTITION ledger_2017_05 VALUES LESS THAN (2017,6)
(SUBPARTITION ledger_2017_05_actuals VALUES ('ACTUALS')
…
,SUBPARTITION ledger_2017_05_z_others VALUES (DEFAULT)
)
…
)
ENABLE ROW MOVEMENT
PARALLEL
NOLOGGING
/
As periods close there are more partitions that can be compressed.  However, it may not be practical to take an outage to rebuild ledger partitions each month, and it certainly isn't practical to rebuild the entire ledger table every month.  Instead, from 11g, partitions can be rebuilt in place in an online operation.  The compression attribute can only be specified on the partition, and then the sub-partitions can be rebuilt.  The data in the partition can still be read while it is being moved.
ALTER TABLE ps_ledger MODIFY PARTITON ledger_2017_05 PCTFREE COMPRESS FOR QUERY LOW;
ALTER TABLE ps_ledger MODIFY SUBPARTITION ledger_2017_05_actuals  ONLINE PARALLEL 32;
…
ALTER TABLE ps_ledger MODIFY SUBPARTITION ledger_2017_05_z_others ONLINE PARALLEL 32;
NB: Moving a partition online can fail in 12.1.0.1 or later of the database due to bug 2070300.  This is documented in MOS Doc ID 2040742.1.
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00932: inconsistent datatypes: expected NUMBER got BINARY
The bug is fixed in 12.2 and can be addressed in 12.1 by patch 2070300.

Monday, November 20, 2017

nVision Performance Tuning: 8. Interval Partitioning and Statistics Maintenance of Tree Selector Tables

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

The decision to use interval partitioning on the tree selector tables came from the need to have accurate statistics for the optimizer when parsing nVision queries.  It is not possible to introduce hints into nVision SQL. The dynamic nature of the code means it is not viable to consider any of the forms of database plan stability across the whole application, (although it may be possible to use SQL Profiles in limited cases). Therefore, as far as possible the optimizer has to choose the best plan on its own. Without accurate optimizer statistics, I have found that the optimizer will usually not choose to use a Bloom filter.
If the selector tables are not partitioned, then each table will usually contain rows for many different selectors. Even with perfectly up to date statistics, including a histogram on SELECTOR_NUM, and extended statistics on SELECTOR_NUM and RANGE_FROM_nn, I have found that Oracle miscosts the join on RANGE_FROMnn and the attribute on the ledger table.
I propose that the tree selector tables should be interval partition such that each selector goes into its own partition.
CREATE TABLE PSYPSTREESELECT10 
(SELECTOR_NUM INTEGER NOT NULL,
 TREE_NODE_NUM INTEGER NOT NULL,
 RANGE_FROM_10 VARCHAR2(10) NOT NULL,
 RANGE_TO_10   VARCHAR2(10) NOT NULL) 
PARTITION BY RANGE (selector_num) INTERVAL (1)
(PARTITION VALUES LESS THAN(2))
TABLESPACE "PTWORK"
STORAGE(INITIAL 128K NEXT 128K)
/
INSERT INTO PSYPSTREESELECT10 
(       SELECTOR_NUM, TREE_NODE_NUM, RANGE_FROM_10, RANGE_TO_10)
SELECT  SELECTOR_NUM, TREE_NODE_NUM, RANGE_FROM_10, RANGE_TO_10
FROM PSTREESELECT10
/
DROP TABLE PSTREESELECT10 
/
ALTER TABLE PSYPSTREESELECT10 RENAME TO PSTREESELECT10
/
  • nVision queries will reference a single selector with a literal value, and therefore Oracle will eliminate all but that single partition at parse time and will use the statistics on that partition to determine how to join it to other tables.
  • Statistics only have to be maintained at partition level, and not at table level. 
  • Now that there is only a single selector number in any one partition, there is no need for extended statistics. 
  • The need to use dynamic selectors, in order to get equality joins between selectors and ledger tables, in order to make use of the Bloom filter, means that statistics on selector table will inevitably be out of date. The PL/SQL triggers and package that log the selector usage, are also used to maintain statistics on the partition. 
  • Partitions do not have to be created in advance. They will be created automatically by Oracle as they are required by the application. 

Compound Triggers on Tree Selector Tables 

There are a pair of compound DML triggers on each tree selector tables, one for insert and one for delete.
  • The after row section captures the current selector number. The one for insert also counts the number of rows and tracks the minimum and maximum values of the RANGE_FROMnn and RANGE_TOnn columns. 
  • The after statement section updates the selector log. The insert trigger directly updates the statistics on the partition, including the minimum and maximum values of the range columns.
    • It is not possible to collect statistics in a trigger in the conventional manner because dbms_stats includes an implicit commit. If dbms_stats was called within an autonomous transaction it could not see the uncommitted insert into the tree selector that fired the trigger. Hence the trigger calls the XX_NVISION_SELECTORS package that uses dbms_stats.set_table_stats and dbms_stats.set_column_stats to set values directly. 
    • The trigger then submits a job to database job scheduler that will collect statistics on the partition in the conventional way using dbms_job. The job number is recorded on the selector log. The job will be picked up by the scheduler when the insert commits. However, there can be a short lag between scheduling the job, and it running. The first query in the nVision report can be parsed before the statistics are available. 
    • The procedure that directly sets the statistics has to make some sensible assumptions about the data. These mostly lead the optimizer to produce good execution plans. However, testing has shown that performance can be better with conventionally collected statistics. Therefore, the trigger both directly sets statistics and submits the database job to collect the statistics.
    • It is important that table level statistics are not maintained by either technique as this would lead to locking between sessions. Locking during partition statistics maintenance will not occur as no two sessions populate the same selector number, and each selector is in a different partition. A table statistics preference for granularity is set to PARTITION on each partitioned tree selector table. 
The combination of dynamics selectors, single value joins, interval partitioning of selector tables, logging triggers on the selector tables driving timely statistics maintenance on the partitions delivers execution plans that perform well and that make effective use of engineered system features.
However, there are two problems that then have to be worked around. 

Library Cache Contention 

Some data warehouse systems can need new partitions in tables daily or even hourly. If partitions were not created in a timely fashion, the application would either break because the partition was missing, or performance would degrade as data accumulated in a single partition. Oracle intended interval partitions to free the DBA from the need to actively manage such partitioning on a day-to-day basis by creating them automatically as the data was inserted. 
However, on a busy nVision system, this solution could create thousands of new selectors in a single day, and therefore thousands of new partitions. This is certainly not how Oracle intended interval partitioning to be used.  I freely admit that I am abusing the feature.
If you have multiple concurrent nVision reports running, using dynamic selectors, you will have multiple database sessions concurrently inserting rows into the tree selector tables each with a different selector number, and therefore each creating new partitions mostly into the same tables.
The recursive code that creates the new partitions, and maintains the data dictionary, acquires a lock the object handle in library cache to prevent other sessions from changing it at the same time.  As the number of concurrent nVisions increase you will start to see nVision sessions waiting on the library cache lock event during the insert into the tree selector table while the new partition is being created. Perversely, as the performance of the nVision queries improve (as you refine tree selector settings) you may find this contention increases. 
The workaround to this is to create multiple database schemas, each with copies of the partitioned tree selector tables (similarly interval partitioned) and the PSTREESELCTL table (to manage static selectors in those schemas). Synonyms will be required for all other tables referenced by nVision queries. 
Then a trigger on the process scheduler request table PSPRCSRQST will arbitarily set the current schema of the nVision batch processes to one of those schemas. The nVision reports still connect and run with privileges of the Owner ID (usually SYSADM), but the objects tables from the current schema. 
I have used a hash function to distribute nVision processes between schemas. I suggest the number of schemas should be a power of 2 (ie, 2, 4, 8 etc.).
CREATE OR REPLACE TRIGGER sysadm.nvision_current_schema
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (new.runstatus IN('7') AND new.prcsname = 'RPTBOOK' AND new.prcstype like 'nVision%')
BEGIN
  EXECUTE IMMEDIATE 'ALTER SESSION SET current_schema = NVEXEC'||LTRIM(TO_CHAR(dbms_utility.get_hash_value(:new.prcsinstance,1,8),'00'));
EXCEPTION WHEN OTHERS THEN NULL; --exception deliberately coded to suppress all exceptions
END;
/
Thus different nVision reports use different tree selector tables in different schemas rather than trying to create partitions in the same tree selector table, thus avoiding the library cache locking.

Limitation on the Maximum Number of Partitions 

In Oracle, it is not possible to have more than 1048576 partitions in a table. That applies to all forms of partitioning.
The tree selector tables are interval partitioned on selector number with an interval of 1 starting with 1. So the highest value of SELECTOR_NUM that they can store is 1048575.
INSERT INTO pstreeselect05 VALUES(1048576,0,' ',' ')
            *
ERROR at line 1:
ORA-14300: partitioning key maps to a partition outside maximum permitted number of partitions
New selector numbers are allocated sequentially from PSTREESELNUM. Left unattended, the selector numbers used by nVision will increase until they eventually hit this limit, and then nVision and ad-hoc queries that use the tree-exists operator will start to fail with this error.
Therefore, a procedure RESET_SELECTOR_NUM has been added to the PL/SQL package to reset the selector number allocation table PSTREESELNUM back to 0, delete any tree selector entries for which there is no logging entry, and then runs the regular selector PURGE procedure in the same
package that will drop unwanted interval partitions.

Recommendation: XX_NVISION_SELECTORS.RESET_SELECTOR_NUM should be scheduled run sufficiently frequently to prevent the selector number reaching the maximum.  

Monday, November 13, 2017

nVision Performance Tuning: 7 Analysis of Tree Usage with the Selector Log

This blog post is part of a series that discusses how to get optimal performance from PeopleSoft nVision reporting as used in General Ledger.

Over time, the selector log will build up a picture of how each tree is used in a system. Here are two examples of how it can be used.

You may look at a piece of SQL generated by nVision, it will have a literal value for the selector number, and you want to know about that particular selector.
REM treeanal.sql
WITH t as (
SELECT DISTINCT d.tree_name, s.dtl_fieldname, d.tree_acc_method, d.tree_acc_Selector, d.tree_acc_sel_opt
FROM pstreedefn d, pstreestrct s
WHERE d.tree_Strct_id = s.tree_strct_id
), l as (
SELECT  *
FROM ps_nvs_treeslctlog l
WHERE l.selector_Num = &selector_num
)
SELECT l.*, t.dtl_fieldname, t.tree_acc_method, t.tree_acc_Selector, t.tree_acc_sel_opt
FROM  t, l
WHERE t.tree_name = l.tree_name
/
Now you can see various pieces of information about the selector and the report in which it was used.
  • Process instance of the report, although it is also in the ACTION string
  • Length of the selector.  Here it was 10 characters so it was in PSTREESELECT10.
  • The number of rows inserted into the selector.  This is useful if you want to recreate the conditions at runtime by populating the selector table manually.
  • Time at which the selector was populated.
  • Session module, usually the process name.  
  • Session action string, which contains the process instance, report ID and business unit.
  • The client info string, containing the operator ID, database name, the name of the host running nVision and the name of the executable.
    • PSNVS is nVision running on Microsoft Excel
    • PSNVSSRV is nVision running in OpenXML
    • psae is a PS/Query using the IN TREE operator that has been scheduled to run on the Process Scheduler.
    • PSQED is the windows query client running in 2-tier mode and using an IN TREE operator.
    • PSSAMSRV indicates either nVision or PS/Query running either through the PIA or on the windows client in 3-tier mode.
  • Status of the selector.  X indicates that the selector has been deleted and the partition has been dropped.
  • Name of the tree.
  • Name of the schema in which the selector table is located.
  • Partition name in the tree selector table in which the data was held.  This will be blank if the partition has been dropped.
  • The current values of the three tree selector flags on the tree definition are also reported.
SELECTOR_NUM PROCESS_INSTANCE Len   NUM_ROWS TIMESTAMP                    MODULE       APPINFO_ACTION
------------ ---------------- --- ---------- ---------------------------- ------------ ----------------------------------------------------------------
CLIENT_INFO                                                      S TREE_NAME          OWNERID  PARTITION_NAME           JOB_NO DTL_FIELDNAME      T T T
---------------------------------------------------------------- - ------------------ -------- -------------------- ---------- ------------------ - - -
       10233          1780069  10        362 10-NOV-17 02.40.50.755038 AM RPTBOOK      PI=1780069:UKGL123I:UK001
GBNVISION,PSFINPRD,UKLONWIN001,,PSNVSSRV.EXE,                    X UKGL_ACCOUNT       SYSADM                             33052 ACCOUNT            J D S

The following query aggregated log entries to report the number of times each tree was used over the last 7 days, and provide various statistics about the numbers of rows extracted from trees into the selector tables, and the current tree performance options.
REM tree_usage.sql
WITH t AS (
SELECT DISTINCT d.tree_name, s.dtl_fieldname, d.tree_acc_method
,      d.tree_acc_Selector, d.tree_acc_sel_opt
FROM   pstreedefn d, pstreestrct s
WHERE  d.tree_Strct_id = s.tree_strct_id
), l AS (
SELECT tree_name, length
, COUNT(*) num_uses
, MIN(num_rows) min_rows
, AVG(num_rows) avg_rows
, MEDIAN(num_Rows) med_rows
, MAX(num_rowS) max_rows
, STDDEV(num_Rows) stddev_rows
, SUM(num_rows) sum_rows
, COUNT(distinct process_instance) processes
FROM ps_nvs_treeslctlog l
WHERE num_rows>0
AND timestamp >= sysdate-7
GROUP BY tree_name, length
)
SELECT l.*, t.dtl_fieldname, t.tree_acc_method, t.tree_acc_Selector, t.tree_acc_sel_opt
FROM   t, l
WHERE  t.tree_name = l.tree_name
ORDER BY sum_rows
/
The default recommendation is that all trees should use:
  • Literal values where possible when working with less than about 2000 rows in the selector.  However, where more than 2000 rows it may be better to join the table due to parse and execution overhead of each criterion.
  • Dynamic selectors 
  • Single Value joins 
This report can help to identify trees where extreme volumes mean that different options should be considered.
                          Num    Min Average Median    Max Std Dev       Sum   Num
TREE_NAME          Len   Uses   Rows    Rows   Rows   Rows    Rows      Rows Procs DTL_FIELDNAME      T T T
------------------ --- ------ ------ ------- ------ ------ ------- --------- ----- ------------------ - - -
…
CORP_ACCT           10      5   1147    2839   2616   6668    2263     14194     1 ACCOUNT            J D S
FUNCTION            10    480      9      32     35     35       8     15474    43 CHARTFIELD2        L D S
INT_SUP              6      7    225    2463   2838   2838     987     17243     1 PRODUCT            L D S
STAT_PRODUCT         6      8   2889    2889   2889   2889       0     23112     1 PRODUCT            J D S
AFFILIATE            5     43    215     576    509    938     223     24789    15 AFFILIATE          L D S
INT_GAAP_CON         5     62     82     486    522    730     225     30153    10 BUSINESS_UNIT      L D S
BU_GAAP_CON          5     96     44     619    614    731     115     59461    48 BUSINESS_UNIT      L D S
STAT_ACCOUNT        10     45     23    4204   6516   6516    2905    189182     6 ACCOUNT            J D S
INT_REP1            10    135    149    1563   1664   1664     379    211005     1 CHARTFIELD1        L D S
COMBO_CODE          10    172     17    1592   1532   2430     809    273846    18 CHARTFIELD1        L D S
UKGL_ACCOUNT        10   2586      2    1713   1147   7797    1793   4430262   110 ACCOUNT            J D S
  • The account trees have been set to join the tree to the ledger table rather than literal values because sometimes in excess of 6000 rows are extracted. A query with 6000 literal terms would be extremely large, take time to generate in nVision, and time on the database to parse and execute each criterion. 
  • STAT_PRODUCT has been set to join partly because it is large, it always extracts 2889 rows, but also because the whole tree is extracted every time so it does not cut down the result set.