Tuesday, November 26, 2024

PeopleSoft PS/Query: Finding Users' Cartesian Joins

Many PeopleSoft users like its ad hoc query tool because they can write their own queries directly on the system, without having to learn to write structured query language (SQL), or getting a developer to write it for them.   

What is the Problem?

It is easy for users to create poor queries, that either don't work as intended or can run for long periods, even indefinitely, consuming resources without ever producing results.  This can consume significant amounts of CPU, and in the cloud, that is mostly what you pay for!  The effect can be mitigated with the database's resource manager, but it is better not to do it in the first place.

One cause of long-running queries that I come across is missing join criteria leading the database to perform Cartesian Merge Joins.  I should stress that not all Cartesian joins are evil.  For example, in some data warehouse queries (e.g. GL nVision reporting), it can be a very effective strategy to Cartesian join dimension tables before visiting the fact table, especially if you can use Bloom filter a full scan on the fact table.  It works well with parallel query, and on engineered systems this can also be pushed down to the storage cells.

Finding Execution Plans with Cartesian Joins

The following query profiles database time by execution plan from ASH for SQL statements from PS/Queries run via the PSQUERY application engine program on a process scheduler.  It returns the longest-running statement for each execution plan.

The data is generated and processed through several common table expressions.

  • R returns the PSQUERY processes that ran in the time window of interest
  • P returns the execution plans captured by AWR that generate Cartesian products for which the SQL text is also captured.  
  • X returns the ASH data for Cartesian join executions. When P is joined with the ASH data, then we just get the queries that performed Cartesian joins.
  • Y sums and groups the ASH data by statement and process
  • Z sums the data by execution plan and identifies the longest-running SQL statement for that plan.

REM qry_cartesianplans.sql
WITH r as ( /*processes of interest*/
SELECT /*+MATERIALIZE*/ r.oprid, r.prcsinstance, r.prcsname, r.begindttm, r.enddttm
,      DECODE(c.private_query_flag,'Y','Private','N','Public') private_query_flag, c.qryname
FROM   psprcsrqst r
       LEFT OUTER JOIN ps_query_run_cntrl c ON c.oprid = r.oprid AND c.run_cntl_id = r.runcntlid
WHERE prcsname = 'PSQUERY'
AND r.begindttm >= trunc(SYSDATE)-0+8/24
AND r.begindttm <= trunc(SYSDATE)-0+19/24
), p as ( /*known Cartesian plans with SQL text*/
SELECT /*+MATERIALIZE*/ p.plan_hash_value, MAX(p.options) options
FROM   dbA_hist_sql_plan p
,      dba_hist_sqltext t
WHERE  t.sql_id = p.sql_id
AND    (p.id = 0 OR p.options = 'CARTESIAN')
GROUP BY p.plan_hash_Value
), x AS ( /*ASH for processes*/
SELECT /*+materialize leading(r x)*/  r.prcsinstance, r.oprid, r.private_query_flag, r.qryname
,      h.event, x.dbid, h.sample_id, h.sample_time, h.instance_number
,      CASE WHEN h.module IS NULL       THEN REGEXP_SUBSTR(h.program, '[^@]+',1,1)
            WHEN h.module LIKE 'PSAE.%' THEN REGEXP_SUBSTR(h.module, '[^.]+',1,2) 
            ELSE                             REGEXP_SUBSTR(h.module, '[^.@]+',1,1) 
       END AS module
,      h.action
,      NULLIF(h.top_level_sql_id, h.sql_id) top_level_sql_id
,      h.sql_id, h.sql_plan_hash_value, h.force_matching_signature, h.sql_exec_id
,      h.session_id, h.session_serial#, h.qc_instance_id, h.qc_Session_id, h.qc_Session_serial#
,      f.name, p.options
,      NVL(usecs_per_row,1e7) usecs_per_row
,      CASE WHEN p.plan_hash_value IS NOT NULL THEN NVL(usecs_per_row,1e7) ELSE 0 END usecs_per_row2
FROM   dba_hist_snapshot x
,      dba_hist_active_sess_history h
       LEFT OUTER JOIN p ON p.plan_hash_value = h.sql_plan_hash_value
       LEFT OUTER JOIN dba_sql_profiles f ON h.force_matching_signature = f.signature
,      r
,      sysadm.psprcsque q
WHERE  h.SNAP_id = X.SNAP_id
AND    h.dbid = x.dbid
AND    h.instance_number = x.instance_number
AND    x.end_interval_time >= r.begindttm
AND    x.begin_interval_time <= NVL(r.enddttm,SYSDATE)
AND    h.sample_time BETWEEN r.begindttm AND NVL(r.enddttm,SYSDATE)
AND    q.prcsinstance = r.prcsinstance
AND    (  (h.module = r.prcsname AND h.action like 'PI='||r.prcsinstance||':Processing')
       OR  h.module like 'PSAE.'||r.prcsname||'.'||q.sessionidnum)
), y as( /*profile time by statement/process*/
SELECT prcsinstance, oprid, private_query_flag, qryname, sql_plan_hash_value, sql_id, force_matching_signature, name
,      dbid, module, action, top_level_sql_id
,      count(distinct qc_session_id||qc_session_serial#||sql_id||sql_exec_id) execs
,      sum(usecs_per_row)/1e6 ash_Secs
,      sum(usecs_per_Row2)/1e6 awr_secs
,      avg(usecs_per_row)/1e6*count(distinct sample_time) elapsed_secs
,      count(distinct instance_number||session_id||session_serial#) num_procs
,      max(options) options
FROM   x 
GROUP BY prcsinstance, oprid, private_query_flag, qryname, sql_plan_hash_value, sql_id, force_matching_signature, name
,      dbid, module, action, top_level_sql_id, qc_instance_id, qc_session_id, qc_session_serial#
), z as ( /*find top statement per plan and sum across all executions*/
SELECT row_number() over (partition by force_matching_signature, sql_plan_hash_value order by awr_secs desc) plan_seq
,      prcsinstance, oprid, name, private_query_flag, NVL(qryname,action) qryname, options
,      sql_id, sql_plan_hash_Value, force_matching_signature
,      count(distinct sql_id) over (partition by force_matching_signature, sql_plan_hash_value) sql_ids
,      sum(execs) over (partition by force_matching_signature, sql_plan_hash_value) plan_execs
,      sum(ash_Secs) over (partition by force_matching_signature, sql_plan_hash_value) plan_ash_secs
,      sum(awr_Secs) over (partition by force_matching_signature, sql_plan_hash_value) plan_awr_secs
,      sum(elapsed_Secs) over (partition by force_matching_signature, sql_plan_hash_value) elap_secs
,      sum(num_procs) over (partition by force_matching_signature, sql_plan_hash_value) max_procs
FROM   y
)
Select z.*, z.plan_ash_secs/z.elap_secs eff_para
from   z
where  plan_seq = 1
and    sql_id is not null
and    plan_ash_secs >= 300
ORDER BY plan_ash_secs DESC
FETCH FIRST 50 ROWS ONLY
/
There are two SQL statements for the same private query. XXX_GL_BJU run by user BXXXXXX that exhibited a Cartesian join.

Plan    Plan
     Process                                          Private                                                           SQL Plan        Force Matching  SQL   Plan     ASH     AWR Elapsed  Max  Eff.
 #  Instance OPRID     NAME                           Query   QRYNAME                        OPTIONS   SQL_ID         Hash Value             Signature  IDs  Execs    Secs    Secs Seconds  Prc  Para
-- --------- --------- ------------------------------ ------- ------------------------------ --------- ------------- ----------- --------------------- ---- ------ ------- ------- ------- ---- -----
 1  12344342 NXXXXXX                                  Public  XXX_TRIAL_BALANCE_BY_BU_XXX_V2           c4zfcub2bnju8  2128864041   4468535744829993986    4      4  103473  103473  103473    4   1.0
 1  12344471 FXXXXXX                                  Public  XXXAM_FIN_GL_AP                          d8jnxzmgx20mq  4189069557  16033793374717384734    1      1   32599   32599   32599    1   1.0
 1  12344448 VXXXXXX                                  Private XXX1_LEDGERBAL1_UPRDAC_XXXX1             ftn7nz1xafh5z           0  15193759933860031914    2      2   20615   20615   20615    2   1.0
 1  12345574 BXXXXXX                                  Private XXX_GL_BJU                     CARTESIAN ab2v91h9zj3hv   603930234   4189289347608449750    1      1   16862   16862   16862    1   1.0
 1  12345681 BXXXXXX                                  Private XXX_GL_BJU                     CARTESIAN 05tphb379fu8j   603930234   6203431496815450503    1      1   15452   15452   15452    1   1.0
 1  12345852 WXXXXXX                                  Public  XXXINSOLVENTS_JRNL_DETAIL                51aw4ahxba0gq  3918624993  11145663850623390044    1      1   13435   13435   13435    1   1.0
 1  12345863 CXXXXXX                                  Public  XXX_COMMUTATIONS_JRNL_DTL                7q9kt75bh35dg           0  11985643849566057390    1      1   13283   13283   13283    1   1.0
 1  12344773 WXXXXXX                                  Private XXX_COMMUTATION_JRNL_DETAIL_2            361gck3w3mak7           0  18367721225324700858    1      2   12883   12883   12883    2   1.0
 1  12344682 DXXXXXX                                  Private COMBINED_JE_DETAIL_DV                    2gchgaf465ku5           0   5375582220398622005    1      1    9279    9279    9279    1   1.0
 1  12345618 DXXXXXX                                  Private COMBINED_JE_DETAIL_DV_NO_AFF             2q2faj9c6003u           0  15355473744647942117    1      1    5079    5079    5079    1   1.0
…
The SQL statement and execution plan can be extracted from AWR using DBMS_XPLAN.DISPLAY_WORKLOAD_REPSITORY.  

SELECT * FROM table(dbms_xplan.display_workload_repository('ab2v91h9zj3hv',603930234,'ADVANCED +ADAPTIVE'));

In this example, there are two similar SQL statements, with different force matching signatures, that produce the same execution plan.  The difference is that one has an IN list of 3 accounts, and the other has an equi-join to just one account.  This is enough to produce a different force matching signature.  This is why I often group ASH data by execution plan hash value.  Even if the SQL statement is different, if the execution plan is the same, then the issues and solutions tend to be the same.

The statements have been reformated to make them easier to read.  Both are just joins between two objects.  There are criteria on PS_JRNL_DRILL_VW (a view on PS_JRNL_LN), but there are no join criteria between it and its parent table JRNL_HEADER, thus a meaningless Cartesian product that joins every journal line to every journal header was created and sorted.

SQL_ID ab2v91h9zj3hv 
--------------------
SELECT A.BUSINESS_UNIT, A.JOURNAL_ID,
TO_CHAR(A.JOURNAL_DATE,'YYYY-MM-DD'), B.DESCR254, A.ACCOUNT,
A.LINE_DESCR, SUM( A.MONETARY_AMOUNT), A.LEDGER, B.ACCOUNTING_PERIOD,
B.SOURCE, B.OPRID, A.PRODUCT, A.CLASS_FLD, A.PROGRAM_CODE,
A.CHARTFIELD1, A.CHARTFIELD3, A.CURRENCY_CD, A.FOREIGN_CURRENCY 
FROM PS_JRNL_DRILL_VW A, PS_JRNL_HEADER B 
WHERE ( A.BUSINESS_UNIT IN('12341','12347') 
AND A.LEDGER IN ('CORE','LOCAL_ADJ','LOCAL_ADJ2') 
AND A.ACCOUNT IN ('1234510040','1234510000','1234510060') 
AND A.ACCOUNTING_PERIOD BETWEEN 1 AND 12 AND A.FISCAL_YEAR = 2023) 
GROUP BY A.BUSINESS_UNIT, A.JOURNAL_ID, A.JOURNAL_DATE, B.DESCR254, A.ACCOUNT,
A.LINE_DESCR, A.LEDGER, B.ACCOUNTING_PERIOD, B.SOURCE, B.OPRID,
A.PRODUCT, A.CLASS_FLD, A.PROGRAM_CODE, A.CHARTFIELD1, A.CHARTFIELD3,
A.CURRENCY_CD, A.FOREIGN_CURRENCY ORDER BY 11

SQL_ID 05tphb379fu8j
--------------------
SELECT A.BUSINESS_UNIT, A.JOURNAL_ID,
TO_CHAR(A.JOURNAL_DATE,'YYYY-MM-DD'), B.DESCR254, A.ACCOUNT,
A.LINE_DESCR, SUM( A.MONETARY_AMOUNT), A.LEDGER, B.ACCOUNTING_PERIOD,
B.SOURCE, B.OPRID, A.PRODUCT, A.CLASS_FLD, A.PROGRAM_CODE,
A.CHARTFIELD1, A.CHARTFIELD3, A.CURRENCY_CD, A.FOREIGN_CURRENCY 
FROM PS_JRNL_DRILL_VW A, PS_JRNL_HEADER B 
WHERE ( A.BUSINESS_UNIT IN('12341','12347') 
AND A.LEDGER IN ('CORE','LOCAL_ADJ','LOCAL_ADJ2') 
AND A.ACCOUNT = '1234510000' 
AND A.ACCOUNTING_PERIOD BETWEEN 1 AND 12 
AND A.FISCAL_YEAR = 2023) 
GROUP BY A.BUSINESS_UNIT, A.JOURNAL_ID, A.JOURNAL_DATE, B.DESCR254, A.ACCOUNT
, A.LINE_DESCR, A.LEDGER, B.ACCOUNTING_PERIOD, B.SOURCE, B.OPRID
, A.PRODUCT, A.CLASS_FLD,A.PROGRAM_CODE, A.CHARTFIELD1, A.CHARTFIELD3
, A.CURRENCY_CD, A.FOREIGN_CURRENCY ORDER BY 11
Line 2 of the execution plan reports a MERGE JOIN CARTESIAN operation that feeds into the SORT GROUP operation at line 1.

Plan hash value: 603930234

---------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                        | Name           | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop | Inst   |IN-OUT|
---------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                 |                |       |       | 84648 (100)|          |       |       |        |      |
|   1 |  SORT GROUP BY                                   |                |    57 |  9063 | 84648   (1)| 00:00:04 |       |       |        |      |
|   2 |   MERGE JOIN CARTESIAN                           |                |    57 |  9063 | 84647   (1)| 00:00:04 |       |       |        |      |
|   3 |    NESTED LOOPS                                  |                |     1 |   145 |  1636   (0)| 00:00:01 |       |       |        |      |
|   4 |     VIEW                                         | PS_JRNL_HEADER |   112 |  4032 |  1188   (0)| 00:00:01 |       |       |        |      |
|   5 |      UNION-ALL                                   |                |       |       |            |          |       |       |        |      |
|   6 |       REMOTE                                     | PS_JRNL_HEADER |    76 |  5624 |    18   (0)| 00:00:01 |       |       | FSARC~ | R->S |
|   7 |       INLIST ITERATOR                            |                |       |       |            |          |       |       |        |      |
|   8 |        TABLE ACCESS BY INDEX ROWID BATCHED       | PS_JRNL_HEADER | 16679 |   586K| 11634   (1)| 00:00:01 |       |       |        |      |
|*  9 |         INDEX RANGE SCAN                         | PSEJRNL_HEADER | 16679 |       |   347   (0)| 00:00:01 |       |       |        |      |
|  10 |     VIEW                                         | PS_JRNL_LN     |     1 |   109 |     4   (0)| 00:00:01 |       |       |        |      |
|  11 |      UNION-ALL PARTITION                         |                |       |       |            |          |       |       |        |      |
|* 12 |       FILTER                                     |                |       |       |            |          |       |       |        |      |
|  13 |        REMOTE                                    | PS_JRNL_LN     |     1 |   217 |     5   (0)| 00:00:01 |       |       | FSARC~ | R->S |
|* 14 |       FILTER                                     |                |       |       |            |          |       |       |        |      |
|  15 |        PARTITION RANGE SINGLE                    |                |     1 |   109 |     5   (0)| 00:00:01 |   KEY |   KEY |        |      |
|* 16 |         TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| PS_JRNL_LN     |     1 |   109 |     5   (0)| 00:00:01 |   KEY |   KEY |        |      |
|* 17 |          INDEX RANGE SCAN                        | PS_JRNL_LN     |     1 |       |     4   (0)| 00:00:01 |   KEY |   KEY |        |      |
|  18 |    BUFFER SORT                                   |                |  7749K|   103M| 84644   (1)| 00:00:04 |       |       |        |      |
|  19 |     VIEW                                         | PS_JRNL_HEADER |  7749K|   103M| 83011   (1)| 00:00:04 |       |       |        |      |
|  20 |      UNION-ALL                                   |                |       |       |            |          |       |       |        |      |
|  21 |       REMOTE                                     | PS_JRNL_HEADER |  5698K|  1880M| 50467   (1)| 00:00:02 |       |       | FSARC~ | R->S |
|  22 |       TABLE ACCESS STORAGE FULL                  | PS_JRNL_HEADER |  2050K|    86M| 32544   (1)| 00:00:02 |       |       |        |      |
---------------------------------------------------------------------------------------------------------------------------------------------------
A profile of database time for that execution plan by event and plan line ID shows that most of the time is spent on line 1, sorting the output of the Cartesian product.
                                                                                                                    Stmt    Stmt
   SQL Plan SQL Plan                                                                  H   E I     ASH                ASH    Elap
 Hash Value  Line ID EVENT                                                            P P x M    Secs  ELAP_SECS    Secs    Secs
----------- -------- ---------------------------------------------------------------- - - - - ------- ---------- ------- -------
  603930234        1 CPU+CPU Wait                                                     N N Y N  217091 23405.3608  299088   32314
                  18 direct path read temp                                            N N Y N   64395 7034.44748  299088   32314
                  18 CPU+CPU Wait                                                     N N Y N   16998 1812.39445  299088   32314
                   1 ASM IO for non-blocking poll                                     N N Y N     195 20.4802032  299088   32314
                  21 CPU+CPU Wait                                                     N N Y N     195   20.47995  299088   32314
                  16 CPU+CPU Wait                                                     N N Y N     113   10.24021  299088   32314
                     CPU+CPU Wait                                                     N N Y N     103   10.25244  299088   32314
This query never finished because the Cartesian product was so large.  The time recorded was spent in two executions that were eventually cancelled by system operators.

The answer in this particular case is to fix the code.  We have to go back to the user, explain why it is necessary to join parent and child tables and get them to correct their PS/Query.

Finding PS/Queries Without Joins on Related Records

In PeopleSoft, the parent of a child record is recorded on PSRECDEFN in the column PARENTRECNAME.  However, this does not translate into a foreign key relationship in any database supported by PeopleSoft.  This is part of PeopleSoft's original platform-agnosticism.  Not all databases previously supported by PeopleSoft supported database enforced referential integrity.  Therefore it never became part of the implementation, and there is no guarantee that the applications were written in such a way to honour foreign-key constraints (i.e. insert parents before children, delete children before parents etc.).

The below query looks at pairs of parent-child records in each select block of each PS/Query and counts the number of key columns for which there are criteria on the child record that are joined to the parent record.  It is restricted to just the journal header/line tables and views.

It returns rows where no joined key columns are found.  These queries are therefore suspected of being faulty.  However, there may be false positives where child records are joined to grandparents rather than immediate parents.  Such an approach in SQL is perfectly valid, and can even result in better performance.  

WITH x as (
SELECT r1.oprid, r1.qryname, r1.selnum
, r1.rcdnum rcdnum1, r1.recname recname1, r1.corrname corrname1
, r2.rcdnum rcdnum2, r2.recname recname2, r2.corrname corrname2
, (SELECT count(*) 
   FROM psqryfield qf1 --INNER JOIN psrecfielddb f1 ON f1.recname = r1.recname AND f1.fieldname = qf1.fieldname
   ,    psqryfield qf2 INNER JOIN psrecfielddb f2 ON f2.recname = r2.recname AND f2.fieldname = qf2.fieldname 
                                                 AND MOD(f2.useedit,2)=1 /*key fields only*/
   , psqrycriteria c
   WHERE qf1.oprid = r1.oprid AND qf1.qryname = r1.qryname AND qf1.selnum = r1.selnum AND qf1.recname = r1.recname AND qf1.fldrcdnum = r1.rcdnum
   AND   qf2.oprid = r2.oprid AND qf2.qryname = r2.qryname AND qf2.selnum = r2.selnum AND qf2.recname = r2.recname AND qf2.fldrcdnum = r2.rcdnum
   AND    c.oprid = r1.oprid AND c.qryname = r1.qryname AND  c.selnum = r1.selnum 
   AND   (  (c.lcrtselnum = r1.selnum AND c.lcrtfldnum = qf1.fldnum AND c.r1crtselnum = r2.selnum AND c.r1crtfldnum = qf2.fldnum)
         OR (c.lcrtselnum = r2.selnum AND c.lcrtfldnum = qf2.fldnum AND c.r1crtselnum = r1.selnum AND c.r1crtfldnum = qf1.fldnum))
-- AND rownum = 1
  ) num_key_fields
FROM psrecdefn r
, psqryrecord r1
  INNER JOIN psqryrecord r2 ON r1.oprid = r2.oprid AND r1.qryname = r2.qryname AND r1.selnum = r2.selnum AND r1.rcdnum != r2.rcdnum --AND r1.corrname < r2.corrname
WHERE r.recname = r2.recname AND r.parentrecname = r1.recname
)
SELECT x.* FROM x
WHERE num_key_fields = 0
AND recname1 IN('JRNL_HEADER')
AND recname2 IN('JRNL_LN','JRNL_DRILL_VW')
ORDER BY 1,2,3
/

However, these queries may not have been run recently.  Users tend to write queries, save a modification as a new version, and then abandon the old version.

                                          Sel Rec1                    Cor Rec2                    Cor #Key
OPRID     QRYNAME                           #    # Record 1           #1     # Record             #2  Flds
--------- ------------------------------ ---- ---- ------------------ --- ---- ------------------ --- ----
          2_XX_CHI_JOURNAL_MES2_RE          1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          12300_GL_ACCOUNT_DETAIL           1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          123_DK                            1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          123_NEW                           1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          12345_ACCRUAL_JE_DETAILS          1    1 JRNL_HEADER        A      2 JRNL_LN            C      0
          12345_ACCRUAL_JE_DETAILS_V2       1    1 JRNL_HEADER        A      2 JRNL_LN            C      0
          12345_ACCRUAL_JE_DETAILS_V3       1    1 JRNL_HEADER        A      2 JRNL_LN            C      0
          12345_HARDSOFT_JE_DETAILS_V3      1    1 JRNL_HEADER        A      2 JRNL_LN            C      0
          12345_BM_CURR_ACTIVITY2           1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          AAIC_CBP_POOLS                    1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
…

I demonstrated how to identify long-running PS/Queries on the process schedulers in an earlier blog post. The following query merges in that query, so that it only considers queries that have run on a process scheduler within the purge period, for which join criteria may be missing.  They are sorted by descending execution time.

REM qry_missingjoins.sql
WITH q1 as (
SELECT r.prcsinstance
, r.oprid runoprid, r.runcntlid
, DECODE(c.private_query_flag,'Y','Private','N','Public') private_query_flag
, DECODE(c.private_query_flag,'Y',r.oprid,' ') oprid
, c.qryname
, CAST(begindttm AS DATE) begindttm
, CAST(enddttm AS DATE) enddttm
, runstatus
, (CAST(NVL(enddttm,SYSDATE) AS DATE)-CAST(begindttm AS DATE))*86400 exec_Secs
FROM psprcsrqst r
  LEFT OUTER JOIN ps_query_run_cntrl c ON c.oprid = r.oprid AND c.run_cntl_id = r.runcntlid
WHERE prcsname = 'PSQUERY'
AND dbname IN(select DISTINCT dbname from ps.psdbowner)
--AND r.begindttm >= trunc(SYSDATE)-2+8/24
--AND r.begindttm <= trunc(SYSDATE)-2+19/24
), q as (
Select /*+MATERIALIZE*/ oprid, qryname
, SUM(exec_secs) exec_secs
, COUNT(*) num_execs
, COUNT(DECODE(runstatus,'9',1,NULL)) complete_execs
, COUNT(DISTINCT runoprid) runoprids
FROM q1
GROUP BY oprid, qryname
), x as (
SELECT r1.oprid, r1.qryname, r1.selnum
, r1.rcdnum rcdnum1, r1.recname recname1, r1.corrname corrname1
, r2.rcdnum rcdnum2, r2.recname recname2, r2.corrname corrname2
, (SELECT count(*) 
   FROM psqryfield qf1 --INNER JOIN psrecfielddb f1 ON f1.recname = r1.recname AND f1.fieldname = qf1.fieldname
   ,    psqryfield qf2 INNER JOIN psrecfielddb f2 ON f2.recname = r2.recname AND f2.fieldname = qf2.fieldname AND MOD(f2.useedit,2)=1
   , psqrycriteria c
   WHERE qf1.oprid = r1.oprid AND qf1.qryname = r1.qryname AND qf1.selnum = r1.selnum AND qf1.recname = r1.recname AND qf1.fldrcdnum = r1.rcdnum
   AND   qf2.oprid = r2.oprid AND qf2.qryname = r2.qryname AND qf2.selnum = r2.selnum AND qf2.recname = r2.recname AND qf2.fldrcdnum = r2.rcdnum
   AND    c.oprid = r1.oprid AND  c.qryname = r1.qryname AND  c.selnum = r1.selnum 
   AND   (  (c.lcrtselnum = r1.selnum AND c.lcrtfldnum = qf1.fldnum AND c.r1crtselnum = r2.selnum AND c.r1crtfldnum = qf2.fldnum)
         OR (c.lcrtselnum = r2.selnum AND c.lcrtfldnum = qf2.fldnum AND c.r1crtselnum = r1.selnum AND c.r1crtfldnum = qf1.fldnum))
   AND rownum = 1
  ) num_key_fields
FROM psrecdefn r
, psqryrecord r1
  INNER JOIN psqryrecord r2 ON r1.oprid = r2.oprid AND r1.qryname = r2.qryname AND r1.selnum = r2.selnum AND r1.rcdnum != r2.rcdnum --AND r1.corrname < r2.corrname
WHERE r.recname = r2.recname AND r.parentrecname = r1.recname
)
SELECT /*+LEADING(Q)*/ q.*, x.selnum
, x.rcdnum1, x.recname1, x.corrname1
, x.rcdnum2, x.recname2, x.corrname2, x.num_key_fields
FROM x
  INNER JOIN q ON q.oprid = x.oprid AND q.qryname = x.qryname
WHERE num_key_fields = 0
AND exec_secs >= 600
ORDER BY exec_secs desc

/

Now I have a list of candidate queries that have been used recently and may be missing joins that I investigate further.

                                                                                          Sel Rec1                    Cor Rec2                    Cor #Key
OPRID     QRYNAME                         EXEC_SECS  NUM_EXECS COMPLETE_EXECS  RUNOPRIDS    #    # Record 1           #1     # Record 2           #2  Flds
--------- ------------------------------ ---------- ---------- -------------- ---------- ---- ---- ------------------ --- ---- ------------------ --- ----
UKXXXXXXX AR_VENDOR_LOCATION_DETAILB         264317        361            360          1    1    1 VENDOR             A      8 VNDR_LOC_SCROL     H      0
          XX_COL_MOV_ALT_ACCT2_PERIO         193692       2096           2051         14    1    1 JRNL_HEADER        A      3 OPEN_ITEM_GL       C      0
          APC_123_LEDGER_ACTIVITY_BY_BU      151438       2959           2938         73    2    1 JRNL_HEADER        B      2 JRNL_LN            C      0
MXXXXXX   MT_AUSTRALIA_TAX_PMTS              137471         36             28          1    1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          XX_PAN_ASIA_JOURNALS_REF           135825         48             47          4    1    1 JRNL_HEADER        A      5 JRNL_OPENITM_VW    E      0
          XXX_STKCOMP_LIFE                   120537        526            523          1    1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
          XXX_123_TB_LEDGER_BAL_BU           100848       2093           2044         17    3    1 JRNL_HEADER        B      2 JRNL_LN            C      0
KXXXXXX   XXX_JRNL_LIST_AUDIT_KL              99843        489            482          1    1    2 JRNL_HEADER        B      1 JRNL_DRILL_VW      A      0
          XXX_JE_ID_QUERY                     86106        156            151          1    1    1 JRNL_HEADER        A      2 JRNL_LN            C      0
          XXX_ACTIVITY_DETAILS_2              85356        336            302          5    1    1 JRNL_HEADER        A      2 JRNL_LN            B      0
…

Anyone can inspect any public queries, but you must be logged in as the owner of a private query to be able to see it.

The scripts in this article can be downloaded from GitHub davidkurtz/psscripts.

No comments :