Thursday, December 17, 2009

Hinting Sub-Queries on Oracle

This posting is a purely Oracle RDBMS discussion about how to correctly apply hints to sub-queries. However, it is particularly relevant to PeopleSoft, which it makes extensive use of correlated sub-queries. The point of this story is not the particular hints that I applied, but where I placed the hints, and how I scoped them to the sub-queries.

The following SQL extract is from the delivered Global Payroll GPGB_EDI process (although I have already made some minor changes, and PS_GP_RSLT_PIN is partitioned). Notice that each sub-query joins two tables together. PS_GP_RSLT_PIN is second largest of Global Payroll result tables. PS_GP_PIN is a look-up table, and the criterion on PIN_CODE will only return a single row.

UPDATE Table(GPGB_EDIE_TMP) X
  SET X.GPGB_WK53_IND = ( 
 SELECT %TrimSubstr(%Sql(FUNCLIB_HR_CHAR,A.CALC_RSLT_VAL),1,2) 
  FROM PS_GP_RSLT_PIN A 
  ,PS_GP_PIN B 
 WHERE A.EMPLID = %Table(GPGB_EDIE_TMP).EMPLID 
...
AND A.PIN_NUM = B.PIN_NUM 
   AND B.PIN_CODE = 'TAX VR PERIOD GBR' 
   AND A.SLICE_BGN_DT = ( 
 SELECT MAX(D.SLICE_BGN_DT) 
  FROM PS_GP_RSLT_PIN D 
 WHERE D.EMPLID = %Table(GPGB_EDIE_TMP).EMPLID 
...
   AND D.INSTANCE = A.INSTANCE 
   AND D.PIN_NUM = B.PIN_NUM) 
...
   )
 WHERE EXISTS ( 
...
   )
AND PROCESS_INSTANCE = %Bind(PROCESS_INSTANCE)

This is part of the initial SQL execution plan. The problem is that the sub-query starts by scanning through the PS_GP_RSLT_PIN table, 4 times because there are three correlated sub-queries, and only at the very end does it look up the PIN_CODE by the PIN_NUM.

---------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| 
---------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                              |                   |     1 |    65 |   113M (93)| 39:44:51 |       |     
|   1 |  UPDATE                                       | PS_GPGB_EDIE_TMP4 |       |       |            |          |       |     
|*  2 |   FILTER                                      |                   |       |       |            |          |       |     
|*  3 |    TABLE ACCESS FULL                          | PS_GPGB_EDIE_TMP4 |   673K|    41M|  9967   (6)| 00:00:13 |      
|   4 |    NESTED LOOPS                               |                   |     1 |   108 |     4   (0)| 00:00:01 |       |     
|   5 |     PARTITION RANGE SINGLE                    |                   |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|   6 |      PARTITION LIST SINGLE                    |                   |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|*  7 |       INDEX RANGE SCAN                        | PS_GP_RSLT_PIN    |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|   8 |        SORT AGGREGATE                         |                   |     1 |    72 |            |          |       |     
|   9 |         PARTITION RANGE SINGLE                |                   |     1 |    72 |     3   (0)| 00:00:01 |   KEY | 
|  10 |          PARTITION LIST SINGLE                |                   |     1 |    72 |     3   (0)| 00:00:01 |   KEY | 
|* 11 |           INDEX RANGE SCAN                    | PS_GP_RSLT_PIN    |     1 |    72 |     3   (0)| 00:00:01 |   KEY | 
|  12 |            SORT AGGREGATE                     |                   |     1 |    83 |            |          |       |     
|  13 |             PARTITION RANGE SINGLE            |                   |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|  14 |              PARTITION LIST SINGLE            |                   |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|  15 |               FIRST ROW                       |                   |     1 |    83 |     3   (0)| 00:00:01 |       |     
|* 16 |                INDEX RANGE SCAN (MIN/MAX)     | PS_GP_RSLT_PIN    |     1 |    83 |     3   (0)| 00:00:01 | 
|  17 |                 SORT AGGREGATE                |                   |     1 |    83 |            |          |       |     
|  18 |                  PARTITION RANGE SINGLE       |                   |     1 |    83 |   158  (99)| 00:00:01 |   KEY | 
|  19 |                   PARTITION LIST SINGLE       |                   |     1 |    83 |   158  (99)| 00:00:01 |   KEY | 
|  20 |                    FIRST ROW                  |                   |     1 |    83 |   158  (99)| 00:00:01 |       |     
|* 21 |                     INDEX RANGE SCAN (MIN/MAX)| PS_GP_RSLT_PIN    |     1 |    83 |   158  (99)| 00:00:01 |   KEY |
|* 22 |     INDEX RANGE SCAN                          | PSAGP_PIN         |     1 |    25 |     1   (0)| 00:00:01 |       |     
...
---------------------------------------------------------------------------------------------------------------------------

It would be much better if we started with the PS_GP_PIN table, looked up the PIN_NUM with the PIN_CODE (there is a suitable index on this column) and used the PIN_NUM value as a part of the lookup on PS_GP_RSLT_PIN (again PIN_CODE is in the unique index on that table).

It is tempting to add LEADING hints to the sub-queries, but such a hint does not work because the hint is not scoped to the sub-query and is considered to be invalid because the entire query cannot start at this point.

The only supported place in the query to add a LEADING hint would be the first query, in this case after the UPDATE keyword.

In this case, I have named the query blocks in the sub-queries with the QB_NAME hint. It is valid to put this hint into the sub-query. Then I added LEADING hints for each sub-query after the UPDATE keyword, but I specified their scope using the name of the sub-query specified in the QB_NAME hint. Each sub-query must now start with the PS_GP_PIN table.

UPDATE /*+LEADING(@SUB1 B@SUB1) LEADING(@SUB2 B@SUB2)*/ %Table(GPGB_EDIE_TMP) X
  SET X.GPGB_WK53_IND = ( 
 SELECT /*+QB_NAME(SUB1)*/ %TrimSubstr(%Sql(FUNCLIB_HR_CHAR,A.CALC_RSLT_VAL),1,2) 
  FROM PS_GP_RSLT_PIN A 
  ,PS_GP_PIN B 
 WHERE A.EMPLID = %Table(GPGB_EDIE_TMP).EMPLID 
...
   AND A.PIN_NUM = B.PIN_NUM 
   AND B.PIN_CODE = 'TAX VR PERIOD GBR' 
   AND A.SLICE_BGN_DT = ( 
 SELECT MAX(D.SLICE_BGN_DT) 
  FROM PS_GP_RSLT_PIN D 
 WHERE D.EMPLID = %Table(GPGB_EDIE_TMP).EMPLID 
...
   AND D.INSTANCE = A.INSTANCE 
   AND D.PIN_NUM = B.PIN_NUM) 
...
   )
 WHERE EXISTS ( 
 SELECT /*+QB_NAME(SUB2)*/ 'X' 
  FROM PS_GP_RSLT_PIN A1 
  ,PS_GP_PIN B1 
 WHERE A1.EMPLID = %Table(GPGB_EDIE_TMP).EMPLID 
...
   AND A1.PIN_NUM = B1.PIN_NUM 
   AND B1.PIN_CODE = 'TAX VR PERIOD GBR' 
...
   )
AND PROCESS_INSTANCE = %Bind(PROCESS_INSTANCE)

This is the new execution plan. The sub-query starts with an access of index PSAGP_PIN (which leads on PIN_CODE) on PS_GP_PIN, and then does the lookups on PS_GP_RSLT_PIN. The cost is the same, but the execution time was considerably reduced.

----------------------------------------------------------------------------------------------------
| Id  | Operation                                     | Name              | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| 
----------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                              |                   |     1 |    65 |   113M (93)| 39:44:51 |       |     
|   1 |  UPDATE                                       | PS_GPGB_EDIE_TMP4 |       |       |            |          |       |     
|*  2 |   FILTER                                      |                   |       |       |            |          |       |     
|*  3 |    TABLE ACCESS FULL                          | PS_GPGB_EDIE_TMP4 |   673K|    41M|  9967   (6)| 00:00:13 |       |
|   4 |    NESTED LOOPS                               |                   |     1 |   108 |     4   (0)| 00:00:01 |       |     
|*  5 |     INDEX RANGE SCAN                          | PSAGP_PIN         |     1 |    25 |     2   (0)| 00:00:01 |       |     
|   6 |     PARTITION RANGE SINGLE                    |                   |     1 |    83 |     2   (0)| 00:00:01 |   KEY | 
|   7 |      PARTITION LIST SINGLE                    |                   |     1 |    83 |     2   (0)| 00:00:01 |   KEY | 
|*  8 |       INDEX RANGE SCAN                        | PS_GP_RSLT_PIN    |     1 |    83 |     2   (0)| 00:00:01 |   KEY | 
|   9 |        SORT AGGREGATE                         |                   |     1 |    72 |            |          |       |     
|  10 |         PARTITION RANGE SINGLE                |                   |     1 |    72 |     3   (0)| 00:00:01 |   KEY | 
|  11 |          PARTITION LIST SINGLE                |                   |     1 |    72 |     3   (0)| 00:00:01 |   KEY | 
|* 12 |           INDEX RANGE SCAN                    | PS_GP_RSLT_PIN    |     1 |    72 |     3   (0)| 00:00:01 |   KEY | 
|  13 |            SORT AGGREGATE                     |                   |     1 |    83 |            |          |       |     
|  14 |             PARTITION RANGE SINGLE            |                   |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|  15 |              PARTITION LIST SINGLE            |                   |     1 |    83 |     3   (0)| 00:00:01 |   KEY | 
|  16 |               FIRST ROW                       |                   |     1 |    83 |     3   (0)| 00:00:01 |       |     
|* 17 |                INDEX RANGE SCAN (MIN/MAX)     | PS_GP_RSLT_PIN    |     1 |    83 |     3   (0)| 00:00:01 |       |
|  18 |                 SORT AGGREGATE                |                   |     1 |    83 |            |          |       |     
|  19 |                  PARTITION RANGE SINGLE       |                   |     1 |    83 |   158  (99)| 00:00:01 |   KEY | 
|  20 |                   PARTITION LIST SINGLE       |                   |     1 |    83 |   158  (99)| 00:00:01 |   KEY | 
|  21 |                    FIRST ROW                  |                   |     1 |    83 |   158  (99)| 00:00:01 |       |     
|* 22 |                     INDEX RANGE SCAN (MIN/MAX)| PS_GP_RSLT_PIN    |     1 |    83 |   158  (99)| 00:00:01 |   KEY |
|  23 |   TABLE ACCESS BY LOCAL INDEX ROWID           | PS_GP_RSLT_PIN    |     1 |    86 |     3   (0)| 00:00:01 |       |
...
---------------------------------------------------------------------------------------------------------------------------