Wednesday, April 12, 2006

%CurrentDateIn MetaSQL prevents Oracle CBO from correctly evaluating selectivity of predicate (up to Oracle 9i)

I thought this might be an interesting subject for the first post to a new blog.

Update 6.2.2009: The problem described in this posting is not an issue in Oracle 10g - see Changes in Calculation of Predicate Selectivity in Oracle 10g

I recently discovered the Oracle Cost Based Optimizer choosing a poor execution plan for a particular critical SQL statement in a CRM system because the expansion of the %CurrentDateIn macro is excessively complicated. The problem occurs in a delivered view PS_RBC_PACKAGE_VW and and a custom view PS_XX_RBCPKCLTR_VW. The views both contain a pair of date conditions, which are coded in line with PeopleSoft standards.

AND A.FROM_DATE <= %CurrentDateIn AND A.TO_DATE >= %CurrentDateIn
AND E.FROM_DATE <= %CurrentDateIn AND E.TO_DATE >= %CurrentDateIn

On an Oracle RDBMS, this expands to

AND A.FROM_DATE <= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’) AND A.TO_DATE >= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’)
AND E.FROM_DATE <= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’) AND E.TO_DATE >= TO_DATE(TO_CHAR(SYSDATE,’YYYY-MM-DD’),’YYYY-MM-DD’)

With the result that this statement took over 15 seconds to execute. However, if the view is recoded as follows

AND A.FROM_DATE <= TRUNC(SYSDATE) AND A.TO_DATE >= TRUNC(SYSDATE)
AND E.FROM_DATE <= TRUNC(SYSDATE) AND E.TO_DATE >= TRUNC(SYSDATE)

Then the execution time fell to less than 1 seconds.

To explain why, I shall use a very simple example that is easy to reproduced. In the following script, I have created a table with a 1000 rows and a few columns. Column B is just for padding so that the rows are not unrealistically small. Columns C, D and E contain some dates. The data is evenly distributed in C and D. In E the distribution is deliberately uneven, there are more dates further in the past.

DROP TABLE t1;
CREATE TABLE t1
(a NUMBER NOT NULL
,b VARCHAR2(2000) NOT NULL
,c DATE NOT NULL
,d DATE NOT NULL
,e DATE NOT NULL);

INSERT INTO t1
SELECT rownum
, RPAD(TO_CHAR(TO_DATE(rownum,'J'),'Jsp'),1000,'.') padd
, SYSDATE-rownum+4.2
, SYSDATE-rownum+42
, SYSDATE-sqrt(rownum)+4.2
FROM dba_objects
WHERE rownum <= 1000 ; 

Lets look at the execution plans of a few simple SQLs. Both of the following queries return 4 rows, but the cardinality is very different. In the first statement I have used just the simple TRUNC(SYSDATE), the optimizer has correctly worked out that the query will return 4 rows. However, in the second I have used the expansion of the %CurrentDateIn macro. Because the predicate contains a function, the optimizer uses a hard coded guess that the selectivity of the condition is 5% of the table.

SELECT * FROM t1 WHERE c > TRUNC(SYSDATE);
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=4 Bytes=4096)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=4 Bytes=4096)

SELECT * FROM t1 WHERE c > TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD');
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=50 Bytes=51200)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=50 Bytes=51200)

If I repeat the queries on column D, the cardinality goes up to 41 (it will in fact return 42 rows, but its close), but the cardinality is still 50.

SELECT * FROM t1 WHERE d > TRUNC(SYSDATE);
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=41 Bytes=41984)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=41 Bytes=41984)

SELECT * FROM t1 WHERE d > TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD');
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=50 Bytes=51200)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=50 Bytes=51200)

If I now try column E, the calculated cardinality the calculated is now 115, although the queries actually return 22 rows. The optimizer makes a mistake because the data values are not evenly distributed between the high and low values for the column.

SELECT * FROM t1 WHERE e > TRUNC(SYSDATE);
Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=115 Bytes=117760)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=115 Bytes=117760)

SELECT * FROM t1 WHERE e > TO_DATE(TO_CHAR(SYSDATE,'YYYY-MM-DD'),'YYYY-MM-DD');

Execution Plan
----------------------------------------------------------
  0 SELECT STATEMENT Optimizer=CHOOSE (Cost=257 Card=50 Bytes=51200)
1 0 TABLE ACCESS (FULL) OF 'T1' (Cost=257 Card=50 Bytes=51200)

So this test illustrates that the expansion of %CurrentDateIn prevents Oracle’s CBO from evaluating the selectivity correctly, and that this causes it to use a guess that is sometimes too high, and sometimes too low, but always incorrect. In some cases this will lead to an inappropriate execution plan. If TRUNC(SYSDATE) were used, the optimizer would have better information about the SQL and be able to make better decisions.
I say that this is a bug in PeopleTools. There is, of course, a simple workaround. Simply code TRUNC(SYSDATE) in the application as a customisation where necessary. However, there is no reason why the expansion of this metaSQL could not be changed, and delivered as a PeopleTools patch.

1 comment :

David Kurtz said...

The problem with %CurrentDateIn described in this posting applies to Oracle 9i, but is no longer a problem in 10g. There has been a change to the way Oracle costs predicates containing expressions (see http://blog.psftdba.com/2007/12/changes-in-calculation-of-predicate.html).