More Implicit Date Conversions

A DBA colleague recently approached me with a rather confusing query execution plan. The query itself was relatively simple, somewhat similar to:

SELECT /*+ index (some_table some_table_i1) */
       *
FROM   some_table
WHERE  date_col >= :X

The column DATE_COL was indexed and the query was hinted to use the index as the bind variable value supplied would return a small subset of rows from the table. The execution plan desired was:

-----------------------------------------------------
| Id  | Operation                   | Name          |
-----------------------------------------------------
|   0 | SELECT STATEMENT            |               |
|   1 |  TABLE ACCESS BY INDEX ROWID| SOME_TABLE    |
|*  2 |   INDEX RANGE SCAN          | SOME_TABLE_I1 |
-----------------------------------------------------

but instead Oracle was using:

-----------------------------------------------------
| Id  | Operation                   | Name          |
-----------------------------------------------------
|   0 | SELECT STATEMENT            |               |
|   1 |  TABLE ACCESS BY INDEX ROWID| SOME_TABLE    |
|*  2 |   INDEX FULL SCAN           | SOME_TABLE_I1 |
-----------------------------------------------------

Simulating the query in SQL*Plus showed the index range scan, not the index full scan, which was puzzling. The solution to the problem, as resolved by Connor McDonald, was in the Predicate Information of the plan:

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(INTERNAL_FUNCTION("DATE_COL")>=:X)

Oracle wasn’t using a range scan on the date column as it was applying a function to the column in order to compare it to the bind variable. As it turned out, the bind variable data type as provided by the external application was not a DATE but a TIMESTAMP. So, in doing the comparison Oracle was casting the DATE column to TIMESTAMP, which defeated the effective use of the index. At this point, I kicked myself for not looking closely enough at the detail, especially after I had diagnosed and resolved a near identical problem a year ago on a different system.

We can construct a simple test to illustrate the problem using the following set-up:

CREATE TABLE date_test
   (id        NUMBER(10) NOT NULL
   ,log_date  DATE       NOT NULL
   ,padding   CHAR(30)   NOT NULL
   )
/

INSERT INTO date_test
SELECT ROWNUM
,      TRUNC(SYSDATE) - 100 + (ROWNUM/100)
,      'X'
FROM   dual
CONNECT BY ROWNUM <= 10000
/

EXEC dbms_stats.gather_table_stats ('','date_test',method_opt=>'for all indexed columns size auto')

CREATE UNIQUE INDEX date_test_pk
   ON date_test (id)
/

CREATE INDEX date_test_i1
   ON date_test (log_date)
/

So, we have a simple table with an indexed date column, LOG_DATE, containing 10,000 rows. The values of LOG_DATE evenly span the last 100 days. The execution plan for a simple query using a DATE bind variable can be obtained with:

EXPLAIN PLAN
FOR
SELECT *
FROM   date_test
WHERE  log_date >= TO_DATE(:some_date)
/

SELECT *
FROM   TABLE(dbms_xplan.display())
/

and this results in the following output:

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
Plan hash value: 2693408247

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              | 50000 |  2148K|    87   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DATE_TEST    | 50000 |  2148K|    87   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | DATE_TEST_I1 |  9000 |       |    26   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("LOG_DATE">=TO_DATE(:SOME_DATE))

In the above query, note how we explicitly cast the bind as a DATE data type. If we repeat the test but cast the bind variable to a TIMESTAMP, using:

EXPLAIN PLAN
FOR
SELECT /*+ index (date_test date_test_i1) */
       *
FROM   date_test
WHERE  log_date >= TO_TIMESTAMP(:some_date)
/

SELECT *
FROM   TABLE(dbms_xplan.display())
/

we get the following output:

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------
Plan hash value: 2054711204

--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              | 50000 |  2148K|  3036   (2)| 00:00:37 |
|   1 |  TABLE ACCESS BY INDEX ROWID| DATE_TEST    | 50000 |  2148K|  3036   (2)| 00:00:37 |
|*  2 |   INDEX FULL SCAN           | DATE_TEST_I1 | 50000 |       |  2697   (2)| 00:00:33 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(INTERNAL_FUNCTION("LOG_DATE")>=TO_TIMESTAMP(:SOME_DATE))

We can see that the Predicate Information section is performing a filter operation via the index. So, Oracle is scanning the all entires in the index, casting the column date found for each one to a timestamp and using that to perform the bind variable comparison. The earlier query using the DATE data type contains an access operation, meaning that Oracle used the index branch structure to traverse directly down to the start of the entries that would satisfy the comparison criteria. From there it can move through the index entries knowing that all entries encountered would satisfy the search criteria. No data type casting is required to perform the comparisons. When the bind variable used will filter out the majority of the rows in the table the index range scan operation will be far faster.

In the TIMESTAMP example I had to explicitly hint the query to use the index, otherwise Oracle would choose a full table scan. However, this is exactly what the problematic application query had done so perhaps the original developers had noticed the query was not using the index and had hinted it without realising why Oracle had avoided the use of the index in the first place.

As noted earlier, I had previously encountered a similar bug in a different application. In that situation the problem was more apparent from a performance perspective as the table in question was partitioned on the date column. Altering our example table we can see the impact:

DROP TABLE date_test PURGE
/

CREATE TABLE date_test
   (id        NUMBER(10) NOT NULL
   ,log_date  DATE       NOT NULL
   ,padding   CHAR(30)   NOT NULL
   )
PARTITION BY RANGE (log_date) INTERVAL (INTERVAL '7' DAY)
   (PARTITION p1 VALUES LESS THAN (TO_DATE('01/01/2012','dd/mm/yyyy')))
/

INSERT INTO date_test
SELECT ROWNUM
,      TRUNC(SYSDATE) - 100 + (ROWNUM/100)
,      'X'
FROM   dual
CONNECT BY ROWNUM <= 10000
/

EXEC dbms_stats.gather_table_stats ('','date_test',method_opt=>'for all indexed columns size auto')

CREATE UNIQUE INDEX date_test_pk
   ON date_test (id)
/

CREATE INDEX date_test_i1
   ON date_test (log_date)
   LOCAL
/

Repeating our EXPLAIN PLAN using a TIMESTAMP, I’ve modified the plan output to exclude the row and byte details to keep the overall width down:

EXPLAIN PLAN
FOR
SELECT /*+ INDEX (date_test date_test_i1) */
       *
FROM   date_test
WHERE  log_date >= TO_TIMESTAMP(:some_date)
/

SELECT *
FROM   TABLE(dbms_xplan.display(format=>'-BYTES,-ROWS'))
/

gives the following output:

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
Plan hash value: 600373160

---------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name         | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |              |    35   (3)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ALL               |              |    35   (3)| 00:00:01 |     1 |1048575|
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| DATE_TEST    |    35   (3)| 00:00:01 |     1 |1048575|
|*  3 |    INDEX FULL SCAN                 | DATE_TEST_I1 |    30   (0)| 00:00:01 |     1 |1048575|
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(INTERNAL_FUNCTION("LOG_DATE")>=TO_TIMESTAMP(:SOME_DATE))

The key detail in this plan are the Pstart and Pstop columns, which have values 1 and 1048575 respectively. The Pstop value is simply Oracle’s way of saying “the last partition” as the table is INTERVAL partitioned. So, the Pstart and Pstop details tell us that Oracle will read all partitions of the local index. The actual operation performed, INDEX FULL SCAN, remains the same as the non-partitioned table.

Contrast this to the plan that would be displayed if the query utilised a DATE value:

EXPLAIN PLAN
FOR
SELECT *
FROM   date_test
WHERE  log_date >= TO_DATE(:some_date)
/

SELECT *
FROM   TABLE(dbms_xplan.display(format=>'-BYTES,-ROWS'))
/

gives the following output:

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
Plan hash value: 146305644

---------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name         | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |              |     3   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE ITERATOR          |              |     3   (0)| 00:00:01 |   KEY |1048575|
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| DATE_TEST    |     3   (0)| 00:00:01 |   KEY |1048575|
|*  3 |    INDEX RANGE SCAN                | DATE_TEST_I1 |     2   (0)| 00:00:01 |   KEY |1048575|
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("LOG_DATE">=TO_DATE(:SOME_DATE))

In this plan the Pstart column now displays KEY, which means that Oracle will use the value of the bind variable supplied to eliminate partitions that it knows will not contain relevant data. Once again, the operation performed is the same as the non-paritioned table; an INDEX RANGE SCAN. So, in this example, Oracle is not only using the index more efficiently to access the data in each partition it is also ensuring that it only accesses partitions that might contain relevant data. For the system in question where this situation was occuring, the process execution time went from a couple of minutes to sub-second.

Sparse Collections

Anyone familiar with programming languages will know of the array data type. Oracle PL/SQL supports 3 different array data types: associative arrays, nested tables and varrays. The Oracle database engine recognises only two of these data types, nested tables and varrays. Associative arrays are PL/SQL constructs only.

A common way to populate collections is to SELECT…BULK COLLECT into them, as the following example illustrates:

DECLARE
   -- define a nested table type and variable of that type
   TYPE t_emp_tab IS TABLE OF emp%ROWTYPE;
   l_emp_tab t_emp_tab;
BEGIN
   -- populate the collection
   SELECT e.*
   BULK COLLECT INTO l_emp_tab
   FROM emp e;

   -- display the names in the collection
   FOR i IN 1..l_emp_tab.LAST
   LOOP
      dbms_output.put_line (l_emp_tab(i).ename);
   END LOOP;
END;
/

which simply displays the employee names from the EMP table.

SMITH
ALLEN
WARD
JONES
MARTIN
BLAKE
CLARK
SCOTT
KING
TURNER
ADAMS
JAMES
FORD
MILLER

PL/SQL procedure successfully completed.

The above example uses a nested table. We can do exactly the same with an associative array:

DECLARE
   -- define an associative array type and variable of that type
   TYPE t_emp_tab IS TABLE OF emp%ROWTYPE INDEX BY PLS_INTEGER;
   l_emp_tab t_emp_tab;
BEGIN
   -- populate the collection
   SELECT e.*
   BULK COLLECT INTO l_emp_tab
   FROM emp e;

   -- display the names in the collection
   FOR i IN 1..l_emp_tab.LAST
   LOOP
      dbms_output.put_line (l_emp_tab(i).ename);
   END LOOP;
END;
/

and get the same result.

So, why this discussion on collections? Well, I nearly introduced a hard to spot bug into an application recently. In both the examples above you will notice that the FOR LOOP counter, i, iterated through the values 1 to the l_emp_tab.LAST to output the details from the collection. The logic here is that each iteration of the loop will gives an index that can be used to access an element in the collection…

… but that logic isn’t quite correct as it assumes that the collection index starts at 1 and that each element will be sequencially numbered such that l_emp_tab.LAST gives the highest index in the collection. This will only hold true when the collection is dense, i.e. has no gaps in the indicies of the collection. In this particular situation we can feel safe to make this assumption as the SELECT…BULK COLLECT populates the collection in this manner.

A collection with gaps in the indicies is known as a sparse collection. We can see what would happen to our example script if the collection was a sparse collection by simply removing an element from the collection between the SELECT and FOR loop statements:

DECLARE
   -- define a nested table type and variable of that type
   TYPE t_emp_tab IS TABLE OF emp%ROWTYPE;
   l_emp_tab t_emp_tab;
BEGIN
   -- populate the collection
   SELECT e.*
   BULK COLLECT INTO l_emp_tab
   FROM emp e;
   
   -- remove the fifth element
   l_emp_tab.DELETE(5);

   -- display the names in the collection
   FOR i IN 1..l_emp_tab.LAST
   LOOP
      dbms_output.put_line (l_emp_tab(i).ename);
   END LOOP;
END;
/

which results in:

SMITH
ALLEN
WARD
JONES
DECLARE
*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at line 17

So, back to the bug I nearly introduced. The code I was dealing with was rather more complex than the scenario above. A routine in one package populated a nested table collection. This collection was part of a larger data structure and was passed back out of the package to the caller. Further along in the code the data structure was passed to another routine in yet a totally different package that iterated through the collection using a FOR loop in a manner similar to the examples above, i.e. it assumed that the collection was dense.

The code change I was required to make was to remove from the original collection elements that satisfied rather specific criteria. Generally there would not be any elements removed but in rare situations it could occur. My initial solution was to do exactly as I did in the last example above and use the DELETE method to remove the necessary elements. However, realising that this would result in a sparse collection I decided to check whether this would have unintended consequences and that was when I discovered the problem.

So, while I nearly introduced a bug, the assumption by the later routine that the collection provided would be dense is also a type of bug… something I would describe as “a bug in waiting”. Anyone familiar with the principles of coupling and cohesion as they apply to structured programming will understand this. In this situation the coupling between the routine that populated the collection and the routine that consumed the collection went beyond the collection data type itself by the assumption that the collection would be dense. This kind of assumption isn’t represented in the data structure itself and is therefore hard to detect.

So, if a collection is sparse, how can it be processed? One solution is to use the FIRST and NEXT methods to control the loop structure:

DECLARE
   -- define a nested table type and variable of that type
   TYPE t_emp_tab IS TABLE OF emp%ROWTYPE;
   l_emp_tab t_emp_tab;
   i INTEGER;
BEGIN
   -- populate the collection
   SELECT e.*
   BULK COLLECT INTO l_emp_tab
   FROM emp e;
   
   -- remove the fifth element
   l_emp_tab.DELETE(5);
   
   -- obtain the first index in the collection
   i := l_emp_tab.FIRST;

   LOOP
      -- explicitly exit when we've run out of elements in the collection
      EXIT WHEN l_emp_tab.NEXT(i) IS NULL;
      dbms_output.put_line (l_emp_tab(i).ename);
      -- move from the current element to the next one
      i := l_emp_tab.NEXT(i);
   END LOOP;
END;
/

which gives us the required output:

SMITH
ALLEN
WARD
JONES
BLAKE
CLARK
SCOTT
KING
TURNER
ADAMS
JAMES
FORD

PL/SQL procedure successfully completed.

The ideal solution to the code I was dealing with would have been to rewrite the routine that consumed the collection to handle sparse collections. This approach would result in a more robust code as the unnecessary assumption that the collection needs to be dense has been removed. However, I decided to take the approach of ensuring the collection passed back by the producer was always dense (I believe “chickened out” is the appropriate term). This seemed a more prudent choice in the interests of managing the overall scale of change required, although it does mean that the “bug in waiting” is still present. Hopefully the next developer to touch that bit of code will read the comments that I have left behind though…