TOP n Queries

So you’ve been given a relatively simple task of getting the last 5 modified rows from a table, which contains a column that holds the modified date. How do you write the query to do that?

Back in pre-10 versions of Oracle it was likely that the query was written:

SELECT *
FROM   (SELECT *
        FROM   my_table
        ORDER  BY
               created_date DESC)
WHERE  ROWNUM <= 5

… and this works pretty well. If you had written it:

SELECT *
FROM   my_table
WHERE  ROWNUM <= 5
ORDER  BY
       created_date DESC

you would (hopefully) have observed some strange output during your testing and adjusted your query accordingly. For the Oracle newcomers, in the above query Oracle selects 5 rows and then sorts them, not the other way around, so it’s the equivalent of telling Oracle “get any 5 rows from the table and give them to me sorted descending by the created date”.

Somewhere around Oracle 10g the recommendation was to use the ROW_NUMBER analytic function in place of ROWNUM, i.e.

SELECT *
FROM   (SELECT t.*
        ,      ROW_NUMBER() OVER (ORDER BY created_date DESC) rn
        FROM   my_table t)
WHERE  rn <= 5

Now in version 12c of the database Oracle has introduced the FETCH FIRST ROWS syntax for doing exactly this kind of query. It makes things quite simple and clear:

SELECT *
FROM   my_table
ORDER  BY
       created_date DESC
FETCH FIRST 5 ROWS ONLY

Now let’s take a peek under the covers and see what Oracle is actually doing when faced with these queries. To start with we’ll create a simple table with 1,000,000 rows:

CREATE TABLE big_table
   (id            NUMBER(8)   NOT NULL
   ,cat           NUMBER(4)   NOT NULL
   ,padding       CHAR(30)    NOT NULL
   ,last_mod_date DATE        NOT NULL)
/

INSERT INTO big_table
SELECT ROWNUM
,      MOD(ROWNUM,1000) AS cat
,      'x' AS padding
,      TO_DATE('01/01/2000','dd/mm/yyyy') + dbms_random.value(0,5000) AS last_mod_date
FROM   (SELECT 'x'
        FROM   dual
        CONNECT BY ROWNUM <= 1000) x
,      (SELECT 'x'
        FROM   dual
        CONNECT BY ROWNUM <= 1000) y
/        

COMMIT
/

EXEC dbms_stats.gather_table_stats ('','big_table')

CREATE UNIQUE INDEX  big_table_pk
   ON big_table (id)
/

ALTER TABLE big_table
   ADD CONSTRAINT big_table_pk
   PRIMARY KEY (id)
/

Obviously we want to access our last created rows as quick as possible so we’ll index that column:

CREATE INDEX big_table_ix1
   ON big_table(last_mod_date)
/   

We’ll run each query against our table with AUTOTRACE enabled to see the execution plan and cost. As with anything related to Oracle performance it’s important to keep in mind the version that you’re using as things can change across versions. In light of that statement, the following examples were run against a 12.1.0.2 database. First up is our ROWNUM approach:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.padding
  7          ,      b.last_mod_date
  8          ,      b.cat
  9          FROM   big_table b
 10          ORDER  BY
 11                 b.last_mod_date DESC) r
 12  WHERE  ROWNUM <= 5
 13  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    545616        616 x                              08/09/2013 23:59:05
    557331        331 x                              08/09/2013 23:57:45
      5220        220 x                              08/09/2013 23:54:28
    874232        232 x                              08/09/2013 23:50:34
    610984        984 x                              08/09/2013 23:39:15


Execution Plan
----------------------------------------------------------
Plan hash value: 2877194421

-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |     5 |   335 |     8   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |               |       |       |            |          |
|   2 |   VIEW                        |               |     5 |   335 |     8   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| BIG_TABLE     |  1000K|    45M|     8   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN DESCENDING| BIG_TABLE_IX1 |     5 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
        954  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

As we can see from the execution plan, Oracle traversed the index on the LAST_MODIFIED_DATE in a descending fashion, returned each table entry found until, and this is the COUNT STOPKEY bit, it had returned the requested number of rows. All up the query required 9 logical IO operations so it’s very efficient.

Next up we do the same with the ROW_NUMBER query:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.cat
  7          ,      b.padding
  8          ,      b.last_mod_date
  9          ,      ROW_NUMBER() OVER (ORDER BY b.last_mod_date DESC) AS rn
 10          FROM   big_table b) r
 11  WHERE  rn <= 5
 12  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    545616        616 x                              08/09/2013 23:59:05
    557331        331 x                              08/09/2013 23:57:45
      5220        220 x                              08/09/2013 23:54:28
    874232        232 x                              08/09/2013 23:50:34
    610984        984 x                              08/09/2013 23:39:15


Execution Plan
----------------------------------------------------------
Plan hash value: 2679878340

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |     5 |   400 |       | 13907   (1)| 00:00:01 |
|*  1 |  VIEW                    |           |     5 |   400 |       | 13907   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           |  1000K|    45M|    57M| 13907   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | BIG_TABLE |  1000K|    45M|       |  1984   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("B"."LAST_MOD_DATE")
              DESC )<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7302  consistent gets
          0  physical reads
          0  redo size
        954  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

Well, at least we got back the same rows so our query worked but Oracle completely ignored our index and chose to perform a full table scan, costing a total of 7,302 logical IOs. Ouch!

Moving onto the FETCH FIRST ROWS syntax:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   big_table r
  6  ORDER  BY
  7         r.last_mod_date DESC
  8  FETCH FIRST 5 ROWS ONLY
  9  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    545616        616 x                              08/09/2013 23:59:05
    557331        331 x                              08/09/2013 23:57:45
      5220        220 x                              08/09/2013 23:54:28
    874232        232 x                              08/09/2013 23:50:34
    610984        984 x                              08/09/2013 23:39:15


Execution Plan
----------------------------------------------------------
Plan hash value: 2679878340

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |     5 |   445 |       | 13907   (1)| 00:00:01 |
|*  1 |  VIEW                    |           |     5 |   445 |       | 13907   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           |  1000K|    45M|    57M| 13907   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | BIG_TABLE |  1000K|    45M|       |  1984   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("R"."LAST_MOD_DATE")
              DESC )<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7302  consistent gets
          0  physical reads
          0  redo size
        954  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

We can see from the execution plan that the FETCH FIRST ROWS syntax has used exactly the same execution plan as the ROW_NUMBER approach; a full table scan. Indeed, the predicate information of the plan shows that it has used the ROW_NUMBER function.

So far we have seen that the ROW_NUMBER and FETCH FIRST ROWS approaches have ignored a suitable index. Let’s look at another scenario that might be a bit more realistic. The table contains a CAT column (for CATegory), which contains 1,000 distinct values. The query we’ll run is “Get the last 5 modified rows for category 42”. For this exercise we’ll create a new index on the category and the last modified date:

CREATE INDEX big_table_ix2
   ON big_table(cat, last_mod_date)

First up the ROWNUM approach:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.padding
  7          ,      b.last_mod_date
  8          ,      b.cat
  9          FROM   big_table b
 10          WHERE  cat = 42
 11          ORDER  BY
 12                 b.last_mod_date DESC) r
 13  WHERE  ROWNUM <= 5
 14  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    156042         42 x                              17/08/2013 23:57:57
    118042         42 x                              17/08/2013 12:44:38
    266042         42 x                              11/08/2013 20:13:13
    805042         42 x                              04/08/2013 08:45:18
    151042         42 x                              30/07/2013 06:46:54


Execution Plan
----------------------------------------------------------
Plan hash value: 200163764

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     5 |   335 |     9   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                 |               |       |       |            |          |
|   2 |   VIEW                         |               |     5 |   335 |     9   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | BIG_TABLE     |  1000 | 48000 |     9   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| BIG_TABLE_IX2 |     5 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)
   4 - access("CAT"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

Similar to our previous query, the ROWNUM approach does uses the new index, keeping the number of logical IO operations down to just 9. Let’s see how the ROW_NUMBER version of our query fares:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.cat
  7          ,      b.padding
  8          ,      b.last_mod_date
  9          ,      ROW_NUMBER() OVER (ORDER BY b.last_mod_date DESC) AS rn
 10          FROM   big_table b
 11          WHERE  cat = 42) r
 12  WHERE  rn <= 5
 13  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    156042         42 x                              17/08/2013 23:57:57
    118042         42 x                              17/08/2013 12:44:38
    266042         42 x                              11/08/2013 20:13:13
    805042         42 x                              04/08/2013 08:45:18
    151042         42 x                              30/07/2013 06:46:54

Execution Plan
----------------------------------------------------------
Plan hash value: 1296513801

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     5 |   400 |  1006   (0)| 00:00:01 |
|*  1 |  VIEW                          |               |     5 |   400 |  1006   (0)| 00:00:01 |
|*  2 |   WINDOW NOSORT STOPKEY        |               |  1000 | 48000 |  1006   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | BIG_TABLE     |  1000 | 48000 |  1006   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| BIG_TABLE_IX2 |  1000 |       |     6   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("B"."LAST_MOD_DATE") DESC
              )<=5)
   4 - access("CAT"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

Well, that’s a bit better. With the ROW_NUMBER query Oracle did elect to use the new index. The presence of the number of the INDEX RANGE SCAN DESCENDING and WINDOW NOSORT STOPKEY lines in the execution plan show that Oracle is only accessing the minimum number of entries it needs to satisfy the query, as opposed to all the entries with a CAT value of 42. This is confirmed by the statistics report showing the logical IO operations is just 10.

Now to see if the FETCH FIRST ROWS syntax does the same:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   big_table r
  6  WHERE  cat = 42
  7  ORDER  BY
  8         r.last_mod_date DESC
  9  FETCH FIRST 5 ROWS ONLY
 10  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    156042         42 x                              17/08/2013 23:57:57
    118042         42 x                              17/08/2013 12:44:38
    266042         42 x                              11/08/2013 20:13:13
    805042         42 x                              04/08/2013 08:45:18
    151042         42 x                              30/07/2013 06:46:54


Execution Plan
----------------------------------------------------------
Plan hash value: 1296513801

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     5 |   445 |  1006   (0)| 00:00:01 |
|*  1 |  VIEW                          |               |     5 |   445 |  1006   (0)| 00:00:01 |
|*  2 |   WINDOW NOSORT STOPKEY        |               |  1000 | 48000 |  1006   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | BIG_TABLE     |  1000 | 48000 |  1006   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| BIG_TABLE_IX2 |  1000 |       |     6   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("R"."LAST_MOD_DATE") DESC
              )<=5)
   4 - access("CAT"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

As per our previous query, the FETCH FIRST ROWS approach appears to be nothing more than a rewrite of the ROW_NUMBER one. The plan and the statistics are all the same.

Admittedly both of the above query scenarios are somewhat contrived. It is unlikely that the last modified date column would be indexed unless there is a driving need for top n type queries against the data. A more likely scenario would be to have an index on the CAT column only. With just this index in place all three queries performed near identical plans of:

--------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |               |     5 |   335 |  1006   (1)| 00:00:01 |
|*  1 |  COUNT STOPKEY                         |               |       |       |            |          |
|   2 |   VIEW                                 |               |  1000 | 67000 |  1006   (1)| 00:00:01 |
|*  3 |    SORT ORDER BY STOPKEY               |               |  1000 | 48000 |  1006   (1)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| BIG_TABLE     |  1000 | 48000 |  1005   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | BIG_TABLE_IX3 |  1000 |       |     5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

and had statistics of:

Statistics
---------------------------------------------------
          0  recursive calls
          0  db block gets
       1005  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

In this scenario any approach to finding the last modified rows is the same from a resource consumption perspective. However, it is particularly interesting how efficient a suitable index makes the query though. Without the LAST_MOD_DATE column being indexed the queries now require just over 1,000 logical IO operations, which is 100 times what they required when the column was included from the index. This provides a nice example of creating indexes that are appropriate to the queries being run against the data.

In summary, we have multiple approaches to writing a TOP n type query; a sorted in-line view with ROWNUM filtering, an in-line view with ROW_NUMBER filter and, with 12c, the FETCH FIRST ROWS syntax. If a suitable access path index is in place then all approaches seem roughly equivalent in terms of cost, except for the cast where the query run against the entire table. In this situation only the ROWNUM approach made use of an index on the LAST_MOD_DATE. As per many things related to the query optimiser, check what you might expect is actually happening and adjust accordingly.