Index Compression

Whilst preparing this post I noticed that I’ve been beaten to it by Connor McDonald’s recent post on index compression. Since my post was virtually ready I’m going to post it anyway but make sure to read Connor’s and follow the links to Richard Foote’s articles too.

Index compression is a feature that’s been around since something like Oracle 8. With index compression, if the index has repeating values for the leading columns (or “attributes” since we can index expressions… I’ll continue to use the term “columns” for the rest of the post though) of the index then Oracle is able to optimise the storage of the index entries and fit more entries per block thereby producing an index that has a smaller on disk footprint.

The syntax for compressing an index is:

CREATE INDEX <index_name> 
   ON <table_name> (<columns>)
   COMPRESS <n>

where n is the number of leading columns to compress. Obviously n cannot be a number greater than there are columns in the index and it may not be worth compressing on all columns so n is often less than the number of columns in the index.

“How much smaller will the index be?” I hear you ask. Well, as with many things results will vary. Thankfully Oracle will tell you what you can expect to achieve so you can make a decision as to whether to proceed or not before actually performing the operation.

Let’s take a look at how we would go about assessing an index for compression. We’ll start with a table containing 4 million rows:

CREATE TABLE index_demo
    (id         NUMBER(10)  NOT NULL
    ,val1       NUMBER(4) NOT NULL
    ,val2       NUMBER(4) NOT NULL)
/

INSERT INTO index_demo
WITH r AS (SELECT ROWNUM rn FROM dual CONNECT BY ROWNUM <= 2000)
SELECT ROWNUM
,      MOD(ROWNUM,100)
,      TRUNC(ROWNUM/1001)
FROM   r, r
/

COMMIT
/

EXEC dbms_stats.gather_table_stats ('','index_demo')

With this table, ID is unique, VAL1 has 100 distinct values and VAL2 has 1001 distinct values. Next we create 4 different indexes; two non-unique and two unique:

CREATE INDEX index_demo_i1
   ON index_demo (val1)
/

CREATE INDEX index_demo_i2
   ON index_demo (val2, val1)
/

CREATE UNIQUE INDEX index_demo_i3
   ON index_demo (val1, id)
/

CREATE UNIQUE INDEX index_demo_i4
   ON index_demo (id, val2)
/

Next we’ll get some basic stats for our indexes:

SELECT index_name
,      blevel
,      leaf_blocks
,      distinct_keys
,      avg_leaf_blocks_per_key AS avg_lf_key
,      num_rows
FROM   user_ind_statistics
WHERE  table_name = 'INDEX_DEMO'
ORDER  BY
       index_name
/

On my 12.1.0.2 database with an 8 kb block size, this gave the following:

INDEX_NAME       BLEVEL LEAF_BLOCKS DISTINCT_KEYS AVG_LF_KEY   NUM_ROWS
--------------- ------- ----------- ------------- ---------- ----------
INDEX_DEMO_I1         2        7792           100         77    4000000
INDEX_DEMO_I2         2       10006        399605          1    4000000
INDEX_DEMO_I3         2       10431       4000000          1    4000000
INDEX_DEMO_I4         2       10982       4000000          1    4000000

The key metric we’re interested in is the LEAF_BLOCKS. When we compress an index we are wanting to see a reduction in that. Oracle will tell us if we analyse the index and look at the index statistics gathered:

ANALYZE INDEX index_demo_i1 VALIDATE STRUCTURE
/

SELECT opt_cmpr_count
,      opt_cmpr_pctsave
FROM   index_stats
/

The two measures from INDEX_STATS are the recommended number of columns to compress and the percent saving in space that can be expected. Repeating the above for each index we end up with:

INDEX_NAME       OPT_CMPR_COUNT OPT_CMPR_PCTSAVE
--------------- --------------- ----------------
INDEX_DEMO_I1                 1               21
INDEX_DEMO_I2                 2               31
INDEX_DEMO_I3                 1               15
INDEX_DEMO_I4                 0                0

So, from the above, Oracle is recommending we use COMPRESS 1 for index INDEX_DEMO_I1 and we should see a 21% reduction in space used by our index. Compression on the first of our unique indexes, INDEX_DEMO_I3, is still possible as the first column has repeating values. Oracle is saying we should see a 15% reduction in space usage. Our second unique index, INDEX_DEMO_I4, leads with a column that is unique and so Oracle says that compression isn’t worthwhile on that index (note, it’s actually still possible to apply compression to the index but you will end up increasing the size of the index!).

Now let’s test out Oracle’s recommendations and see how accurate they are. We could simply drop and recreate the indexes with the compress option but we can also rebuild them on-line, which is great for a system with high availability requirements.

ALTER INDEX index_demo_i1
   REBUILD
   COMPRESS 1
   ONLINE
/   

ALTER INDEX index_demo_i2
   REBUILD
   COMPRESS 2
   ONLINE
/   

ALTER INDEX index_demo_i3
   REBUILD
   COMPRESS 1
   ONLINE
/   

Rerunning our query against USER_IND_STATISTICS we get:

INDEX_NAME       BLEVEL LEAF_BLOCKS DISTINCT_KEYS AVG_LF_KEY   NUM_ROWS
--------------- ------- ----------- ------------- ---------- ----------
INDEX_DEMO_I1         2        6145           100         61    4000000
INDEX_DEMO_I2         2        6863        399605          1    4000000
INDEX_DEMO_I3         2        8782       4000000          1    4000000
INDEX_DEMO_I4         2       10982       4000000          1    4000000

Some quick calculations show the compression actually achieved:

INDEX_NAME       PCT COMPRESSION
--------------- ----------------
INDEX_DEMO_I1                 21
INDEX_DEMO_I2                 30
INDEX_DEMO_I3                 16

We can see that Oracle’s estimates of compression were quite accurate.

Index compression is a feature that can result in a significant space saving for any moderately large database. By the time the space saving is multiplied over development and test environments and back-up space it’s pretty much a no-brainer to use it whereever possible.

Flashback Data Archive

I recently started work on a small APEX application. The owner of the application requested that all data changes be tracked, which was not an unreasonable request given the nature of the application. In days of yore, i.e. pre-Oracle 12c, this would have involved something like custom PL/SQL code within an data access API or a (*shudder*) trigger-based strategy. However, recalling back to Bjoern Rost’s presentation at the 2014 AusOUG conference I decided to look into using Flashback Data Archive (FDA) instead… and I’m so glad I did!

Flashback Data Archive essentially involves defining a retention policy, e.g. something like “1 year”, and associating tables with that policy. Oracle then handles the rest. Let’s take a look at how easy it is to set this up. We will:

  • define a new tablespace
  • create a FDA policy (specifying the new tablespace as the storage)
  • create an FDA application (which is nothing more than a container for tables to allow easy FDA management; 12c only)
  • associate the application with the FDA policy
  • enable the FDA

An example is in order and we’ll use Oracle standard DEPT and EMP tables for it (using a database of version 12.1.0.2). Assuming you’ve already got the table set up in the SCOTT account we’ll create an FDA to track 2 years of data changes:

-- Create the FDA tablespace
CREATE TABLESPACE scott_fda_data 
   DATAFILE '<<path>>/scott_fda_data.dbf'
   SIZE 1M 
   AUTOEXTEND ON NEXT 1M
   MAXSIZE 200M
/

-- Grant SCOTT access to the new tablespace
ALTER USER scott 
   QUOTA UNLIMITED ON scott_fda_data
/
   
-- Create the flashback data archive policy
CREATE FLASHBACK ARCHIVE scott_fda 
   TABLESPACE scott_fda_data
   QUOTA 200M 
   RETENTION 2 YEAR
/   

-- Set up an Application so we can control FDA for all tables (Oracle 12c only)
EXEC dbms_flashback_archive.register_application('SCOTT','SCOTT_FDA')

EXEC dbms_flashback_archive.add_table_to_application ('SCOTT','DEPT','SCOTT')
EXEC dbms_flashback_archive.add_table_to_application ('SCOTT','EMP','SCOTT')

EXEC dbms_flashback_archive.enable_application ('SCOTT')

Now that we’ve got our FDA in place, let’s make some data changes. The script below makes use of DBMS_LOCK.SLEEP to insert delays of 1 minute. This is done so that later on we can query the tables as they looked back in time using semi-realistic examples.

UPDATE emp
SET    sal = sal + 10;
COMMIT;

EXEC dbms_lock.sleep (60)

INSERT INTO EMP VALUES
(8000,'MARK','MANAGER',7839,to_date('01-01-2015','dd-mm-yyyy'),2000,NULL,40);
COMMIT;

EXEC dbms_lock.sleep (60)

INSERT INTO EMP VALUES
(8010,'JAMES','DEVELOPER',8000,to_date('15-01-2015','dd-mm-yyyy'),2500,NULL,40);
COMMIT;

EXEC dbms_lock.sleep (60)

INSERT INTO EMP VALUES
(8010,'HEATHER','DEVELOPER',8000,to_date('20-01-2015','dd-mm-yyyy'),2200,NULL,40);
COMMIT;

EXEC dbms_lock.sleep (60)

UPDATE emp
SET    sal = sal + 500
WHERE  empno = 8000;
COMMIT;

Of course, there’s little point tracking data changes if you can’t easily access them. Simple flashback query clauses to SELECT allows us to query the source tables either as it looked at a specific point in time or show all the changes that have occurred across time.

For example, to view the data in a table at a specific point in time we use:

SELECT *
FROM   <table_name> AS OF TIMESTAMP <timestamp_expr>

So, for our example table we could run something like the following to find out what that the EMP data was like 5 minutes ago, just before we did the operations shown above:

SELECT *
FROM   emp AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '5' MINUTE)
ORDER  BY
       empno;

If you are wanting to get a full version history of the changes made then you can use the VERSIONS BETWEEN clause:

SELECT e.*
,      versions_starttime
,      versions_endtime
FROM   emp VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE e
WHERE  empno = 8000
ORDER  BY
       versions_starttime;

The above query includes the pseudo-columns versions_starttime andversions_endtime, which provide the date range that a specific row version was effective for.

Now for some FDA details not mentioned above:

  • Flashback Data Archive is available in all versions of Oracle from 11.2.0.4 without additional licensing. In earlier versions of 11g it required Advanced Compression license to utilise but that restriction has been removed now the archive tables are, by default, not compressed.
  • Only Oracle 12c supports the ability to create flashback applications. In 11g you associate the table directly with the FDA one by one using ALTER TABLE <<table>> FLASHBACK ARCHIVE <<fda>>. This means the FDA tracking won’t be started/stopped at precisely the same moment in time for all tables.
  • You’ll need to disable FDA whilst making DDL changes to the tracked tables; dbms_flashback_archive.disable_application
  • In order to query the tables as at an earlier time period you require FLASHBACK privilege on the table, or the FLASHBACK ANY TABLE privilege.

In summary, Flashback Data Archive provides an effective means of tracking data changes on a table. The implementation is not only simple and the interface for accessing the historical data, via extensions to the SELECT statement, is easy to harness and intuitive.

Storing Active Session History

If you’ve ever had to diagnose a production problem in Oracle you will know the huge benefit that Active Session History (commonly known as ASH) gives you. In summary, ASH provides you with a one second sampling of all active sessions in the database. As noted previously this data is stored in a memory buffer that is overwritten in a circular fashion. This means that there’s no defined period of time that the data is retained; a busy system will age out data quicker than one with little activity. Clearly it would be beneficial to retain the ASH data according to a time-based policy. That way when an application developer comes to you with a slow query problem two days after the fact you won’t be left wondering if the ASH data for that period of time is available or not.

If you take a look at v$active_session_history you will note that the first column is named SAMPLE_ID. The Oracle Database Reference simply notes this to be “ID of the sample” but that’s enough to indicate that it should be unique. Armed with this knowledge it isn’t hard to regularly capture the contents of v$active_session_history into a separate table so that it’s available should you require it. Furthermore, if you were to range partitioning the table on the SAMPLE_TIME then old data can be easily removed by simply dropping partitions.

CREATE TABLE ash_hist
PARTITION BY RANGE (sample_time) INTERVAL (INTERVAL '1' DAY)
   (PARTITION ash_hist_p1 VALUES LESS THAN (TO_TIMESTAMP('01/01/2015','dd/mm/yyyy')))
AS
SELECT *
FROM   v$active_session_history;

A capture routine that can be scheduled using the database scheduler, DBMS_JOB or an external job scheduling application can be as simple as:

BEGIN
   SELECT MAX(sample_id)
   INTO   l_max_sample_id
   FROM   ash_hist
   -- add a filter to stop scanning all partitions. just make sure
   -- the interval falls within the capture time period
   WHERE  sample_time >= CAST(SYSTIMESTAMP AS TIMESTAMP) - INTERVAL '1' DAY;

   INSERT INTO ash_hist
   SELECT *
   FROM   v$active_session_history
   WHERE  sample_id >= NVL(l_max_sample_id,0);
   
   COMMIT;
END;

Dropping the old partitions once they exceed their usefulness can be done via a routine similar to the following:

DECLARE
   -- A cursor that returns the partition details for
   -- the ASH_HIST table
   CURSOR tab_partition_cur
   IS
   SELECT tp.partition_name
   ,      tp.high_value
   ,      tp.high_value_length
   FROM   user_tab_partitions tp
   WHERE  tp.table_name     = 'ASH_HIST'
   -- we sort the cursor from the earliest partition to the latest
   -- once we've hit one that we will not drop then we can stop
   -- processing the cursor
   ORDER  BY
          partition_position;
          
   tab_partition_rec    tab_partition_cur%ROWTYPE;          

   c_retention_days      CONSTANT NUMBER(4) := 60;
   l_earliest_timestamp  ash_hist.sample_time%TYPE;
   l_partition_timestamp ash_hist.sample_time%TYPE;
   l_continue_processing BOOLEAN := TRUE;   
BEGIN
   
   l_earliest_timestamp := CAST(SYSTIMESTAMP AS TIMESTAMP) - 
                           NUMTODSINTERVAL(c_retention_days,'DAY');
                                 
   -- reset the start of the interval partitioning so we can drop
   -- the old partitions
   EXECUTE IMMEDIATE 'ALTER TABLE ash_hist SET INTERVAL(INTERVAL ''1'' DAY)';
   
   OPEN tab_partition_cur;
   
   WHILE (l_continue_processing)
   LOOP

      FETCH tab_partition_cur INTO tab_partition_rec;

      -- we stop if we run out of partitions to process
      IF (tab_partition_cur%NOTFOUND) THEN
         l_continue_processing := FALSE;
         
      ELSE
      
         -- extract the high value as a TIMESTAMP
         EXECUTE IMMEDIATE 'SELECT ' || SUBSTR (tab_partition_rec.high_value
                                               ,1
                                               ,tab_partition_rec.high_value_length) || 
                           ' FROM dual' INTO l_partition_timestamp;

         -- if the partition is older than our retention period we drop it
         IF (l_partition_timestamp <= l_earliest_timestamp)
         THEN
            EXECUTE IMMEDIATE 'ALTER TABLE ash_hist DROP PARTITION ' || 
                              tab_partition_rec.partition_name;

         -- we've hit a cursor entry for a partition that we won't be dropping so
         -- we can stop as the remaining cursor entries will be the same      
         ELSE
            l_continue_processing := FALSE;
         END IF;
         
      END IF;
   
   END LOOP;
   
   CLOSE tab_partition_cur;
   
END;

Note, Active Session History requires Enterprise Edition database and is licensed as part of the Diagnostics Pack. Please make sure you have the appropriate licenses before using this facility.

Credits: Thanks go to Connor McDonald for setting up a system like this where I currently work. I’ve found it to be the single most invaluable tool for troubleshooting.

TOP n Queries

So you’ve been given a relatively simple task of getting the last 5 modified rows from a table, which contains a column that holds the modified date. How do you write the query to do that?

Back in pre-10 versions of Oracle it was likely that the query was written:

SELECT *
FROM   (SELECT *
        FROM   my_table
        ORDER  BY
               created_date DESC)
WHERE  ROWNUM <= 5

… and this works pretty well. If you had written it:

SELECT *
FROM   my_table
WHERE  ROWNUM <= 5
ORDER  BY
       created_date DESC

you would (hopefully) have observed some strange output during your testing and adjusted your query accordingly. For the Oracle newcomers, in the above query Oracle selects 5 rows and then sorts them, not the other way around, so it’s the equivalent of telling Oracle “get any 5 rows from the table and give them to me sorted descending by the created date”.

Somewhere around Oracle 10g the recommendation was to use the ROW_NUMBER analytic function in place of ROWNUM, i.e.

SELECT *
FROM   (SELECT t.*
        ,      ROW_NUMBER() OVER (ORDER BY created_date DESC) rn
        FROM   my_table t)
WHERE  rn <= 5

Now in version 12c of the database Oracle has introduced the FETCH FIRST ROWS syntax for doing exactly this kind of query. It makes things quite simple and clear:

SELECT *
FROM   my_table
ORDER  BY
       created_date DESC
FETCH FIRST 5 ROWS ONLY

Now let’s take a peek under the covers and see what Oracle is actually doing when faced with these queries. To start with we’ll create a simple table with 1,000,000 rows:

CREATE TABLE big_table
   (id            NUMBER(8)   NOT NULL
   ,cat           NUMBER(4)   NOT NULL
   ,padding       CHAR(30)    NOT NULL
   ,last_mod_date DATE        NOT NULL)
/

INSERT INTO big_table
SELECT ROWNUM
,      MOD(ROWNUM,1000) AS cat
,      'x' AS padding
,      TO_DATE('01/01/2000','dd/mm/yyyy') + dbms_random.value(0,5000) AS last_mod_date
FROM   (SELECT 'x'
        FROM   dual
        CONNECT BY ROWNUM <= 1000) x
,      (SELECT 'x'
        FROM   dual
        CONNECT BY ROWNUM <= 1000) y
/        

COMMIT
/

EXEC dbms_stats.gather_table_stats ('','big_table')

CREATE UNIQUE INDEX  big_table_pk
   ON big_table (id)
/

ALTER TABLE big_table
   ADD CONSTRAINT big_table_pk
   PRIMARY KEY (id)
/

Obviously we want to access our last created rows as quick as possible so we’ll index that column:

CREATE INDEX big_table_ix1
   ON big_table(last_mod_date)
/   

We’ll run each query against our table with AUTOTRACE enabled to see the execution plan and cost. As with anything related to Oracle performance it’s important to keep in mind the version that you’re using as things can change across versions. In light of that statement, the following examples were run against a 12.1.0.2 database. First up is our ROWNUM approach:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.padding
  7          ,      b.last_mod_date
  8          ,      b.cat
  9          FROM   big_table b
 10          ORDER  BY
 11                 b.last_mod_date DESC) r
 12  WHERE  ROWNUM <= 5
 13  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    545616        616 x                              08/09/2013 23:59:05
    557331        331 x                              08/09/2013 23:57:45
      5220        220 x                              08/09/2013 23:54:28
    874232        232 x                              08/09/2013 23:50:34
    610984        984 x                              08/09/2013 23:39:15


Execution Plan
----------------------------------------------------------
Plan hash value: 2877194421

-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |     5 |   335 |     8   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |               |       |       |            |          |
|   2 |   VIEW                        |               |     5 |   335 |     8   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| BIG_TABLE     |  1000K|    45M|     8   (0)| 00:00:01 |
|   4 |     INDEX FULL SCAN DESCENDING| BIG_TABLE_IX1 |     5 |       |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
        954  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

As we can see from the execution plan, Oracle traversed the index on the LAST_MODIFIED_DATE in a descending fashion, returned each table entry found until, and this is the COUNT STOPKEY bit, it had returned the requested number of rows. All up the query required 9 logical IO operations so it’s very efficient.

Next up we do the same with the ROW_NUMBER query:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.cat
  7          ,      b.padding
  8          ,      b.last_mod_date
  9          ,      ROW_NUMBER() OVER (ORDER BY b.last_mod_date DESC) AS rn
 10          FROM   big_table b) r
 11  WHERE  rn <= 5
 12  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    545616        616 x                              08/09/2013 23:59:05
    557331        331 x                              08/09/2013 23:57:45
      5220        220 x                              08/09/2013 23:54:28
    874232        232 x                              08/09/2013 23:50:34
    610984        984 x                              08/09/2013 23:39:15


Execution Plan
----------------------------------------------------------
Plan hash value: 2679878340

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |     5 |   400 |       | 13907   (1)| 00:00:01 |
|*  1 |  VIEW                    |           |     5 |   400 |       | 13907   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           |  1000K|    45M|    57M| 13907   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | BIG_TABLE |  1000K|    45M|       |  1984   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("B"."LAST_MOD_DATE")
              DESC )<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7302  consistent gets
          0  physical reads
          0  redo size
        954  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

Well, at least we got back the same rows so our query worked but Oracle completely ignored our index and chose to perform a full table scan, costing a total of 7,302 logical IOs. Ouch!

Moving onto the FETCH FIRST ROWS syntax:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   big_table r
  6  ORDER  BY
  7         r.last_mod_date DESC
  8  FETCH FIRST 5 ROWS ONLY
  9  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    545616        616 x                              08/09/2013 23:59:05
    557331        331 x                              08/09/2013 23:57:45
      5220        220 x                              08/09/2013 23:54:28
    874232        232 x                              08/09/2013 23:50:34
    610984        984 x                              08/09/2013 23:39:15


Execution Plan
----------------------------------------------------------
Plan hash value: 2679878340

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           |     5 |   445 |       | 13907   (1)| 00:00:01 |
|*  1 |  VIEW                    |           |     5 |   445 |       | 13907   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           |  1000K|    45M|    57M| 13907   (1)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | BIG_TABLE |  1000K|    45M|       |  1984   (1)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("R"."LAST_MOD_DATE")
              DESC )<=5)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7302  consistent gets
          0  physical reads
          0  redo size
        954  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

We can see from the execution plan that the FETCH FIRST ROWS syntax has used exactly the same execution plan as the ROW_NUMBER approach; a full table scan. Indeed, the predicate information of the plan shows that it has used the ROW_NUMBER function.

So far we have seen that the ROW_NUMBER and FETCH FIRST ROWS approaches have ignored a suitable index. Let’s look at another scenario that might be a bit more realistic. The table contains a CAT column (for CATegory), which contains 1,000 distinct values. The query we’ll run is “Get the last 5 modified rows for category 42”. For this exercise we’ll create a new index on the category and the last modified date:

CREATE INDEX big_table_ix2
   ON big_table(cat, last_mod_date)

First up the ROWNUM approach:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.padding
  7          ,      b.last_mod_date
  8          ,      b.cat
  9          FROM   big_table b
 10          WHERE  cat = 42
 11          ORDER  BY
 12                 b.last_mod_date DESC) r
 13  WHERE  ROWNUM <= 5
 14  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    156042         42 x                              17/08/2013 23:57:57
    118042         42 x                              17/08/2013 12:44:38
    266042         42 x                              11/08/2013 20:13:13
    805042         42 x                              04/08/2013 08:45:18
    151042         42 x                              30/07/2013 06:46:54


Execution Plan
----------------------------------------------------------
Plan hash value: 200163764

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     5 |   335 |     9   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                 |               |       |       |            |          |
|   2 |   VIEW                         |               |     5 |   335 |     9   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | BIG_TABLE     |  1000 | 48000 |     9   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| BIG_TABLE_IX2 |     5 |       |     3   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=5)
   4 - access("CAT"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

Similar to our previous query, the ROWNUM approach does uses the new index, keeping the number of logical IO operations down to just 9. Let’s see how the ROW_NUMBER version of our query fares:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   (SELECT b.id
  6          ,      b.cat
  7          ,      b.padding
  8          ,      b.last_mod_date
  9          ,      ROW_NUMBER() OVER (ORDER BY b.last_mod_date DESC) AS rn
 10          FROM   big_table b
 11          WHERE  cat = 42) r
 12  WHERE  rn <= 5
 13  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    156042         42 x                              17/08/2013 23:57:57
    118042         42 x                              17/08/2013 12:44:38
    266042         42 x                              11/08/2013 20:13:13
    805042         42 x                              04/08/2013 08:45:18
    151042         42 x                              30/07/2013 06:46:54

Execution Plan
----------------------------------------------------------
Plan hash value: 1296513801

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     5 |   400 |  1006   (0)| 00:00:01 |
|*  1 |  VIEW                          |               |     5 |   400 |  1006   (0)| 00:00:01 |
|*  2 |   WINDOW NOSORT STOPKEY        |               |  1000 | 48000 |  1006   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | BIG_TABLE     |  1000 | 48000 |  1006   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| BIG_TABLE_IX2 |  1000 |       |     6   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("RN"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("B"."LAST_MOD_DATE") DESC
              )<=5)
   4 - access("CAT"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

Well, that’s a bit better. With the ROW_NUMBER query Oracle did elect to use the new index. The presence of the number of the INDEX RANGE SCAN DESCENDING and WINDOW NOSORT STOPKEY lines in the execution plan show that Oracle is only accessing the minimum number of entries it needs to satisfy the query, as opposed to all the entries with a CAT value of 42. This is confirmed by the statistics report showing the logical IO operations is just 10.

Now to see if the FETCH FIRST ROWS syntax does the same:

SQL>SELECT r.id
  2  ,      r.cat
  3  ,      r.padding
  4  ,      r.last_mod_date
  5  FROM   big_table r
  6  WHERE  cat = 42
  7  ORDER  BY
  8         r.last_mod_date DESC
  9  FETCH FIRST 5 ROWS ONLY
 10  /

        ID        CAT PADDING                        LAST_MOD_DATE
---------- ---------- ------------------------------ -------------------
    156042         42 x                              17/08/2013 23:57:57
    118042         42 x                              17/08/2013 12:44:38
    266042         42 x                              11/08/2013 20:13:13
    805042         42 x                              04/08/2013 08:45:18
    151042         42 x                              30/07/2013 06:46:54


Execution Plan
----------------------------------------------------------
Plan hash value: 1296513801

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     5 |   445 |  1006   (0)| 00:00:01 |
|*  1 |  VIEW                          |               |     5 |   445 |  1006   (0)| 00:00:01 |
|*  2 |   WINDOW NOSORT STOPKEY        |               |  1000 | 48000 |  1006   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID | BIG_TABLE     |  1000 | 48000 |  1006   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN DESCENDING| BIG_TABLE_IX2 |  1000 |       |     6   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("R"."LAST_MOD_DATE") DESC
              )<=5)
   4 - access("CAT"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

As per our previous query, the FETCH FIRST ROWS approach appears to be nothing more than a rewrite of the ROW_NUMBER one. The plan and the statistics are all the same.

Admittedly both of the above query scenarios are somewhat contrived. It is unlikely that the last modified date column would be indexed unless there is a driving need for top n type queries against the data. A more likely scenario would be to have an index on the CAT column only. With just this index in place all three queries performed near identical plans of:

--------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |               |     5 |   335 |  1006   (1)| 00:00:01 |
|*  1 |  COUNT STOPKEY                         |               |       |       |            |          |
|   2 |   VIEW                                 |               |  1000 | 67000 |  1006   (1)| 00:00:01 |
|*  3 |    SORT ORDER BY STOPKEY               |               |  1000 | 48000 |  1006   (1)| 00:00:01 |
|   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| BIG_TABLE     |  1000 | 48000 |  1005   (0)| 00:00:01 |
|*  5 |      INDEX RANGE SCAN                  | BIG_TABLE_IX3 |  1000 |       |     5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

and had statistics of:

Statistics
---------------------------------------------------
          0  recursive calls
          0  db block gets
       1005  consistent gets
          0  physical reads
          0  redo size
        941  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

In this scenario any approach to finding the last modified rows is the same from a resource consumption perspective. However, it is particularly interesting how efficient a suitable index makes the query though. Without the LAST_MOD_DATE column being indexed the queries now require just over 1,000 logical IO operations, which is 100 times what they required when the column was included from the index. This provides a nice example of creating indexes that are appropriate to the queries being run against the data.

In summary, we have multiple approaches to writing a TOP n type query; a sorted in-line view with ROWNUM filtering, an in-line view with ROW_NUMBER filter and, with 12c, the FETCH FIRST ROWS syntax. If a suitable access path index is in place then all approaches seem roughly equivalent in terms of cost, except for the cast where the query run against the entire table. In this situation only the ROWNUM approach made use of an index on the LAST_MOD_DATE. As per many things related to the query optimiser, check what you might expect is actually happening and adjust accordingly.

ORA-00001 unique key violated with MERGE

I arrived at work one morning to find a system generated email telling me that a user session had thrown a unique constraint violation exception. Curious given the code had been running in production for 6 months…

The line number for the PL/SQL package that had thrown the exception showed that it came from a MERGE statement. The fact that it was a MERGE that had thrown an ORA-00001 immediately points to a concurrency issue. At first the developer of the code didn’t understand how a MERGE could throw ORA-00001 until I showed them the following example.

Starting with a simple two column table with 10 rows (running under Oracle 12.1.0.2):

CREATE TABLE merge_test
    (id     NUMBER(6) NOT NULL
    ,val    NUMBER(6) NOT NULL
    ,CONSTRAINT merge_test_pk PRIMARY KEY (id))
/

INSERT INTO merge_test
SELECT ROWNUM
,      ROWNUM
FROM   dual
CONNECT BY ROWNUM <= 10
/

COMMIT
/

EXEC dbms_stats.gather_table_stats ('','merge_test')

we run the following MERGE in two sessions, operating against an existing row in the table:

MERGE INTO merge_test dest
USING (SELECT 5    AS id
       ,      20   AS val
       FROM   dual) src
ON (dest.id = src.id)
WHEN MATCHED THEN UPDATE
SET dest.val = src.val
WHEN NOT MATCHED THEN INSERT
   (id, val)
VALUES
   (src.id, src.val)

The first session immediately reports:

1 row merged.

while the second session, attempting to obtain a lock on the same row, is blocked. Once the first session issues a commit or rollback then the second session also reports 1 row merged.

Now let’s repeat this exercise using an ID value for a row that does not exist in the table:

MERGE INTO merge_test dest
USING (SELECT 20   AS id
       ,      20   AS val
       FROM   dual) src
ON (dest.id = src.id)
WHEN MATCHED THEN UPDATE
SET dest.val = src.val
WHEN NOT MATCHED THEN INSERT
   (id, val)
VALUES
   (src.id, src.val)

Session 1 reports the 1 row merged while session 2, after session 1 commits, reports:

MERGE INTO merge_test dest
*
ERROR at line 1:
ORA-00001: unique constraint (DEVELOPER.MERGE_TEST_PK) violated

The reason for this is all related to Oracle’s multi-versioning read consistency model. At the point in time that session 2 executed the MERGE session 1 had not committed the data so session 2 cannot “see” the row that session 1 is about to insert. As a result the MERGE that session 2 executes also attempts to insert a new row. Oracle then blocks session 2 as it detects a uniqueness conflict with the row that session 1 is inserting. Oracle does not immediately report the exception as it needs to wait until session 1 commits or rollbacks its transaction. When session 1 commits the transaction then session 2 throws the ORA-00001 exception. If session 1 had issued a rollback then session 2 would have been permitted to insert the new row.

The solution to this problem is to simply remove the MERGE and replace it with an INSERT/UPDATE combination:

BEGIN
   INSERT INTO merge_test (id, val)
   VALUES (20, 20);
EXCEPTION
   WHEN DUP_VAL_ON_INDEX THEN
      UPDATE merge_test
      SET    val = 20
      WHERE  id = 20;
END;

With this code session 2 will block on its attempt to perform the INSERT. Once session 1 issues a commit then session 2 intercepts the DUP_VAL_ON_INDEX exception that is thrown, which is then caught and the operation is changed to an UPDATE.

Back to the production problem, a scan of the application log files and database entries did indeed show that there were two near simultaneous request that would have resulted in the MERGE being run for the same key, confirming the diagnosis. Another problem solved and, for the developer of the code, another lesson learned.

String constants and package state

I have previously discussed package state and avoiding the “ORA-04068: existing state of packages has been discarded” exception. Connor McDonald has pointed out that come Oracle 11.2 constants do not create package state, except when they involve a calculation. This post expands on that by noting that string constants always create package state (thanks to my work colleague Vlad for pointing this out, after finding out the hard way…).

First up, let’s take a look at how numeric constants don’t create package state. We start with the simplest of packages compiled in session 1:

CREATE OR REPLACE PACKAGE pkg_state
AS
   c_number   CONSTANT NUMBER(5) := 123;
END pkg_state;

and in another session, session 2, we run a short program that makes reference to the package constant:

SET SERVEROUTPUT ON
BEGIN
   dbms_output.put_line ('Package value is: ' || TO_CHAR(pkg_state.c_number));
END;

Not surprisingly we get the output:

Package value is: 123

If we go back to session 1 and recompile the package changing the constant to a different value:

CREATE OR REPLACE PACKAGE pkg_state
AS
   c_number   CONSTANT NUMBER(5) := 789;
END pkg_state;

and rerun our simple output script back in session 2 we don’t get ORA-04068 but instead our session displays the new value:

Package value is: 789

Now let’s repeat the entire exercise but with a string constant. In session 1:

CREATE OR REPLACE PACKAGE pkg_state
AS
   c_string   CONSTANT VARCHAR2(5) := 'ABC';
END pkg_state;

Running this in session 2:

SET SERVEROUTPUT ON
BEGIN
   dbms_output.put_line ('Package value is: ' || pkg_state.c_string);
END;

we get:

Package value is: ABC

Compiling the package in session 1 to:

CREATE OR REPLACE PACKAGE pkg_state
AS
   c_string   CONSTANT CHAR(5) := 'XYZ';
END pkg_state;

gives us the following when we rerun our display routine in session 2:

BEGIN
*
ERROR at line 1:
ORA-04068: existing state of packages has been discarded
ORA-04061: existing state of package "DEVELOPER.PKG_STATE" has been invalidated
ORA-04065: not executed, altered or dropped package "DEVELOPER.PKG_STATE"
ORA-06508: PL/SQL: could not find program unit being called: "DEVELOPER.PKG_STATE"
ORA-06512: at line 2

This occurs with Oracle 11.2 and 12.1. So, while in these versions have taken steps to limit constants from creating package state, Connor’s post and this post show that it doesn’t apply to all constants. Hence some care needs to be taken when planning your deployments. You wouldn’t want to be in a situation where you think you don’t have package state but in reality you do. Right Vlad?

ORA-01001: invalid cursor with REF CURSOR output parameter

A short post to note an Oracle database bug I encountered recently. Whilst testing some code the exception:

ORA-01001: invalid cursor

was produced. The error was originating from a pipelined function. Tracking things down it seems that Oracle will throw this exception when the following occur:

  • A PL/SQL routine passes out a ref cursor variable
  • The ref cursor passed out uses a pipelined function
  • The pipelined function itself has a ref cursor as as input parameter

Having replicated the problem on Linux, AIX and WIndows I believe it is not platform specific. It occurs in database versions 11.2 and 12.1. Earlier versions have not been tested and so may be affected too.

The bug appears to be similar to Bug 2968664 : ORA-1001 IN FUNCTION CALLED FROM SQL WHICH FETCHES FROM A REFCURSOR PARAMETER but that bug is noted to be fixed in version 10.1.0.1.

I created a simple test package, imaginatively named pipeline_test, for Oracle Support to replicate the problem. The package contains 3 callable routines:

  • run_test1 – a routine with an OUT ref cursor parameter that matches the criteria noted earlier. Calling this routine throws ORA-01001
  • run_test2 – a routine that opens a ref cursor similar to run_test1 but this routine consumes that cursor rather than passing it out to a caller. This routine does not generate ORA-01001.
  • run_test3 – A routine that calls run_test1 to obtain a ref cursor and then consumes it, similar to run_test2. This routine does generate ORA-01001.

The package and test script are available for download. The setup.sql file creates the package and collection object required by the pipelined function while the run.sql script is a basic SQL*Plus script to call the test routines.

For my situation I was to work around the problem by substituting a collection for the ref cursor that the pipelined function used. I had some luck with hinting the query that used the pipelined function with the MATERIALIZE hint.

I’ll update this post with the details that I get back from Oracle Support. At the moment they have acknowledged that there is a problem and they are investigating.


Update 01 May 2015: Oracle have created a new bug, Bug 20405099, to track this problem. However, access to the bug details on Oracle Support is restricted so you’ll have to take my word on it. :-)