Identity Columns

We’ve previously looked at the use of sequence NEXTVAL as a column default in Oracle 12c and the ability to use column defaults when an inserted value is NULL. Oracle has another change in 12c to help out in the area of column default; identity columns.

Developers coming to Oracle from other database platforms are sometimes perplexed about the need to create sequence objects to populate simple integer primary key columns. The complaint typically is one of “Why doesn’t Oracle just do this for me?” and identity columns are directly aimed at this complaint.

In it’s simplest form, an identity column is declared as:

CREATE TABLE test_table
   (id      NUMBER(12) GENERATED AS IDENTITY
   ,padding CHAR(30) NOT NULL)

Oracle will automatically add a NOT NULL constraint to the identity column. No unique index or primary key constraint is added though so you will need to define those yourself, assuming the column will be used as a table key.

Behind the scenes, Oracle creates a sequence and will draw from that sequence whenever an insert into the table is done, via a default on the column, as illustrated below:

SQL>SELECT sequence_name
  2  ,      min_value
  3  ,      max_value
  4  ,      cache_size
  5  FROM   user_sequences
  6  /

no rows selected

SQL>CREATE TABLE test_table
  2     (id      NUMBER(12) GENERATED AS IDENTITY
  3     ,padding CHAR(30) NOT NULL)
  4  /

Table created.

SQL>INSERT INTO test_table (padding)
  2  VALUES ('x')
  3  /

1 row created.

SQL>SELECT *
  2  FROM   test_table
  3  /

        ID PADDING
---------- ----------
         1 x

SQL>SELECT sequence_name
  2  ,      min_value
  3  ,      max_value
  4  ,      cache_size
  5  FROM   user_sequences
  6  /

SEQUENCE_NAME         MIN_VALUE  MAX_VALUE CACHE_SIZE
-------------------- ---------- ---------- ----------
ISEQ$$_91711                  1 1.0000E+28         20

SQL>SELECT column_name
  2  ,      data_default
  3  FROM   user_tab_columns
  4  WHERE  table_name = 'TEST_TABLE'
  5  AND    column_name = 'ID'
  6  /

COLUMN_NAME     DATA_DEFAULT
--------------- -----------------------------------
ID              "DEVELOPER"."ISEQ$$_91711".nextval

Like any normal sequence, the automatically created sequence can be drawn from explicitly, although this is probably not recommended:

SQL>SELECT ISEQ$$_91711.NEXTVAL FROM dual
  2  /

   NEXTVAL
----------
         2

Sequence properties can be explicitly specified when creating identity columns:

SQL>CREATE TABLE test_table
  2     (id      NUMBER(12) GENERATED AS IDENTITY
  3                         (CACHE 1000
  4                          MINVALUE 1000
  5                          MAXVALUE 999999999999)
  6     ,padding CHAR(30) NOT NULL)
  7  /

Table created.

SQL>SELECT sequence_name
  2  ,      min_value
  3  ,      max_value
  4  ,      cache_size
  5  FROM   user_sequences
  6  /

SEQUENCE_NAME         MIN_VALUE  MAX_VALUE CACHE_SIZE
-------------------- ---------- ---------- ----------
ISEQ$$_91725               1000 1.0000E+12       1000

Unlike a traditional column with a default, the identify column cannot be directly inserted into:

SQL>INSERT INTO test_table (id, padding)
  2  VALUES (999, 'x')
  3  /
INSERT INTO test_table (id, padding)
                        *
ERROR at line 1:
ORA-32795: cannot insert into a generated always identity column

However, the column can be altered to replace the “ALWAYS GENERATED” clause with the “BY DEFAULT”. In this case the sequence will behave just like any other sequence allocated column default:

SQL>ALTER TABLE test_table
  2     MODIFY id GENERATED BY DEFAULT AS IDENTITY
  3  /

Table altered.

SQL>INSERT INTO test_table (id, padding)
  2  VALUES (999, 'x')
  3  /

1 row created.

Finally, if the table is dropped then the sequence used for the identity column is also dropped:

SQL>DROP TABLE test_table PURGE
  2  /

Table dropped.

SQL>SELECT sequence_name
  2  ,      min_value
  3  ,      max_value
  4  ,      cache_size
  5  FROM   user_sequences
  6  /

no rows selected

Column defaults and NULL

More than once I have encountered tables with columns defined as:

   ...
   column_x VARCHAR2(10) DEFAULT ' '
   ...

Yep, that’s a nullable column with a default of a single space! Any time I see a nullable column with a default value I immediately think it’s a logical bug. After all, why define a default for a column and still permit NULLs? I realise that there are scenarios where this combination is legitimate but generally speaking these are rare. The use of a space as the default is to denote “no data” as client UIs will typically trim this down to en empty string.

A nullable column with a default of a space character will end up with a NULL if an insert specifies the column but sets the inserted value to NULL. This could be either explicitly assigning NULL or via an expression that evaluates to NULL.

The problem of having a column declared this way is that when writing queries you need to handle both scenarios. As a result, queries end up either wrapping column references with NVL(column_x, ‘ ‘) or trimming the column convert the single space entries to NULL. Either way, the queries end up looking confusing and can easily hamper the query optimiser.

It may appear that the obvious solution to the “does the column have a value or not” problem is to perform an update and translate all NULLs to spaces and then mark the column as NOT NULL. However, the problem is then how to find and modify the SQL statements that were inserting the NULLs in the first place… and here’s where the new ON NULL clause for column defaults in Oracle 12c helps us out.

In Oracle 12c, when a column default is defined with the ON NULL clause the default is also used if the inserted value evaluates to NULL and not only when no value for the column is specified. For our problematic column, we would define it as:

   ...
   column_x VARCHAR2(10) DEFAULT ' ' ON NULL
   ...

We can easily demonstrate the impact of this clause. First we use a default without the ON NULL clause to illustrate how the column can end up with a NULL value by creating a table and explicitly inserting two rows; one that uses the default and one that sets the column to NULL:

SQL>SET NULL <<null>>

SQL>CREATE TABLE test_table
  2     (id         NUMBER(10)
  3     ,column_x   VARCHAR2(10) DEFAULT ' ')
  4  /

Table created.

SQL>INSERT INTO test_table (id)
  2  VALUES (1)
  3  /

1 row created.

SQL>INSERT INTO test_table (id, column_x)
  2  VALUES (2, NULL)
  3  /

1 row created.

SQL>SELECT id
  2  ,      column_x
  3  ,      NVL(LENGTH(column_x),0) AS len_column_x
  4  FROM   test_table
  5  /

        ID COLUMN_X   LEN_COLUMN_X
---------- ---------- ------------
         1                       1
         2 <<null>>              0

We can see from the above how NULLs manage to get into our column even with a default in place. Now we repeat the test with the ON NULL clause:

SQL>ALTER TABLE test_table
  2     MODIFY column_x DEFAULT ON NULL ' '
  3  /

Table altered.

SQL>INSERT INTO test_table (id)
  2  VALUES (1)
  3  /

1 row created.

SQL>INSERT INTO test_table (id, column_x)
  2  VALUES (2, NULL)
  3  /

1 row created.

SQL>SELECT id
  2  ,      column_x
  3  ,      NVL(LENGTH(column_x),0) AS len_column_x
  4  FROM   test_table
  5  /

        ID COLUMN_X   LEN_COLUMN_X
---------- ---------- ------------
         1                       1
         2                       1

The above shows that the default value, a single space, is now being used when the column is not specified in the INSERT or if it is specified but the value inserted is NULL.

So, the solution to our problem of nullable columns with a default is to update the table and set all the NULLs to the default value and then change the default to make use of the ON NULL clause. No search of the codebase for SQL that generated the NULLs in the first place is required. Of course, you might have to upgrade to Oracle 12c first…

It should also be noted that Oracle has taken the liberty of defining a column with an ON NULL default to be NOT NULL:

SQL>DESCRIBE test_table
Name                  Null?    Type
--------------------- -------- ------------
ID                             NUMBER(10)
COLUMN_X              NOT NULL VARCHAR2(10)

which means that attempting to enable the ON NULL clause when there exists NULLs in the column will result in an error, ORA-02296.