Oracle Concurrency 101

My first post takes a look at concurrency in Oracle data processing. Concurrency is one key aspect of data handling that is necessary to maintain data consistency but an area that is far too regularly overlooked. In my experience, even people who term themselves database developers make fundamental mistakes.

The following illustrates a very simple scenario of a single row change on a table upon which resides a primary key constraint. The requirement is to add a new entry if the key attribute doesn’t exist or update the existing row if it does. We start off defining a simple table that will be used for the examples:

-- create a simple table with an ID column
CREATE TABLE test_table
   (id      NUMBER(10)   NOT NULL
   ,padding VARCHAR2(50) NOT NULL
   )
/

-- create the primary key index and constraint
ALTER TABLE test_table
ADD CONSTRAINT test_table_pk
PRIMARY KEY (id)
/

The following script shows a simple implementation of the requirement. It initially runs a query against the table to see if a row already exists. If no row is found then it will perform an insert operation, otherwise it will perform an update operation. The script has been liberally instrumented to report on the steps it performs.

-- a simple anonymous block that performs an
-- insert if a row doesn't already exist, otherwise
-- performs an update
SET ECHO ON
SET SERVEROUTPUT ON
DECLARE
   l_id         test_table.id%TYPE := 1;
   l_padding    test_table.padding%TYPE := RPAD('A',50,'A');
   l_row_count  NUMBER(10);
BEGIN

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Start of test');

   SELECT COUNT(*)
   INTO   l_row_count
   FROM   test_table
   WHERE  id = l_id;

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After SELECT');

   IF (l_row_count = 0) THEN
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Before INSERT');
      INSERT INTO test_table
         (id, padding)
      VALUES
         (l_id, l_padding);
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After INSERT');
   ELSE
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Before UPDATE');
      UPDATE test_table
      SET    padding = l_padding
      WHERE  id = l_id;
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After UPDATE');
   END IF;

   COMMIT;

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - End of test');

END;
/

Running the script for the first time shows that it has inserted the new row as desired:

04:49:21.361 - Start of test
04:49:21.361 - After SELECT
04:49:21.361 - Before INSERT
04:49:21.361 - After INSERT
04:49:21.361 - End of test

PL/SQL procedure successfully completed.

Running the script for a second time shows that it now performs an update as required:

04:49:27.352 - Start of test
04:49:27.352 - After SELECT
04:49:27.352 - Before UPDATE
04:49:27.352 - After UPDATE
04:49:27.352 - End of test

PL/SQL procedure successfully completed.

On the surface it would appear that the code fulfils our requirement. However, things begin to fall apart when concurrency considerations are taken into account. What happens if two people run this script at the same time?

I find it strange that even some people who consider themselves to be database developers fail to see the concurrency flaw in the above strategy. Some will acknowledge that while the script executed by two sessions at precisely the same point in time will cause problems the probability of this occurring is remote. They will point out that the length of time between the test for existence of the row (i.e. the SELECT) and the operation (i.e. the INSERT or UPDATE) is so small that in practical terms this won’t happen (the script output above shows this to be less than 1 millisecond). This reasoning is flawed though.

Let’s see what happens if we simulate subsequent processing after the insert or update by introducing a delay after the row operation but before the commit point. The modified script becomes:

-- a simple anonymous block that performs an
-- insert if a row doesn't already exist, otherwise
-- performs an update. This test includes a delay
-- before the commit of the changes
SET ECHO ON
SET SERVEROUTPUT ON
DECLARE
   l_id         test_table.id%TYPE := 2;
   l_padding    test_table.padding%TYPE := RPAD('A',50,'A');
   l_row_count  NUMBER(10);
BEGIN

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Start of test');

   SELECT COUNT(*)
   INTO   l_row_count
   FROM   test_table
   WHERE  id = l_id;

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After SELECT');

   IF (l_row_count = 0) THEN
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Before INSERT');
      INSERT INTO test_table
         (id, padding)
      VALUES
         (l_id, l_padding);
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After INSERT');
   ELSE
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Before UPDATE');
      UPDATE test_table
      SET    padding = l_padding
      WHERE  id = l_id;
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After UPDATE');
   END IF;

   dbms_lock.sleep (30);

   COMMIT;

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - End of test');

END;
/

We run the script in one session and then again in another session about 10 seconds later. The first session completes  with:

04:58:05.335 - Start of test
04:58:05.335 - After SELECT
04:58:05.335 - Before INSERT
04:58:05.335 - After INSERT
04:58:35.365 - End of test

PL/SQL procedure successfully completed.

whilst the second session produces the following:

04:58:16.738 - Start of test
04:58:16.738 - After SELECT
04:58:16.738 - Before INSERT
DECLARE
*
ERROR at line 1:
ORA-00001: unique constraint (DEVELOPER.TEST_TABLE_PK) violated
ORA-06512: at line 18

Looking at the timestamps involved we can see that the second session performed its test for existence (the SELECT) after the first session had performed its operation (the INSERT) and yet it still generated an error. The reason for this is that the insert performed by the first session is not visible to the second session unit the session has issued a commit. This means the opportunity for error is not limited to the duration between the test for existence and the subsequent operation but from the test for existence to the commit point.

In a production system this type of problem will probably manifest itself rarely. When it does occur it will often be written off as a “one-off system glitch” due to the “same process having been run for months without error”. To end users, the system will be labelled “buggy” and “unreliable”.

So, what can be done to avoid this scenario? One possibility is to look at eliminating the SELECT/INSERT/UPDATE structure with a simple MERGE statement. As many developers would know that MERGE performs either an insert or update according to whether a row exists so it appears to match our requirements exactly. Let’s test this:

-- a simple anonymous block that performs an
-- merge to insert or update on a row according
-- to whether it exists or not
SET ECHO ON
SET SERVEROUTPUT ON
DECLARE
   l_id         test_table.id%TYPE := 3;
   l_padding    test_table.padding%TYPE := RPAD('A',50,'A');
   l_row_count  NUMBER(10);
BEGIN

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Start of test');

   MERGE INTO test_table dest
   USING (SELECT l_id AS id
          ,      l_padding AS padding
          FROM   dual) src
   ON (src.id = dest.id)
   WHEN MATCHED THEN
      UPDATE
      SET dest.padding = src.padding
   WHEN NOT MATCHED THEN
      INSERT
         (id
         ,padding)
      VALUES
         (src.id
         ,src.padding);

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After MERGE');

   dbms_lock.sleep (30);

   COMMIT;

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - End of test');

END;
/

The output from the two sessions are:

05:02:15.013 - Start of test
05:02:15.013 - After MERGE
05:02:45.028 - End of test

PL/SQL procedure successfully completed.

and

05:02:24.015 - Start of test
DECLARE
*
ERROR at line 1:
ORA-00001: unique constraint (DEVELOPER.TEST_TABLE_PK) violated
ORA-06512: at line 9

Hmmm… that didn’t work either. No doubt the MERGE in the second session still doesn’t “see” the row inserted by the first session so it attempts to perform an insert too, rather than an update. So, how about we try a strategy that doesn’t require one session to “see’ what the other session is doing:

-- a simple anonymous block that performs an
-- update on a row and if the row doesn't already exist
-- it will perform an insert
SET ECHO ON
SET SERVEROUTPUT ON
DECLARE
   l_id         test_table.id%TYPE := 4;
   l_padding    test_table.padding%TYPE := RPAD('A',50,'A');
   l_row_count  NUMBER(10);
BEGIN

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Start of test');

   BEGIN
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Before INSERT');
      INSERT INTO test_table
         (id, padding)
      VALUES
         (l_id, l_padding);
      dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After INSERT');
   EXCEPTION
      WHEN DUP_VAL_ON_INDEX THEN
         dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - Before UPDATE');
         UPDATE test_table
         SET    padding = l_padding
         WHERE  id = l_id;
         dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - After UPDATE');
   END;

   dbms_lock.sleep (30);

   COMMIT;

   dbms_output.put_line (TO_CHAR(SYSTIMESTAMP,'hh:mi:ss.ff3') || ' - End of test');

END;
/

With this script we simply attempt to create a new entry and if that fails with an error that tells us the row already exists we perform an update instead. Running this script in two sessions we get the following output:

05:04:20.484 - Start of test
05:04:20.484 - Before INSERT
05:04:20.484 - After INSERT
05:04:50.499 - End of test

PL/SQL procedure successfully completed.

and:

05:04:28.487 - Start of test
05:04:28.487 - Before INSERT
05:04:50.499 - Before UPDATE
05:04:50.499 - After UPDATE
05:05:20.513 - End of test

PL/SQL procedure successfully completed.

Finally, success!

A close look at the timestamps on the messages tells us what has happened. The first session performed the insert operation as we would expect. The timing on the second session messages shows that it also attempted to perform an insert but had to wait until the first session committed its changes before failing that operation. It then switched to an update operation. Concurrency is properly handled.

Of course, in a real system there are other data processing aspects to consider. The second session overwrote the details of the first session without so much as batting an eyelid. Depending on the situation we may have a lost update problem on our hands. We’ll defer discussion of that type of problem to a later post though.

In summary, processes that may appear to be be robust when examined from the perspective of a single session may not properly handle concurrency correctly. Developing data processes that correctly handle concurrency issues is not necessary any more difficult, it just requires different considerations and an understanding of the types of problems that exist and the strategies available to avoid them.