New columns and row migration

As a database developer you can get away without knowing a lot about how the Oracle database works. The internals of data blocks, tablespaces and data files tends to sit firmly in the DBA realm leaving developers to simply consider things in terms of tables, columns and rows. That being said, certain development activities will have an impact on the underlying storage and its efficiency. This post looks at the impact of row migration that might result from the common activity of adding a column to a table and back-populating it.

For those not familiar with the concept of row migration, when Oracle goes to update a row in a data block if there is insufficient free space in the block to support the update then Oracle will migrate the row to another data block. Oracle is required to leave behind a pointer in the original block to the new block where the row now resides. This is necessary as all indexes refer to the block using the original location. Row migration can happen at any time and isn’t a particular problem unless it affects a significant number of rows.

Now for an example of adding a row to a table, back-populating it and seeing how many rows get migrated:

SQL>CREATE TABLE row_growth
  2     (id     NUMBER (10) NOT NULL
  3     ,val_1  NUMBER(3)   NOT NULL
  4     ,col_1  CHAR(50)    NOT NULL)
  5  /

Table created.

SQL>CREATE INDEX row_growth_pk
  2     ON row_growth (id)
  3  /

Index created.

SQL>CREATE INDEX row_growth_ix1
  2     ON row_growth (val_1)
  3  /

Index created.

SQL>INSERT INTO row_growth
  2  SELECT ROWNUM
  3  ,      MOD(ROWNUM,500)
  4  ,      'X'
  5  FROM   dual
  6  CONNECT BY ROWNUM <= 10000
  7  /

10000 rows created.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000         92          0

Here we’ve created a table of three columns and two indexes. After populating the tables with 10,000 rows and analysing it we can see that the table is using 92 blocks and 0 chained/migrated rows. As an aside, we need to use the ANALYZE TABLE command to populate the CHAIN_CNT column rather than the usual DBMS_STATS routine.

Now let’s query the table on the indexed VAL_1 column using AUTOTRACE:

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1180 |    21   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1180 |    21   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         24  consistent gets
          0  physical reads
          0  redo size
       2145  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

From the above output we can see that it required 24 logical IO operations to look up the 20 rows where VAL_1 has a value of 10. Now let’s add a new column to the table and back-populate it:

SQL>ALTER TABLE row_growth
  2     ADD more_data VARCHAR2(30)
  3  /

Table altered.

SQL>UPDATE row_growth
  2     SET more_data = RPAD('X',30,'X')
  3  /

10000 rows updated.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        142       2849

After adding and populating a new 30 character column we analysed the table again. The table statistics show that our table is now using 142 blocks and 2,849 of the rows have been migrated.

So what’s the impact of this? Let’s repeat our query on the VAL_1 column and also monitor the session statistic “table fetch continued row”, which is Oracle’s internal count of how many migrated/chained rows that it has encountered:

SQL>SELECT m.value
  2  FROM   v$mystat m
  3  ,      v$statname s
  4  WHERE  s.statistic# = m.statistic#
  5  AND    s.name = 'table fetch continued row'
  6  /

     VALUE
----------
       513

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1720 |    27   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1720 |    27   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         32  consistent gets
          0  physical reads
          0  redo size
       2884  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

SQL>SELECT m.value
  2  FROM   v$mystat m
  3  ,      v$statname s
  4  WHERE  s.statistic# = m.statistic#
  5  AND    s.name = 'table fetch continued row'
  6  /

     VALUE
----------
       521

This is the same query against the same number of rows and following the same execution plan but it’s now done 32 logical IO operations instead of 24, an increase of 8 that corresponds to the increase seen in the “table fetch continued row” statistic (521-513). This is because Oracle is required to follow the migrated row pointers from where they originally resided to their new location.

The solution to the migrated rows problem is to rebuild the table with the ALTER TABLE MOVE command, which in turn invalidates the indexes requiring them to be rebuilt too:

SQL>ALTER TABLE row_growth
  2     MOVE
  3  /

Table altered.

SQL>ALTER INDEX row_growth_pk
  2     REBUILD
  3  /

Index altered.

SQL>ALTER INDEX row_growth_ix1
  2     REBUILD
  3  /

Index altered.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tab_statistics
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        132          0

We’re now back down to 0 migrated rows. Running our query again:

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1720 |    21   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1720 |    21   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         24  consistent gets
          0  physical reads
          0  redo size
       2836  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

After the move and index rebuild our query is back down to the original 24 logical IO operations.

As noted earlier in this post, migrated rows can occur during normal DML activities and are not normally a problem as they impose a trivial overhead. Activities such as adding new columns may cause a significant proportion of a tables rows to be migrated though. This is something to look out for in system deployments and incorporate a table move and index rebuild to mitigate any adverse impact.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s