Increasing Row Length and PCTFREE

In the last post we looked at how adding a new column can cause row migration and its impact this can have on our queries. In that post I noted that row migration may occur through normal DML so this post looks at that scenario and what we can do to manage it.

A relatively common scenario with table data is that a row is initially created with only a subset of the columns populated. Over time the rest of the columns are then populated with values as the data becomes available. Tables that follow this pattern are prone to suffer from migrated rows much like tables that get new columns added to them. The solution to this problem is different though as we know in advance the columns that exist in the table and can plan accordingly.

We can illustrate the problem using the same table from the previous post. Instead of adding a new column we will start off with all columns defined in the table. Our process first populates all columns except the MORE_DATA column and then goes back and adds data to that column:

SQL>CREATE TABLE row_growth
  2     (id        NUMBER (10) NOT NULL
  3     ,val_1     NUMBER(3)   NOT NULL
  4     ,col_1     CHAR(50)    NOT NULL
  5     ,more_data VARCHAR2(30))
  6  /

Table created.

SQL>CREATE INDEX row_growth_pk
  2     ON row_growth (id)
  3  /

Index created.

SQL>CREATE INDEX row_growth_ix1
  2     ON row_growth (val_1)
  3  /

Index created.

SQL>INSERT INTO row_growth
  2     (id, val_1, col_1)
  3  SELECT ROWNUM
  4  ,      MOD(ROWNUM,500)
  5  ,      'X'
  6  FROM   dual
  7  CONNECT BY ROWNUM <= 10000
  8  /

10000 rows created.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000         92          0

Once again we have a table with 92 block and no migrated rows. It’s as if the MORE_DATA column does not exist in the table, which is close to the truth as the column has no data in it. Querying the table we get:

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1240 |    21   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1240 |    21   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         24  consistent gets
          0  physical reads
          0  redo size
       2216  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

Similar to last time our query is consuming 24 logical IOs to return the 20 rows. Let’s now populate the MORE_DATA column:

SQL>UPDATE row_growth
  2     SET more_data = RPAD('X',30,'X')
  3  /

10000 rows updated.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        142       2849

SQL>SELECT m.value
  2  FROM   v$mystat m
  3  ,      v$statname s
  4  WHERE  s.statistic# = m.statistic#
  5  AND    s.name = 'table fetch continued row'
  6  /

     VALUE
----------
       105

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1720 |    27   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1720 |    27   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         32  consistent gets
          0  physical reads
          0  redo size
       2884  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

SQL>SELECT m.value
  2  FROM   v$mystat m
  3  ,      v$statname s
  4  WHERE  s.statistic# = m.statistic#
  5  AND    s.name = 'table fetch continued row'
  6  /

     VALUE
----------
       113

Exactly as the scenario where we added a new column, we now have 2,849 migrated rows across 142 blocks. The impact to our query if the same; 32 logical IOs, which is an increase of 8 due to the impact of the migrated rows as confirmed by the change in the “table fetch continued row” statistic.

Previously we needed to do an ALTER TABLE MOVE command to resolve this problem but this situation is different. Since we have all our columns determined in advance and we know that we will be populating the MORE_DATA column after the other rows we can tell Oracle to reserve space for it in the data blocks. We do this by setting the PCTFREE value when creating the table.

PCTFREE specifies the amount of space in a block Oracle will leave free when inserting rows into it (not updating existing rows, only inserting new rows). By default PCTFREE is set to 10, meaning that Oracle will stop inserting row into a block if the new row would drop the free space in the black to 10% or less. A quick calculation based on the column lengths and our understanding of the data shows that our initial insert is about 60% of the overall row length so we would need to set PCTFREE to 40 to reserve space for the MORE_DATA column.

Let’s repeat our example with PCTFREE set to 40:

SQL>CREATE TABLE row_growth
  2     (id        NUMBER (10) NOT NULL
  3     ,val_1     NUMBER(3)   NOT NULL
  4     ,col_1     CHAR(50)    NOT NULL
  5     ,more_data VARCHAR2(30))
  6  PCTFREE 40
  7  /

Table created.

SQL>CREATE INDEX row_growth_pk
  2     ON row_growth (id)
  3  /

Index created.

SQL>CREATE INDEX row_growth_ix1
  2     ON row_growth (val_1)
  3  /

Index created.

SQL>INSERT INTO row_growth
  2     (id, val_1, col_1)
  3  SELECT ROWNUM
  4  ,      MOD(ROWNUM,500)
  5  ,      'X'
  6  FROM   dual
  7  CONNECT BY ROWNUM <= 10000
  8  /

10000 rows created.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        137          0

Our table is now consuming 137 blocks after the initial insert of data instead of 92 previously. This is due to each block only being populated up to 60% of available space. So does this impact our query?

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1240 |    21   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1240 |    21   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         24  consistent gets
          0  physical reads
          0  redo size
       2216  bytes sent via SQL*Net to client
        531  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

We can see that our query is still using 24 logical IOs. So, while the rows of the table a spread across more block we still only need to access the same number for this query. The reason our query did require any extra work was because it is using an index so it is directly accessing the blocks required.

Now let’s see what happens when we update the MORE_DATA column:

SQL>UPDATE row_growth
  2     SET more_data = RPAD('X',30,'X')
  3  /

10000 rows updated.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        137          0

We can see from the CHAIN_CNT entry that we don’t have any migrated rows. Our estimate of setting PCTFREE to 40 was sufficient to ensure that there was adequate free space to handle the update of the MORE_DATA column. Since none of our rows have been migrated our query will not have been impacted and it will perform the same 24 logical IO operations.

In summary, when designing database tables consideration of the manner in which the rows will be populated and setting the PCTFREE clause accordingly is fairly important. Resolving a problem with migrated rows is an annoyance that can easily be avoided with a small amount of forethought and a quick calculation.

New columns and row migration

As a database developer you can get away without knowing a lot about how the Oracle database works. The internals of data blocks, tablespaces and data files tends to sit firmly in the DBA realm leaving developers to simply consider things in terms of tables, columns and rows. That being said, certain development activities will have an impact on the underlying storage and its efficiency. This post looks at the impact of row migration that might result from the common activity of adding a column to a table and back-populating it.

For those not familiar with the concept of row migration, when Oracle goes to update a row in a data block if there is insufficient free space in the block to support the update then Oracle will migrate the row to another data block. Oracle is required to leave behind a pointer in the original block to the new block where the row now resides. This is necessary as all indexes refer to the block using the original location. Row migration can happen at any time and isn’t a particular problem unless it affects a significant number of rows.

Now for an example of adding a row to a table, back-populating it and seeing how many rows get migrated:

SQL>CREATE TABLE row_growth
  2     (id     NUMBER (10) NOT NULL
  3     ,val_1  NUMBER(3)   NOT NULL
  4     ,col_1  CHAR(50)    NOT NULL)
  5  /

Table created.

SQL>CREATE INDEX row_growth_pk
  2     ON row_growth (id)
  3  /

Index created.

SQL>CREATE INDEX row_growth_ix1
  2     ON row_growth (val_1)
  3  /

Index created.

SQL>INSERT INTO row_growth
  2  SELECT ROWNUM
  3  ,      MOD(ROWNUM,500)
  4  ,      'X'
  5  FROM   dual
  6  CONNECT BY ROWNUM <= 10000
  7  /

10000 rows created.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000         92          0

Here we’ve created a table of three columns and two indexes. After populating the tables with 10,000 rows and analysing it we can see that the table is using 92 blocks and 0 chained/migrated rows. As an aside, we need to use the ANALYZE TABLE command to populate the CHAIN_CNT column rather than the usual DBMS_STATS routine.

Now let’s query the table on the indexed VAL_1 column using AUTOTRACE:

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1180 |    21   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1180 |    21   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         24  consistent gets
          0  physical reads
          0  redo size
       2145  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

From the above output we can see that it required 24 logical IO operations to look up the 20 rows where VAL_1 has a value of 10. Now let’s add a new column to the table and back-populate it:

SQL>ALTER TABLE row_growth
  2     ADD more_data VARCHAR2(30)
  3  /

Table altered.

SQL>UPDATE row_growth
  2     SET more_data = RPAD('X',30,'X')
  3  /

10000 rows updated.

SQL>COMMIT
  2  /

Commit complete.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tables
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        142       2849

After adding and populating a new 30 character column we analysed the table again. The table statistics show that our table is now using 142 blocks and 2,849 of the rows have been migrated.

So what’s the impact of this? Let’s repeat our query on the VAL_1 column and also monitor the session statistic “table fetch continued row”, which is Oracle’s internal count of how many migrated/chained rows that it has encountered:

SQL>SELECT m.value
  2  FROM   v$mystat m
  3  ,      v$statname s
  4  WHERE  s.statistic# = m.statistic#
  5  AND    s.name = 'table fetch continued row'
  6  /

     VALUE
----------
       513

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1720 |    27   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1720 |    27   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         32  consistent gets
          0  physical reads
          0  redo size
       2884  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

SQL>SELECT m.value
  2  FROM   v$mystat m
  3  ,      v$statname s
  4  WHERE  s.statistic# = m.statistic#
  5  AND    s.name = 'table fetch continued row'
  6  /

     VALUE
----------
       521

This is the same query against the same number of rows and following the same execution plan but it’s now done 32 logical IO operations instead of 24, an increase of 8 that corresponds to the increase seen in the “table fetch continued row” statistic (521-513). This is because Oracle is required to follow the migrated row pointers from where they originally resided to their new location.

The solution to the migrated rows problem is to rebuild the table with the ALTER TABLE MOVE command, which in turn invalidates the indexes requiring them to be rebuilt too:

SQL>ALTER TABLE row_growth
  2     MOVE
  3  /

Table altered.

SQL>ALTER INDEX row_growth_pk
  2     REBUILD
  3  /

Index altered.

SQL>ALTER INDEX row_growth_ix1
  2     REBUILD
  3  /

Index altered.

SQL>EXEC dbms_stats.gather_table_stats ('','row_growth')

PL/SQL procedure successfully completed.

SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS
  2  /

Table analyzed.

SQL>SELECT num_rows
  2  ,      blocks
  3  ,      chain_cnt
  4  FROM   user_tab_statistics
  5  WHERE  table_name = 'ROW_GROWTH'
  6  /

  NUM_ROWS     BLOCKS  CHAIN_CNT
---------- ---------- ----------
     10000        132          0

We’re now back down to 0 migrated rows. Running our query again:

SQL>SET AUTOTRACE TRACEONLY

SQL>SELECT *
  2  FROM   row_growth
  3  WHERE  val_1 = 10
  4  /

20 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1672052865

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |    20 |  1720 |    21   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ROW_GROWTH     |    20 |  1720 |    21   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ROW_GROWTH_IX1 |    20 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("VAL_1"=10)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         24  consistent gets
          0  physical reads
          0  redo size
       2836  bytes sent via SQL*Net to client
        530  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

SQL>SET AUTOTRACE OFF

After the move and index rebuild our query is back down to the original 24 logical IO operations.

As noted earlier in this post, migrated rows can occur during normal DML activities and are not normally a problem as they impose a trivial overhead. Activities such as adding new columns may cause a significant proportion of a tables rows to be migrated though. This is something to look out for in system deployments and incorporate a table move and index rebuild to mitigate any adverse impact.