In the last post we looked at how adding a new column can cause row migration and its impact this can have on our queries. In that post I noted that row migration may occur through normal DML so this post looks at that scenario and what we can do to manage it.
A relatively common scenario with table data is that a row is initially created with only a subset of the columns populated. Over time the rest of the columns are then populated with values as the data becomes available. Tables that follow this pattern are prone to suffer from migrated rows much like tables that get new columns added to them. The solution to this problem is different though as we know in advance the columns that exist in the table and can plan accordingly.
We can illustrate the problem using the same table from the previous post. Instead of adding a new column we will start off with all columns defined in the table. Our process first populates all columns except the MORE_DATA column and then goes back and adds data to that column:
SQL>CREATE TABLE row_growth 2 (id NUMBER (10) NOT NULL 3 ,val_1 NUMBER(3) NOT NULL 4 ,col_1 CHAR(50) NOT NULL 5 ,more_data VARCHAR2(30)) 6 / Table created. SQL>CREATE INDEX row_growth_pk 2 ON row_growth (id) 3 / Index created. SQL>CREATE INDEX row_growth_ix1 2 ON row_growth (val_1) 3 / Index created. SQL>INSERT INTO row_growth 2 (id, val_1, col_1) 3 SELECT ROWNUM 4 , MOD(ROWNUM,500) 5 , 'X' 6 FROM dual 7 CONNECT BY ROWNUM <= 10000 8 / 10000 rows created. SQL>COMMIT 2 / Commit complete. SQL>EXEC dbms_stats.gather_table_stats ('','row_growth') PL/SQL procedure successfully completed. SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS 2 / Table analyzed. SQL>SELECT num_rows 2 , blocks 3 , chain_cnt 4 FROM user_tables 5 WHERE table_name = 'ROW_GROWTH' 6 / NUM_ROWS BLOCKS CHAIN_CNT ---------- ---------- ---------- 10000 92 0
Once again we have a table with 92 block and no migrated rows. It’s as if the MORE_DATA column does not exist in the table, which is close to the truth as the column has no data in it. Querying the table we get:
SQL>SET AUTOTRACE TRACEONLY SQL>SELECT * 2 FROM row_growth 3 WHERE val_1 = 10 4 / 20 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1672052865 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 1240 | 21 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ROW_GROWTH | 20 | 1240 | 21 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | ROW_GROWTH_IX1 | 20 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("VAL_1"=10) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 24 consistent gets 0 physical reads 0 redo size 2216 bytes sent via SQL*Net to client 531 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 20 rows processed SQL>SET AUTOTRACE OFF
Similar to last time our query is consuming 24 logical IOs to return the 20 rows. Let’s now populate the MORE_DATA column:
SQL>UPDATE row_growth 2 SET more_data = RPAD('X',30,'X') 3 / 10000 rows updated. SQL>COMMIT 2 / Commit complete. SQL>EXEC dbms_stats.gather_table_stats ('','row_growth') PL/SQL procedure successfully completed. SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS 2 / Table analyzed. SQL>SELECT num_rows 2 , blocks 3 , chain_cnt 4 FROM user_tables 5 WHERE table_name = 'ROW_GROWTH' 6 / NUM_ROWS BLOCKS CHAIN_CNT ---------- ---------- ---------- 10000 142 2849 SQL>SELECT m.value 2 FROM v$mystat m 3 , v$statname s 4 WHERE s.statistic# = m.statistic# 5 AND s.name = 'table fetch continued row' 6 / VALUE ---------- 105 SQL>SET AUTOTRACE TRACEONLY SQL>SELECT * 2 FROM row_growth 3 WHERE val_1 = 10 4 / 20 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1672052865 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 1720 | 27 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ROW_GROWTH | 20 | 1720 | 27 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | ROW_GROWTH_IX1 | 20 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("VAL_1"=10) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 32 consistent gets 0 physical reads 0 redo size 2884 bytes sent via SQL*Net to client 531 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 20 rows processed SQL>SET AUTOTRACE OFF SQL>SELECT m.value 2 FROM v$mystat m 3 , v$statname s 4 WHERE s.statistic# = m.statistic# 5 AND s.name = 'table fetch continued row' 6 / VALUE ---------- 113
Exactly as the scenario where we added a new column, we now have 2,849 migrated rows across 142 blocks. The impact to our query if the same; 32 logical IOs, which is an increase of 8 due to the impact of the migrated rows as confirmed by the change in the “table fetch continued row” statistic.
Previously we needed to do an ALTER TABLE MOVE command to resolve this problem but this situation is different. Since we have all our columns determined in advance and we know that we will be populating the MORE_DATA column after the other rows we can tell Oracle to reserve space for it in the data blocks. We do this by setting the PCTFREE value when creating the table.
PCTFREE specifies the amount of space in a block Oracle will leave free when inserting rows into it (not updating existing rows, only inserting new rows). By default PCTFREE is set to 10, meaning that Oracle will stop inserting row into a block if the new row would drop the free space in the black to 10% or less. A quick calculation based on the column lengths and our understanding of the data shows that our initial insert is about 60% of the overall row length so we would need to set PCTFREE to 40 to reserve space for the MORE_DATA column.
Let’s repeat our example with PCTFREE set to 40:
SQL>CREATE TABLE row_growth 2 (id NUMBER (10) NOT NULL 3 ,val_1 NUMBER(3) NOT NULL 4 ,col_1 CHAR(50) NOT NULL 5 ,more_data VARCHAR2(30)) 6 PCTFREE 40 7 / Table created. SQL>CREATE INDEX row_growth_pk 2 ON row_growth (id) 3 / Index created. SQL>CREATE INDEX row_growth_ix1 2 ON row_growth (val_1) 3 / Index created. SQL>INSERT INTO row_growth 2 (id, val_1, col_1) 3 SELECT ROWNUM 4 , MOD(ROWNUM,500) 5 , 'X' 6 FROM dual 7 CONNECT BY ROWNUM <= 10000 8 / 10000 rows created. SQL>COMMIT 2 / Commit complete. SQL>EXEC dbms_stats.gather_table_stats ('','row_growth') PL/SQL procedure successfully completed. SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS 2 / Table analyzed. SQL>SELECT num_rows 2 , blocks 3 , chain_cnt 4 FROM user_tables 5 WHERE table_name = 'ROW_GROWTH' 6 / NUM_ROWS BLOCKS CHAIN_CNT ---------- ---------- ---------- 10000 137 0
Our table is now consuming 137 blocks after the initial insert of data instead of 92 previously. This is due to each block only being populated up to 60% of available space. So does this impact our query?
SQL>SET AUTOTRACE TRACEONLY SQL>SELECT * 2 FROM row_growth 3 WHERE val_1 = 10 4 / 20 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1672052865 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 1240 | 21 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ROW_GROWTH | 20 | 1240 | 21 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | ROW_GROWTH_IX1 | 20 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("VAL_1"=10) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 24 consistent gets 0 physical reads 0 redo size 2216 bytes sent via SQL*Net to client 531 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 20 rows processed SQL>SET AUTOTRACE OFF
We can see that our query is still using 24 logical IOs. So, while the rows of the table a spread across more block we still only need to access the same number for this query. The reason our query did require any extra work was because it is using an index so it is directly accessing the blocks required.
Now let’s see what happens when we update the MORE_DATA column:
SQL>UPDATE row_growth 2 SET more_data = RPAD('X',30,'X') 3 / 10000 rows updated. SQL>COMMIT 2 / Commit complete. SQL>EXEC dbms_stats.gather_table_stats ('','row_growth') PL/SQL procedure successfully completed. SQL>ANALYZE TABLE row_growth COMPUTE STATISTICS 2 / Table analyzed. SQL>SELECT num_rows 2 , blocks 3 , chain_cnt 4 FROM user_tables 5 WHERE table_name = 'ROW_GROWTH' 6 / NUM_ROWS BLOCKS CHAIN_CNT ---------- ---------- ---------- 10000 137 0
We can see from the CHAIN_CNT entry that we don’t have any migrated rows. Our estimate of setting PCTFREE to 40 was sufficient to ensure that there was adequate free space to handle the update of the MORE_DATA column. Since none of our rows have been migrated our query will not have been impacted and it will perform the same 24 logical IO operations.
In summary, when designing database tables consideration of the manner in which the rows will be populated and setting the PCTFREE clause accordingly is fairly important. Resolving a problem with migrated rows is an annoyance that can easily be avoided with a small amount of forethought and a quick calculation.