Extracts from a database are a common activity for any system. In our information driven world we are constantly extracting data for reporting systems, data warehouses, extracts for other systems or simple reports. Naturally, getting data out of an Oracle database is as simple as writing one or more SQL queries. Things can get a bit tricky when we need to consider the consistency of the data we’re extracting.
As an example of data extract inconsistency, one place I worked at had a requirement to extract data into CSV files for a third party on a daily basis. Primarily the data consisted of account details and transactions on those accounts. The third party started to complain that the account balances did not match up with the transactions that were reported. The cause for the difference was easily diagnosed. The overall process simply ran one query after another dumping the results to a CSV file. The transaction extract occurred after the account details one. By the time the account details extract had finished new transactions had made their way into the system and were picked up by the transaction extract. These new transactions were not reflected in the account balances of the account extract. What the extract process required was to do all the extracts as at the same point in time and Oracle provides this functionality via flashback query.
Flashback query has been part of Oracle since version 9. In it’s simplest form it’s a matter of specifying the point in time that you want to query a table, e.g.:
SELECT * FROM transactions AS OF TIMESTAMP CAST(TRUNC(SYSDATE) AS TIMESTAMP)
which will display the details of the TRANSACTION table as at the start of the day. Activity on the table since the start of the day will be rolled back for the purpose of the query. Using the above approach it’s possible to construct a set of queries that are consistent at the same point in time, which is ideal for extracts that need to be data consistent with each other.
Using PKG_CSV, creating a data consistent set of extracts is as simple as:
DECLARE l_directory VARCHAR2(30) := 'EXTRACT_DIR'; l_extract_timestamp TIMESTAMP := CAST(TRUNC(SYSDATE) AS TIMESTAMP); l_emp_cur SYS_REFCURSOR; l_dept_cur SYS_REFCURSOR; BEGIN OPEN l_emp_cur FOR SELECT * FROM emp AS OF TIMESTAMP l_extract_timestamp; OPEN l_dept_cur FOR SELECT * FROM dept AS OF TIMESTAMP l_extract_timestamp; pkg_csv.to_file (p_cursor => l_emp_cur ,p_directory => l_directory ,p_filename => 'emp_' || TO_CHAR(l_extract_timestamp,'yyyymmddhh24miss') || '.csv'); pkg_csv.to_file (p_cursor => l_dept_cur ,p_directory => l_directory ,p_filename => 'dept_' || TO_CHAR(l_extract_timestamp,'yyyymmddhh24miss') || '.csv'); END;
So how far back can you go with this type of query? Oracle uses the undo segments (or rollback) to work out what the data looked like in the past. The obvious question is then “how much undo does Oracle retain?”. Unfortunately the answer to this isn’t clear cut. Generally speaking, Oracle will retain undo for at least the duration specified by the initialisation parameter UNDO_RETENTION, which is specified in seconds. This parameter defaults to 900, or 15 minutes, although many systems will have this set to a larger value. The Oracle documentation contains this warning on the way Oracle manages the undo available:
The UNDO_RETENTION parameter can only be honored if the current undo tablespace has enough space. If an active transaction requires undo space and the undo tablespace does not have available space, then the system starts reusing unexpired undo space. This action can potentially cause some queries to fail with a “snapshot too old” message
You can read more about flashback query in the Oracle documentation.