I’ve previously commented that the best way to monitor database processes is for the processes to log their progress. Unlike traditional applications where logs are written to simple text files database processes have at their disposal a far more useful avenue for writing logs; the database itself. This post outlines a simple logging mechanism and shows how it can be combined with other process monitoring features to provide in-depth information into what a process is doing.
I am a firm believer in logging to a database table. The reasoning is simple; at some stage you’ll want to query your logs to find out what is currently happening or what has previously happened. The key word in the previous sentence is “query”… and nothing handles querying of data better than the database itself.
Many Oracle developers seem to think that process instrumentation starts and ends with DBMS_OUTPUT. While this might be suitable for ad-hoc output during the development process it’s mostly useless in an active system where a process is not being executed from the command line.
The following are the basic requirements for the logging mechanism:
- logs are written to a database table
- fast and efficient so as not to create significant overhead in the recording of log messages
- minimalistic interface so it can be called easily
- support for the classification of different types of log message
- captures the process attributes of MODULE, ACTION and CLIENT_IDENTIFIER
The DDL for the LOGS tables is:
CREATE TABLE logs ( log_id NUMBER NOT NULL, log_timestamp TIMESTAMP(4) NOT NULL, msg_type VARCHAR2(1) NOT NULL, msg VARCHAR2(500) NOT NULL, sid NUMBER NOT NULL, unique_sid VARCHAR2(24) NOT NULL, module VARCHAR2(48), action VARCHAR2(32), client_identifier VARCHAR2(64), err_no NUMBER(7), msg_detail CLOB ) LOB (msg_detail) STORE AS (DISABLE STORAGE IN ROW) PARTITION BY RANGE (log_timestamp) INTERVAL (NUMTODSINTERVAL(1,'DAY')) (PARTITION logs_20121231 VALUES LESS THAN (TO_DATE('01/01/2013','dd/mm/yyyy')) )
The attributes used in the LOGS table are:
- LOG_ID – a sequence generated ID. Fairly useless except for differentiating the order that entries were created for a single process
- LOG_TIMESTAMP – date/time the entry was created
- MSG_TYPE – contains a value of:
- D – debug message
- I – information message
- W – warning message
- E – error message
- MSG – the log message itself
- SID – the session ID of the session creating the message
- UNIQUE_SID – the unique session ID, as reported by dbms_session.unique_session_id, which can be useful in a RAC environment
- MODULE – the MODULE setting for the session
- ACTION – the ACTION setting for the session
- CLIENT_IDENTIFIER – the CLIENT_IDENTIFIER setting for the session
- ERR_NO – the error number associated with any error message entry
- MSG_DETAIL – an optional large message store. I added this for a project where the developers wanted to store an application stack dump for errors and it seemed like a good think to retain.
Unlike a number of logging mechanisms I have chosen to avoid a numeric “log level” attribute in favour of a type classification that is roughly similar to what is used in some application development environments. The key reason for this is that an outsider is unlikely to know what level 1, 2, 3, etc are meant to be and where they should be used. Sticking with defined concepts might be slightly less flexible but is far more descriptive. The API package described next goes into this further.
Attributes aside, the only other feature of the table is that it’s partitioned by the LOG_TIMESTAMP with each partition being 1 day. The idea here is that we will want to purge old data in the table and dropping partitions is the easiest way to do that. Furthermore, there are no indexes on the table and so the partitioning will help when it comes to querying the data, assuming the queries will be able to restrict themselves to a fairly narrow time period. Obviously if you are not licensed for partitioning then you will need to disable this feature. Potentially sub-partitioning by MSG_TYPE would permit selective retention of the different message types, e.g. debug messages might be purged after a couple of days but error messages might be retained for months.
As noted, there are no indexes on the table. This is deliberate so as to reduce the overhead in creating entries. Querying the LOGS table is not likely to be common activity in a production system but creating entries will be very common.
So, let’s take a look at the API for creating the log entries…
The primary focus is on a short and simple interface to creating log messages. A its heart PKG_LOG contains 4 logging routines:
- DEBUG – creates a debug log entry
- INFO – creates a information log entry. This is the “standard” type of entry
- WARN – creates a warning log entry
- ERR – creates an error log entry
With the exception of ERR the routines accept a log message and, optionally, an extended message detail parameter. Creating a standard information message is as simple as:
pkg_log.info ('Some text to write out to the LOGS table');
The ERR routine takes a log message and error number so calling it will probably resemble:
pkg_log.err (SQLERRM, SQLCODE);
Other routines provided by PKG_LOG include:
- enable_logging / disable_logging – used to enable and disable message logging entirely (calls to ERR will still be logged though). By default logging is enabled.
- enable_autonomous / disable_autonomous – used to enable or disable the writing of log message as autonomous transactions. This functionality was introduced to work around issues with autonomous transactions within distributed transactions that Oracle had in some earlier versions so may not be relevant any longer. The default is to use autonomous transactions.
- set_level_normal / set_level_debug – By default debug messages are not logged. Calling set_level_debug causes debug messages to be logged.
- logging_enabled – A function that returns TRUE if the logging facility is enabled
- debug_logging – a function that returns TRUE if the logging facility is set to write out debug messages
As an example of general usage, combining PKG_LOG with PKG_APP_INFO for setting the MODULE and ACTION, PKG_DB_TYPE for standard type definitions and PKG_ERR for logging the last error generated, the normal way I would instrument a routine would be:
PROCEDURE proc AS c_proc CONSTANT pkg_db_type.t_object_name := 'proc'; BEGIN pkg_app_info.set_action (c_proc); pkg_log.info ('Entering ' || c_proc); ... pkg_log.debug ('Exiting ' || c_proc); pkg_app_info.restore_action; EXCEPTION WHEN OTHERS THEN pkg_err.log_last_err; pkg_app_info.restore_action; RAISE; END proc;
So, you can see that routine entry is logged as an INFO message and exit is logged as a DEBUG message. What else would I log? Any important steps and logical flows through the code would receive an INFO message. Likewise, the row count of any bulk SQL statement would get logged. I also include DEBUG message logging for minor steps in a process, which can easily be enabled for when in-depth analysis is required. Naturally exceptions thrown are logged too.
Combining a simple logging mechanism with MODULE, ACTION and CLIENT_IDENTIFER process identifiers will allow you to determine with precision the exact series of steps a process has gone through. Furthermore, the details can be cross referenced against the process history in Active Session History to diagnose most logical and performance related problems. The overhead of setting up effective logging is very low but the benefits it provides are enormous.
Download code shown in this post.