Interval Aggregation

Interval data types were introduced in Oracle 9i. The idea is that they store an “amount” of time, e.g. “5 minutes”, “1 year and 2 months”. Before interval data types it was common to track time units using simple numeric values. Oracle natively supports date arithmetic on the DATE data types with a value of 1 denoting “1 day”. This mechanism allows us to do things like determine the following week by adding 7 to a DATE variable.

Oracle now provides 2 interval data types; YEAR TO MONTH and DAY TO SECOND. Two data types are necessary to be able to handle the fact that months have a different number of days.

Interval data types have a number of advantages over the dates and a number offset approach. Oracle “knows” about the time they represent and is able to handle operations correctly. For example, Oracle “knows” that “1 day, 6 hours” is equivalent to “30 hours” if a comparison is made between them, Furthermore, it also “knows” that it is the same as “adding 18 hours and 12 hours”. Adding an interval data type to a timestamp data type results in a timestamp that is the interval value beyond the original timestamp. All this operates in a nice and logical manner! You can find more about how Oracle handles datetime / interval arithmetic in the Oracle documentation.

One area that is lacking for interval data types is with the SQL aggregate functions. As mentioned earlier, Oracle can correctly add two intervals together:

SELECT INTERVAL '18' HOUR + INTERVAL '12' HOUR AS time_period
FROM   dual
/

and this correctly gives us:

TIME_PERIOD
-----------------------------
+000000001 06:00:00.000000000

However, if we try to use the SUM aggregate function to add together a column of an interval data type:

SELECT SUM(time_period) AS sum_time_period
FROM   (SELECT INTERVAL '18' HOUR AS time_period
        FROM   dual
		UNION ALL
		SELECT INTERVAL '12' HOUR
		FROM   dual)
/

we get:

SELECT SUM(time_period) AS sum_time_period
           *
ERROR at line 1:
ORA-00932: inconsistent datatypes: expected NUMBER got INTERVAL DAY TO SECOND

This behaviour is rather annoying but can be worked around by creating our own aggregate function. Documentation on creating aggregate functions can be found in Chapter 11 of the Oracle Database Data Cartridge Developer’s Guide.

So, how do we create an aggregate function that we can use to sum interval data types? We will need an object type and a function, which for this example we’ll name t_sum_interval and sum_interval respectively. We’ll use the DAY TO SECOND interval data type as I have found that this data type is more common than the YEAR TO MONTH type, although this could easily be adapted if your requirements differ.

The core work is done by the object type. There are 4 routines and the Oracle provided documentation is quite clear on what purpose they serve. For our interval aggregation function our object type specification is simply:

CREATE OR REPLACE TYPE t_SumInterval AS OBJECT
(
--------------------------------------------------------------------------------
--  Type          : t_SumInterval
--  Author        : Mark Hoxey
--  Creation Date : 01/09/2009
--
--  Overview      : Object type to implement the SumInterval aggregate function
--------------------------------------------------------------------------------
-- Source Control : $Id$
--------------------------------------------------------------------------------
--  Modification History :
--
--  Initials  Date        Description
--  --------  ----------  ---------------------------------------
--  M Hoxey   01/09/2009  Initial version
--------------------------------------------------------------------------------
 
--------------------------------------------------------------------------------
-- Member Properties

-- Used to track the sum of intervals
l_interval  INTERVAL DAY(9) TO SECOND(9),

--------------------------------------------------------------------------------
-- Function      : ODCIAggregateInitialize
--
-- Description   : Initialisation function for the aggregate function which
--                 simply calls the default constructor.
--------------------------------------------------------------------------------
STATIC FUNCTION ODCIAggregateInitialize (p_SumInterval IN OUT t_SumInterval) 
RETURN NUMBER,
   
--------------------------------------------------------------------------------
-- Function      : ODCIAggregateIterate
--
-- Description   : Adds the interval specified to the interval member property
--                 after ensuring that either the provided interval or the
--                 member property is NOT NULL. If both are NULL then no action
--                 is taken and the member property is left as NULL.
--------------------------------------------------------------------------------
MEMBER FUNCTION ODCIAggregateIterate (self         IN OUT t_SumInterval, 
                                      p_interval   IN     DSINTERVAL_UNCONSTRAINED) 
RETURN NUMBER,
   
--------------------------------------------------------------------------------
-- Function      : ODCIAggregateTerminate
--
-- Description   : Returns the interval member property on termination of the 
--                 aggregation operation.
--------------------------------------------------------------------------------
MEMBER FUNCTION ODCIAggregateTerminate (self        IN  t_SumInterval, 
                                        p_interval  OUT DSINTERVAL_UNCONSTRAINED, 
                                        flags       IN  NUMBER) 
RETURN NUMBER,
   
--------------------------------------------------------------------------------
-- Function      : ODCIAggregateMerge
--
-- Description   : Adds an interval member property to the interval member 
--                 property of the current object, preserving NULLs are
--                 required for aggregate processing.
--------------------------------------------------------------------------------
MEMBER FUNCTION ODCIAggregateMerge  (self            IN OUT t_SumInterval, 
                                     p_SumInterval   IN     t_SumInterval) 
RETURN NUMBER
   
);
/

Other than the prescribed functions, our object type has a single member property, which is used to accumulate the sum of the intervals we’re processing. If we were creating an average aggregate function then we would need to keep track of two values; the sum of values and the number of values.

The logic required for the 4 prescribed routines is:

ODCIAggregateInitialize – return, via the out parameter, a new object type instance with the member property initialised to NULL
ODCIAggregateIterate – add the specified interval onto the member property of the current instance
ODCIAggregateTerminate – return the member property of the current instance
ODCIAggregateMerge – take the member property of the input object type and add it onto the member property of the current instance

Our object type body implementation is:

CREATE OR REPLACE TYPE BODY t_SumInterval IS

--------------------------------------------------------------------------------
--  Type          : t_SumInterval
--  Author        : Mark Hoxey
--  Creation Date : 01/09/2009
--
--  Overview      : Object type to implement the SumInterval aggregate function
--------------------------------------------------------------------------------
-- Source Control : $Id$
--------------------------------------------------------------------------------
--  Modification History :
--
--  Initials  Date        Description
--  --------  ----------  ---------------------------------------
--  M Hoxey   01/09/2009  Initial version
--------------------------------------------------------------------------------

--------------------------------------------------------------------------------
-- Function      : ODCIAggregateInitialize
--
-- Description   : Initialisation function for the aggregate function which
--                 simply calls the default constructor.
--------------------------------------------------------------------------------
STATIC FUNCTION ODCIAggregateInitialize (p_SumInterval IN OUT t_SumInterval) 
RETURN NUMBER 
AS 
BEGIN
   p_SumInterval := t_SumInterval(NULL);
   RETURN ODCIConst.Success;
END ODCIAggregateInitialize;

--------------------------------------------------------------------------------
-- Function      : ODCIAggregateIterate
--
-- Description   : Adds the interval specified to the interval member property
--                 after ensuring that either the provided interval or the
--                 member property is NOT NULL. If both are NULL then no action
--                 is taken and the member property is left as NULL.
--------------------------------------------------------------------------------
MEMBER FUNCTION ODCIAggregateIterate (self         IN OUT t_SumInterval, 
                                      p_interval   IN     DSINTERVAL_UNCONSTRAINED) 
RETURN NUMBER IS
BEGIN
   IF (self.l_interval IS NOT NULL OR p_interval IS NOT NULL) THEN
      self.l_interval := NVL(self.l_interval ,INTERVAL '0' SECOND) + NVL(p_interval ,INTERVAL '0' SECOND);
   END IF;
   RETURN ODCIConst.Success;
END ODCIAggregateIterate;

--------------------------------------------------------------------------------
-- Function      : ODCIAggregateTerminate
--
-- Description   : Returns the interval member property on termination of the 
--                 aggregation operation.
--------------------------------------------------------------------------------
MEMBER FUNCTION ODCIAggregateTerminate (self        IN  t_SumInterval, 
                                        p_interval  OUT DSINTERVAL_UNCONSTRAINED, 
                                        flags       IN  NUMBER) 
RETURN NUMBER 
AS
BEGIN
   p_interval := self.l_interval;
   RETURN ODCIConst.Success;
END ODCIAggregateTerminate;

--------------------------------------------------------------------------------
-- Function      : ODCIAggregateMerge
--
-- Description   : Adds an interval member property to the interval member 
--                 property of the current object, preserving NULLs are
--                 required for aggregate processing.
--------------------------------------------------------------------------------
MEMBER FUNCTION ODCIAggregateMerge (self            IN OUT t_SumInterval, 
                                    p_SumInterval   IN     t_SumInterval) 
RETURN NUMBER 
AS
BEGIN
   IF (self.l_interval IS NOT NULL OR p_SumInterval.l_interval IS NOT NULL) THEN
      self.l_interval := NVL(self.l_interval ,INTERVAL '0' SECOND) + NVL(p_SumInterval.l_interval ,INTERVAL '0' SECOND);
   END IF;
   RETURN ODCIConst.Success;
END ODCIAggregateMerge;

END;
/

The only part remaining is the creation of the aggregate function that invokes our object type:

CREATE OR REPLACE FUNCTION SumInterval (p_interval DSINTERVAL_UNCONSTRAINED) 
--------------------------------------------------------------------------------
--  Function      : SumInterval
--  Author        : Mark Hoxey
--  Creation Date : 01/09/2009
--
--  Overview      : SUM aggregate function for DAY TO SECOND interval data type
--                  using the t_SumInterval object type
--------------------------------------------------------------------------------
-- Source Control : $Id$
--------------------------------------------------------------------------------
--  Modification History :
--
--  Initials  Date        Description
--  --------  ----------  ---------------------------------------
--  M Hoxey   01/09/2009  Initial version
--------------------------------------------------------------------------------
RETURN DSINTERVAL_UNCONSTRAINED 
PARALLEL_ENABLE
AGGREGATE USING t_SumInterval;
/

Quite simple really. The only potential stumbling point is with the handling of NULLs. If we look at how the SUM function works with respect to NULL entries we can ensure we mimic the correct behaviour (that is assuming we don’t already know how SQL aggregate functions are required to handle NULLs…).

Let’s create some tests for our new aggregate function to check that it’s working correctly, comparing the output to the standard SUM function operating on simple numeric values. Rather than creating a table of values I’ve SELECTed from DUAL to obtain a small set within an in-line view:

SELECT SumInterval(interval_val) AS sum_interval_val
,      SUM(number_val)           AS sum_number_val
FROM   (SELECT INTERVAL '2' DAY  AS interval_val
        ,      2                 AS number_val
        FROM   dual
        UNION ALL
        SELECT INTERVAL '3' DAY
        ,      3                
        FROM   dual
        UNION ALL
        SELECT INTERVAL '4' DAY
        ,      4               
        FROM   dual)
/

which gives us an output of:

SUM_INTERVAL_VAL               SUM_NUMBER_VAL
------------------------------ --------------
+000000009 00:00:00.000000000               9

So, our aggregated interval value appear to be correct. Now, how what happens if we introduce NULLs into the picture:

SELECT SumInterval(interval_val) AS sum_interval_val
,      SUM(number_val)           AS sum_number_val
FROM   (SELECT INTERVAL '2' DAY  AS interval_val
        ,      2                 AS number_val
        FROM   dual
        UNION ALL
        SELECT INTERVAL '3' DAY
        ,      3               
        FROM   dual
        UNION ALL
        SELECT NULL
        ,      NULL
        FROM   dual
        UNION ALL	
        SELECT INTERVAL '4' DAY
        ,      4               
        FROM   dual)
/

which gives us:

SUM_INTERVAL_VAL               SUM_NUMBER_VAL
------------------------------ --------------
+000000009 00:00:00.000000000               9

From the above we can see that we’ve correctly handled the behaviour of NULLs in that we ignored it. What about if all values are NULL though:

SELECT SumInterval(interval_val) AS sum_interval_val
,      SUM(number_val)           AS sum_number_val 
FROM   (SELECT NULL  AS interval_val
        ,      NULL  AS number_val
        FROM   dual
        UNION ALL
        SELECT NULL
        ,      NULL
        FROM   dual)
/

which gives us:

SUM_INTERVAL_VAL               SUM_NUMBER_VAL
------------------------------ --------------

So, it would appear that our interval aggregate function if behaving as expected. Normally I would recommend a larger test suite but I won’t include that for this post. We could easily implement other aggregate functions, MIN, MAX, AVG, etc, for interval data types in a similar manner. The hard part is remembering that the standard aggregate functions cannot be used with interval data types and that we will need to invoke our custom functions instead.


Download scripts shown in this post.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s