Another situation from work worth creating a post; how Oracle SQL operators and functions handle NULL arguments. Quite simple stuff to an experienced Oracle developer but perhaps not so obvious to a newcomer as the concept of NULL seems to be frequently overlooked. I have previously commented on aggregate functions and their handling of NULL and so I’ll focus on standard operators and common in-built functions in the first part of the post and look into specific NULL handling functions in the latter part.
Note, all of the following was done on Oracle 11.2.0.3.
SQL Operators
Let’s start off by looking at how the in-built operators handle NULL. The follow operations were performed using a simple:
SELECT <operation> FROM dual
Where the result is NULL then it’s been specified in italics as null:
Operator | Operation | Result |
---|---|---|
Addition | 1 + NULL | null |
Subtraction | 1 – NULL | null |
Multiplication | 1 * NULL | null |
Division | 1 / NULL | null |
String Concatenation | ‘A’ || NULL | ‘A’ |
So, with the exception of string concatenation, all the operators return NULL when one of their arguments is NULL. Hopefully there are no surprises there; it’s quite logical really… how can an expression return a valid result when the value of one component is unknown?. String concatenation is the exception; NULL is ignored and so the end result will be the concatenation of the non-NULL arguments.
Pretty simple so far…
SQL Functions
I don’t intend to go through all functions here but will simply show the output of a handful that are not specific to dealing with NULL parameters initially. Later I’ll go through some functions that are designed specifically for NULL parameters. Similar to the previous section the results shown here were obtained via:
SELECT <operation> FROM dual
Function | Operation | Result |
---|---|---|
TO_NUMBER | TO_NUMBER(NULL) | null |
TO_CHAR | TO_CHAR(NULL) | null |
TRUNC | TRUNC(NULL) | null |
SUBSTR | SUBSTR(NULL,1,2) | null |
SIGN | SIGN(NULL) | null |
LEAST | LEAST(1,2,NULL) | null |
RPAD | RPAD(NULL,5,’X’) | null |
LENGTH | LENGTH(NULL) | null |
Hmmm… I see a pattern emerging…
Quite simply SQL functions return NULL if an input parameter is NULL. Even functions that you might expect to return a value, such as LENGTH, return NULL.
Okay, the above statement is a bit of a generalisation as there are functions that are specifically designed to handle NULL. Let’s have a look at some of them. For these functions I’ll go into a bit more detail. The functions examined are:
- NVL
- NVL2
- COALESCE
- NULLIF
- LNNVL
- DECODE
NVL
You won’t have been developing with Oracle for long before encountering this function. NVL takes 2 parameters. If the first parameter is not NULL then it returns that value, otherwise it returns the second parameter. Rewritten as a CASE expression it would be:
CASE WHEN <parameter 1> IS NOT NULL THEN <parameter 1> ELSE <parameter 2> END
So:
NVL (5, 4)
returns the value 5, whilst:
NVL(NULL, 4)
returns the value 4. So, NVL is a simple function for assigning a non-NULL value to an expression that might be NULL. However, even with such a simple function there are a few things to keep in mind:
- NVL will return NULL if both parameters evaluate to NULL.
- Oracle will attempt to perform an implicit data type conversion on the second parameter if it does not match the data type of the first parameter. So, the expression NVL(5,’NULL’) will raise an exception, whereas the expression NVL(5,’3′) will not.
NVL2
NVL2 takes 3 parameters. If the first parameter evaluates to not NULL then it returns the second parameter, otherwise it returns the third parameter. Expressed using CASE, NVL2 would be:
CASE WHEN <parameter 1> IS NOT NULL THEN <parameter 2> ELSE <parameter 3> END
From the above, it can be seen that NVL2 will return NULL in the following scenarios:
- Parameter 1 is not NULL and Parameter 2 is NULL
- Parameter 1 and 3 are NULL
The data type returned by NVL2 is driven by the second parameter and an implicit conversion is done on the third parameter if it does not match the data type of the second parameter.
COALESCE
COALESCE takes a variable number of parameters and returns the first non-NULL parameter. Expressed using CASE this would be:
CASE WHEN <parameter 1> IS NOT NULL THEN <parameter 1> WHEN <parameter 2> IS NOT NULL THEN <parameter 2> WHEN <parameter 3> IS NOT NULL THEN <parameter 3> ... WHEN <parameter n> IS NOT NULL THEN <parameter n> END
So, how many parameters can COALESCE accept? According to the following script, the answer is 65535:
DECLARE l_no_parameters NUMBER(6) := 65535; l_result NUMBER(5); l_sql CLOB; BEGIN dbms_lob.createtemporary(l_sql,FALSE); l_sql := 'SELECT COALESCE ('; FOR i IN 1..l_no_parameters-1 LOOP l_sql := l_sql || 'NULL,'; END LOOP; l_sql := l_sql || TO_CHAR(l_no_parameters) || ') INTO :1 FROM dual'; EXECUTE IMMEDIATE l_sql INTO l_result; dbms_output.put_line ('Result is: ' || TO_CHAR(l_result)); END;
Using 65535 as the number of parameters, we get a response of:
Result is: 65535
However, if we increment the number of parameters to 65536 we get a response of:
DECLARE * ERROR at line 1: ORA-00939: too many arguments for function ORA-06512: at line 15
Still, I would expect that 64k parameters should be enough for anyone…
NULLIF
NULLIF accepts 2 parameters and returns NULL if both parameters are the same, otherwise it returns the first parameter. Expressed using CASE this would be:
CASE WHEN <parameter 1> = <parameter 2> THEN NULL ELSE <parameter 1> END
An interesting situation exists if the first parameter is NULL. According to the NULLIF documentation, the first parameter cannot be specified as a literal NULL but testing shows that it can be an expression that results in NULL:
SELECT NULLIF(TO_CHAR(NULL),'1') FROM dual
which results in NULL.
The parameters of NULLIF need to be of the same basic data type; numeric, character or date. NULLIF will reurn an error if they are not.
LNNVL
LNNVL is an interesting function in that it takes a single logical expression rather than a more traditional scalar parameter(s). If the expression evaluates to TRUE then LNNVL returns FALSE, otherwise it returns TRUE. Expressed using CASE, it would be:
CASE WHEN <logical expression> THEN FALSE ELSE TRUE END
“Hang on” I hear you say, “isn’t that what NOT does?”, i.e.:
X > 5
could be just as easily expressed as:
NOT (X > 5)
to obtain the reverse without needing LNNVL. The answer is “No, not quite”. This is where the complexity of handling NULL comes in. Let’s take a closer look at very simple scenario of a data set involving 3 rows; values 1, 10 and NULL. First we select the entries where the value is greater than 5, as per our previous expression:
WITH demo_data AS (SELECT 1 AS id, 1 AS val FROM dual UNION ALL SELECT 2 AS id, 10 AS val FROM dual UNION ALL SELECT 3 AS id, NULL AS val FROM dual) SELECT * FROM demo_data WHERE val > 5
which gives:
ID VAL ---------- ---------- 2 10
Then we’ll select the entries where the values is not greater than 5:
WITH demo_data AS (SELECT 1 AS id, 1 AS val FROM dual UNION ALL SELECT 2 AS id, 10 AS val FROM dual UNION ALL SELECT 3 AS id, NULL AS val FROM dual) SELECT * FROM demo_data WHERE NOT(val > 5)
which gives:
ID VAL ---------- ---------- 1 1
So, despite one WHERE clauses being the exact opposite of the other from a simple boolean perspective we have excluded the row with the NULL from both result-sets. Enter LNNVL. If we use LNNVL in place of the NOT operator in the second statement:
WITH demo_data AS (SELECT 1 AS id, 1 AS val FROM dual UNION ALL SELECT 2 AS id, 10 AS val FROM dual UNION ALL SELECT 3 AS id, NULL AS val FROM dual) SELECT * FROM demo_data WHERE LNNVL(val > 5)
we get:
ID VAL ---------- ---------- 1 1 3
Ah, so now we do get the exact opposite of the initial result-set, which includes those entries where VAL is NULL.
DECODE
I’m including DECODE in the list of NULL handling functions due to peculiar property that it has in relation to NULL. DECODE is written as:
DECODE (expression ,value_1, result_1 ,value_2, result_2 ,value_3, result_3 ,... ,default_result)
So, the expression is evaluated and compared to the value 1 parameter, if the values match then result_1 parameter is returned. Otherwise a comparision is made to the value_2 parameter, if matched then result_2 is returned and so on down the parameter list. If no match is made then the final parameter is returned, assuming one is provided, otherwise DECODE returns NULL. Using CASE, DECODE is:
CASE <expression> WHEN <value_1> THEN <result_1> WHEN <value_2> THEN <result_2> WHEN <value_3> THEN <result_3> ... ELSE <default_result> END
So, what’s so special with DECODE in relation to NULLs? If the expression evaluates to NULL and a value parameter is NULL then DECODE returns the corresponding result parameter. Normally NULL is never considered equal to anything, including another NULL, but DECODE treats it as if it is. This can be best illustrated using the following:
WITH demo_data AS (SELECT 1 AS id, 1 AS val FROM dual UNION ALL SELECT 2 AS id, 10 AS val FROM dual UNION ALL SELECT 3 AS id, NULL AS val FROM dual) SELECT id , val , DECODE (val, NULL, 999, -1) AS calc_val FROM demo_data
which gives the following result set:
ID VAL CALC_VAL ---------- ---------- ---------- 1 1 -1 2 10 -1 3 999
So, here DECODE mapped the NULL in the third row to a calculated value of 999. This also illustrates a bug with how I represented DECODE using CASE. If I replace DECODE with a CASE expression I initially showed above:
WITH demo_data AS (SELECT 1 AS id, 1 AS val FROM dual UNION ALL SELECT 2 AS id, 10 AS val FROM dual UNION ALL SELECT 3 AS id, NULL AS val FROM dual) SELECT id , val , DECODE (val, NULL, 999, -1) AS calc_val_decode , CASE val WHEN NULL THEN 999 ELSE -1 END AS calc_val_case FROM demo_data
the result would be:
ID VAL CALC_VAL_DECODE CALC_VAL_CASE ---------- ---------- --------------- ------------- 1 1 -1 -1 2 10 -1 -1 3 999 -1
Notice how the CASE expression did not consider the NULL column value to equal the NULL result scenario so it defaults to the ELSE clause instead. So, in order to mimic DECODE via CASE we would need a bit more complex logic:
CASE -- resolve the scenario where the expression evaluates to NULL WHEN <expression> IS NULL THEN <result_null> -- now handle the not NULL scenarios ELSE CASE expression WHEN <value_1> THEN <result_1> WHEN <value_2> THEN <result_2> WHEN <value_3> THEN <result_3> ... ELSE <default_result> END END
… or something similar as there are various ways this logic could be implemented.
So, in summary, I hope this post has illustrated the basics of how NULL is handled by SQL operators and functions. The handling of NULL is not overly complex but it is an area that quite regularly causes problems when overlooked in applications.