Creating indexes on expressions using in-built SQL functions can be a very useful approach for a solving certain problems. For example, an index can be created to enforce uniqueness on a subset of rows or for solving an important, unanticipated query that contains complex logic.
When creating function-based indexes we are not limited to using the in-built SQL functions but can use PL/SQL functions too, which is a feature that I prefer to avoid and consequently forget about. This post looks at some of the basics associated with PL/SQL function-based indexes, noting a couple of concerns you need to be aware of.
To start with, let’s create a table for the following examples:
CREATE TABLE plsql_fbi (id NUMBER(10) NOT NULL ,cat VARCHAR2(10) NOT NULL) / ALTER TABLE plsql_fbi ADD CONSTRAINT plsql_fbi_chk1 CHECK (UPPER(cat) = cat) / INSERT INTO plsql_fbi (id, cat) SELECT ROWNUM , dbms_random.string('U',10) FROM dual CONNECT BY ROWNUM <= 100 / EXEC dbms_stats.gather_table_stats ('','plsql_fbi')
So, we now have a table named PLSQL_FBI, which contains 2 columns and 100 rows. The CAT column is character-based and each entry is randomly generated 10 upper case character strings. Now assume we have a requirement to filter the data in the table according to the last character of the CAT column, i.e.:
SELECT * FROM plsql_fbi WHERE SUBSTR(cat,-1,1) = 'C' ORDER BY cat
We could certainly create a function-based index without involving a PL/SQL function, as shown below:
CREATE INDEX plsql_fbi_ix1 ON plsql_fbi (SUBSTR(cat,-1,1))
However, for this exercise we will adopt a “developer mentality” and instead choose to:
CREATE OR REPLACE FUNCTION plsql_fbi_cat (p_cat IN plsql_fbi.cat%TYPE) RETURN VARCHAR2 DETERMINISTIC AS BEGIN RETURN SUBSTR(p_cat,1,1); END plsql_fbi_cat; / CREATE INDEX plsql_fbi_ix1 ON plsql_fbi (plsql_fbi_cat(cat)) /
We’ve now created a PL/SQL function that accepts a string and returns the last character of that string. We then create an index on our table using that function. A key thing to note in the definition of the function above is the DETERMINISTIC keyword, which basically means for any given input the function output will yeild the same result. Many functions are deterministic but if DETERMINISTIC isn’t specified then Oracle will prevent you from using the function for index creation and return the following error instead:
SQL>CREATE INDEX plsql_fbi_ix1 2 ON plsql_fbi (plsql_fbi_cat(cat)) 3 / ON plsql_fbi (plsql_fbi_cat(cat)) * ERROR at line 2: ORA-30553: The function is not deterministic
Certainly this what will happen in Oracle 11g release 2. I seem to recall that in earlier versions of Oracle the DETERMINISTIC keyword was not required but I may be wrong. What is important to note is that even though the function has been declared as DETERMINISTIC Oracle does not actually verify that the function is in fact deterministic. This means that the following functoin which is decidedly non-deterministic will be assumed to be deterministic and could be used to create a function-based index:
CREATE OR REPLACE FUNCTION plsql_fbi_cat (p_cat IN plsql_fbi.cat%TYPE) RETURN VARCHAR2 DETERMINISTIC AS BEGIN RETURN dbms_random.string('U',1); END plsql_fbi_cat;
Okay, back to our example, we now have our PL/SQL function and an index on our table that utilises that function. A simple test, using autotrace, will show us whether Oracle will use it or not:
SET AUTOTRACE ON SELECT * FROM plsql_fbi WHERE plsql_fbi_cat(cat) = 'C' ORDER BY cat / SET AUTOTRACE OFF
which gives us:
SQL>SELECT * 2 FROM plsql_fbi 3 WHERE plsql_fbi_cat(cat) = 'C' 4 ORDER BY 5 cat 6 / ID CAT ---------- ---------- 18 FSAHXSSXCC 68 IEMPRBPTCC 56 JGDDOSKQYC Execution Plan ---------------------------------------------------------- Plan hash value: 3845859686 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 2 (50)| 00:00:01 | | 1 | SORT ORDER BY | | 1 | 14 | 2 (50)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| PLSQL_FBI | 1 | 14 | 1 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | PLSQL_FBI_IX1 | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("DEVELOPER"."PLSQL_FBI_CAT"("CAT")='C') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 680 bytes sent via SQL*Net to client 520 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 3 rows processed
From the above results, we have 3 rows in that table that end in the character C (since the data was randomly generated, you will get different results). The execution plan shows that Oracle chose to use the index to select the relevant rows, which is exactly what we would hope for.
So far so good. We’ve eliminated what could be an expensive calculation against our data and stored the calculated values in our index so we can quickly look up data based on that calculation efficiently. Any hidden dangers with doing this?
Well, what happens if our function changes, e.g. we change the calculation so that we extract the first character from the input string instead of the last:
CREATE OR REPLACE FUNCTION plsql_fbi_cat (p_cat IN plsql_fbi.cat%TYPE) RETURN VARCHAR2 DETERMINISTIC AS BEGIN RETURN SUBSTR(p_cat,1,1); END plsql_fbi_cat;
Now we rerun our query and we get the following result:
SQL>SELECT * 2 FROM plsql_fbi 3 WHERE plsql_fbi_cat(cat) = 'C' 4 ORDER BY 5 cat 6 / ID CAT ---------- ---------- 18 FSAHXSSXCC 68 IEMPRBPTCC 56 JGDDOSKQYC
Clearly this result is incorrect. We’ve changed our function but the query is still returning the same results as the pre-modified function. When we recompiled our PL/SQL function our index that is based on that function was not updated. This might come as a surprise to some but that’s just the way it is. In order to get the index to reflect the revised calculation we need to rebuild the index:
SQL>ALTER INDEX plsql_fbi_ix1 REBUILD 2 / Index altered. SQL>SELECT * 2 FROM plsql_fbi 3 WHERE plsql_fbi_cat(cat) = 'C' 4 ORDER BY 5 cat 6 / ID CAT ---------- ---------- 19 CMBJCOCCPD 62 CRUDSSFRRO
This time we get the results that we would expect from the revised function. What would happen if we forget to rebuild our index and data in the table changed:
SQL>INSERT INTO plsql_fbi 2 VALUES (200,'ACDC') 3 / 1 row created. SQL> SQL>INSERT INTO plsql_fbi 2 VALUES (201,'CDCA') 3 / 1 row created. SQL>COMMIT 2 / Commit complete. SQL>SELECT * 2 FROM plsql_fbi 3 WHERE plsql_fbi_cat(cat) = 'C' 4 ORDER BY 5 cat 6 / ID CAT ---------- ---------- 201 CDCA 18 FSAHXSSXCC 68 IEMPRBPTCC 56 JGDDOSKQYC
Whoa, that’s completely inconsistent! What we now have are entries in the index where some have been calculated using the initial version of the PL/SQL function and some with the modified version. So, the recommendation here would be that if a function used by an index is modified then the index should immediately be rebuilt. This might sound quite simple and obvious but in a complex environment where function calls might be nested one within another then it can be hard to work out the precise impact of any single code change. My recommendation would be that if you decide to use a PL/SQL function for an index then keep the function logic self-contained and don’t start calling other PL/SQL functions. Generating incorrect results from a query is probably the worst type of error that can occur within a database application so it is best to avoid scenarios that can lead to this.
What happens if we modify the function and it ends up invalid through some typo?
SQL>CREATE OR REPLACE FUNCTION plsql_fbi_cat (p_cat IN plsql_fbi.cat%TYPE) 2 RETURN VARCHAR2 3 DETERMINISTIC 4 AS 5 BEGIN 6 RETURN xSUBSTR(p_cat,-1,1); 7 END plsql_fbi_cat; 8 / Warning: Function created with compilation errors. SQL>SELECT * 2 FROM plsql_fbi 3 WHERE plsql_fbi_cat(cat) = 'C' 4 ORDER BY 5 cat 6 / FROM plsql_fbi * ERROR at line 2: ORA-06575: Package or function PLSQL_FBI_CAT is in an invalid state
Again, probably no surprises here. If the function goes invalid then we get an error when we try to run our SELECT query. Things change a bit though if our function exists within a package:
SQL>CREATE OR REPLACE PACKAGE pkg_plsql_fbi 2 AS 3 4 FUNCTION cat (p_cat IN plsql_fbi.cat%TYPE) 5 RETURN VARCHAR2 6 DETERMINISTIC; 7 8 END pkg_plsql_fbi; 9 / Package created. SQL> SQL>CREATE OR REPLACE PACKAGE BODY pkg_plsql_fbi 2 AS 3 4 FUNCTION cat (p_cat IN plsql_fbi.cat%TYPE) 5 RETURN VARCHAR2 6 DETERMINISTIC 7 AS 8 BEGIN 9 RETURN SUBSTR(p_cat,-1,1); 10 END cat; 11 12 END pkg_plsql_fbi; 13 / Package body created. SQL> SQL>DROP INDEX plsql_fbi_ix1 2 / Index dropped. SQL> SQL>CREATE INDEX plsql_fbi_ix1 2 ON plsql_fbi (pkg_plsql_fbi.cat(cat)) 3 / Index created. SQL> SQL>SELECT * 2 FROM plsql_fbi 3 WHERE pkg_plsql_fbi.cat(cat) = 'C' 4 ORDER BY 5 cat 6 / ID CAT ---------- ---------- 200 ACDC 18 FSAHXSSXCC 68 IEMPRBPTCC 56 JGDDOSKQYC SQL> SQL>DROP PACKAGE BODY pkg_plsql_fbi 2 / Package body dropped. SQL> SQL>SELECT * 2 FROM plsql_fbi 3 WHERE pkg_plsql_fbi.cat(cat) = 'C' 4 ORDER BY 5 cat 6 / ID CAT ---------- ---------- 200 ACDC 18 FSAHXSSXCC 68 IEMPRBPTCC 56 JGDDOSKQYC
So long as the package specification remain valid then we can continue to run SELECT queries. This makes sense as Oracle is not actually executing the query but using the pre-calculated values that exist in the index. Naturally, if we tell Oracle to ignore our index and force it to execute the function then we get:
SQL>SELECT /*+ FULL (plsql_fbi) */ 2 * 3 FROM plsql_fbi 4 WHERE pkg_plsql_fbi.cat(cat) = 'C' 5 ORDER BY 6 cat 7 / SELECT /*+ FULL (plsql_fbi) */ * ERROR at line 1: ORA-04067: not executed, package body "DEVELOPER.PKG_PLSQL_FBI" does not exist ORA-06508: PL/SQL: could not find program unit being called: "DEVELOPER.PKG_PLSQL_FBI" ORA-06512: at line 1
Likewise, if we attempt to modify the data we will generate an error as Oracle has to actually execute the function in order to update the index:
SQL>INSERT INTO plsql_fbi 2 VALUES (203,'ABCD') 3 / INSERT INTO plsql_fbi * ERROR at line 1: ORA-04067: not executed, package body "DEVELOPER.PKG_PLSQL_FBI" does not exist ORA-06508: PL/SQL: could not find program unit being called: "DEVELOPER.PKG_PLSQL_FBI" ORA-06512: at line 1
So, when using PL/SQL function-based indexes the things to keep in mind are:
- Ensure the function you use is deterministic. Despite the DETERMINISTIC keyword it is up to you to ensure this is true.
- If you modify the function used by the function-based index then you will need to rebuild the index. I would go sofar as to say you should be disabling the index prior to making any change to the function.