One thing guaranteed to draw my ire in a code review is the use of COUNT(*) to test for the existence of data. This can be done in a number of ways. For example, a simple count via a SELECT statement and then checking if the result was 1 or greater:
... SELECT COUNT(*) INTO l_rows_exist FROM some_table WHERE ... blah,blah,blah; IF (l_rows_exist >= 1) THEN ... blah,blah,blah; END IF;
or as some sort of logical sub-query construct inside a query:
SELECT ... blah,blah,blah FROM some_table WHERE (SELECT COUNT(*) FROM some_other_table WHERE ... blah,blah,blah) >= 1
Unless there’s some constraint involved that will guarantee that the query will return at the most one row counting all entries to simply detect if one entry exists is asking for trouble. Oracle will cheerfully table scan a billion row table looking for all rows that satisfy the query even after it found a matching entry in the first block examined.
Let’s see that in action using a 1,000,000 row table, using AUTOTRACE to obtain some statistics:
SQL>CREATE TABLE lots_of_rows 2 (id NUMBER(10) NOT NULL 3 ,cat NUMBER(2) NOT NULL 4 ,padding CHAR(30) NOT NULL) 5 / Table created. SQL>-- generate 1,000,000 rows SQL>INSERT INTO lots_of_rows 2 WITH row_src AS 3 (SELECT ROWNUM r FROM dual CONNECT BY ROWNUM <= 1000) 4 SELECT ROWNUM 5 , MOD(ROWNUM,100) 6 , 'X' 7 FROM row_src r2 8 , row_src r2 9 / 1000000 rows created. SQL>COMMIT 2 / Commit complete. SQL>EXEC dbms_stats.gather_table_stats ('','lots_of_rows') PL/SQL procedure successfully completed. SQL>SET AUTOTRACE TRACEONLY SQL>SELECT 'X' 2 FROM dual 3 WHERE (SELECT COUNT(*) 4 FROM lots_of_rows 5 WHERE cat = 42) >= 1 6 / Execution Plan ---------------------------------------------------------- Plan hash value: 3376271402 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 1652 (1)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | 3 | SORT AGGREGATE | | 1 | 3 | | | |* 4 | TABLE ACCESS FULL| LOTS_OF_ROWS | 10000 | 30000 | 1650 (1)| 00:00:01 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( (SELECT COUNT(*) FROM "LOTS_OF_ROWS" "LOTS_OF_ROWS" WHERE "CAT"=42)>=1) 4 - filter("CAT"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6069 consistent gets 0 physical reads 0 redo size 536 bytes sent via SQL*Net to client 544 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The execution plan above clearly shows a full table scan and the consistent gets statistic would confirm that the entire table was indeed scanned. The solution to this situation is trivial: rewrite the query to use EXISTS. In our example above, this would be to use:
SELECT ... blah,blah,blah FROM some_table WHERE EXISTS (SELECT COUNT(*) FROM some_other_table WHERE ... blah,blah,blah)
The impact of this simple change can be seen using AUTOTRACE:
SQL>SELECT 'X' 2 FROM dual 3 WHERE EXISTS(SELECT 'X' 4 FROM lots_of_rows 5 WHERE cat = 42) 6 / Execution Plan ---------------------------------------------------------- Plan hash value: 4135382906 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 4 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| LOTS_OF_ROWS | 2 | 6 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT 0 FROM "LOTS_OF_ROWS" "LOTS_OF_ROWS" WHERE "CAT"=42)) 3 - filter("CAT"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 536 bytes sent via SQL*Net to client 544 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
While the execution plan still shows a full table scan the consistent gets has dropped to just 3. This would have been the number that Oracle performed before it found the fist “42” entry and stopped scanning the table. The query also has be advantage of being better documented as EXISTS describes perfectly what if being sought.
However, perhaps my ire at the use of COUNT(*) is somewhat misplaced. The Oracle optimiser is capable of detecting subqueries in the form of:
(SELECT COUNT(*) FROM ....) > 0
and automatically translate it into the equivalent EXISTS form:
EXISTS (SELECT 'X' FROM ....)
We can see this in action using our million row table:
SQL>SELECT 'X' 2 FROM dual 3 WHERE (SELECT COUNT(*) 4 FROM lots_of_rows 5 WHERE cat = 42) > 0 6 / Execution Plan ---------------------------------------------------------- Plan hash value: 4135382906 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 4 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| LOTS_OF_ROWS | 2 | 6 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT 0 FROM "LOTS_OF_ROWS" "LOTS_OF_ROWS" WHERE "CAT"=42)) 3 - filter("CAT"=42) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 536 bytes sent via SQL*Net to client 544 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
So, while we wrote the query to use SELECT COUNT(*) the predicate information section in the above AUTOTRACE output shows that Oracle actually used EXISTS. Note, this rewrite only occurs if the test is “> 0”. The rewrite will not take place using a test of “>= 1”, which is what the original example above used.
I’m not sure what version of Oracle that this automatic rewrite appeared. It certainly works in 11.2 and above. It’s a nice feature but I would still argue that it would not be good practice to rely of this kind of thing and to explicitly write the query using EXISTS.