CBO ... So much words has been said for Oracle most curious technology today and still most of us know very little about it.
Until now, I was pretty sure that any SQL that was parsed without errors cannot be wrong at run time... To be more clear I was sure that SQL result may not depend on physical data in tables.
But I was wrong! Following example came from real live problem on 10g database.
The problem
I wanted to write a script which they can query TEMP usage through SQLPlus as ordinary sql query. After some time I managed to get satisfactory query.I run it as an test in my environment:
SQL> select 2 s.username, 3 s.sid, s.serial#, 4 round(((su.blocks*pa.value)/1024/1024),2) size_mb, 5 s.status, 6 su.segfile#, 7 su.segblk# 8 from v$session s, 9 v$sort_usage su, 10 v$process pr, 11 v$parameter pa 12 where 13 s.saddr = su.session_addr 14 and s.paddr = pr.addr 15 and pa.name='db_block_size' 16 and ((nvl(su.blocks,0)*pa.value)/1024/1024) > 0 17 order by size_mb; no rows selected SQL>Result "no rows selected" force me to occupy some TEMP space to test properly. So I run in other SQLPlus session script which do exactly that-occupy TEMP space:
SQL> select * from dba_source order by text desc;After some time (when previous script start to output data) I run again mine TEMP script and get an error:
SQL> select 2 s.username, 3 s.sid, s.serial#, 4 round(((su.blocks*pa.value)/1024/1024),2) size_mb, 5 s.status, 6 su.segfile#, 7 su.segblk# 8 from v$session s, 9 v$sort_usage su, 10 v$process pr, 11 v$parameter pa 12 where 13 s.saddr = su.session_addr 14 and s.paddr = pr.addr 15 and pa.name='db_block_size' 16 and ((nvl(su.blocks,0)*pa.value)/1024/1024) > 0 17 order by size_mb; and ((nvl(su.blocks,0)*pa.value)/1024/1024) > 0 * ERROR at line 16: ORA-01722: invalid number SQL>I was surprised with an error.
Firstly I have rechecked return value from v$parameter query:
SQL> select value 2 from v$parameter pa 3 where pa.name='db_block_size'; VALUE -------------------------------- 8192 SQL>
Then I retest that returned value is correct (it is a number):
SQL> select value * 10 2 from v$parameter pa 3 where pa.name='db_block_size'; VALUE*10 ---------- 81920 SQL>
Then I tried to place to_number() to explicitly cast problematic part-no success!
Then I thought that Oracle database statistic (data dictionary) was too old so I run it manually (dbms_stats.gather_database_stats)-no success!
After that I run explain plan to see what is going on with the query:
SQL> explain plan for 2 select 3 s.username, 4 s.sid, s.serial#, 5 round(((su.blocks*pa.value)/1024/1024),2) size_mb, 6 s.status, 7 su.segfile#, 8 su.segblk# 9 from v$session s, 10 v$sort_usage su, 11 v$process pr, 12 v$parameter pa 13 where 14 s.saddr = su.session_addr 15 and s.paddr = pr.addr 16 and pa.name='db_block_size' 17 and ((nvl(su.blocks,0)*pa.value)/1024/1024) > 0 18 order by size_mb; Explained. SQL> set lines 2000 SQL> set head off SQL> select plan_table_output from table(dbms_xplan.display('PLAN_TABLE',null,'advanced')); -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 2461 | 3 (100)| 00:00:01 | | 1 | SORT ORDER BY | | 1 | 2461 | 3 (100)| 00:00:01 | | 2 | NESTED LOOPS | | 1 | 2461 | 2 (100)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 2448 | 2 (100)| 00:00:01 | |* 4 | HASH JOIN | | 1 | 2422 | 2 (100)| 00:00:01 | |* 5 | HASH JOIN | | 1 | 2390 | 2 (100)| 00:00:01 | | 6 | NESTED LOOPS | | 1 | 2283 | 1 (100)| 00:00:01 | | 7 | NESTED LOOPS | | 1 | 2270 | 1 (100)| 00:00:01 | |* 8 | HASH JOIN | | 1 | 2244 | 1 (100)| 00:00:01 | | 9 | NESTED LOOPS | | 1 | 2176 | 1 (100)| 00:00:01 | |* 10 | HASH JOIN | | 1 | 142 | 1 (100)| 00:00:01 | |* 11 | FIXED TABLE FULL | X$KSUSE | 1 | 71 | 0 (0)| 00:00:01 | |* 12 | FIXED TABLE FULL | X$KTSSO | 1 | 71 | 0 (0)| 00:00:01 | |* 13 | FIXED TABLE FULL | X$KSPPCV | 5 | 10170 | 0 (0)| 00:00:01 | |* 14 | FIXED TABLE FULL | X$KSPPI | 1 | 68 | 0 (0)| 00:00:01 | |* 15 | FIXED TABLE FIXED INDEX| X$KSLWT (ind:1) | 1 | 26 | 0 (0)| 00:00:01 | |* 16 | FIXED TABLE FIXED INDEX | X$KSLED (ind:2) | 1 | 13 | 0 (0)| 00:00:01 | |* 17 | FIXED TABLE FULL | X$KSUSE | 1 | 107 | 0 (0)| 00:00:01 | |* 18 | FIXED TABLE FULL | X$KSUPR | 1 | 32 | 0 (0)| 00:00:01 | |* 19 | FIXED TABLE FIXED INDEX | X$KSLWT (ind:1) | 1 | 26 | 0 (0)| 00:00:01 | |* 20 | FIXED TABLE FIXED INDEX | X$KSLED (ind:2) | 1 | 13 | 0 (0)| 00:00:01 |As it is easy to see, the problem rely in view v$parameter and column value which has mixed values (numbers as well as varchars). So when Oracle's CBO perform Full Table Scan (FTS) (regardless value I was retrieving was pure number) it breaks.
The whole explain plan output can be downloaded from one file .
Solution
Solution was relatively easy when we know the cause. So I'll show one most acceptable way. Placing hint will always prevent Oracle's CBO to go wrong-to make FTS!SQL> select /*+ rule */ 2 s.username, 3 s.sid, s.serial#, 4 round(((su.blocks*pa.value)/1024/1024),2) size_mb, 5 s.status, 6 su.segfile#, 7 su.segblk# 8 from v$session s, 9 v$sort_usage su, 10 v$process pr, 11 v$parameter pa 12 where 13 s.saddr = su.session_addr 14 and s.paddr = pr.addr 15 and pa.name='db_block_size' 16 and ((nvl(su.blocks,0)*pa.value)/1024/1024) > 0 17 order by size_mb; USERNAME SID SERIAL# SIZE_MB STATUS SEGFILE# SEGBLK# -------------- --------- ---------- ---------- -------- ---------- ---------- TAB 307 905 226 INACTIVE 81 60681 SQL>The same could be done with rule FIRST_ROWS.
To see that realy CBO plan changed, let us look into it:
SQL> explain plan for 2 select /*+ rule */ 3 s.username, 4 s.sid, s.serial#, 5 round(((su.blocks*pa.value)/1024/1024),2) size_mb, 6 s.status, 7 su.segfile#, 8 su.segblk# 9 from v$session s, 10 v$sort_usage su, 11 v$process pr, 12 v$parameter pa 13 where 14 s.saddr = su.session_addr 15 and s.paddr = pr.addr 16 and pa.name='db_block_size' 17 and ((nvl(su.blocks,0)*pa.value)/1024/1024) > 0 18 order by size_mb; Explained. SQL> set lines 2000 SQL> set head off SQL> select plan_table_output from table(dbms_xplan.display('PLAN_TABLE',null,'advanced')); -------------------------------------------------------- | Id | Operation | Name | -------------------------------------------------------- | 0 | SELECT STATEMENT | | | 1 | SORT ORDER BY | | | 2 | MERGE JOIN | | | 3 | SORT JOIN | | | 4 | MERGE JOIN | | | 5 | SORT JOIN | | | 6 | MERGE JOIN | | | 7 | SORT JOIN | | | 8 | MERGE JOIN | | | 9 | SORT JOIN | | | 10 | MERGE JOIN | | | 11 | SORT JOIN | | | 12 | MERGE JOIN | | | 13 | SORT JOIN | | | 14 | MERGE JOIN | | | 15 | SORT JOIN | | | 16 | NESTED LOOPS | | | 17 | MERGE JOIN | | | 18 | FIXED TABLE FULL | X$KSPPCV | |* 19 | FILTER | | |* 20 | SORT JOIN | | |* 21 | FIXED TABLE FULL| X$KSPPI | |* 22 | FIXED TABLE FULL | X$KTSSO | |* 23 | SORT JOIN | | |* 24 | FIXED TABLE FULL | X$KSUSE | |* 25 | SORT JOIN | | | 26 | FIXED TABLE FULL | X$KSLWT | |* 27 | SORT JOIN | | | 28 | FIXED TABLE FULL | X$KSLED | |* 29 | SORT JOIN | | |* 30 | FIXED TABLE FULL | X$KSUSE | |* 31 | SORT JOIN | | |* 32 | FIXED TABLE FULL | X$KSUPR | |* 33 | SORT JOIN | | | 34 | FIXED TABLE FULL | X$KSLWT | |* 35 | SORT JOIN | | | 36 | FIXED TABLE FULL | X$KSLED | --------------------------------------------------------The whole explain plan output can be downloaded from one file
The conclusion
This problem I have reproduced on 10g database as well as 11g (32/64 bit Win/Lin versions) with default init ora parameter optimizer_mode defined as ALL_ROWS.Because "bug" was too obvious finally I search on Metalink and found exactly the same case: "Bug 6263716: QUERY WORKS FOR ORACLE 9I BUT IT DOESN'T WORK FOR ORACLE 10G-ORA-1722" but it is closed as not a bug but expected behaviour.
So, be careful on tables with mixed values in one column-they might reproduce an unexpected error at run time ... and this is expected behaviour...
;-)
Cheers!