Oracle的USE_NL的说明
之前一直都不是很清楚这个怎么在Nest Loop中指定驱动表(外部表) 和 内部表,经过了一些实验对这一部分有了一定的认识: 首先对于nest loop有两种机制: 内部表全表扫描和内部表索引扫描,9i开始对于内部表索引扫描的情况有存在两种方式: 内部表索引range scan & 内部表索引uniqu scan, 索引range scan这种方式可以减少大的嵌套循环连接中的逻辑IO,减少锁存,甚至物理IO, 这个过程称为Table Prefetching. 另外对于nest loop, 驱动表和内部表位置如下:
两个表全表扫描 : NESTED LOOPS TABLE ACCESS (FULL) OF ‘Driving table’ TABLE ACCESS (FULL) OF ‘Inner table’
Inner 表索引扫描: NESTED LOOPS TABLE ACCESS (FULL) OF ‘Driving table’ INDEX RANGE SCAN OF ‘Index of inner table’
问题一: 如果通过hint指定Nest loop的驱动表? 1) /*+ ORDERED USE_NL(a b) */ 注意这时并非a 为驱动表, b为内部表,ORDERED和Where子句的顺序相关:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 drop table tmpa; drop table tmpb; create table tmpa as select rownum a, object_id b from DBA_objects where rownum<10000; create table tmpb as select rownum a, object_id b from user_objects where rownum<10; kl@k02> exec dbms_stats.gather_table_stats('kl','TMPB'); PL/SQL procedure successfully completed. kl@k02> exec dbms_stats.gather_table_stats('kl','TMPB'); PL/SQL procedure successfully completed. set autotrace traceonly exp stat
该顺序可以看出驱动表TMPB是个小表, TMPA是个大表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 kl@k02> select /*+ use_nl(a b) */ * from tmpa a,tmpb b where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=38 Card=9 Bytes=144) 1 0 NESTED LOOPS (Cost=38 Card=9 Bytes=144) 2 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=9 Bytes=72) 3 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=1 Bytes=8) ----use_nl(a b) kl@k02> select /*+ ordered use_nl(a b) */ * from tmpa a,tmpb b where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=20002 Card=9 Bytes=144) 1 0 NESTED LOOPS (Cost=20002 Card=9 Bytes=144) 2 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=9999 Bytes=79992) 3 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=1 Bytes=8) ----顺序并没有变 kl@k02> select /*+ ordered use_nl(b a) */ * from tmpa a,tmpb b where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=20002 Card=9 Bytes=144) 1 0 NESTED LOOPS (Cost=20002 Card=9 Bytes=144) 2 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=9999 Bytes=79992) 3 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=1 Bytes=8) ---- 看看顺序实际是安from子句的顺序 kl@k02> select /*+ ordered use_nl(b a) */ * from tmpb b,tmpa a where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=38 Card=9 Bytes=144) 1 0 NESTED LOOPS (Cost=38 Card=9 Bytes=144) 2 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=9 Bytes=72) 3 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=1 Bytes=8) ---- Leading可以选择inner table,这个地方我之前一直不清楚, 目前看来Leading的表并非驱动表,而是内部表 : kl@k02> select /*+ leading(b) use_nl(a) */ * from tmpb a,tmpa b where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=20002 Card=9 Bytes=144) 1 0 NESTED LOOPS (Cost=20002 Card=9 Bytes=144) 2 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=9999 Bytes=79992) --驱动表 3 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=1 Bytes=8) kl@k02> select /*+ leading(a) use_nl(b) */ * from tmpb a,tmpa b where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=38 Card=9 Bytes=144) 1 0 NESTED LOOPS (Cost=38 Card=9 Bytes=144) 2 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=9 Bytes=72) ---驱动表 3 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=1 Bytes=8)
问题二: 如果没有统计信息,会怎样?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 kl@k02> exec dbms_stats.delete_table_stats('kl','TMPA'); PL/SQL procedure successfully completed. kl@k02> exec dbms_stats.delete_table_stats('kl','TMPB'); PL/SQL procedure successfully completed. kl@k02> select /*+ ordered use_nl(a b) */ * from tmpb a,tmpa b where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1310 Card=1879 Bytes=97708) 1 0 NESTED LOOPS (Cost=1310 Card=1879 Bytes=97708) 2 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=327 Bytes=8502) 3 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=6 Bytes=156) kl@k02> select /*+ ordered use_nl(a b) */ * from tmpb b,tmpa a where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1310 Card=1879 Bytes=97708) 1 0 NESTED LOOPS (Cost=1310 Card=1879 Bytes=97708) 2 1 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=327 Bytes=8502) 3 1 TABLE ACCESS (FULL) OF 'TMPA' (Cost=4 Card=6 Bytes=156) ---- 可见如果没有统计信息,执行计划都是一样的;
问题三: 如果有索引,内部表作index range scan
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 kl@k02> create index b_ind on tmpa (a); Index created. kl@k02> exec dbms_stats.gather_table_stats('kl','TMPA'); PL/SQL procedure successfully completed. kl@k02> exec dbms_stats.gather_table_stats('kl','TMPB'); PL/SQL procedure successfully completed. kl@k02> select /*+ USE_NL(a b) */ * from tmpb b,tmpa a where a.a=b.a; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=20 Card=9 Bytes=144) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'TMPA' (Cost=2 Card=1 Bytes=8) 2 1 NESTED LOOPS (Cost=20 Card=9 Bytes=144) 3 2 TABLE ACCESS (FULL) OF 'TMPB' (Cost=2 Card=9 Bytes=72) 4 2 INDEX (RANGE SCAN) OF 'B_IND' (NON-UNIQUE) (Cost=1 Card=1) (看到index range scan,并非card=1的表是驱动表。) 综上可以理解什么驱动表,什么是内部表, 另外怎么指定驱动表,ordered hint指定from 后先出现的表为驱动表, leading hint 指定内部表。对于 HASH_JOIN也是类似, HASH JOIN TABLE ACCESS (FULL) OF 'TMPB' ---TMPB是HASH表 TABLE ACCESS (FULL) OF 'TMPA' ---TMPA是Prune表。
另一个结论
在使用use_nl时,最好和ordered同时使用,这时在from子句中排列在前的表为驱动表; 或者不用ordered,但在use_nl中将进行嵌套循环连接的两表列出,例如use_nl(e,d),两表的 顺序无关紧要,系统自动选择小表作为驱动表。