1. 创建测试表和两个复合索引,idx_alex_t04_ab前导列区分度低,idx_alex_t04_cd前导列区分度高。
-- Create table drop table alex_t04 purge; create table alex_t04 (id number, a number, b number, c number, d number, name varchar2(100)); -- Create/Recreate primary, unique and foreign key constraints alter table alex_t04 add constraint pk_alex_t04 primary key (id) using index; create index idx_alex_t04_ab on alex_t04 (a, b); create index idx_alex_t04_cd on alex_t04 (c, d); -- Insert table declare begin for i in 1..100000 loop insert into alex_t04 values (i, mod(i,2), mod(i,20000), mod(i,20000), mod(i,20000), 'alex'); end loop; commit; end; / --Gather Statistics exec dbms_stats.gather_table_stats('dbmgr','alex_t04') exec dbms_stats.gather_index_stats('dbmgr','pk_alex_t04') exec dbms_stats.gather_index_stats('dbmgr','idx_alex_t04_ab') exec dbms_stats.gather_index_stats('dbmgr','idx_alex_t04_cd')
2. 查看表的相关信息
SQL> desc alex_t04 名称 是否为空? 类型 ----------------------------------------- -------- ---------------------------- ID NOT NULL NUMBER A NUMBER B NUMBER C NUMBER D NUMBER NAME VARCHAR2(100) SQL> select count(*) tab_cnt, count(distinct id) id_dist, count(distinct a) a_dist, count(distinct b) b_dist, count(distinct c) c_dist, count(distinct d) d_dist from alex_t04; TAB_CNT ID_DIST A_DIST B_DIST C_DIST D_DIST ---------- ---------- ---------- ---------- ---------- ---------- 100000 100000 2 20000 20000 20000 SQL> select table_name, index_name, column_name from user_ind_columns where table_name='ALEX_T04'; TABLE_NAME INDEX_NAME COLUMN_NAME -------------------- -------------------- -------------------- ALEX_T04 PK_ALEX_T04 ID ALEX_T04 IDX_ALEX_T04_AB A ALEX_T04 IDX_ALEX_T04_AB B ALEX_T04 IDX_ALEX_T04_CD C ALEX_T04 IDX_ALEX_T04_CD D
3. 接下来看一下低区分度前导列的复合索引上的查询行为吧。
############################ # 低区分度前导列的复合索引 ############################ SQL> select * from alex_t04 where a=1 and b=600; ---------------------------------------------------------- Plan hash value: 684560531 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 75 | 6 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ALEX_T04 | 3 | 75 | 6 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_ALEX_T04_AB | 5 | | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------------------------
查询条件覆盖了所有复合索引列,走了INDEX RANGE SCAN,达到较好的预期效果。
SQL> select * from alex_t04 where a=1; ---------------------------------------------------------- Plan hash value: 1888389458 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 50000 | 1220K| 115 (5)| 00:00:02 | |* 1 | TABLE ACCESS FULL| ALEX_T04 | 50000 | 1220K| 115 (5)| 00:00:02 | ------------------------------------------------------------------------------
查询条件覆盖复合索引前导列,由于区分度低,走了全表扫描。
SQL> select * from alex_t04 where b=600; ---------------------------------------------------------- Plan hash value: 1342997078 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 125 | 8 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ALEX_T04 | 5 | 125 | 8 (0)| 00:00:01 | |* 2 | INDEX SKIP SCAN | IDX_ALEX_T04_AB | 5 | | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- SQL> select /*+ full(alex_t04) */ * from alex_t04 where b=600; ---------------------------------------------------------- Plan hash value: 1888389458 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 5 | 125 | 115 (5)| 00:00:02 | |* 1 | TABLE ACCESS FULL| ALEX_T04 | 5 | 125 | 115 (5)| 00:00:02 | ------------------------------------------------------------------------------
查询条件覆盖复合索引第二列,由于前导列区分度低,走INDEX SKIP SCAN成本更低,执行计划更优,强制走全表扫描,反而成本高。
4. 再来看一下高区分度前导列的复合索引上的查询行为吧。
############################ # 高区分度前导列的复合索引 ############################ SQL> select * from alex_t04 where c=1 and d=600; ---------------------------------------------------------- Plan hash value: 1785261929 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 6 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ALEX_T04 | 1 | 25 | 6 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_ALEX_T04_CD | 5 | | 1 (0)| 00:00:01 | -----------------------------------------------------------------------------------------------
查询条件覆盖了所有复合索引列,同样走了INDEX RANGE SCAN,达到较好的预期效果。
SQL> select * from alex_t04 where c=1; ---------------------------------------------------------- Plan hash value: 1785261929 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 125 | 7 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ALEX_T04 | 5 | 125 | 7 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_ALEX_T04_CD | 5 | | 2 (0)| 00:00:01 | -----------------------------------------------------------------------------------------------
查询条件覆盖复合索引前导列,由于区分度高,仍走了INDEX RANGE SCAN,达到较好的效果。
SQL> select * from alex_t04 where d=600; ---------------------------------------------------------- Plan hash value: 1888389458 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 5 | 125 | 115 (5)| 00:00:02 | |* 1 | TABLE ACCESS FULL| ALEX_T04 | 5 | 125 | 115 (5)| 00:00:02 | ------------------------------------------------------------------------------ SQL> select /*+ index_ss(alex_t04 idx_alex_t04_cd) */ * from alex_t04 where d=600; ---------------------------------------------------------- Plan hash value: 888274063 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 125 | 20029 (1)| 00:04:01 | | 1 | TABLE ACCESS BY INDEX ROWID| ALEX_T04 | 5 | 125 | 20029 (1)| 00:04:01 | |* 2 | INDEX SKIP SCAN | IDX_ALEX_T04_CD | 5 | | 20024 (1)| 00:04:01 | -----------------------------------------------------------------------------------------------
查询条件覆盖复合索引第二列,由于前导列区分度高,走全表扫描成本更低,执行计划更优,强制走INDEX SKIP SCAN,反而成本高。
5. 总结
(1)在复合索引的应用中,查询条件覆盖所有的索引列,查询效果较优;
(2)INDEX SKIP SCAN的索引扫描方式,只有在前导列区分度低的情况下,效果比较好,反之则是被优化项;
(3)一般来说,在复合索引创建的时候,前导列都建议选择区分度较高的,故INDEX SKIP SCAN往往就是被优化对象。
no comment untill now