0. 创建测试环境

pro 请输入表的记录数
define tabcnt=&tabcnt
pro 请输入Distinct值的数量
define discnt=&discnt

-- Create table
drop table alex_t03 purge;
create table alex_t03 (
id number, 
col1 number, 
name varchar2(100)
);

-- Create/Recreate primary, unique and foreign key constraints 
alter table alex_t03 add constraint pk_alex_t03 primary key (id) using index;
create index idx_alex_t03_col1 on alex_t03 (col1);

-- Insert table
declare
seq number:=1;
begin
for i in 1..&tabcnt loop
insert into alex_t03 values (i, seq, 'alex');
if mod(i,trunc(&tabcnt/&discnt))=0 then 
seq:=seq+1;
end if;
end loop;
commit;
end;
/

--Gather Statistics
exec dbms_stats.gather_table_stats('dbmgr','alex_t03',method_opt=>'FOR ALL COLUMNS SIZE 1')
exec dbms_stats.gather_index_stats('dbmgr','pk_alex_t03')
exec dbms_stats.gather_index_stats('dbmgr','idx_alex_t03_col1')

1. 查询测试表的数据相关信息
(1)测试表字段描述

SQL> desc alex_t03
 名称                                      是否为空? 类型
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL NUMBER
 COL1                                               NUMBER
 NAME                                               VARCHAR2(100)

(2)查看数据分布情况

SQL> select count(*) tcount, count(distinct col1) coldc_col1 from alex_t03;
    TCOUNT COLDC_COL1
---------- ----------
    100000          5

(3)索引理想选择度(索引理想选择度 = 1 / 索引列的distinct个数)

SQL> select trunc((1/count(distinct col1))*100,2)||'%' col_ratio from alex_t03;
COL_RATIO
--------------------
20%

(4)索引实际选择度(索引实际选择度 = 返回结果集行数 / 全部行数)

SQL> select trunc((select count(*) from alex_t03 where col1=1)/(select count(*) from alex_t03)*100,2)||'%' select_ratio from dual;
SELECT_RATIO
--------------------
20%

(5)查看索引情况

SQL> select table_name, index_name, column_name from user_ind_columns where table_name='ALEX_T03';
TABLE_NAME           INDEX_NAME           COLUMN_NAME
-------------------- -------------------- --------------------
ALEX_T03             PK_ALEX_T03          ID
ALEX_T03             IDX_ALEX_T03_COL1    COL1

2. 索引列查询

SQL> select * from alex_t03 where col1=1;
----------------------------------------------------------
Plan hash value: 4180973395
------------------------------------------------------------------------------
| Id  | Operation         | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |          | 19673 |   230K|    87   (5)| 00:00:02 |
|*  1 |  TABLE ACCESS FULL| ALEX_T03 | 19673 |   230K|    87   (5)| 00:00:02 |
------------------------------------------------------------------------------

上述查询过程,为什么会走索引,而不是走全表扫描呢?下面我们借助10053 trace工具来分析一下吧。

3. 分析10053 trace文件
(1)筛选trace文件的关键部分,考察一下CBO优化器是如何计算的

*****************************
SYSTEM STATISTICS INFORMATION
*****************************
  Using NOWORKLOAD Stats
  CPUSPEED: 509 millions instruction/sec
  IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
  IOSEEKTIM: 10 milliseconds (default is 10)
***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: ALEX_T03  Alias: ALEX_T03
    #Rows: 100945  #Blks:  370  AvgRowLen:  13.00
Index Stats::
  Index: IDX_ALEX_T03_COL1  Col#: 2
    LVLS: 1  #LB: 251  #DK: 1000  LB/K: 1.00  DB/K: 100.00  CLUF: 100000.00
***************************************
SINGLE TABLE ACCESS PATH
***********************
  Access Path: TableScan
    Cost:  87.07  Resp: 87.07  Degree: 0
      Cost_io: 83.00  Cost_cpu: 24844853
      Resp_io: 83.00  Resp_cpu: 24844853
  Access Path: index (AllEqRange)
    Index: IDX_ALEX_T03_COL1
    resc_io: 103.00  resc_cpu: 774138
    ix_sel: 0.0010115  ix_sel_with_filters: 0.0010115
    Cost: 103.13  Resp: 103.13  Degree: 1
  Best:: AccessPath: TableScan
         Cost: 87.07  Degree: 1  Resp: 87.07  Card: 101.15  Bytes: 0

(2)COST计算公式如下

sreadtime = ioseektim + db_block_size/iotfrspeed = 10+8192/4096 = 12
mreadtim = ioseektim + db_file_multiblock_read_count * db_block_size / iotrfspeed = 10+16*8192/4096 = 42
#MRds = #Blks/MBRC = 370/16

(3)计算全表扫描的COST开销

I/O Cost = 1 + ceil(#MRds * (mreadtim/sreadtime)) = 1+ceil(370/16*42/12) = 82
CPU Cost = round(#CPUCycles/cpuspeed/1000/sreadtim) = round(24844853/509/1000/12) = 4
Cost = I/O Cost + CPU Cost = 82+4=86

(4)计算索引扫描的COST开销

Index Access I/O Cost = LVLS + ceil(#LB * ix_sel) = 1+ceil(251*0.0010115) = 2
Table Access I/O Cost = ceil(CLUF * ix_sel_with_filters) = ceil(100000*0.0010115) = 102
I/O Cost = Index Access I/O Cost + Table Access I/O Cost = 2+102 =104
CPU Cost = round(#CPUCycles/cpuspeed/1000/sreadtim) = round(774138/509/1000/12) = 0
CBO Cost = I/O Cost + CPU Cost = 104+0 =104

4. 总结
综上所述,可以看到是否走索引,取决于CBO优化器的COST算法,CBO算法又取决于数据的分布情况。
本文同时提出两个概念:

索引理想选择度 = 1 / 索引列的distinct个数 
索引实际选择度 = 返回结果集行数 / 全部行数

本文实例中,数据分布是均匀的理想的,故两者相等,CBO在缺少直方图统计信息的前提上,默认数据均匀分布的,即走理想选择度的。
如果分布不均匀,又缺少直方图统计信息,则两者就不等了,CBO就可能计算错误,接下来的BLOG再展开。

Trackback

no comment untill now

Add your comment now

切换到手机版