count(*)用不到索引

正常情况下,count(*)效率最高时一定是用到索引,因为count(*)并不返回任何一个列,只需返回一个总记录数即可,这种情况由于无需回表,所以索引可看成一个瘦表,这样同样扫描全部,全扫描小的多的索引块能回答记录数,当然效率更高。
生产中,不少应用是有加count(*)代码,要对此引起重视,测试如下:

Connected to Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 
Connected as ljb
SQL>  drop table test;
Table dropped

建测试表和索引

SQL> create table test as select * from dba_objects where object_id is not null;
Table created
SQL> create index idx_test on test (object_id);
Index created
SQL> analyze table test compute statistics for table for all indexes for all indexed columns;
Table analyzed
SQL> explain  plan for select count(*) from test ;
Explained

发现该查询根本没有用到索引

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
Plan hash value: 1950795681
--------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   201   (2)| 00:00:03 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| TEST | 62640 |   201   (2)| 00:00:03 |
--------------------------------------------------------------------------------------------------
9 rows selected

那如果加上提示,让系统强制使用索引可以吗?

SQL> explain  plan for select /*+index(a,idx_test)*/ count(*) from test a ;
Explained

发现加上hint后,根本起不了作用,系统仍然走全表扫描

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
Plan hash value: 1950795681
---------------------------------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   201   (2)| 00:00:03 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| TEST | 62640 |   201   (2)| 00:00:03 |
---------------------------------------------------------------------------------------------------
9 rows selected

问题出在哪呢?原来真正的原因是索引不存储空值,这是非常重要的知识点!如果数据库记录中索引列有空值,那索引就不会存储空的那个列的信息,这样,如果想count(*)从索引这个“瘦表”回答记录数问题就可能出错,因为有可能统计少了,oracle当然无论如何都不会走索引,即便你加上hint强制执行,它都不答应,它不能冒统计错误的风险。
那怎么办呢,那试着给object_id 加上is not null,明确告诉ORACLE该索引列值不会为空,看看能否走索引

SQL> explain  plan for select  count(*) from test where object_id is not null ;
Explained

这下真走索引了!

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------
Plan hash value: 1190062564
-------------------------------------------------------------------------------------------------------------
| Id  | Operation        | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |          |     1 |     4 |    14   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |          |     1 |     4 |            |          |
|*  2 |   INDEX FULL SCAN| IDX_TEST | 62640 |   244K|    14   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("OBJECT_ID" IS NOT NULL)
14 rows selected

当然,在代码中加where object_id is not null是一个方法,最好的方法就是,如果该列确实不允许为空,就把它的索引改为主键,主键列不允许为空,这点ORACLE它明白!

SQL> drop index idx_test;
Index dropped
SQL> alter table test add constraint test_pk primary key (object_id);
Table altered
SQL> analyze table test compute statistics for table for all indexes for all indexed columns;
Table analyzed
SQL> explain  plan for select  count(*) from test ;
Explained

再看加上主键后的表的执行计划,select count(*) from test 也走索引了

SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
Plan hash value: 3830891886
---------------------------------------------------------------------------------------------------
| Id  | Operation        | Name    | Rows  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |         |     1 |    13   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |         |     1 |            |          |
|   2 |   INDEX FULL SCAN| TEST_PK | 62640 |    13   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
9 rows selected

总结:通过此例可以看出,应用中如果某列不允许为空,并且应用中有大量对该表count(*)的情况,最好选择主键,或者指定该列is not null,这样就可以让count(*)用上索引,提高效率。

Trackback

no comment untill now

Add your comment now

切换到手机版