概述
当where子句后指定了一个表的多个列条件时,优化器通常会将多个列的选择性(selectivity)相乘得到where语句的选择性,导致优化器做出错误判断!
Oracle 11g引入了多列统计信息概念,如果上面情况列关联性很好,我们可以做多列统计信息收集,让优化器做出正确判断。
例如:
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA';
COUNT(*) ---------- 3341
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;
COUNT(*) ---------- 3341 SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52775;
COUNT(*) ---------- 0 |
可以看出表customers的列CUST_STATE_PROVINCE、country_id是有关系的,只有多列统计信息才能让优化器知道他们的关系,从而有一个更精确的选择性(selectiveity)。
1、创建Column Group:
DECLARE cg_name varchar2(30); BEGIN cg_name := dbms_stats.create_extended_stats(null,'customers', '(cust_state_province,country_id)'); END; / |
2、查看column group name:
select sys.dbms_stats.show_extended_stats_name('sh','customers', '(cust_state_province,country_id)') col_group_name from dual;
COL_GROUP_NAME -------------------------------------------------------------------------------- SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ |
3、删除Column Group
exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)'); |
4、监控Column Groups
--查询多列统计信息 Select extension_name, extension from user_stat_extensions where table_name='CUSTOMERS';
EXTENSION_NAME EXTENSION ------------------------------ ---------------------------------------- SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE","COUNTRY_ID")
--查看distinct数和柱状图使用情况 select e.extension col_group, t.num_distinct, t.histogram from user_stat_extensions e, user_tab_col_statistics t where e.extension_name = t.column_name and e.table_name = t.table_name and t.table_name = 'CUSTOMERS'; COL_GROUP NUM_DISTINCT HISTOGRAM ---------------------------------------- ------------ --------------- ("CUST_STATE_PROVINCE","COUNTRY_ID") 145 FREQUENCY |
5、收集多列统计信息
1)收集已存在的列组统计信息
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE AUTO'); |
2)收集新指定的列组统计信息
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY'); |
实验:
1)当不使用多列统计信息时,真实结果是3341,执行计划是1107.
SQL> exec dbms_stats.drop_extended_stats('sh','customers','(cust_state_province,country_id)');
PL/SQL procedure successfully completed.
SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;
COUNT(*) ---------- 3341
Execution Plan ---------------------------------------------------------- Plan hash value: 296924608
-------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 406 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | TABLE ACCESS FULL| CUSTOMERS | 1107 | 17712 | 406 (1)| 00:00:05 | --------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)
Statistics ---------------------------------------------------------- 120 recursive calls 0 db block gets 1588 consistent gets 1454 physical reads 0 redo size 423 bytes sent via SQL*Net to client 420 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 7 sorts (memory) 0 sorts (disk) 1 rows processed |
2)当使用多列统计信息时,真实结果是3341,执行计划是3294.
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');
PL/SQL procedure successfully completed.
SQL> SQL> select count(*) from sh.customers where CUST_STATE_PROVINCE = 'CA' and country_id=52790;
COUNT(*) ---------- 3341
Execution Plan ---------------------------------------------------------- Plan hash value: 296924608
-------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 16 | 406 (1)| 00:00:05 | | 1 | SORT AGGREGATE | | 1 | 16 | | | |* 2 | TABLE ACCESS FULL| CUSTOMERS | 3294 | 52704 | 406 (1)| 00:00:05 | --------------------------------------------------------------------------------
Predicate Information (identified by operation id): ---------------------------------------------------
2 - filter("CUST_STATE_PROVINCE"='CA' AND "COUNTRY_ID"=52790)
Statistics ---------------------------------------------------------- 8 recursive calls 0 db block gets 1458 consistent gets 1454 physical reads 0 redo size 423 bytes sent via SQL*Net to client 420 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
3)即以上情况,使用多列统计信息能让优化器得到更准确的判断!
出处:http://blog.itpub.net/27126919/viewspace-1655727/
最后
以上就是哭泣泥猴桃为你收集整理的Oracle 11g新特性:多列统计信息(MultiColumn Statistics)的全部内容,希望文章能够帮你解决Oracle 11g新特性:多列统计信息(MultiColumn Statistics)所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复