overview
Oracle's cost-based optimizer (COB) uses statistics to calculate the selectivity (the fraction of rows in a table that the SQL statement's predicate chooses) of predicates and to estimate the "cost" of each execution plan. The COB will use the selectivity of a predicate to estimate the cost of a particular access method and to determin the optimal join order
ORACLE COB使用統計信息來計算查詢謂詞的選擇性,並借此評估執行計划的成本。然后COB會使用謂詞的選擇性來評估特定的訪問路徑的成本並確定最優的連接順序。
statistics are used to quantify the data distribution and storage characteristics of tables, columns, indexes and partitions. The COB uses these statistics to estimate how much I/O and memory are required to execute a SQL statement using a particular execution plan. Statistics are stored in the data dictionary, and they can be exported from one database and imported into another. Situations in where you would want to perform this, might be to transfer production statistics to a test system to simulate the real environment, even though the test system may only have small samples of the data。
統計信息被用來量化表、列、索引和分區的數據分布特征和存儲特征。COB使用統計信息來評估SQL語句采用某特定執行計划時的內存和輸入輸出量。統計信息存儲在數據字典視圖中,它們可以被導出和導入,例如,我們可以將生產環境的統計信息導入到測試環境中以便模擬真環境,即使測試環境具有較小的數據采樣。
In order to give the Oracle cost-based optimizer the most up-to-date information about schema objects (and the best chance for choosing a good execution plan) all application tables and indexes to be accessed must be analyzed. New statistics should be gathered on schema objects that are out of date. After loading or deleting large amounts of data would obviously change the number of rows. Other changes like updating a large amount of rows would not effect the number of rows, but may effect the average row length.
為了給ORACLE COB提供最新的關於模式對象的信息(從而可以選擇最優執行計划),所有被訪問的應用表和索引都需要被分析。如果對象的統計信息已經過時,我們需要更新統計信息,例如,在進行大量的裝載或者刪除數據后,或者對表數據進行了大量的更新操作。
Statistics can be generated with the ANALYZE statement or with the package DBMS_STATS(introduced in Oracle8i). The DBMS_STATS package is great for DBA's in managing database statistics only for use by the COB. The package itself allows the DBA to create, modify, view and delete statistics from a standard, well-defined set of package procedures. The statistics can be gathered on tables, indexes, columns, partitions and schemas, but note that it does not generate statistics for clusters.
統計信息可以通過ANALYZE命令或者DBMS_STATS包來收集。在COB模式下,DBMS_STATS包是DBA管理統計信息的有力工具。DBMS_STATS包允許管理員以調用過程的方式創建,編輯,查看和刪除統計信息。它可以收集表、索引、列、分區和模式的統計信息,但是它不可以生成cluster的統計信息;
DBMS_STATS provides a mechanism for you to view and modify optimizer statistics gathered for database objects.The statistics can reside in two different locations:
The dictionary.A table created in the user's schema for this purpose
dbms_stats包為我們提供了查看和編輯統計信息的機制。統計信息可以存儲在2個不同的位置:數據字典視圖和用戶自定義的表中。
Only statistics stored in the dictionary itself have an impact on the cost-based optimizer.
When you generate statistics for a table, column, or index, if the data dictionary already contains statistics for the object, then Oracle updates the existing statistics. Oracle also invalidates any currently parsed SQL statements that access the object.
The next time such a statement executes, the optimizer automatically chooses a new execution plan based on the new statistics. Distributed statements issued on remote databases that access the analyzed objects use the new statistics the next time Oracle parses them.
When you associate a statistics type with a column or domain index, Oracle calls the statistics collection method in the statistics type if you analyze the column or domain index.
只有存儲在字典視圖中的統計信息,才會被優化器使用。
當我們收集表、列或者索引的統計信息時,如果數據字典中已經包含有統計信息,oracle會將已有信息進行更新。同時oracle也會使當前解析的與更新對象相關的sql語句無效,以便可以使用信息的統計信息。在遠程主機中執行的分布式語句,則在oracle下次解析時才會使用心得統計信息。
當我們將某列或者域索引與某一統計類別管理時,oracle會在分析該列或者域索引是調用該統計類別下的統計收集方法。
missing statistics
When statistics do not exist on schema objects, the optimizer uses the following default values.
當統計信息不存在是,oracle會使用如下的默認統計信息。
Tables | |
Statistic | Default Value Used by Optimizer |
Cardinality | 100 rows |
Avg. row len | 20 bytes |
No. of blocks | 100 |
Remote cardinality | 2000 rows |
Remote average row length | 100 bytes |
Indexes | |
Statistic | Default Value Used by Optimizer |
Levels | 1 |
Leaf blocks | 25 |
Leaf blocks/key | 1 |
Data blocks/key | 1 |
Distinct keys | 100 |
Clustering factor | 800 (8*no. of blocks) |
Analyze vs DBMS_STATS
The following is a quick overview of the two.
Analyze The only method available for collecting statistics in Oracle 8.0 and lower.(ORACLE 8之前僅有的統計信息收集方式)ANALYZE can only run serially(只可以串行執行).ANALYZE cannot overwrite or delete certain types of statistics that where generated by DBMS_STATS(不可以覆蓋DBMS_STATS生成的部分統計信息).ANALYZE calculates global statistics for partitioned tables and indexes instead of gathering them directly. This can lead to inaccuracies for some statistics, such as the number of distinct values.(ANALYZE針對分區表和索引計算全局統計信息,而不是直接針對整張表進行統計分析,這可能造成不正確的統計信息,例如distinct value的取值) For partitioned tables and indexes, ANALYZE gathers statistics for the individual partitions and then calculates the global statistics from the partition statistics.(對於分區表,ANALYZE收集每個分區的統計信息,然后根據各個分區的信息計算出全局統計信息)For composite partitioning, ANALYZE gathers statistics for the subpartitions and then calculates the partition statistics and global statistics from the subpartition statistics.(對於組合分區表,ANALYZE收集每個子分區的統計信息,然后據此計算各個分區和全局的統計信息) ANALYZE can gather additional information that is not used by the optimizer, such as information about chained rows and the structural integrity of indexes, tables, and clusters. DBMS_STATS does not gather this information.(ANALYZE 會收集某些與優化器無關的信息,例如chainrow,索引、表和cluster的結構完整性,DBMS_STATS不會收集這些信息)No easy way of knowing which tables or how much data within the tables have changed. The DBA would generally re-analyze all of their tables on a semi-regular basis.(沒有辦法知道哪些表或者表中的哪些數據發生了變化,dba通常會依據一定的規則重新收集所有標的統計信息)
DBMS_STATS Only available for Oracle 8i and higher.(在oracle8之后才可用)Statistics can be generated to a statistics table and can then be imported or exported between databases and re-loaded into the data dictionary at any time. This allows the DBA to experiment with various statistics.(統計信息可以被導出導入,方便了DBA的使用)DBMS_STATS routines have the option to run via parallel query or operate serially(可以並行或者串行執行).Can gather statistics for sub-partitions or partitions.(可以收集分區和子分區的統計信息)Certain DDL commands (ie. create index) automatically generate statistics, therefore eliminating the need to generate statistics explicitly after DDL command.(某些DDL語句可以自動收集統計信息)DBMS_STATS does not generate information about chained rows and the structural integrity of segments.(不會收集chainrow和段結構有效性的統計信息)The DBA can set a particular table, a whole schema or the entire database to be automatically monitored when a modification occurs. When enabled, any change (insert, update, delete, direct load, truncate, etc.) that occurs on a table will be tracked in the SGA. This information is incorporated into the data dictionary by the SMON process at a pre-set interval (every 3 hours in Oracle 8.1.x, and every 15 minutes in Oracle 9i). The information collected by this monitoring can be seen in the DBA_TAB_MODIFICATIONS view. Oracle 9i introduced a new function in the DBMS_STATS package called: FLUSH_DATABASE_MONITORING_INFO. The DBA can make use of this function to flush the monitored table data more frequently. Oracle 9i will also automatically call this procedure prior to executing DBMS_STATS for statistics gathering purposes. Note that this function is not included with Oracle 8i.(使用DBMS_STATS,DBA可以指定某張表,或者整個用戶,或者這個數據庫自動監視數據的變化。當發生任何變化時(增刪改查,裝載,truncate等),oracle會在sga中自動記錄數據的變化,隨后SMON進程會將這些變化與已有的統計信息進行合並(oracle8每3個小時合並一次,oracle9之后沒15分鍾合並一次)。我們可以通過DBA_TAB_MODIFICATIONS視圖來查看已經發生的變化。我們也可以直接使用9i引入的新函數FLUSH_DATABASE_MONITORING_INFO來將信息手動合並到已有統計信息中。在9i中,oracle會在每次調用DBMS_STATS時,首先調用FLASH_DATABASE_MONITORING_INFO函數。)DBMS_STATS provides a more efficient, scalable solution for statistics gathering and should be used over the traditional ANALYZE command which does not support features such as parallelism and stale statistics collection.(DBMS_STAS提供了一種更高效,可伸縮的信息統計方式,我們優先使用DBMS_STATS,而不使用ANNLYZE)Use of table monitoring in conjunction with DBMS_STATS stale object statistics generation is highly recommended for environments with large, random and/or sporadic data changes. These features allow the database to more efficiently determine which tables should be re-analyzed versus the DBA having to force statistics collection for all tables. Including those that have not changed enough to merit a re-scan)(優先使用dbms_stats)
What gets collected?
Table Statistics
Oracle collects the following statistics for a table. Statistics marked with an asterisk are always computed exactly. Table statistics, including the status of domain indexes, appear in the data dictionary views USER_TABLES, ALL_TABLES, and DBA_TABLES in the columns shown in parentheses.
oracle可以為表收集如下的統計信息,部分統計信息始終是准確的(帶*)。表的統計信息(包括domain index)都可以在 USER_TABLES, ALL_TABLES, and DBA_TABLES等視圖的如下字段中可以查看到。
Number of rows (NUM_ROWS)記錄數量* Number of data blocks below the high water mark (that is, the number of data blocks that have been formatted to receive data, regardless whether they currently contain data or are empty) (BLOCKS) 位於高水位線之下的數據塊數量(在mssm中,oracle通過freelist管理段,當段空間不足時,oracle會分配新的數據塊到高水位線下,並進行格式化后放到freelist上以備后用,此時高水位線下的塊都是格式化的,但可能並沒有被使用。在ASSM下,段的管理模式發生了變化,當空間不足時,oracle會分配數據塊到高水位線下,但是並不會立即格式化,而是在使用時才格式化,此時引入了另一個概念low 高水位線,lowhwm下的塊都是格式化的,lowhwm和hwm之間的數據庫可能是格式化也可能並未格式化,當lowhwm和hwm之間的數據塊全部格式化時,lowhwm上移到hwm的位置),
* Number of data blocks allocated to the table that have never been used (EMPTY_BLOCKS)空閑數據塊的數量,HWM之上的數據塊
Average available free space in each data block in bytes (AVG_SPACE) 平均每個數據上的空閑空間,blocks+empty_blocks
Number of chained rows. [Not collected by DBMS_STATS] (CHAIN_COUNT)發生chainrow的記錄數量
Average row length, including the row's overhead, in bytes (AVG_ROW_LEN)平均每行的長度,包含overhead信息
Index Statistics
Oracle collects the following statistics for an index. Statistics marked with an asterisk are always computed exactly. For conventional indexes, the statistics appear in the data dictionary views USER_INDEXES, ALL_INDEXES, and DBA_INDEXES in the columns in parentheses.(帶*為准確值)
oracle收集如下的索引統計信息。對於常規索引,可以在視圖USER_INDEXES, ALL_INDEXES, and DBA_INDEXES中查看到如下的統計信息。
* Depth of the index from its root block to its leaf blocks (BLEVEL)(從0開始)Number of leaf blocks (LEAF_BLOCKS) (葉子塊的數量)
Number of distinct index values (DISTINCT_KEYS)
Average number of leaf blocks per index value (AVG_LEAF_BLOCKS_PER_KEY) (每個索引值存在於幾個葉子塊,通常為1)
Average number of data blocks per index value (for an index on a table) (AVG_DATA_BLOCKS_PER_KEY) (每個索引值對應的記錄存在於幾個數據塊,通常為1)
Clustering factor (how well ordered the rows are about the indexed values) (CLUSTERING_FACTOR)(聚簇因子)
Where are the statistics stored?
Statistics are stored into the Oracle Data Dictionary, in tables owned by SYS. Views are created on these tables to retrieve data more easily.These views are prefixed with DBA_ or ALL_ or USER_. For ease of reading, we will use DBA_% views, but ALL_% views or USER_% views could be used as well.
統計信息存儲在數據字典中,在sys用戶下的表內。通過視圖我們可以非常方便的從這些表中獲取信息。視圖通常以DBA_ USER_ ALL_開始。為了簡便,我們以DBA_開頭的視圖為例。
Conventions Used
- Statistics available only since 8.0.X rdbms release : (*) - Statistics available only since 8.1.X rdbms release : (**) - Statistics not available at partition or subpartition level : (G) - Statistics not available at subpartition level : (GP)Table level statistics can be retrieved from:
DBA_ALL_TABLES - (8.X onwards)DBA_OBJECT_TABLES - (8.X onwardsDBA_TABLES - (all versions)DBA_TAB_PARTITIONS - (8.X onwards)DBA_TAB_SUBPARTITIONS - (8.1 onwards)Columns to look at are:
NUM_ROWS : Number of rows (always exact even when computed with ESTIMATE method) BLOCKS : Number of blocks which have been used even if they are empty due to delete statements EMPTY_BLOCKS : Number of empty blocks (these blocks have never been used) AVG_SPACE : Average amount of FREE space in bytes in blocks allocated to the table : Blocks + Empty Blocks CHAIN_CNT : Number of chained or migrated rows AVG_ROW_LEN : Average length of rows in bytes AVG_SPACE_FREELIST_BLOCKS (*)(G) : Average free space of blocks in the freelist NUM_FREELIST_BLOCKS (*)(G) : Number of blocks in the freelist SAMPLE_SIZE : Sample defined in ESTIMATE method (0 if COMPUTE) LAST_ANALYZED : Timestamp of last analysis GLOBAL_STATS (**) : For partitioned tables, YES means statistics are collected for the TABLE as a whole NO means statistics are estimated from statistics on underlying table partitions or subpartitions USER_STATS (**) : YES if statistics entered directly by the userIndex level statistics can be retrieved from:
DBA_INDEXES - (all versions )DBA_IND_PARTITIONS - (8.X onwards)DBA_IND_SUBPARTITIONS - (8.1 onwards )Columns to look at are:
BLEVEL : B*Tree level : depth of the index from its root block to its leaf blocks (從0開始) LEAF_BLOCKS : Number of leaf blocks DISTINCT_KEYS : Number of distinct keys AVG_LEAF_BLOCKS_PER_KEY : Average number of leaf blocks in which each distinct key appears (1 for a UNIQUE index) AVG_DATA_BLOCKS_PER_KEY : Average number of data blocks in the table that are pointed to by a distinct key CLUSTERING_FACTOR : - if near the number of blocks, then the table is ordered : index entries in a single leaf block tend to point to rows in same data block - if near the number of rows, the table is randomly ordered : index entries in a single leaf block are unlikely to point to rows in same data block SAMPLE_SIZE : Sample defined in ESTIMATE method (0 if COMPUTE) LAST_ANALYZED : Timestamp of last analysis GLOBAL_STATS (**) : For partitioned indexes, YES means statistics are collected for the INDEX as a whole NO means statistics are estimated from statistics on underlying index partitions or subpartitions USER_STATS (**) : YES if statistics entered directly by the user PCT_DIRECT_ACCESS (**)(GP) : For secondary indexes on IOTs, percentage of rows with VALID guess(可以通過alter index index_name update block references來更新)Column level statistics can be retrieved from:
DBA_TAB_COLUMNS - (all versions)DBA_TAB_COL_STATISTICS - (Version 8.X onwards)DBA_PART_COL_STATISTICS - (Version 8.X onwards)DBA_SUBPART_COL_STATISTICS - (Version 8.1 onwards)The last three views extract statistics data from DBA_TAB_COLUMNS.(后三個視圖是從DBA_TAB_COLUMNS獲取數據)
Columns to look at are:
NUM_DISTINCT : Number of distinct values LOW_VALUE : Lowest value LOW_VALUE : Highest value DENSITY : Density NUM_NULLS : Number of columns having a NULL value AVG_COL_LEN : Average length in bytes NUM_BUCKETS : Number of buckets in histogram for the column SAMPLE_SIZE : Sample defined in ESTIMATE method (0 if COMPUTE) LAST_ANALYZED : Timestamp of last analysis (**)GLOBAL_STATS : For partitioned tables, YES means statistics are collected for the TABLE as a whole NO means statistics are estimated from statistics on underlying table partitions or subpartitions (**)USER_STATS : YES if statistics entered directly by the user
Compute statistics vs. Estimate statistics
Both computed and estimated statistics are used by the Oracle optimizer to choose the execution plan for SQL statements that access analyzed objects. These statistics may also be useful to application developers who write such statements.
無論是采用compute還是采用estimat的方式計算統計信息,優化器都會根據這些信息來選擇執行計划。程序員也可以根據這些統計信息來編寫sql語句。
COMPUTE STATISTICS
COMPUTE STATISTICS instructs Oracle to compute exact statistics about the analyzed object and store them in the data dictionary.When computing statistics, an entire object is scanned to gather data about the object. This data is used by Oracle to compute exact statistics about the object. Slight variances throughout the object are accounted for in these computed statistics. Because an entire object is scanned to gather information for computed statistics, the larger the size of an object, the more work that is required to gather the necessary information.
To perform an exact computation, Oracle requires enough space to perform a scan and sort of the table. If there is not enough space in memory, then temporary space may be required. For estimations, Oracle requires enough space to perform a scan and sort of only the rows in the requested sample of the table. For indexes, computation does not take up as much time or space, so it is best to perform a full computation.
Some statistics are always computed exactly, such as the number of data blocks currently containing data in a table or the depth of an index from its root block to its leaf blocks.
Use estimation for tables and clusters rather than computation, unless you need exact values. Because estimation rarely sorts, it is often much faster than computation, especially for large tables.
當COMPUTE STATISTICS時,oracle會精確計算被分析對象的統計信息,並將其存儲在數據字典中。oracle會掃描整個對象來獲取數據,並根據這些數據計算統計信息。對於這種方式,基本是輕微的變化也會被計算在內。因為整個對象都會被掃描,因此對象越大就會需要越多的工作量來完成統計。
為了完成精確統計,oracle需要足夠的空間來執行掃描和排序作業。如果在內存中不存在足夠的空間,就會占用磁盤的臨時空間。對於estimation方式,oracle僅僅需要掃描和排序所采樣的內容。如果我們統計的對象是索引,computation方式不會占用太多的時間和空間,因此對於索引我們最好采用compute方式。
某些統計信息總是精確計算的,例如表所占用的數據塊數量和索引的深度。
對於表和聚簇,我們建議使用estimation的方式,除非真的需要精確的統計信息。因此estatimation方式通常不會發生排序,速度更快,尤其在分析大表時。
ESTIMATE STATISTICS
ESTIMATE STATISTICS instructs Oracle to estimate statistics about the analyzed object and stores them in the data dictionary.When estimating statistics, Oracle gathers representative information from portions of an object. This subset of information provides reasonable, estimated statistics about the object. The accuracy of estimated statistics depends upon how representative the sampling used by Oracle is. Only parts of an object are scanned to gather information for estimated statistics, so an object can be analyzed quickly. You can optionally specify the number or percentage of rows that Oracle should use in making the estimate.
estimate statistics 使得oracle評估待分析對象的統計信息並將它們存儲在數據字典中。當評估統計信息時,oracle在待分析對象的部分區間內收集信息。這部分信息為分析對象提供了足夠的內容。estimate方式的准確程度主要依賴於oracle是如何采樣的。由於只有部分內容被掃描,因此速度更快。我們可以指定oracle采樣的百分比。
To estimate statistics, Oracle selects a random sample of data. You can specify the sampling percentage and whether sampling should be based on rows or blocks.
對於estimate方式,oracle會隨機采樣數據。我們可以指定采樣的百分比,也可以指定是根據記錄還是根據塊來采樣。
Row sampling reads rows without regard to their physical placement on disk. This provides the most random data for estimates, but it can result in reading more data than necessary. For example, in the worst case a row sample might select one row from each block, requiring a full scan of the table or index. 基於記錄的采用不會考慮記錄的物理存儲位置。這種方式提供了更好的隨機性,但是可能會造成讀取更多的數據。在最壞的情況下,oracle可能會在每個數據塊中讀取一條記錄,從而會全表掃描表或者索引 Block sampling reads a random sample of blocks and uses all of the rows in those blocks for estimates. This reduces the amount of I/O activity for a given sample size, but it can reduce the randomness of the sample if rows are not randomly distributed on disk. Block sampling is not available for index statistics. 基於塊的采樣會隨機讀取數據塊,然后利用數據塊中的所有記錄來進行分析統計工作。這無疑減少了輸入輸出的數量,但是如果記錄在塊內的分布不是隨機的,這種方式會影響采樣的隨機性。對於索引,基於塊的采樣方式是不可用的。Notes on estimating statistics
The default estimate of the analyze command reads the first approx 1064 rows of the table so the results often leave a lot to be desired. 默認情況下,oracle會讀取表中的前1064條記錄來作為采樣數據。The general consensus is that the default value of 1064 is not sufficient for accurate statistics when dealing with tables of any size. Many claims have shown that estimating statistics on 30 percent produces very accurate results. I personally have been running estimate 35 percent. This seems to produce very accurate numbers. It also saves a lot of time over full scans. 通常情況下,默認采樣1064條記錄是不充分的。多數人認為30%的采樣會產生比較准確的結果。我個人常常將采樣比例設置為35%
Note that if an estimate does 50% or more of a table Oracle converts the estimate to a full compute statistics. 如果采樣比超過50%,oracle會將其轉換為full compute statiistics
DBMS_STATS functions and variable definitions
Most of the DBMS_STATS procedures include the three parameters statown, stattab, and statid. These parameters allow you to store statistics in your own tables (outside of the dictionary), which does not affect the optimizer. Therefore, you can maintain and experiment with sets of statistics.
大部分DBMS_STAT過程包含三個參數STATOWN,STATTAB和statid。這些參數允許我們將統計信息存放到自己的表中,這些統計信息不回影響優化器。因此,我們可以維護和測試統計信息。
The stattab parameter specifies the name of a table in which to hold statistics, and it is assumed that it resides in the same schema as the object for which statistics are collected (unless the statown parameter is specified). Users may create multiple tables with different stattabidentifiers to hold separate sets of statistics.stattab參數規定了保存統計信息的表明,通常情況下,如果沒有指定statown參數,oracle以被統計對象所在的模式用戶為stattab的擁有者。我們可以使用不同的stattab來分別存儲不同的統計信息。
Additionally, users can maintain different sets of statistics within a single stattab by using the statid parameter, which can help avoid cluttering the user's schema.
靈位,我們也可以指定statid參數,從而在相同的stattab中存儲不同的統計信息,這樣可以使用戶模式顯得井井有條。
For all of the SET or GET procedures, if stattab is not provided (i.e., NULL), then the operation works directly on the dictionary statistics; therefore, users do not need to create these statistics tables if they only plan to modify the dictionary directly. However, if stattab is not NULL, then the SET or GET operation works on the specified user statistics table, and not the dictionary.
對於所有的set和get過程,如果我們沒有指定stattab,oracle會將統計信息寫入數據字典,如果指定了stattab,orcle只會將統計信息寫入用戶自定義表,而不會更新數據字典。
Create Stats Table
DBMS_STATS.CREATE_STAT_TABLE ( ownname VARCHAR2, stattab VARCHAR2, tblspace VARCHAR2 DEFAULT NULL);ownname : Name of the schema.stattab : Name of the table to create. This value should be passed as the stattab parameter to other procedures when the user does not want to modify the dictionary statistics directly. tblspace : Tablespace in which to create the stat tables. If none is specified, then they are created in the user's default tablespace.
Drop Stats Table
DBMS_STATS.drop_stat_table ( ownname VARCHAR2, stattab VARCHAR2);ownname : Name of the schema.stattab : User stat table identifier.
Gather Schema Stats (本人在測試過程中,即便指定了stattab,該過程依然更新了數據字典)
DBMS_STATS.gather_schema_stats ( ownname VARCHAR2, estimate_percent NUMBER DEFAULT NULL, block_sample BOOLEAN DEFAULT FALSE, method_opt VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1',(size 1 指在該列上不創建histogram,如果該值大於1,則創建histogram) degree NUMBER DEFAULT NULL, granularity VARCHAR2 DEFAULT 'DEFAULT', cascade BOOLEAN DEFAULT FALSE, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, options VARCHAR2 DEFAULT 'GATHER', objlist OUT ObjectTab, statown VARCHAR2 DEFAULT NULL);ownname : Schema to analyze (NULL means current schema).estimate_percent : Percentage of rows to estimate (NULL means compute): The valid range is [0.000001,100).
block_sample : Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics.
method_opt : Method options of the following format (the phrase 'SIZE 1' is required to ensure gathering statistics in parallel and for use with the phrase hidden):
FOR ALL [INDEXED | HIDDEN] COLUMNS [SIZE integer]
This value is passed to all of the individual tables.
degree : Degree of parallelism (NULL means use table default value).
granularity : Granularity of statistics to collect (only pertinent if the table is partitioned).
DEFAULT: Gather global- and partition-level statistics.SUBPARTITION: Gather subpartition-level statistics.PARTITION: Gather partition-level statistics.GLOBAL: Gather global statistics.ALL: Gather all (subpartition, partition, and global) statistics.
cascade : Gather statistics on the indexes as well.
Index statistics gathering is not parallelized. Using this option is equivalent to running the gather_index_stats procedure on each of the indexes in the schema in addition to gathering table and column statistics.
stattab : User stat table identifier describing where to save the current statistics.
statid : Identifier (optional) to associate with these statistics within stattab.
options : Further specification of which objects to gather statistics for:
GATHER: Gather statistics on all objects in the schema.GATHER STALE: Gather statistics on stale objects as determined by looking at the *_tab_modifications views. Also, return a list of objects found to be stale.GATHER EMPTY: Gather statistics on objects which currently have no statistics. also, return a list of objects found to have no statistics.LIST STALE: Return list of stale objects as determined by looking at the *_tab_modifications views.LIST EMPTY: Return list of objects which currently have no statistics.
objlist : List of objects found to be stale or empty.
statown : Schema containing stattab (if different than ownname).
Export Schema Stats(從數據字典導出到用戶表)
DBMS_STATS.export_schema_stats ( ownname VARCHAR2, stattab VARCHAR2, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);ownname : Name of the schema.stattab : User stat table identifier describing where to store the statistics.
statid : Identifier (optional) to associate with these statistics within stattab.
statown : Schema containing stattab (if different than ownname).
Import Schema Stats(從用戶表導入到數據字典)
DBMS_STATS.import_schema_stats ( ownname VARCHAR2, stattab VARCHAR2, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);ownname : Name of the schema.stattab : User stat table identifier describing from where to retrieve the statistics.
statid : Identifier (optional) to associate with these statistics within stattab.statown : Schema containing stattab (if different than ownname).
Delete Schema Stats
DBMS_STATS.delete_schema_stats ( ownname VARCHAR2, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);ownname : Name of the schema.stattab : User stat table identifier describing from where to delete the statistics. If stattab is NULL, then the statistics are deleted directly in the dictionary.
statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
statown : Schema containing stattab (if different than ownname).
Set Table Stats
DBMS_STATS.set_table_stats ( ownname VARCHAR2, tabname VARCHAR2, partname VARCHAR2 DEFAULT NULL, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, numrows NUMBER DEFAULT NULL, numblks NUMBER DEFAULT NULL, avgrlen NUMBER DEFAULT NULL, flags NUMBER DEFAULT NULL, statown VARCHAR2 DEFAULT NULL);ownname : Name of the schema.tabname : Name of the table.
partname : Name of the table partition in which to store the statistics. If the table is partitioned and partname is NULL, then the statistics are stored at the global table level.
stattab : User stat table identifier describing where to store the statistics. If stattab is NULL, then the statistics are stored directly in the dictionary.
statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
numrows : Number of rows in the table (partition).
numblks : Number of blocks the table (partition) occupies.
avgrlen : Average row length for the table (partition).
flags : For internal Oracle use (should be left as NULL).
statown : Schema containing stattab (if different than ownname).
Get Table Stats
DBMS_STATS.get_table_stats ( ownname VARCHAR2, tabname VARCHAR2, partname VARCHAR2 DEFAULT NULL, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, numrows OUT NUMBER, numblks OUT NUMBER, avgrlen OUT NUMBER, statown VARCHAR2 DEFAULT NULL);ownname : Name of the schema.tabname : Name of the table to which this column belongs.
partname : Name of the table partition from which to get the statistics. If the table is partitioned and if partname is NULL, then the statistics are retrieved from the global table level.
stattab : User stat table identifier describing from where to retrieve the statistics. If stattab is NULL, then the statistics are retrieved directly from the dictionary.
statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
numrows : Number of rows in the table (partition).
numblks : Number of blocks the table (partition) occupies.
avgrlen : Average row length for the table (partition).
statown : Schema containing stattab (if different than ownname).
Get Index Stats
DBMS_STATS.GET_INDEX_STATS ( ownname VARCHAR2, indname VARCHAR2, partname VARCHAR2 DEFAULT NULL, stattab VARCHAR2 DEFAULT NULL, statid VARCHAR2 DEFAULT NULL, numrows OUT NUMBER, numlblks OUT NUMBER, numdist OUT NUMBER, avglblk OUT NUMBER, avgdblk OUT NUMBER, clstfct OUT NUMBER, indlevel OUT NUMBER, statown VARCHAR2 DEFAULT NULL);ownname : Name of the schema.indname : Name of the index.
partname : Name of the index partition for which to get the statistics. If the index is partitioned and if partname is NULL, then the statistics are retrieved for the global index level.
stattab : User stat table identifier describing from where to retrieve the statistics. If stattab is NULL, then the statistics are retrieved directly from the dictionary.statid : Identifier (optional) to associate with these statistics within stattab (Only pertinent if stattab is not NULL).
numrows : Number of rows in the index (partition).
numlblks : Number of leaf blocks in the index (partition).
numdist : Number of distinct keys in the index (partition).
avglblk : Average integral number of leaf blocks in which each distinct key appears for this index (partition).
avgdblk : Average integral number of data blocks in the table pointed to by a distinct key for this index (partition).
clstfct : Clustering factor for the index (partition).
indlevel : Height of the index (partition).
statown : Schema containing stattab (if different than ownname).
Automated table monitoring and stale statistics gathering example
在oracle10g中 statistics_level 初始化參數作為一個全局設置影響對表的監控操作,本文下面涉及的alter_schema_tab_monitoring已經不再被使用,但是到我們調用這些過程時,不會報錯,只是沒有任何事情發生。
You can automatically gather statistics or create lists of tables that have stale or no statistics.
To automatically gather statistics, run the DBMS_STATS.GATHER_SCHEMA_STATS and DBMS_STATS.GATHER_DATABASE_STATS procedures with the OPTIONS and objlist parameters. Use the following values for the options parameter:
GATHER STALE : Gathers statistics on tables with stale statistics. (通過*_tab_modifications視圖)GATHER : Gathers statistics on all tables. (default)
GATHER EMPTY : Gathers statistics only on tables without statistics.
LIST STALE : Creates a list of tables with stale statistics.(通過*_tab_modifications視圖)
LIST EMPTY : Creates a list of tables that do not have statistics.
The objlist parameter identifies an output parameter for the LIST STALE and LIST EMPTY options. The objlist parameter is of type DBMS_STATS.OBJECTTAB.
Step 1 : Perform a quick analyze to load in base statistics
BEGIN DBMS_STATS.GATHER_SCHEMA_STATS ( ownname => 'scott', estimate_percent => null, -- Small table, lets compute block_sample => false, method_opt => 'FOR ALL COLUMNS', degree => null, -- No parallelism used in this example granularity => 'ALL', cascade => true, -- Make sure we include indexes options => 'GATHER' -- Gather mode ); END; / PL/SQL procedure successfully completed.Step 2 : Examine the current statistics
SELECT table_name, num_rows, blocks, avg_row_len FROM user_tables WHERE table_name='EMP'; TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN ------------------------------ ---------- ---------- ----------- EMP 1500 28 92Step 3 : Turn on Automatic Monitoring
Now turn on automatic monitoring for the emp table. This can be done using the alter table method. Starting with Oracle 9i, you can also perform this at the "schema", and "entire database" level. I provide the syntax for all three methods below.
通過alter table 語句我們可以設置oracle數據庫自動監控某張表的變化,從9i開始,我們還可以在schema或者數據庫級別設置是否監控數據變化,監控結構會存儲在*_tab_modifications視圖中。
Monitor only the EMP table.
alter table emp monitoring; Table altered.Monitor all of the tables within Scott's schema. (Oracle 9i and higher)BEGIN DBMS_STATS.alter_schema_tab_monitoring('scott', true); END; / PL/SQL procedure successfully completed.Monitor all of the tables within the database. (Oracle 9i and higher)Note: Although the option to collect statistics for SYS tables is available via ALTER_DATABASE_TAB_MONITORING, Oracle continues to recommend against this practice until the next major release after 9i Release 2. Also note that the ALTER_DATABASE_TAB_MONITORING procedure in the DBMS_STATS package only monitors tables; there is an ALTER INDEX...MONITORING statement which can be used to monitor indexes. Thanks to Nabil Nawaz for providing this and pointing out an error I made in the previous version of this article.
BEGIN DBMS_STATS.alter_database_tab_monitoring ( monitoring => true, sysobjs => false); -- Don't set to true, see note above. END; / PL/SQL procedure successfully completed.Step 4 : Verify that monitoring is turned on.
Note: The results of the following query are from running the alter table ... statement on the emp table only.
可以通過*_tables視圖的monitoring字段來判斷某張表是否開啟了自動監控
SELECT table_name, monitoring FROM user_tables ORDER BY monitoring; TABLE_NAME MONITORING ------------------------------ ---------- DEPT NO EMP YESStep 5 : Delete some rows from the database.
SQL> DELETE FROM emp WHERE rownum < 501; 500 rows deleted. SQL> commit; Commit complete.Step 6 : Wait until the monitered data is flushed.
Data can be flushed in several ways. In Oracle 8i, you can wait it out for 3 hours.In Oracle 9i and higher, you only need to wait 15 minutes.In either version, restart the database.For immediate results in Oracle 9i and higher, use the DBMS_STATS.flush_database_monitoring_info package. OK, I'm impatient...exec dbms_stats.flush_database_monitoring_info; PL/SQL procedure successfully completed.Step 7 : Check for what it has collected.
As user "scott", check USER_TAB_MODIFICATIONS to see what it was collected.SELECT * FROM user_tab_modifications; TABLE_NAME PARTITION_NAME SUBPARTITION_NAME INSERTS UPDATES DELETES TIMESTAMP TRUNCATED ---------- -------------- ----------------- ------- ------- ------- --------- --------- EMP 0 0 500 18-SEP-02 NOStep 8 : Execute DBMS_STATS to gather stats on all "stale" tables.
BEGIN DBMS_STATS.GATHER_SCHEMA_STATS( ownname => 'scott', estimate_percent => null, block_sample => false, method_opt => 'FOR ALL COLUMNS', degree => null, granularity => 'ALL', cascade => true, options => 'GATHER STALE'); END; / PL/SQL procedure successfully completed.Step 9 : Verify that the table is no longer listed in USER_TAB_MODIFICATIONS.
SQL> SELECT * FROM user_tab_modifications; no rows selected.Step 10 : Examine some of new statistics collected.
SELECT table_name, num_rows, blocks, avg_row_len FROM user_tables where table_name='EMP'; TABLE_NAME NUM_ROWS BLOCKS AVG_ROW_LEN ------------------------------ ---------- ---------- ----------- EMP 1000 28 92
How to determine if dictionary statistics are RDBMS-generated or user-defined
The following section explains how to determine if your dictionary statistics are RDBMS-generated or set by users through one of the DBMS_STATS.SET_xx_STATS procedures.This is crucial for development environments that are testing the performance of SQL statements with various sets of statistics. The DBA will need to know if the relying statistics are RDBMS-defined or user-defined.
RDBMS-generated statistics are generated by the following:(我們可以通過如下方式生成統計信息)
ANALYZE SQL commandDBMS_UTILITY.ANALYZE_SCHEMA procedureDBMS_UTILITY.ANALYZE_DATABASE procedureDBMS_DDL.ANALYZE_OBJECT procedure8.1 DBMS_STATS.GATHER_xx_STATS procedures User generated statistics are only done through the use of the DBMS_STATS.SET_xx_STATS procedures(如果我們需要手工設置統計信息,只可以通過dbms_stats包的set_xx_stats過程來實現)The column USER_STATS from DBA_TABLES, ALL_TABLES, USER_TABLES displays:
YES, when statistics are entered directly by a user.NO, when statistics are generated by RDBMS through an ANALYZE statement(如果USER_STATS字段的值為Yes,則統計信息為手工指定,NO,為通過dbms或者analyze方式系統生成)
轉自:http://www.bitscn.com/pdb/otherdb/201504/491562.html