MS SQL統計信息淺析上篇對SQL SERVER 數據庫統計信息做了一個整體的介紹,隨着我對數據庫統計信息的不斷認識、理解,於是有了MS SQL統計信息淺析下篇。 下面是我對SQL Server統計信息的一些探討或認識,如有不對的地方,希望大家能夠指正。
觸發統計信息更新條件疑問
關於這個觸發統計信息更新的條件。因為我在很多資料上看到過,例如Microsoft SQL Server 企業級平台管理實踐。 我自己上篇也是這樣解釋的。
1:普通表上,觸發數據庫自動更新統計信息的條件
1、 在一個空表中有數據的改動。
2、 當統計信息創建時,表的行數只有500或以下,且后來統計對象中的引導列(統計信息的第一個字段數據)的更改次數大於500.
3、 當表的統計信息收集時,超過了500行,且統計對象的引導列(統計信息的第一個字段數據)后來更改次數超過500+表總行數的20%時
2:臨時表
If the statistics object is defined on a temporary table, it is out of date as discussed above, except that there is an additional threshold for recomputation at 6 rows, with a test otherwise identical to test 2 in the previous list.。
3: 表變量
表變量沒有統計信息
官方資料http://msdn.microsoft.com/en-us/library/dd535534%28v=sql.100%29.aspx 也是這樣解釋的。
A statistics object is considered out of date in the following cases:
If the statistics is defined on a regular table, it is out of date if:
- The table size has gone from 0 to >0 rows (test 1).
- The number of rows in the table when the statistics were gathered was 500 or less, and the colmodctr of the leading column of the statistics object has changed by more than 500 since then (test 2).
- The table had more than 500 rows when the statistics were gathered, and the colmodctr of the leading column of the statistics object has changed by more than 500 + 20% of the number of rows in the table when the statistics were gathered (test 3).
· For filtered statistics, the colmodctr is first adjusted by the selectivity of the filter before these conditions are tested. For example, for filtered statistics with predicate selecting 50% of the rows, the colmodctr is multiplied by 0.5.
· One limitation of the automatic update logic is that it tracks changes to columns in the statistics, but not changes to columns in the predicate. If there are many changes to the columns used in predicates of filtered statistics, consider using manual updates to keep up with the changes.
· If the statistics object is defined on a temporary table, it is out of date as discussed above, except that there is an additional threshold for recomputation at 6 rows, with a test otherwise identical to test 2 in the previous list.
Table variables do not have statistics at all.
但是又一次我的實驗顯示不是那么一回事,有興趣的可以按照下面SQL語句試試,
CREATE TABLE TEST1
(
ID INT ,
NAME VARCHAR(8) ,
CONSTRAINT PK_TEST1 PRIMARY KEY(ID)
)
GO
SELECT name AS index_name,
STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
WHERE OBJECT_ID = OBJECT_ID('dbo.TEST1')
GO
index_name StatsUpdated
------------------------------------ -----------------------
PK_TEST1 NULL
INSERT INTO TEST1
SELECT 1001, 'Kerry' ;
此時查看統計信息的更新日期,發現空表插入一條數據並沒有觸發數據庫更新其統計信息。
INSERT INTO TEST1
SELECT '1002', 'Jimmy' ;
即使我再插入一條或幾條數據,統計信息依然不會更新,DBCC SHOW_STATISTICS(TEST1, PK_TEST1) 查看依然如此
這明顯跟第一條規則:在一個空表中有數據的改動會觸發統計信息明顯不符。 Why? 難道官方文檔有問題? 那么我就去驗證第二條規則
DECLARE @Index INT;
SET @Index =2;
WHILE @Index <= 510
BEGIN
INSERT INTO TEST1 VALUES(@Index, 'k'+LTRIM(STR(@Index)));
SET @Index = @Index + 1;
END
SELECT name AS index_name,
STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes
WHERE OBJECT_ID = OBJECT_ID('dbo.TEST1')
DBCC SHOW_STATISTICS(TEST1, PK_TEST1)
查看居然發現統計信息還是沒有更新。規則2似乎也沒有生效,哇靠,怎么會這樣呢? 估計有人會懷疑是不是我沒有開啟數據庫”自動更新統計信息“和”自動創建統計信息“選項,其實當時我也這樣懷疑過,甚至懷疑數據庫版本問題,如果你按照這個做實驗估計你也很納悶,為什么呢? 其實出現這種情況,是因為還少了觸發條件,大家可以先將整個表DROP掉,然后從簡單開始,插入一條數據后,
CREATE TABLE TEST1
(
ID INT ,
NAME VARCHAR(8) ,
CONSTRAINT PK_TEST1 PRIMARY KEY(ID)
)
GO
INSERT INTO TEST1
SELECT 1001, 'Kerry' ;
執行SQL語句后SELECT * FROM TEST1,此時查看統計信息是否更新,發現沒有,但是如果執行SELECT * FROM TEST1 WHERE ID=1語句后,你會發現統計信息居然更新了。官方文檔果然誠不欺我啊,呵呵。
其實也就是說統計信息的更新不光需要滿足上述條件,還需特定的SQL觸發, 例如上面兩條語句
SELECT * FROM TEST1
SELECT * FROM TEST1 WHERE ID=1001
第一條SQL不會觸發,而第二條SQL會觸發。 它們之間的區別是就是查詢條件,第二條語句查詢條件ID包含在統計信息PK_TEST1里面。那么大家猜測一下
SELECT * FROM TEST1 WHERE Name='Kerry'
這條語句會不會觸發統計信息的更新呢? 我的實驗是不會。於是我設計了另外一個小實驗,來驗證我的另外一個想法, 如下所示:
IF EXISTS(SELECT 1 FROM SYSOBJECTS WHERE NAME = 'TEST1' AND XTYPE = 'U')
BEGIN
DROP TABLE TEST1;
END
GO
CREATE TABLE TEST1
(
ID INT ,
NAME VARCHAR(8) ,
Sex VARCHAR(2) ,
CONSTRAINT PK_TEST1 PRIMARY KEY(ID,NAME)
)
GO
INSERT INTO TEST1
SELECT 1001, 'Kerry','男' ;
SELECT * FROM TEST1 WHERE Name='Kerry'
結果告訴我,上面這個SQL語句依然不能觸發更新統計信息,那么可以總結歸納為,觸發統計信息更新的SQL語句里面必須有統計信息的第一個字段作為條件才能成功觸發統計信息的更新。
統計信息存儲位置探討
關於統計信息存儲在哪些系統表,我們先從簡單的入手。一個簡單的表Employee為示例
使用SP_HELPSTATS 可以查看表擁有哪些統計信息,那么如果看過我上篇 ,就可以通過分析存儲過程SP_HELPSTATS了解到其實這里查看的統計信息其實是從sys.stats取值,當然還需要關聯sys.stats_columns 與sys.columns
但是上面的統計信息位於哪里呢?在SQL SERVER 2000 里面,在sysindexes里面有一列statblob,統計信息應該就放在這里面.官方文檔里面指明statblob字段就是統計信息的二進制大對象。
select name, statblob from sysindexes where id= object_id('Login_Log')
但是從SQL SERVER 2005開始,這一列返回null值,statblob本身存儲在一張內部目錄表中。
那么如何查看統計信息呢? 其實可以用下面兩種方式:
方法1:DBCC SHOW_STATISTICS('Employee', 'PK_Employee_ID_Name') WITH STATS_STREAM
0x0100000002000000000000000000000084A41ED3000000001103000000000000B902000000000000380300003800000004000A00000000000000000000000000E7030480E7000000280000000000000024D0000000000000070000007E3A170111A300000F270000000000000F270000000000000000803F76BCD13876BCD1380000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000030000000200000014000000873ADE41003C1C460000000000008040873ABE4100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000190000000000000000000000000000005D0000000000000045010000000000004D0100000000000018000000000000002F000000000000004600000000000000100014000000803F000000000000803F10270000040000100014000000803F00301C460000803F1D4E0000040000100014000000803F000000000000803F1E4E00000400000300000078DC020011A30000000000008087C340778200000000000003000000000000C08E371A3F00000000000000000000000000000000000000000000000000000000000000000000000000000000A003CA000FA30000000000008087C340778200000000000003000000000000C08E371A3F0000000000000000000000000000000000000000000000000000000000000000000000000000000049805F010DA30000000000008087C340778200000000000003000000000000C08E371A3F000000000000000000000000000000000000000000000000000000000000000000000000000000000F27000000000000
方法2: 必須通過DAC方式登錄數據庫,才能查看到。
SELECT name, imageval
FROM sys.stats AS s
INNER JOIN sys.sysobjvalues AS o
ON s.object_id = o.objid
AND s.stats_id = o.subobjid
WHERE
s.object_id = OBJECT_ID('dbo.Employee');
0x070000007E3A170111A300000F270000000000000F270000000000000000803F76BCD13876BCD1380000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000030000000200000014000000873ADE41003C1C460000000000008040873ABE4100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000190000000000000000000000000000005D0000000000000045010000000000004D0100000000000018000000000000002F000000000000004600000000000000100014000000803F000000000000803F10270000040000100014000000803F00301C460000803F1D4E0000040000100014000000803F000000000000803F1E4E00000400000300000078DC020011A30000000000008087C340778200000000000003000000000000C08E371A3F00000000000000000000000000000000000000000000000000000000000000000000000000000000A003CA000FA30000000000008087C340778200000000000003000000000000C08E371A3F0000000000000000000000000000000000000000000000000000000000000000000000000000000049805F010DA30000000000008087C340778200000000000003000000000000C08E371A3F000000000000000000000000000000000000000000000000000000000000000000000000000000000F27000000000000
將上面兩段二進制碼分兩行放到UE或文本編輯器,右對齊后發現下面下面的十六進制碼是上面十六進制碼的一部分。部分截圖如下所示。
至於兩種的區別,暫時還沒有搞懂。
有個叫joe-chang的大神http://sqlblog.com/blogs/joe_chang/archive/2012/05/05/decoding-stats-stream.aspx 這篇博客介紹了一些如何解碼統計信息(文章看起來確實很深奧、枯燥)。我用其提供的p_Stat2008c存儲過程的執行結果與DBCC SHOW_STATISTICS的結果做了對比分析,如下所示,從里面可以找到很多對應信息,圖片里面沒法弄很多線條,大致整了幾條。
DBCC SHOW_STATISTICS(Employee,PK_Employee_ID_Name);
exec [dbo].[p_Stat2008c] @Table='Employee', @Index ='PK_Employee_ID_Name'
也就是說統計信息基本上是可以解析出來的。只是相當復雜。
有效維護更新數據庫統計信息
如果要有效的維護、更新數據庫的統計信息,下面有一些建議,僅供參考
1:一般建議開啟“自動創建統計信息”和“自動更新統計信息”選項(默認開啟)。讓數據庫自動維護、更新統計信息。在比較繁忙的OLTP系統中建議開啟“自動異步更新統計信息”選項, 否則應該關閉這個選項,尤其是OLAP系統。
關於自動異步更新統計信息開啟的建議:Use asynchronous statistics update if synchronous update causes undesired delay
If you have a large database and an OLTP workload, and if you enable AUTO_UPDATE_STATISTICS, some transactions that normally run in a fraction of a second may very infrequently take several seconds or more because they cause statistics to be updated. If you want to avoid the possibility of this noticeable delay, enable AUTO_UPDATE_STATISTICS_ASYNC. For workloads with long-running queries, getting the best plan is more important than an infrequent delay in compilation. In such cases, use synchronous rather than asynchronous auto update statistics.
2: 如果需要,可以選項性的使用FULLSCAN更新統計信息。
更新統計信息是一件消耗資源的事情,尤其是對那些大表。很多時候SQL SERVER引擎會根據抽樣更新統計信息,例如80%進行抽樣更新統計信息。此時統計信息的准確性就跟采樣的比例有很大的關系。尤其對某些特殊的表(數據分布嚴重不均)影響非常大。所以有時候可以選擇性的使用FULLSCAN來更新統計信息。SQL SERVER一般會在統計信息的准確度度和資源合理消耗之間做一個平衡。其實有很多特殊例子,最明顯就是數據分布非常不均時,此時統計信息的准確性對執行計划的影響就非常大。
3:Consider more frequent statistics gathering for ascending keys
考慮更頻繁的收集ascending keys的統計數據。升序鍵列,如IDENTITY列或代表真實世界的時間戳datetime列,頻繁的INSERT可能會導致表的統計信息不正確,因為新插入的值不在直方圖之中。所以需要頻繁的收集ascending keys的統計數據。
4:考慮少用表值函數和表變量
對於表變量和表值函數,它們沒有統計信息,所以數據庫優化器需去猜測它們的基數,這樣得到的執行計划就非常不可靠。
5:在腳本中考慮使用字符串代替局部變量,在存儲過程考慮使用參數代替局部變量
如果你在查詢謂詞中使用局部變量而不是參數或字符串,那么優化器會訴諸於減少質量的估計或謂詞的選擇性的一個猜想選擇性(If you use a local variable in a query predicate instead of a parameter or literal, the optimizer resorts to a reduced-quality estimate, or a guess for selectivity of the predicate)。在查詢中使用參數或字符串而不是局部變量,優化器能選擇一個更好的查詢計划。請看下面例子。
declare @StartOrderDate datetime
set @StartOrderDate = '20040731'
select * from Sales.SalesOrderHeader h, Sales.SalesOrderDetail d
WHERE h.SalesOrderID = d.SalesOrderId
AND h.OrderDate >= @StartOrderDate
SELECT * FROM Sales.SalesOrderHeader h, Sales.SalesOrderDetail d
WHERE h.SalesOrderID = d.SalesOrderId
AND h.OrderDate >= '20040731'
兩者的對比如下所示。使用字符串得到的執行計划明顯優於使用局部變量(注意,不同版本的數據庫或可用內存不同,得到的執行計划可能有所差異,請以各自實驗為准)
consider (1) rewriting the query to use literals instead of variables, (2) using sp_executesql with parameters that replace your use of local variables, or (3) using a stored procedure with parameters that replace your use of local variables. Dynamic SQL via EXEC may also be useful for eliminating local variables, but it typically results in higher compilation overhead and more complex programming. A new enhancement in SQL Server 2008 is that the OPTION(RECOMPILE) hint
USE [YourSQLDba]
GO
/****** Object: StoredProcedure [yMaint].[UpdateStats] Script Date: 04/25/2014 14:26:10 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [yMaint].[UpdateStats]
@JobNo Int
, @SpreadUpdStatRun Int
as
Begin
declare @seqStatNow Int
declare @cmptlevel Int
declare @dbName sysname
declare @sql nvarchar(max)
declare @lockResult int
Declare @seq Int -- row sequence for row by row processing
Declare @scn sysname -- schema name
Declare @tb sysname -- table name
declare @sampling Int -- page count to get an idea if the size of the table
Declare @idx sysname -- index name
Declare @object_id int -- a proof that an object exists
Begin Try
Create table #TableNames
(
scn sysname
, tb sysname
, sampling nvarchar(3)
, seq int
, primary key clustered (seq)
)
Update Maint.JobSeqUpdStat
Set @seqStatNow = (seq + 1) % @SpreadUpdStatRun, seq = @seqStatNow
Set @DbName = ''
While(1 = 1) -- simple do loop
Begin
Select top 1 -- first next in alpha sequence after the last one.
@DbName = DbName
, @cmptLevel = CmptLevel
From #Db
Where DbName > @DbName
Order By DbName
-- exit if nothing after the last one processed
If @@rowcount = 0 Break --
-- If database is not updatable, skip update stats for this database
If DATABASEPROPERTYEX(@DbName, 'Updateability') = N'READ_ONLY'
Continue
-- If database is in emrgency, skip update stats for this database
If DatabasepropertyEx(@DbName, 'Status') IN (N'Emergency')
Continue
-- makes query boilerplate with replacable parameter identified by
-- labels between "<" et ">"
-- this query select table for which to perform update statistics
truncate table #TableNames
set @sql =
'
set nocount on
;With
TableSizeStats as
(
select
object_schema_name(Ps.object_id, db_id("<DbName>")) as scn --collate <srvCol>
, object_name(Ps.object_id, db_id("<DbName>")) as tb --collate <srvCol>
, Sum(Ps.Page_count) as Pg
From
sys.dm_db_index_physical_stats (db_id("<DbName>"), NULL, NULL, NULL, "LIMITED") Ps
Group by
Ps.object_id
)
Insert into #tableNames (scn, tb, seq, sampling)
Select
scn
, tb
, row_number() over (order by scn, tb) as seq
, Case
When pg > 200001 Then "10"
When Pg between 50001 and 200000 Then "20"
When Pg between 5001 and 50000 Then "30"
else "100"
End
From
TableSizeStats
where (abs(checksum(tb)) % <SpreadUpdStatRun>) = <seqStatNow>
'
set @sql = replace(@sql,'<srvCol>',convert(nvarchar(100), Serverproperty('collation')))
Set @sql = replace(@sql,'<seqStatNow>', convert(nvarchar(20), @seqStatNow))
Set @sql = replace(@sql,'<SpreadUpdStatRun>', convert(nvarchar(20), @SpreadUpdStatRun))
set @sql = replace(@sql,'"','''') -- to avoid doubling of quotes in boilerplate
set @sql = replace(@sql,'<DbName>',@DbName)
Exec yExecNLog.LogAndOrExec
@context = 'yMaint.UpdateStats'
, @Info = 'Table selection for update statistics'
, @sql = @sql
, @JobNo = @JobNo
, @forDiagOnly = 1
Exec yMaint.LockMaintDb @jobNo, 'U', @DbName, @LockResult output
If @lockResult < 0 -- messages are issued from yMaint.LockMaintDb
Continue
set @seq = 0
While (1 = 1)
begin
Select top 1 @scn = scn, @tb = tb, @sampling = sampling, @seq = seq
from #TableNames where seq > @seq order by seq
if @@rowcount = 0 break
Set @sql = 'Select @object_id = object_id("<DbName>.<scn>.<tb>") '
set @sql = replace (@sql, '<DbName>', @DbName)
set @sql = replace (@sql, '<scn>', @scn)
set @sql = replace (@sql, '<tb>', @tb)
set @sql = replace (@sql, '"', '''')
Exec sp_executeSql @Sql, N'@object_id int output', @object_id output
If @object_id is not null
Begin
Set @sql = 'update statistics [<DbName>].[<scn>].[<tb>] WITH sample <sampling> PERCENT'
set @sql = replace (@sql, '<DbName>', @DbName)
set @sql = replace (@sql, '<scn>', @scn)
set @sql = replace (@sql, '<tb>', @tb)
set @sql = replace (@sql, '<sampling>', @sampling)
set @sql = replace (@sql, '"', '''')
Exec yExecNLog.LogAndOrExec
@context = 'yMaint.UpdateStats'
, @Info = 'update statistics selected'
, @sql = @sql
, @JobNo = @JobNo
End
end -- While
Exec yMaint.UnLockMaintDb @jobNo, @DbName
End -- While boucle banque par banque
End try
Begin catch
Exec yExecNLog.LogAndOrExec @jobNo = @jobNo, @context = 'yMaint.UpdateStats Error', @err = '?'
End Catch
End -- yMaint.UpdateStats
USE [YourSQLDba]
GO
/****** Object: StoredProcedure [yMaint].[UpdateStats] Script Date: 04/25/2014 14:26:10 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [yMaint].[UpdateStats]
@JobNo Int
, @SpreadUpdStatRun Int
as
Begin
declare @seqStatNow Int
declare @cmptlevel Int
declare @dbName sysname
declare @sql nvarchar(max)
declare @lockResult int
Declare @seq Int -- row sequence for row by row processing
Declare @scn sysname -- schema name
Declare @tb sysname -- table name
declare @sampling Int -- page count to get an idea if the size of the table
Declare @idx sysname -- index name
Declare @object_id int -- a proof that an object exists
Begin Try
Create table #TableNames
(
scn sysname
, tb sysname
, sampling nvarchar(3)
, seq int
, primary key clustered (seq)
)
Update Maint.JobSeqUpdStat
Set @seqStatNow = (seq + 1) % @SpreadUpdStatRun, seq = @seqStatNow
Set @DbName = ''
While(1 = 1) -- simple do loop
Begin
Select top 1 -- first next in alpha sequence after the last one.
@DbName = DbName
, @cmptLevel = CmptLevel
From #Db
Where DbName > @DbName
Order By DbName
-- exit if nothing after the last one processed
If @@rowcount = 0 Break --
-- If database is not updatable, skip update stats for this database
If DATABASEPROPERTYEX(@DbName, 'Updateability') = N'READ_ONLY'
Continue
-- If database is in emrgency, skip update stats for this database
If DatabasepropertyEx(@DbName, 'Status') IN (N'Emergency')
Continue
-- makes query boilerplate with replacable parameter identified by
-- labels between "<" et ">"
-- this query select table for which to perform update statistics
truncate table #TableNames
set @sql =
'
set nocount on
;With
TableSizeStats as
(
select
object_schema_name(Ps.object_id, db_id("<DbName>")) as scn --collate <srvCol>
, object_name(Ps.object_id, db_id("<DbName>")) as tb --collate <srvCol>
, Sum(Ps.Page_count) as Pg
From
sys.dm_db_index_physical_stats (db_id("<DbName>"), NULL, NULL, NULL, "LIMITED") Ps
Group by
Ps.object_id
)
Insert into #tableNames (scn, tb, seq, sampling)
Select
scn
, tb
, row_number() over (order by scn, tb) as seq
, Case
When pg > 200001 Then "10"
When Pg between 50001 and 200000 Then "20"
When Pg between 5001 and 50000 Then "30"
else "100"
End
From
TableSizeStats
where (abs(checksum(tb)) % <SpreadUpdStatRun>) = <seqStatNow>
'
set @sql = replace(@sql,'<srvCol>',convert(nvarchar(100), Serverproperty('collation')))
Set @sql = replace(@sql,'<seqStatNow>', convert(nvarchar(20), @seqStatNow))
Set @sql = replace(@sql,'<SpreadUpdStatRun>', convert(nvarchar(20), @SpreadUpdStatRun))
set @sql = replace(@sql,'"','''') -- to avoid doubling of quotes in boilerplate
set @sql = replace(@sql,'<DbName>',@DbName)
Exec yExecNLog.LogAndOrExec
@context = 'yMaint.UpdateStats'
, @Info = 'Table selection for update statistics'
, @sql = @sql
, @JobNo = @JobNo
, @forDiagOnly = 1
Exec yMaint.LockMaintDb @jobNo, 'U', @DbName, @LockResult output
If @lockResult < 0 -- messages are issued from yMaint.LockMaintDb
Continue
set @seq = 0
While (1 = 1)
begin
Select top 1 @scn = scn, @tb = tb, @sampling = sampling, @seq = seq
from #TableNames where seq > @seq order by seq
if @@rowcount = 0 break
Set @sql = 'Select @object_id = object_id("<DbName>.<scn>.<tb>") '
set @sql = replace (@sql, '<DbName>', @DbName)
set @sql = replace (@sql, '<scn>', @scn)
set @sql = replace (@sql, '<tb>', @tb)
set @sql = replace (@sql, '"', '''')
Exec sp_executeSql @Sql, N'@object_id int output', @object_id output
If @object_id is not null
Begin
Set @sql = 'update statistics [<DbName>].[<scn>].[<tb>] WITH sample <sampling> PERCENT'
set @sql = replace (@sql, '<DbName>', @DbName)
set @sql = replace (@sql, '<scn>', @scn)
set @sql = replace (@sql, '<tb>', @tb)
set @sql = replace (@sql, '<sampling>', @sampling)
set @sql = replace (@sql, '"', '''')
Exec yExecNLog.LogAndOrExec
@context = 'yMaint.UpdateStats'
, @Info = 'update statistics selected'
, @sql = @sql
, @JobNo = @JobNo
End
end -- While
Exec yMaint.UnLockMaintDb @jobNo, @DbName
End -- While boucle banque par banque
End try
Begin catch
Exec yExecNLog.LogAndOrExec @jobNo = @jobNo, @context = 'yMaint.UpdateStats Error', @err = '?'
End Catch
End -- yMaint.UpdateStats
6:Consider filtered statistics for heterogeneous data
Sometimes rows with different schema are mapped to a single physical table, with multipurpose columns such as ntext1, ntext2, bigint1, bigint2 storing semantically unrelated data. Typically, there is also a special-purpose rowtype column that defines what is the semantic meaning of the data stored in each column. Such design is useful for storing arbitrary user-defined lists without changing the underlying database schema. As a result, the same column may end up storing telephone numbers and city names, and a histogram on such column may not be very useful, due to the limit of 200 steps. To avoid this, define separate statistics for each rowtype in this table.:
7:Consider filtered statistics for partitioned tables
Statistics are defined at the table level. Changes to partitions affect statistics only indirectly, through the column modification counters. Switching in a partition is treated as an insert of its rows into the table, triggering statistics update based on the 20% rule, as outlined above. Filtered statistics, through their predicates, can target only rows in certain partition or partitions. There is no requirement to align to the boundaries of the partitions when defining the statistics.
Often, customers partition a table by the Date column, keeping a partition for every month, and updating only the last partition; older months receive the majority of the complex, read-only queries. In this scenario, creating separate, full-scan statistics on the read-only region of the table results in more accurate cardinality estimates. In order to benefit from the separate statistics object, queries must be contained within the read-only region. Similarly, separate statistics objects can be created for different regions based on the different access patterns.
8:定期自動去更新統計信息,我推薦使用YourSQLDba,這個工具開源、方便擴展。最重要的是你能了解是如何處理的。
USE [YourSQLDba]
GO
/****** Object: StoredProcedure [yMaint].[UpdateStats] Script Date: 04/25/2014 14:26:10 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [yMaint].[UpdateStats]
@JobNo Int
, @SpreadUpdStatRun Int
as
Begin
declare @seqStatNow Int
declare @cmptlevel Int
declare @dbName sysname
declare @sql nvarchar(max)
declare @lockResult int
Declare @seq Int -- row sequence for row by row processing
Declare @scn sysname -- schema name
Declare @tb sysname -- table name
declare @sampling Int -- page count to get an idea if the size of the table
Declare @idx sysname -- index name
Declare @object_id int -- a proof that an object exists
Begin Try
Create table #TableNames
(
scn sysname
, tb sysname
, sampling nvarchar(3)
, seq int
, primary key clustered (seq)
)
Update Maint.JobSeqUpdStat
Set @seqStatNow = (seq + 1) % @SpreadUpdStatRun, seq = @seqStatNow
Set @DbName = ''
While(1 = 1) -- simple do loop
Begin
Select top 1 -- first next in alpha sequence after the last one.
@DbName = DbName
, @cmptLevel = CmptLevel
From #Db
Where DbName > @DbName
Order By DbName
-- exit if nothing after the last one processed
If @@rowcount = 0 Break --
-- If database is not updatable, skip update stats for this database
If DATABASEPROPERTYEX(@DbName, 'Updateability') = N'READ_ONLY'
Continue
-- If database is in emrgency, skip update stats for this database
If DatabasepropertyEx(@DbName, 'Status') IN (N'Emergency')
Continue
-- makes query boilerplate with replacable parameter identified by
-- labels between "<" et ">"
-- this query select table for which to perform update statistics
truncate table #TableNames
set @sql =
'
set nocount on
;With
TableSizeStats as
(
select
object_schema_name(Ps.object_id, db_id("<DbName>")) as scn --collate <srvCol>
, object_name(Ps.object_id, db_id("<DbName>")) as tb --collate <srvCol>
, Sum(Ps.Page_count) as Pg
From
sys.dm_db_index_physical_stats (db_id("<DbName>"), NULL, NULL, NULL, "LIMITED") Ps
Group by
Ps.object_id
)
Insert into #tableNames (scn, tb, seq, sampling)
Select
scn
, tb
, row_number() over (order by scn, tb) as seq
, Case
When pg > 200001 Then "10"
When Pg between 50001 and 200000 Then "20"
When Pg between 5001 and 50000 Then "30"
else "100"
End
From
TableSizeStats
where (abs(checksum(tb)) % <SpreadUpdStatRun>) = <seqStatNow>
'
set @sql = replace(@sql,'<srvCol>',convert(nvarchar(100), Serverproperty('collation')))
Set @sql = replace(@sql,'<seqStatNow>', convert(nvarchar(20), @seqStatNow))
Set @sql = replace(@sql,'<SpreadUpdStatRun>', convert(nvarchar(20), @SpreadUpdStatRun))
set @sql = replace(@sql,'"','''') -- to avoid doubling of quotes in boilerplate
set @sql = replace(@sql,'<DbName>',@DbName)
Exec yExecNLog.LogAndOrExec
@context = 'yMaint.UpdateStats'
, @Info = 'Table selection for update statistics'
, @sql = @sql
, @JobNo = @JobNo
, @forDiagOnly = 1
Exec yMaint.LockMaintDb @jobNo, 'U', @DbName, @LockResult output
If @lockResult < 0 -- messages are issued from yMaint.LockMaintDb
Continue
set @seq = 0
While (1 = 1)
begin
Select top 1 @scn = scn, @tb = tb, @sampling = sampling, @seq = seq
from #TableNames where seq > @seq order by seq
if @@rowcount = 0 break
Set @sql = 'Select @object_id = object_id("<DbName>.<scn>.<tb>") '
set @sql = replace (@sql, '<DbName>', @DbName)
set @sql = replace (@sql, '<scn>', @scn)
set @sql = replace (@sql, '<tb>', @tb)
set @sql = replace (@sql, '"', '''')
Exec sp_executeSql @Sql, N'@object_id int output', @object_id output
If @object_id is not null
Begin
Set @sql = 'update statistics [<DbName>].[<scn>].[<tb>] WITH sample <sampling> PERCENT'
set @sql = replace (@sql, '<DbName>', @DbName)
set @sql = replace (@sql, '<scn>', @scn)
set @sql = replace (@sql, '<tb>', @tb)
set @sql = replace (@sql, '<sampling>', @sampling)
set @sql = replace (@sql, '"', '''')
Exec yExecNLog.LogAndOrExec
@context = 'yMaint.UpdateStats'
, @Info = 'update statistics selected'
, @sql = @sql
, @JobNo = @JobNo
End
end -- While
Exec yMaint.UnLockMaintDb @jobNo, @DbName
End -- While boucle banque par banque
End try
Begin catch
Exec yExecNLog.LogAndOrExec @jobNo = @jobNo, @context = 'yMaint.UpdateStats Error', @err = '?'
End Catch
End -- yMaint.UpdateStats
參考資料:
http://msdn.microsoft.com/en-us/library/dd535534.aspx
http://sqlblog.com/blogs/joe_chang/archive/2012/05/05/decoding-stats-stream.aspx


![clipboard[1] clipboard[1]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwMzU5MTA3MzcxMC5wbmc=.png)
![clipboard[2] clipboard[2]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDAxMDkxMzgzOC5wbmc=.png)
![clipboard[3] clipboard[3]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDAyNTQ1NDU5NC5wbmc=.png)
![clipboard[4] clipboard[4]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDA0MzI2ODU0OS5wbmc=.png)
![clipboard[5] clipboard[5]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDA3MjYzNzUxOS5wbmc=.png)
![clipboard[6] clipboard[6]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDA5MDQ1MjQ3NS5wbmc=.png)
![clipboard[7] clipboard[7]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDExODczNjI0NC5wbmc=.png)
![clipboard[8] clipboard[8]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDEzMjAxNDI3MS5wbmc=.png)
![clipboard[9] clipboard[9]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDE2Nzk1OTY0MC5wbmc=.png)
![clipboard[35] clipboard[35]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDIyMTA3MTc1MC5wbmc=.png)
![clipboard[36] clipboard[36]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDI0MDkxMTg3OS5wbmc=.png)
![clipboard[37] clipboard[37]](/image/aHR0cHM6Ly9pbWFnZXMwLmNuYmxvZ3MuY29tL2Jsb2cvNzM1NDIvMjAxNDA0LzI1MTUwNDI3NjU0OTc4OC5wbmc=.png)