什么?有個 SQL 執行了 8 秒!
哪里出了問題?臣妾不知道啊,得找 DBA 啊。
DBA 人呢?離職了!!擦!!!
程序員在無處尋求幫助時,就得想辦法自救,努力讓自己變成 "偽 DBA"。
索引
- 數據文件和日志文件位置和大小
- 查看指定數據庫文件的大小和可用空間
- 服務器 Disk 容量和掛載信息
- 查看 Disk 剩余空間
- 查詢數據庫設置的 Recovery Model
- 查看最近的 Full Backup 信息
- 獲取所有數據庫的 VLF 數量
- SQL Server 的錯誤日志位置
- 查詢近期的 Error Log 信息
- 在錯誤日志中查詢 I/O 超過 15s 的請求
- 查詢 Disk 的性能指標
- 查看哪個數據庫文件 I/O 瓶頸最嚴重
- 按照 Write I/O 進行排名
- 獲取數據庫的 I/O 使用率
- 查看指定數據庫文件的 I/O 狀況
- 找出 I/O 平均使用最多的語句
- 查詢正在等待 I/O 的請求等待時間
數據文件和日志文件位置和大小
SELECT DB_NAME([database_id]) AS [Database Name] ,[file_id] ,[name] ,physical_name ,type_desc ,state_desc ,is_percent_growth ,growth ,CONVERT(BIGINT, growth / 128.0) AS [Growth in MB] ,CONVERT(BIGINT, size / 128.0) AS [Total Size in MB] FROM sys.master_files WITH (NOLOCK) WHERE [database_id] > 4 AND [database_id] <> 32767 OR [database_id] = 2 ORDER BY DB_NAME([database_id]) OPTION (RECOMPILE);
通常會查看:
- 數據文件是否與日志文件放到了不同的磁盤上?
- 難道都裝到了 C 盤上?
- tempdb 是否指定了獨立的磁盤?
- 有幾個 tempdb 文件?
- 這些 tempdb 都多大了?
- 數據庫是否包含多個文件?
- 數據文件的增長步長合適嗎?
查看指定數據庫文件的大小和可用空間
需要指定數據庫,或使用 use 指定,例如 use TEST。
SELECT f.[name] AS [File Name] ,f.physical_name AS [Physical Name] ,CAST((f.size / 128.0) AS DECIMAL(15, 2)) AS [Total Size in MB] ,CAST(f.size / 128.0 - CAST(FILEPROPERTY(f.[name], 'SpaceUsed') AS INT) / 128.0 AS DECIMAL(15, 2)) AS [Available Space In MB] ,[file_id] ,fg.[name] AS [Filegroup Name] FROM sys.database_files AS f WITH (NOLOCK) LEFT OUTER JOIN sys.data_spaces AS fg WITH (NOLOCK) ON f.data_space_id = fg.data_space_id OPTION (RECOMPILE);
服務器 Disk 容量和掛載信息
SELECT DISTINCT vs.volume_mount_point ,vs.file_system_type ,vs.logical_volume_name ,CONVERT(DECIMAL(18, 2), vs.total_bytes / 1073741824.0) AS [Total Size (GB)] ,CONVERT(DECIMAL(18, 2), vs.available_bytes / 1073741824.0) AS [Available Size (GB)] ,CAST(CAST(vs.available_bytes AS FLOAT) / CAST(vs.total_bytes AS FLOAT) AS DECIMAL(18, 2)) * 100 AS [Space Free %] FROM sys.master_files AS f WITH (NOLOCK) CROSS APPLY sys.dm_os_volume_stats(f.database_id, f.[file_id]) AS vs OPTION (RECOMPILE);
SELECT db_name(vs.database_id) AS DatabaseName ,vs.file_id ,vs.volume_mount_point ,vs.volume_id ,vs.logical_volume_name ,vs.file_system_type ,(vs.total_bytes / 1024 / 1024 / 1024) AS [TotalSize(GB)] ,(vs.available_bytes / 1024 / 1024 / 1024) AS [AvailableSize(GB)] ,vs.supports_compression ,vs.supports_alternate_streams ,vs.supports_sparse_files ,vs.is_read_only ,vs.is_compressed FROM sys.master_files mf CROSS APPLY sys.dm_os_volume_stats(mf.database_id, mf.file_id) vs;
查看 Disk 剩余空間
EXEC master.dbo.xp_fixeddrives
SELECT DISTINCT SUBSTRING(volume_mount_point, 1, 1) AS Volume_mount_point ,total_bytes / 1024 / 1024 AS Total_MB ,available_bytes / 1024 / 1024 AS Available_MB FROM sys.master_files AS f CROSS APPLY sys.dm_os_volume_stats(f.database_id, f.file_id);
查詢數據庫設置的 Recovery Model
SELECT db.[name] AS [Database Name] ,db.recovery_model_desc AS [Recovery Model] ,db.state_desc ,db.log_reuse_wait_desc AS [Log Reuse Wait Description] ,CONVERT(DECIMAL(18, 2), ls.cntr_value / 1024.0) AS [Log Size (MB)] ,CONVERT(DECIMAL(18, 2), lu.cntr_value / 1024.0) AS [Log Used (MB)] ,CAST(CAST(lu.cntr_value AS FLOAT) / CAST(ls.cntr_value AS FLOAT) AS DECIMAL(18, 2)) * 100 AS [Log Used %] ,db.[compatibility_level] AS [DB Compatibility Level] ,db.page_verify_option_desc AS [Page Verify Option] ,db.is_auto_create_stats_on ,db.is_auto_update_stats_on ,db.is_auto_update_stats_async_on ,db.is_parameterization_forced ,db.snapshot_isolation_state_desc ,db.is_read_committed_snapshot_on ,db.is_auto_close_on ,db.is_auto_shrink_on ,db.target_recovery_time_in_seconds ,db.is_cdc_enabled FROM sys.databases AS db WITH (NOLOCK) INNER JOIN sys.dm_os_performance_counters AS lu WITH (NOLOCK) ON db.NAME = lu.instance_name INNER JOIN sys.dm_os_performance_counters AS ls WITH (NOLOCK) ON db.NAME = ls.instance_name WHERE lu.counter_name LIKE N'Log File(s) Used Size (KB)%' AND ls.counter_name LIKE N'Log File(s) Size (KB)%' AND ls.cntr_value > 0 OPTION (RECOMPILE);
通常會關注:
- 數據庫實例(Instance)上建立了多少個數據庫?
- 它們都分別使用了什么恢復模型(Recovery Model)?
- Log 重用是如何設置的?
- 事務日志(Transaction Log)現在多大了?
- 兼容等級(Compatibility Level)是怎么配置的?
- 頁驗證選項(Page Verify Option)的設置是什么?通常為 CHECKSUM。
- 是否設置了 Auto Update Statistics Asynchronously 選項?
- 確保未開啟 auto_shrink 和 auto_close 選項。
查看最近的 Full Backup 信息
SELECT TOP (30) bs.machine_name ,bs.server_name ,bs.database_name AS [Database Name] ,bs.recovery_model ,CONVERT(BIGINT, bs.backup_size / 1048576) AS [Uncompressed Backup Size (MB)] ,CONVERT(BIGINT, bs.compressed_backup_size / 1048576) AS [Compressed Backup Size (MB)] ,CONVERT(NUMERIC(20, 2), (CONVERT(FLOAT, bs.backup_size) / CONVERT(FLOAT, bs.compressed_backup_size))) AS [Compression Ratio] ,DATEDIFF(SECOND, bs.backup_start_date, bs.backup_finish_date) AS [Backup Elapsed Time (sec)] ,bs.backup_finish_date AS [Backup Finish Date] FROM msdb.dbo.backupset AS bs WITH (NOLOCK) WHERE DATEDIFF(SECOND, bs.backup_start_date, bs.backup_finish_date) > 0 AND bs.backup_size > 0 AND bs.type = 'D' -- Change to L if you want Log backups AND database_name = DB_NAME(DB_ID()) ORDER BY bs.backup_finish_date DESC OPTION (RECOMPILE);
獲取所有數據庫的 VLF 數量
VLF :Virtual Log File
SQL Server 將日志文件 LDF 划分為多個 VLF 以提高日志處理效率。VLF 的數量和大小不能通過配置指定,SQL Server 會按情況自行判斷處理,而如果生成了過多的 VLF 則會產生性能問題。通過指定合理的日志文件初始大小和增長步長,可以有效的防止過多 VLF 的產生。
- 1M-64M 4
- 64M-1GB 8
- >1GB 16
CREATE TABLE #VLFInfo ( RecoveryUnitID INT ,FileID INT ,FileSize BIGINT ,StartOffset BIGINT ,FSeqNo BIGINT ,[Status] BIGINT ,Parity BIGINT ,CreateLSN NUMERIC(38) ); CREATE TABLE #VLFCountResults ( DatabaseName SYSNAME ,VLFCount INT ); EXEC sp_MSforeachdb N'Use [?]; INSERT INTO #VLFInfo EXEC sp_executesql N''DBCC LOGINFO([?])''; INSERT INTO #VLFCountResults SELECT DB_NAME(), COUNT(*) FROM #VLFInfo; TRUNCATE TABLE #VLFInfo;' SELECT DatabaseName ,VLFCount FROM #VLFCountResults ORDER BY VLFCount DESC; DROP TABLE #VLFInfo; DROP TABLE #VLFCountResults;
較高的 VLF 數量會影響寫入性能,並且會使數據庫的恢復過程變慢,通常需要保持 VLF Counts 在 200 以下。
SQL Server 的錯誤日志位置
SELECT is_enabled ,[path] ,max_size ,max_files FROM sys.dm_os_server_diagnostics_log_configurations WITH (NOLOCK) OPTION (RECOMPILE);
查詢近期的 Error Log 信息
DECLARE @Time_Start DATETIME; DECLARE @Time_End DATETIME; SET @Time_Start = getdate() - 2; SET @Time_End = getdate(); -- Create the temporary table CREATE TABLE #ErrorLog ( logdate DATETIME ,processinfo VARCHAR(255) ,Message VARCHAR(500) ) -- Populate the temporary table INSERT #ErrorLog ( logdate ,processinfo ,Message ) EXEC master.dbo.xp_readerrorlog 0 ,1 ,NULL ,NULL ,@Time_Start ,@Time_End ,N'desc'; -- Filter the temporary table SELECT LogDate ,Message FROM #ErrorLog WHERE ( Message LIKE '%error%' OR Message LIKE '%failed%' ) AND processinfo NOT LIKE 'logon' ORDER BY logdate DESC -- Drop the temporary table DROP TABLE #ErrorLog
在錯誤日志中查詢 I/O 超過 15s 的請求
CREATE TABLE #IOWarningResults ( LogDate DATETIME ,ProcessInfo SYSNAME ,LogText NVARCHAR(1000) ); INSERT INTO #IOWarningResults EXEC xp_readerrorlog 0 ,1 ,N'taking longer than 15 seconds'; INSERT INTO #IOWarningResults EXEC xp_readerrorlog 1 ,1 ,N'taking longer than 15 seconds'; INSERT INTO #IOWarningResults EXEC xp_readerrorlog 2 ,1 ,N'taking longer than 15 seconds'; INSERT INTO #IOWarningResults EXEC xp_readerrorlog 3 ,1 ,N'taking longer than 15 seconds'; INSERT INTO #IOWarningResults EXEC xp_readerrorlog 4 ,1 ,N'taking longer than 15 seconds'; SELECT LogDate ,ProcessInfo ,LogText FROM #IOWarningResults ORDER BY LogDate DESC; DROP TABLE #IOWarningResults;
如果能夠查詢出結果,可以說明 I/O 性能存在問題,但是哪里引起的還需進一步探索。
查詢 Disk 的性能指標
SELECT [Drive] ,CASE WHEN num_of_reads = 0 THEN 0 ELSE (io_stall_read_ms / num_of_reads) END AS [Read Latency (ms)] ,CASE WHEN io_stall_write_ms = 0 THEN 0 ELSE (io_stall_write_ms / num_of_writes) END AS [Write Latency (ms)] ,CASE WHEN ( num_of_reads = 0 AND num_of_writes = 0 ) THEN 0 ELSE (io_stall / (num_of_reads + num_of_writes)) END AS [Overall Latency (ms)] ,CASE WHEN num_of_reads = 0 THEN 0 ELSE (num_of_bytes_read / num_of_reads) END AS [Avg Bytes/Read] ,CASE WHEN io_stall_write_ms = 0 THEN 0 ELSE (num_of_bytes_written / num_of_writes) END AS [Avg Bytes/Write] ,CASE WHEN ( num_of_reads = 0 AND num_of_writes = 0 ) THEN 0 ELSE ((num_of_bytes_read + num_of_bytes_written) / (num_of_reads + num_of_writes)) END AS [Avg Bytes/Transfer] FROM ( SELECT LEFT(UPPER(mf.physical_name), 2) AS Drive ,SUM(num_of_reads) AS num_of_reads ,SUM(io_stall_read_ms) AS io_stall_read_ms ,SUM(num_of_writes) AS num_of_writes ,SUM(io_stall_write_ms) AS io_stall_write_ms ,SUM(num_of_bytes_read) AS num_of_bytes_read ,SUM(num_of_bytes_written) AS num_of_bytes_written ,SUM(io_stall) AS io_stall FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS vfs INNER JOIN sys.master_files AS mf WITH (NOLOCK) ON vfs.database_id = mf.database_id AND vfs.file_id = mf.file_id GROUP BY LEFT(UPPER(mf.physical_name), 2) ) AS tab ORDER BY [Overall Latency (ms)] OPTION (RECOMPILE);
通常 Latency 的值大於 20-25 ms 時可考慮有性能問題。
查看哪個數據庫文件 I/O 瓶頸最嚴重
SELECT DB_NAME(fs.database_id) AS [Database Name] ,CAST(fs.io_stall_read_ms / (1.0 + fs.num_of_reads) AS NUMERIC(10, 1)) AS [avg_read_stall_ms] ,CAST(fs.io_stall_write_ms / (1.0 + fs.num_of_writes) AS NUMERIC(10, 1)) AS [avg_write_stall_ms] ,CAST((fs.io_stall_read_ms + fs.io_stall_write_ms) / (1.0 + fs.num_of_reads + fs.num_of_writes) AS NUMERIC(10, 1)) AS [avg_io_stall_ms] ,CONVERT(DECIMAL(18, 2), mf.size / 128.0) AS [File Size (MB)] ,mf.physical_name ,mf.type_desc ,fs.io_stall_read_ms ,fs.num_of_reads ,fs.io_stall_write_ms ,fs.num_of_writes ,fs.io_stall_read_ms + fs.io_stall_write_ms AS [io_stalls] ,fs.num_of_reads + fs.num_of_writes AS [total_io] FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS fs INNER JOIN sys.master_files AS mf WITH (NOLOCK) ON fs.database_id = mf.database_id AND fs.[file_id] = mf.[file_id] ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE);
考慮將數據庫文件移動到不同的磁盤上,或更快的磁盤陣列上以改進性能。
按照 Write I/O 進行排名
SELECT [ReadLatency] = CASE WHEN [num_of_reads] = 0 THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END ,[WriteLatency] = CASE WHEN [num_of_writes] = 0 THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END ,[Latency] = CASE WHEN ( [num_of_reads] = 0 AND [num_of_writes] = 0 ) THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END ,[AvgBytesPerRead] = CASE WHEN [num_of_reads] = 0 THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END ,[AvgBytesPerWrite] = CASE WHEN [num_of_writes] = 0 THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END ,[AvgBytesPerTransfer] = CASE WHEN ( [num_of_reads] = 0 AND [num_of_writes] = 0 ) THEN 0 ELSE (([num_of_bytes_read] + [num_of_bytes_written]) / ([num_of_reads] + [num_of_writes])) END ,LEFT([mf].[physical_name], 2) AS [Drive] ,DB_NAME([vfs].[database_id]) AS [DB] ,[mf].[physical_name] ,[mf].file_id FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS [vfs] JOIN sys.master_files AS [mf] ON [vfs].[database_id] = [mf].[database_id] AND [vfs].[file_id] = [mf].[file_id] ORDER BY [WriteLatency] DESC;
獲取數據庫的 I/O 使用率
WITH Aggregate_IO_Statistics AS ( SELECT DB_NAME(database_id) AS [Database Name] ,CAST(SUM(num_of_bytes_read + num_of_bytes_written) / 1048576 AS DECIMAL(12, 2)) AS io_in_mb FROM sys.dm_io_virtual_file_stats(NULL, NULL) AS [DM_IO_STATS] GROUP BY database_id ) SELECT ROW_NUMBER() OVER ( ORDER BY io_in_mb DESC ) AS [I/O Rank] ,[Database Name] ,io_in_mb AS [Total I/O (MB)] ,CAST(io_in_mb / SUM(io_in_mb) OVER () * 100.0 AS DECIMAL(5, 2)) AS [I/O Percent] FROM Aggregate_IO_Statistics ORDER BY [I/O Rank] OPTION (RECOMPILE);
查看指定數據庫文件的 I/O 狀況
需要指定數據庫,或使用 use 指定,例如 use TEST。
SELECT DB_NAME(DB_ID()) AS [Database Name] ,df.[name] AS [Logical Name] ,vfs.[file_id] ,df.physical_name AS [Physical Name] ,vfs.num_of_reads ,vfs.num_of_writes ,vfs.io_stall_read_ms ,vfs.io_stall_write_ms ,CAST(100. * vfs.io_stall_read_ms / (vfs.io_stall_read_ms + vfs.io_stall_write_ms) AS DECIMAL(10, 1)) AS [IO Stall Reads Pct] ,CAST(100. * vfs.io_stall_write_ms / (vfs.io_stall_write_ms + vfs.io_stall_read_ms) AS DECIMAL(10, 1)) AS [IO Stall Writes Pct] ,(vfs.num_of_reads + vfs.num_of_writes) AS [Writes + Reads] ,CAST(vfs.num_of_bytes_read / 1048576.0 AS DECIMAL(10, 2)) AS [MB Read] ,CAST(vfs.num_of_bytes_written / 1048576.0 AS DECIMAL(10, 2)) AS [MB Written] ,CAST(100. * vfs.num_of_reads / (vfs.num_of_reads + vfs.num_of_writes) AS DECIMAL(10, 1)) AS [# Reads Pct] ,CAST(100. * vfs.num_of_writes / (vfs.num_of_reads + vfs.num_of_writes) AS DECIMAL(10, 1)) AS [# Write Pct] ,CAST(100. * vfs.num_of_bytes_read / (vfs.num_of_bytes_read + vfs.num_of_bytes_written) AS DECIMAL(10, 1)) AS [Read Bytes Pct] ,CAST(100. * vfs.num_of_bytes_written / (vfs.num_of_bytes_read + vfs.num_of_bytes_written) AS DECIMAL(10, 1)) AS [Written Bytes Pct] FROM sys.dm_io_virtual_file_stats(DB_ID(), NULL) AS vfs INNER JOIN sys.database_files AS df WITH (NOLOCK) ON vfs.[file_id] = df.[file_id] OPTION (RECOMPILE);
協助從 I/O 角度來觀察數據庫文件所承載的壓力。
找出 I/O 平均使用最多的語句
SELECT TOP (50) OBJECT_NAME(qt.objectid, dbid) AS [SP Name] ,(qs.total_logical_reads + qs.total_logical_writes) / qs.execution_count AS [Avg IO] ,qs.execution_count AS [Execution Count] ,SUBSTRING(qt.[text], qs.statement_start_offset / 2 + 1, ( CASE WHEN qs.statement_end_offset = - 1 THEN LEN(CONVERT(NVARCHAR(max), qt.[text])) * 2 ELSE qs.statement_end_offset END - qs.statement_start_offset ) / 2) AS [Query Text] FROM sys.dm_exec_query_stats AS qs WITH (NOLOCK) CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt WHERE qt.[dbid] = DB_ID() ORDER BY [Avg IO] DESC OPTION (RECOMPILE);
參考資料:
查詢正在等待 I/O 的請求等待時間
SELECT DB_NAME(database_id) AS [DBNAME] ,file_id ,io_stall ,io_pending_ms_ticks ,scheduler_address FROM sys.dm_io_virtual_file_stats(NULL, NULL) iovfs ,sys.dm_io_pending_io_requests AS iopior WHERE iovfs.file_handle = iopior.io_handle
《人人都是 DBA》系列文章索引:
本系列文章《人人都是 DBA》由 Dennis Gao 發表自博客園,未經作者本人同意禁止任何形式的轉載,任何自動或人為的爬蟲轉載行為均為耍流氓。