一,RECOVERY PENDING狀態
今天修改了SQL Server的Service Account的密碼,然后重啟SQL Server的Service,發現有db處於Recovery Pending狀態。
Recovery Pending狀態是指:數據庫在還原(recovery)時遇到跟資源相關的錯誤,雖然數據庫沒有損壞,但是文件可能丟失,或者系統資源的限制,導致該數據庫不能開始還原進程。數據庫處於Recovery Pending 狀態,表明還原進程被掛起,數據庫不能開始數據庫的數據和日志的還原進程;這種情況,不能說慢Recovery失敗,因為Recovery還沒有開始。這種情況下,最可能的原因是丟失數據文件或日志文件。
對於Recovery Pending狀態,應該如何修復:
ALTER DATABASE [DB_Name] SET SINGLE_USER WITH NO_WAIT ALTER DATABASE [DB_Name] SET EMERGENCY; DBCC checkdb ([DB_Name], REPAIR_ALLOW_DATA_LOSS ) ALTER DATABASE [DB_Name] SET online; ALTER DATABASE [DB_Name] SET Multi_USER WITH NO_WAIT
在使用CheckDB命令Repair之前,查看DB的大小
select DB_NAME(mf.database_id) as DatabaseName, mf.type_desc as FileType, mf.name as FileLogicName, mf.physical_name as FilePhysicalName, mf.size as PagesCount, mf.size*8/1024 as Size_MB, mf.size*8/1024/1024.0 as Size_GB from sys.master_files mf where mf.database_id= db_id(N'dbname')
在執行時,出現各種問題:
1,User does not have permission to alter database 'Office365', the database does not exist, or the database is not in a state that allows access checks.
2,Database 'Office365' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.
最后,我到File 的 Physical path下,找不到相應的MDF文件,但是Log文件是存在的,並且log文件最后修改的時間離現在有2年,可能是被遺棄的DB。修改 Service Account ,不會刪除一個18GB的MDF文件,向Leader詢問,Leader說這是一個被廢棄的DB。虛驚一場,像這種,MDF文件被刪除,Log文件還保存的情況,數據文件肯定是被強制刪除。
有驚無險,血淚的教訓:在Service Restart 之前,一定確保DB沒有在運行更新操作,並使用checkpoint保存臟數據。
二,估計Recovery的剩余時間
當一個DB處於 In Recovery 狀態時,用戶是不能訪問的,如果Recovery時間很長,那么對一個DBA來說,等待的過程是虐心的,DBA需要知道剩余的還原時間。如何預測一個DB從In Recovery 狀態,還原到正常Online狀態所需的時間? SQL Server 沒有直接給出答案,但是,在Recovery的過程中SQL Server將還原進程記錄到ErrorLog中,可以通過Recovery的歷史記錄來估計剩余的完成時間。

DECLARE @DBName VARCHAR(64) = 'databasename' DECLARE @ErrorLog AS TABLE ( [LogDate] CHAR(24), [ProcessInfo] VARCHAR(64), [TEXT] VARCHAR(MAX) ) INSERT INTO @ErrorLog EXEC master..sp_readerrorlog 0, 1, 'Recovery of database', @DBName SELECT TOP 11 [LogDate] ,SUBSTRING([TEXT], CHARINDEX(') is ', [TEXT]) + 4,CHARINDEX(' complete (', [TEXT]) - CHARINDEX(') is ', [TEXT]) - 4) AS PercentComplete ,CAST(SUBSTRING([TEXT], CHARINDEX('approximately', [TEXT]) + 13,CHARINDEX(' seconds remain', [TEXT]) - CHARINDEX('approximately', [TEXT]) - 13) AS FLOAT)/60.0 AS MinutesRemaining ,CAST(SUBSTRING([TEXT], CHARINDEX('approximately', [TEXT]) + 13,CHARINDEX(' seconds remain', [TEXT]) - CHARINDEX('approximately', [TEXT]) - 13) AS FLOAT)/60.0/60.0 AS HoursRemaining ,[TEXT] FROM @ErrorLog ORDER BY [LogDate] DESC
在SQL Server的Log中,記錄的消息是:
Recovery of database 'database name' (16) is 0% complete (approximately 303767 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
Recovery of database 'database name' (16) is 0% complete (approximately 396166 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
三,Database 處於Suspect狀態
在物理機安裝Windows更新,重啟之后,發現該Server上有一個DB處於Suspect狀態,該DB的Files分布在不同的Server上,我懷疑是在Remote Server重啟時,導致該DB不能訪問Remote Files,因此,SQL Server 進入 Suspect狀態。
查看Windows 日志報告,發現一下錯誤信息:
The operating system returned error 53(The network path was not found.) to SQL Server during a read at offset 0x000001bed08000 in file '\\RemoteServerName\ShareFolder\xxxx.ndf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
這個錯誤是由於Remote Server重啟,導致該DB不能訪問位於Remote Server上的Files,數據庫的文件並沒有損壞。所以,解決方法是:等到所有的Remote Server都重啟之后,只需要使該DB先脫機(offline),再聯機(Online),SQL Server會自動檢測該數據庫的完整性,如果該DB的所有Files都能正常訪問,該DB就會恢復到正常的Online狀態。
alter database database_name set offline --wait for some seconds alter database database_name set online
附件:
數據庫的狀態和描述:
- ONLINE:Database is available for access. The primary filegroup is online, although the undo phase of recovery may not have been completed.
- OFFLINE:Database is unavailable. A database becomes offline by explicit user action and remains offline until additional user action is taken. For example, the database may be taken offline in order to move a file to a new disk. The database is then brought back online after the move has been completed.
- RESTORING:One or more files of the primary filegroup are being restored, or one or more secondary files are being restored offline. The database is unavailable.
- RECOVERING:Database is being recovered. The recovering process is a transient state; the database will automatically become online if the recovery succeeds. If the recovery fails, the database will become suspect. The database is unavailable.
- RECOVERY PENDING:SQL Server has encountered a resource-related error during recovery. The database is not damaged, but files may be missing or system resource limitations may be preventing it from starting. The database is unavailable. Additional action by the user is required to resolve the error and let the recovery process be completed.
- SUSPECT:At least the primary filegroup is suspect and may be damaged. The database cannot be recovered during startup of SQL Server. The database is unavailable. Additional action by the user is required to resolve the problem.
- EMERGENCY:User has changed the database and set the status to EMERGENCY. The database is in single-user mode and may be repaired or restored. The database is marked READ_ONLY, logging is disabled, and access is limited to members of the sysadmin fixed server role. EMERGENCY is primarily used for troubleshooting purposes. For example, a database marked as suspect can be set to the EMERGENCY state. This could permit the system administrator read-only access to the database. Only members of the sysadmin fixed server role can set a database to the EMERGENCY state.
推薦閱讀:
How to resolve the issue of a database that was in Recovery Pending mode
Troubleshooting: SCOM DW Database is in a Suspect State
Search Engine Q&A #4: Using EMERGENCY mode to access a RECOVERY PENDING or SUSPECT database
Corruption: Last resorts that people try first…
How To Repair A Suspect Database In MSSQL
Recovering a SQL Server Database from Suspect Mode