Oracle的實例恢復解析


在數據庫服務器異常斷電重啟后,數據庫會進行實例恢復,那么實例恢復的過程中Oracle做了什么操作呢?參考官網在這里做一下解釋,菜鳥水平有限,歡迎勘正。

首先說下實例恢復的定義:

Instance recovery is the process of applying records in the online redo log to data files to reconstruct changes made after the most recent checkpoint. Instance recovery occurs automatically when an administrator attempts to open a database that was previously shut down inconsistently.

Oracle Database performs instance recovery automatically in the following situations:

  • The database opens for the first time after the failure of a single-instance database or all instances of an Oracle RAC database. This form of instance recovery is also called crash recovery. Oracle Database recovers the online redo threads of the terminated instances together.

  • Some but not all instances of an Oracle RAC database fail. Instance recovery is performed automatically by a surviving instance in the configuration.

The SMON background process performs instance recovery, applying online redo automatically. No user intervention is required.

因此我們知道非一致性關閉會引發實例恢復(一致性關閉不會,參考shutdown immediate的官方定義)同時RAC節點宕機也會在一個存活節點進行實例恢復,其過程就是重構內存中的臟塊並提交,同時對未提交的做出回滾,這個過程由smon后台進程負責。

實例恢復分兩階段:
1.前滾:Rolling Forward 
Oracle根據redo日志中的記載,重現checkpoint之后的臟塊隊列:
    1)對於提交的事務,根據日志進行內存中的臟塊重現,然后進行commit,並按正常機制正常寫入磁盤。
    2)對於未提交的事務,也根據redo進行臟塊重現(為何會有未提交的事務日志被寫入磁盤呢?因為日志的寫入是按時間排序的,一些已提交事務的寫日志操作會引發之前的一些未提交事務日志的寫入),對以此類臟塊只是重現,oracle在此階段完全不對此類臟塊做其他操作。
由於一些未提交大事務的更改已經被寫入磁盤(但依然會保持嚴格的日志先寫機制),以及前滾過程中生成的未提交事務的臟塊,oracle必須進行第二步的回滾。
2.回滾:Rolling Back
   對於所有未提交的臟塊,oracle根據undo的前鏡像進行回滾(行級別的邏輯反操作),重新將內存中緩存的相關數據臟塊換為非臟塊,並寫入磁盤上的data block。

這里上一幅官網的圖:

圖:Basic Instance Recovery Steps: Rolling Forward and Rolling Back

圖解:
我們看到實例恢復前redo日志中記載的日志對應着四種更改塊(redo只記載更改):
1)已提交且被寫入磁盤的更改塊,oracle對這種塊無需做任何操作。
2)已提交但未被寫入磁盤的更改塊,oracle會在前滾過程中在內存重現臟塊,然后按正常機制提交。
3)未提交且未被寫入磁盤的更改塊。
4)未提交但已被寫入磁盤的更改塊。
由於回滾是按事務為單位進行處理的,因此對於3、4兩種塊的處理全部是在回滾階段,oracle根據undo進行所有未提交事務的回滾操作,用前鏡像覆蓋掉磁盤or內存中的數據,這樣就會處理掉第3、4種塊。
此外,從上不難看出oracle默認undo中記載的事務進度是和redo日志中的完全一致的,不存在undo記載了事務被提交但是redo日志記錄未提交的情況。
但是並不是所有情況都符合Oracle默認的預期,有時候數據庫頻繁掉電就可能出現無法成功進行實例恢復的情況,此時只能采取一些特殊手段對數據文件頭和SCN做一些改動。
一般除非特別緊急的狀況,否則不要用BBED、強制推進SCN等”偏方“去打開數據庫,對於一個成功的DBA來說,做好備份和災備永遠是最重要的工作。

Instance Recovery Phases
The first phase of instance recovery is called cache recovery or rolling forward, and involves reapplying all of the changes recorded in the online redo log to the data files. Because rollback data is recorded in the online redo log, rolling forward also regenerates the corresponding undo segments.

Rolling forward proceeds through as many online redo log files as necessary to bring the database forward in time. After rolling forward, the data blocks contain all committed changes recorded in the online redo log files. These files could also contain uncommitted changes that were either saved to the data files before the failure, or were recorded in the online redo log and introduced during cache recovery.

After the roll forward, any changes that were not committed must be undone. Oracle Database uses the checkpoint position, which guarantees that every committed change with an SCN lower than the checkpoint SCN is saved on disk. Oracle Database applies undo blocks to roll back uncommitted changes in data blocks that were written before the failure or introduced during cache recovery. This phase is called rolling back or transaction recovery.

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM