先來看看這幾個參數在官方文檔中定義是怎樣的
Property |
Description |
Parameter type |
Integer |
Default value |
0 |
Modifiable |
ALTER SYSTEM |
Range of values |
0 to 3600 seconds |
Basic |
No |
Real Application Clusters |
Multiple instances can have different values, and you can change the values at runtime. |
FAST_START_MTTR_TARGET enables you to specify the number of seconds the database takes to perform crash recovery of a single instance. When specified, FAST_START_MTTR_TARGET is overridden by LOG_CHECKPOINT_INTERVAL.
Property |
Description |
Parameter type |
Integer |
Default value |
0 |
Modifiable |
ALTER SYSTEM |
Range of values |
0 to 231 - 1 |
Basic |
No |
Real Application Clusters |
Multiple instances can have different values. |
LOG_CHECKPOINT_INTERVAL specifies the frequency of checkpoints in terms of the number of redo log file blocks that can exist between an incremental checkpoint and the last block written to the redo log. This number refers to physical operating system blocks, not database blocks.
Regardless of this value, a checkpoint always occurs when switching from one online redo log file to another. Therefore, if the value exceeds the actual redo log file size, checkpoints occur only when switching logs. Checkpoint frequency is one of the factors that influence the time required for the database to recover from an unexpected failure.
Notes:
· Specifying a value of 0 (zero) for LOG_CHECKPOINT_INTERVAL has the same effect as setting the parameter to infinity and causes the parameter to be ignored. Only nonzero values of this parameter are considered meaningful.
· Recovery I/O can also be limited by setting the LOG_CHECKPOINT_TIMEOUT parameter or by the size specified for the smallest redo log. For information on which mechanism is controlling checkpointing behavior, query the V$INSTANCE_RECOVERY view.
Property |
Description |
Parameter type |
Integer |
Default value |
1800 |
Modifiable |
ALTER SYSTEM |
Range of values |
0 to 231 - 1 |
Basic |
No |
Real Application Clusters |
Multiple instances can have different values. |
LOG_CHECKPOINT_TIMEOUT specifies (in seconds) the amount of time that has passed since the incremental checkpoint at the position where the last write to the redo log (sometimes called the tail of the log) occurred. This parameter also signifies that no buffer will remain dirty (in the cache) for more than integer seconds.
Specifying a value of 0 for the timeout disables time-based checkpoints. Hence, setting the value to 0 is not recommended unless FAST_START_MTTR_TARGET is set.
Notes:
· A checkpoint scheduled to occur because of this parameter is delayed until the completion of the previous checkpoint if the previous checkpoint has not yet completed.
· Recovery I/O can also be limited by setting the LOG_CHECKPOINT_INTERVAL parameter or by the size specified for the smallest redo log. For information on which mechanism is controlling checkpointing behavior, query the V$INSTANCE_RECOVERY view.
實際上還有一個參數FAST_START_IO_START,不過我在我的10G官方文檔Reference中已經找不到了,估計已經不支持了吧。在一個帖子里看到過一個朋友寫過關於FAST_START_IO_START、LOG_CHECKPOINT_INTERVAL、LOG_CHECKPOINT_TIMEOUT這三者的區別,覺得挺好的,在這里引用一下。
#########################################################################
首先明確幾個概念:
1.Data block 是包含幾個os block,也就是一對多的關系。
2.Data files 中的是 data block
Redo log files 中的 redo block 是os block
3.Data block 中記錄的是完整的信息
Redo block 中記錄的是最簡單的信息
在8i 以前的版本,只有兩個參數用來影響recovery,那就是log_checkpoint_interval和log_checkpoint_time。以log_checkpoint_interval=10000舉例。意思很簡單,就是經過10000個redo block后就引發checkpoint,這樣恢復時,就絕對是在這10000個redo block中。
但是這樣做有個缺點,就是因為redo block(os block)不等於data block。假如datablock中的數據很小,比如1,修改成2,那么10000個redo block中包含的信息卻可以是遠遠大於10000個data block,假如data block中的數據很大,比如123456789,修改成987654321,那么10000個redo block中包含的信息卻可以是遠遠小於10000個data block。
這樣,在恢復時,盡管是讀10000個redo blcok,但是這個10000個redo block中所包含的data block有可能是很少,也有可能是很多,很難把握恢復的時間。
於是,在8i中就引進了fast_start_io_target這個參數來彌補這個不足。計算機自動計算redo block中所包含的data block的多少。例如fast_start_io_target=10000,t1(第一次checkpoint),在t1后有5000個redo block記錄了,但是這個5000個redo block中只包含了7000個data block,那么redo block繼續記錄,到了redo block到8000個時,計算機發現這8000個redo block中包含了10000個data block,那么就引起t2(第二次checkpoint),凡是在t1和t2之間的任何crash,都可以保證recovery的時間肯定在這 10000個data block的讀寫時間之內(因為1個data block的I/O 是可以估算的)
所以說,設置fast_start_io_target比設置log_checkpoint_interval,log_checkpoint_time這個兩個參數更為准確。假如3個參數一起設置,那么只要達到任一參數值時,就引發checkpoint。
例如:fast_start_io_target=10000,log_checkpoint_interval=10000,log_checkpoint_time=1800。
在redo block=10000時,所包含的data block卻只有8000,就觸發log_checkpoint_interval
在redo block=8000時,所包含的data block 卻有10000,就觸發fast_start_io_target
#########################################################################
他這里說說的log_checkpoint_time估計就是log_checkpoint_timeout,因為我在文檔里也查不到log_checkpoint_time了,呵呵。
而現在一般都推薦使用FAST_START_MTTR_TARGET參數,直接將ORACLE在恢復的過程中花費的時間限定在一個特定的時間段內。
然而我在查看我數據庫中此參數的時候又產生了新的疑惑
SQL> show parameter mttr
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
fast_start_mttr_target integer 0
SQL>
為什么我的fast_start_mttr_target 會是0呢?如果是0的話,那么我數據庫啟動時進行恢復的時間又是被哪種機制來控制的呢?總不能無限期長時間恢復吧?上網搜到EYGLE寫過的一篇文章,這里涉及到一個新的概念:自動調整的檢查點
下面摘取文章中部分內容看看
#########################################################################
從Oracle10gR2開始,數據庫可以實現自動調整的檢查點.
使用自動調整的檢查點,Oracle數據庫可以利用系統的低I/O負載時段寫出內存中的臟數據,從而提高數據庫的效率。
因此,即使數據庫管理員設置了不合理的檢查點相關參數,Oracle仍然能夠通過自動調整將數據庫的Crash Recovery時間控制在合理的范圍之內。
當FAST_START_MTTR_TARGET參數未設置時,自動檢查點調整生效。
通常,如果我們必須嚴格控制實例或節點恢復時間,那么我們可以設置FAST_START_MTTR_TARGET為期望時間值;如果恢復時間不需要嚴格控制,那么我們可以不設置FAST_START_MTTR_TARGET參數,從而啟用Oracle10g的自動檢查點調整特性。
當取消FAST_START_MTTR_TARGET參數設置之后:
SQL> show parameter fast_start_mttr NAME TYPE VALUE |
在啟動數據庫的時候,我們可以從alert文件中看到如下信息:
Wed Jan 11 16:28:12 2006 |
檢查v$instance_recovery視圖,我們可以發現Oracle10g中的改變:
SQL> select RECOVERY_ESTIMATED_IOS REIOS,TARGET_MTTR TMTTR, REIOS TMTTR EMTTR WMTTR WOSET CKPTBW WAUTO WFTCKPT |
在以上視圖中,WRITES_AUTOTUNE字段值就是指由於自動調整檢查點執行的寫出次數,
而CKPT_BLOCK_WRITES指的則是由於檢查點寫出的Block的數量。
#########################################################################
了解了吧,這里雖然我的FAST_START_MTTR_TARGET沒有設置,值為0,可是ORACLE是會通過自動檢查點調整特性來控制啟動時恢復的時間的,使它不會太久。
文檔中還有兩個相關的參數,既然寫了,在這里也順便提一下吧。
Property |
Description |
Parameter type |
Boolean |
Default value |
false |
Modifiable |
ALTER SYSTEM |
Range of values |
true | false |
Basic |
No |
LOG_CHECKPOINTS_TO_ALERT lets you log your checkpoints to the alert file. Doing so is useful for determining whether checkpoints are occurring at the desired frequency.
Property |
Description |
Parameter type |
String |
Syntax |
FAST_START_PARALLEL_ROLLBACK = { HI | LO | FALSE } |
Default value |
LOW |
Modifiable |
ALTER SYSTEM |
Basic |
No |
FAST_START_PARALLEL_ROLLBACK determines the maximum number of processes that can exist for performing parallel rollback. This parameter is useful on systems in which some or all of the transactions are long running.
Values:
- FALSE indicates that parallel rollback is disabled
- LOW limits the number of rollback processes to 2 * CPU_COUNT
- HIGH limits the number of rollback processes to 4 * CPU_COUNT
第一個參數LOG_CHECKPOINTS_TO_ALERT就不說了,我們來看看第二個參數FAST_START_PARALLEL_ROLLBACK。
講到這個參數,就不能不談一下INSTANCE RECOVERY的過程。
Instance and crash recovery occur in two steps: cache recovery followed by transaction recovery.
看一下CACHE RECOVERY都做了些什么
The database can be opened as soon as cache recovery completes, so improving the performance of cache recovery is important for increasing availability.
The duration of cache recovery processing is determined by two factors: the number of data blocks that have changes at SCNs higher than the SCN of the checkpoint, and the number of log blocks that need to be read to find those changes.
Frequent checkpointing writes dirty buffers to the datafiles more often than otherwise, and so reduces cache recovery time in the event of an instance failure. If checkpointing is frequent, then applying the redo records in the redo log between the current checkpoint position and the end of the log involves processing relatively few data blocks. This means that the cache recovery phase of recovery is fairly short.
However, in a high-update system, frequent checkpointing can reduce runtime performance, because checkpointing causes DBWn processes to perform writes.
從這里可以看到,CHECKPOINT也並不是越頻繁越好,當然,也不是間隔越久越好,關鍵是要根據系統的具體情況來把握一個合理的“度”。
Cache Recovery (Rolling Forward)
During the cache recovery step, Oracle applies all committed and uncommitted changes in the redo log files to the affected data blocks. The work required for cache recovery processing is proportional to the rate of change to the database (update transactions each second) and the time between checkpoints.
Transaction Recovery (Rolling Back)
To make the database consistent, the changes that were not committed at the time of the crash must be undone (in other words, rolled back). During the transaction recovery step, Oracle applies the rollback segments to undo the uncommitted changes.
ORACLE在恢復的過程中,首先讀取日志,從最后完成的檢查點開始,應用所有重做記錄,這個過程成為前滾。也就是CACHE RECOVERY過程,完成前滾之后,就可以打開數據庫提供訪問和使用了。
此后進入實例恢復的第二階段,ORACLE回滾未提交的事務,也就是TRANSACTION RECOVERY。ORACLE使用兩個特點來增加這個恢復階段的效率,這兩個特點是FAST-START ON-DEMAND ROLLBACK和FAST-START PARALLEL ROLLBACK(這些特點是FAST-START FAULT RECOVERY的組成部分,僅在oracle 8i之后的企業版中可用)。
使用FAST-START ON-DEMAND ROLLBACK特點,ORACLE自動允許在數據庫打開之后開始新的事務,這通常只需要很短的CACHE RECOVERY時間。如果一個用戶試圖訪問被異常中止進程鎖定的記錄,ORACLE回滾那些新事務請求的記錄,也就是說,因需求而回滾。因而,新事務不需要等待漫長的事務回滾時間。在FAST-START ON-DEMAND ROLLBACK中,后台進程SMON充當一個調度員,使用多個服務器進程並行回滾一個事務集。
FAST-START PARALLEL ROLLBACK主要對於長時間運行的未提交事務有效,尤其是並行INSERT,UPDATE和DELETE操作。SMON自動決定何時開始並行回滾並且自動在多個進程之間分散工作。
FAST-START PARALLEL ROLLBACK的一個特殊形式是內部事務恢復(INTRA-TRANSACTION RECOVERY)。在內部事務恢復中,一個大的事務可以被拆分,分配給幾個服務器進程並行回滾。這個時候就可以通過我們前面說到的那個FAST_START_PARALLEL_ROLLBACK來控制並行回滾。
前面還涉及到一個概念,FAST-START FAULT RECOVERY,在這里引用網上一個朋友的文章簡單的解釋一下。
######################################################
從Oracle8i開始,Oracle在企業版中引入了Fast-Start Fault Recovery選項。
該選項包含三個主要增強:
1.Fast-Start Checkpointing
2.Fast-Start On-Demand Rollback
3.Fast-Start Parallel Rollback.
這三個選項,都是為了加快系統在故障后的恢復,提高系統的可用性。
從v$option視圖中,我們可以找到這個選項:
SQL> select * from v$version where rownum <2;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.4.0 - Production
SQL> select * from v$option
2 where Parameter='Fast-Start Fault Recovery';
PARAMETER VALUE
-------------------------------------------------- ---------------
Fast-Start Fault Recovery TRUE
######################################################
在使用FAST_START_MTTR_TARGET的時候需要注意的事情
You must disable or remove the FAST_START_IO_TARGET, LOG_CHECKPOINT_INTERVAL, and LOG_CHECKPOINT_TIMEOUT initialization parameters when using FAST_START_MTTR_TARGET. Setting these parameters interferes with the mechanisms used to manage cache recovery time to meet FAST_START_MTTR_TARGET.
Practical Values for FAST_START_MTTR_TARGET
The maximum value for FAST_START_MTTR_TARGET is 3600 seconds (one hour). If you set the value to more than 3600, then Oracle rounds it to 3600.