一、 RDB Redis DataBase
The RDB persistence performs point-in-time snapshots of your dataset at specified intervals.
在指定的時間間隔內將內存中的數據集快照寫入磁盤,也就是行話講的Snapshot快照,它恢復時是將快照文件直接讀到內存里。
二、備份是如何執行的?
Redis會單獨創建(fork)一個子進程來進行持久化,會先將數據寫入到一個臨時文件中,待持久化過程都結束了,再用這個臨時文件替換上次持久化好的文件,不是在原來的文件上做增量,而是全部備份。整個過程中,主進程是不進行任何IO操作的,這就確保了極高的性能。如果需要進行大規模數據的恢復,且對於數據恢復的完整性不是非常敏感,那RDB方式要比AOF方式更加的高效。
- RDB的缺點是最后一次持久化后的數據可能丟失。 有時間間隔,服務器down了,有可能丟失,單機down了一定會丟失數據
三、 關於fork (分叉;分歧)
在Linux程序中,fork()會產生一個和父進程完全相同的子進程,但子進程在此后多會exec系統調用,出於效率考慮,Linux中引入了“寫時復制技術”,一般情況父進程和子進程會共用同一段物理內存,只有進程空間的各段的內容要發生變化時,才會將父進程的內容復制一份給子進寫磁盤、fork時對內存的壓力很大,性能殺器。
聯系gitHub中的fork
Fork的作用是復制一個與當前進程一樣的進程。新進程的所有數據(變量、環境變量、程序計數器等)數值都和原進程一致,但是是一個全新的進程,並作為原進程的子進程
四、配置文件snapshotting看rdb設置
rdb的保存策略

RDB是整個內存的壓縮過的Snapshot,RDB的數據結構,可以配置復合的快照觸發條件,默認
是1分鍾內改了1萬次,
或5分鍾內改了10次,
或15分鍾內改了1次
禁用:如果想禁用RDB持久化的策略,只要不設置任何save指令,或者給save傳入一個空字符串參數也可以
動態所有停止RDB保存規則的方法:redis-cli config set save ""
四、 如何觸發RDB快照
1、 配置文件中默認的快照配置

在redis.conf中配置文件名稱,默認為dump.rdb
2、命令save vs bgsave
save: 只管保存,占主進程,其它不管,以后的操作全部阻塞,性能殺器
BGSAVE:Redis會在后台異步進行快照操作,快照同時還可以響應客戶端請求。可以通過lastsave命令獲取最后一次成功執行快照的時間background后台存儲
3、執行flushall命令,也會產生dump.rdb文件,但里面是空的,無意義

rdb的保存的文件

當Redis無法寫入磁盤的話,直接關掉Redis的寫操作,
如果沒有設置,容易導致數據一致性問題,后台報錯不及時修改容易出現災難disaster
事故案例:小型機磁盤清理,備份數據時后台報錯,實際沒有備份成功造成數據丟失

進行rdb保存時,將文件壓縮,但是會占CPU
對於存儲到磁盤中的快照,可以設置是否進行壓縮存儲。如果是的話,redis會采用
LZF算法進行壓縮。如果你不想消耗CPU來進行壓縮的話,可以設置為關閉此功能

在存儲快照后,還可以讓Redis使用CRC64算法來進行數據校驗,但是這樣做會增加大約10%的性能消耗,如果希望獲取到最大的性能提升,可以關閉此功能

rdb文件的保存路徑,也可以修改。默認為Redis啟動時命令行所在的目錄下
五、 rdb的備份和恢復
備份:先通過config get dir 查詢rdb文件的目錄,將*.rdb的文件拷貝到別的地方
恢復:先關閉Redis,把備份的文件拷貝到工作目錄上,
啟動Redis,備份數據會自動加載
六、 Rdb 小總結

優點:節省磁盤空間
恢復速度快,就是一個鏡像,適合大規模的數據恢復
對數據完整性和一致性要求不高
缺點:
- 在備份周期在一定間隔時間做一次備份,所以如果Redis意外down掉的話,就會丟失最后一次快照后的所有修改。
- 雖然Redis在fork時使用了寫時拷貝技術,但是如果數據龐大時還是會占用cpu性能。
Redis Persistence
Redis provides a different range of persistence options:
- The RDB persistence performs point-in-time snapshots of your dataset at specified intervals.
- the AOF persistence logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. Redis is able to rewrite the log on background when it gets too big.
- If you wish, you can disable persistence at all, if you want your data to just exist as long as the server is running.
- It is possible to combine both AOF and RDB in the same instance. Notice that, in this case, when Redis restarts the AOF file will be used to reconstruct the original dataset since it is guaranteed to be the most complete.
The most important thing to understand is the different trade-offs between the RDB and AOF persistence. Let's start with RDB:
RDB advantages
- RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance you may want to archive your RDB files every hour for the latest 24 hours, and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.
- RDB is very good for disaster recovery, being a single compact file can be transferred to far data centers, or on Amazon S3 (possibly encrypted).
- RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent instance will never perform disk I/O or alike.
- RDB allows faster restarts with big datasets compared to AOF.
RDB disadvantages
- RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure different save points where an RDB is produced (for instance after at least five minutes and 100 writes against the data set, but you can have multiple save points). However you'll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason you should be prepared to lose the latest minutes of data.
- RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great. AOF also needs to fork() but you can tune how often you want to rewrite your logs without any trade-off on durability.
http://redis.io/topics/persistence
