leveldb snapshot詳解


  了解leveldb 的snapshot首先得了解SequenceNumber。當插入數據時,SequenceNumber會依次增長,例如插入key1, key2, key3, key4等數據時,依次對應的SequenceNumber為1, 2, 3, 4。當然,並不是每次都會如此簡單,當存在合並寫時,例如key1, key2, key3, key4,key5. key1對應的SequenceNumber為1, key2, key3, key4對應的SequenceNumber為2, key5對應的SequenceNumber為5.

  一條kv鍵對會安如下格式插入到memtable里去:

  internal_key_size                       internal_key                   value_size            value

  ----------------------------|-----------------------|-----------------------|---------------  

  其中,internal_key 里就帶了SequenceNumber, internal_key格式如下:

  key                                        SequenceNumber                    type(value類型)

  ---------------------|--------------------------------------|--------------------------

  也就是說SequenceNumber會跟隨着kv鍵對存儲的。

  

  接下來,我們看看snapshot的api, 接口和實現如下:

1 const Snapshot* DBImpl::GetSnapshot() {
2   MutexLock l(&mutex_);
3   return snapshots_.New(versions_->LastSequence());
4 }
5 
6 void DBImpl::ReleaseSnapshot(const Snapshot* s) {
7   MutexLock l(&mutex_);
8   snapshots_.Delete(reinterpret_cast<const SnapshotImpl*>(s));
9 }

  snapshots_為一個維護snapshot的雙向鏈表。每次獲取一個snapshot,就以當前的SequenceNumber new一個snapshot, 並插入到雙向鏈表中。當釋放一個snapshot時,就從雙向鏈表中刪除。

  那么如何保持快照的數據不會被刪除了?在leveldb中,唯一會刪除數據的地方就是compaction了。so,我們看下DBImpl::DoCompactionWork的核心部分

 1 Status DBImpl::DoCompactionWork(CompactionState* compact) {
 2   //...................
 3   if (snapshots_.empty()) {
 4     compact->smallest_snapshot = versions_->LastSequence();
 5   } else {
 6     compact->smallest_snapshot = snapshots_.oldest()->number_;
 7   }
 8 
 9   // Release mutex while we're actually doing the compaction work
10   mutex_.Unlock();
11 
12   Iterator* input = versions_->MakeInputIterator(compact->compaction);
13   input->SeekToFirst();
14   Status status;
15   ParsedInternalKey ikey;
16   std::string current_user_key;
17   bool has_current_user_key = false;
18   SequenceNumber last_sequence_for_key = kMaxSequenceNumber;
19   for (; input->Valid() && !shutting_down_.Acquire_Load(); ) {
20     //..............................
21     // Handle key/value, add to state, etc.
22     bool drop = false;
23     if (!ParseInternalKey(key, &ikey)) {
24       // Do not hide error keys
25       current_user_key.clear();
26       has_current_user_key = false;
27       last_sequence_for_key = kMaxSequenceNumber;
28     } else {
29       if (!has_current_user_key ||
30           user_comparator()->Compare(ikey.user_key,
31                                      Slice(current_user_key)) != 0) {
32         // First occurrence of this user key
33         current_user_key.assign(ikey.user_key.data(), ikey.user_key.size());
34         has_current_user_key = true;
35         last_sequence_for_key = kMaxSequenceNumber;
36       }
37 
38       if (last_sequence_for_key <= compact->smallest_snapshot) {
39         // Hidden by an newer entry for same user key
40         drop = true;    // (A)
41       } else if (ikey.type == kTypeDeletion &&
42                  ikey.sequence <= compact->smallest_snapshot &&
43                  compact->compaction->IsBaseLevelForKey(ikey.user_key)) {
44         // For this user key:
45         // (1) there is no data in higher levels
46         // (2) data in lower levels will have larger sequence numbers
47         // (3) data in layers that are being compacted here and have
48         //     smaller sequence numbers will be dropped in the next
49         //     few iterations of this loop (by rule (A) above).
50         // Therefore this deletion marker is obsolete and can be dropped.
51         drop = true;
52       }
53 
54       last_sequence_for_key = ikey.sequence;
55     }
56 
57     if (!drop) {
58     //..............................
59     }
60 
61     input->Next();
62   }
63 }

  在第6行中,compact->smallest_snapshot 賦值為最舊的snapshot的SequenceNumber. 隨后創建了compation目標的iterator, 對於同一個key_a,  遍歷時可能會出現

  (key_a,  value5)--------(key_a,  value4)--------(key_a,  value3)--------(key_a,  value2)--------(key_a,  value1)的順序。

  當遍歷至(key_a,  value5)時, 會運行33-35行的代碼。隨后last_sequence_for_key賦值為(key_a,  value5) , 下一次遍歷至(key_a,  value4)時,將last_sequence_for_key 和compact->smallest_snapshot做比較,如果last_sequence_for_key小於compact->smallest_snapshot時,表示last_sequence_for_key比最舊的snaphot的SequenceNumber還要小,因此(key_a,  value4)可以在compact時drop掉。否則,如果(key_a,  value4)是刪除操作,並且其sequency小於最舊的snaphot的SequenceNumber, 並且比該kv所在level更高level上沒有相同key時這三個條件都滿足時,也可以在compact時drop掉。其它情況都不可以drop.

  這樣的compact邏輯就是為了舊snapshot可以讀到舊的值,而不會因為后續的更新而變化。達到快照的目的。

  Get時,可以通過option傳入snapshot參數。在Get邏輯中,實際的seek時會跳過SequenceNumber比snapshot大的kv鍵對。從而保證讀到的時snapshot時的值,而非后續的新值。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM