RocksDB Rate Limiter源碼解析


這次的項目我們重點關注RocksDB中的一個環節:Rate Limiter。其實Rate Limiter的思想在很多其他系統中也很常用。

在RocksDB中,后台會實時運行compaction和flush操作,這些都會對磁盤進行大量的寫操作。可以通過Rate Limiter來控制最大寫入速度的上限。因為在某些場景下,突發的大量寫入會導致很大的read latency,從而影響系統性能。

Rate Limiter的基本原理是令牌桶算法:系統每秒往桶里丟數量為1/QPS的令牌(滿了為止),寫請求只有拿到了令牌才能處理。當桶里沒有令牌時便可拒絕服務(阻塞)。它在RocksDB中的實現可以參考這里

Rate Limiter中可以調節的有以下幾個參數

  • int64_t rate_bytes_per_sec:控制 compaction 和 flush 每秒總寫入量的上限。一般情況下只需要調節這一個參數。
  • int64_t refill_period_us:控制 tokens 多久再次填滿,譬如 rate_limit_bytes_per_sec 是 10MB/s,而 refill_period_us 是 100ms,那么每 100ms 的流量就是 1MB/s。
  • int32_t fairness:用來控制 high 和 low priority 的請求,防止 low priority 的請求餓死。

更詳細的介紹可以直接看rate_limiter.h:

 1 // Create a RateLimiter object, which can be shared among RocksDB instances to control write rate of flush and compaction.
 2 // @rate_bytes_per_sec: this is the only parameter you want to set most of the time. It controls the total write rate of compaction and flush in bytes per second. Currently, RocksDB does not enforce rate limit for anything other than flush and compaction, e.g. write to WAL.
 3 // @refill_period_us: this controls how often tokens are refilled. For example, when rate_bytes_per_sec is set to 10MB/s and refill_period_us is set to 100ms, then 1MB is refilled every 100ms internally. Larger value can lead to burstier writes while smaller value introduces more CPU overhead. The default should work for most cases.
 4 // @fairness: RateLimiter accepts high-pri requests and low-pri requests.  A low-pri request is usually blocked in favor of hi-pri request. Currently, RocksDB assigns low-pri to request from compaction and high-pri to request from flush. Low-pri requests can get blocked if flush requests come in continuously. This fairness parameter grants low-pri requests permission by 1/fairness chance even though high-pri requests exist to avoid starvation. You should be good by leaving it at default 10.
 5 // @mode: Mode indicates which types of operations count against the limit.
 6 // @auto_tuned: Enables dynamic adjustment of rate limit within the range `[rate_bytes_per_sec / 20, rate_bytes_per_sec]`, according to the recent demand for background I/O.
 7 extern RateLimiter* NewGenericRateLimiter(
 8    int64_t rate_bytes_per_sec, int64_t refill_period_us = 100 * 1000,
 9    int32_t fairness = 10,
10    RateLimiter::Mode mode = RateLimiter::Mode::kWritesOnly,
11    bool auto_tuned = false);
12 
13 }  // namespace rocksdb

這里有個bool auto_tuned,這是RocksDB中帶的一個Auto tune Rate Limiter的模塊。因為這個rate_bytes_per_sec(寫入速度的上限)是很難手動調節的,太大了沒效果,太小了又會導致大量寫操作無法繼續。所以RocksDB提供了這個模塊來自動調節。當開啟這個模塊時,參數rate_bytes_per_sec的含義就變成了定義寫入速度上限的上限(In this case rate_bytes_per_sec will indicate the upper-bound of the window within which a rate limit will be picked dynamically.)。之后這個auto tuner會周期性監測I/O寫入量,並相應增大/減小寫入量上限的值(重新設置rate_bytes_per_sec_和refill_bytes_per_period_)。這里的benchmark顯示Auto-tune Rate Limiter可以有效減少write突然增加的程度。

 


 

Rate Limiter的用法

可以通過NewGenericRateLimiter類來新建一個Rate Limiter。可以對每個RocksDB實例單獨搞一個,也可以讓所有實例共享一個來control the aggregated write rate of flush and compaction。

1 RateLimiter* rate_limiter = NewGenericRateLimiter(
2     rate_bytes_per_sec /* int64_t */, 
3     refill_period_us /* int64_t */,
4     fairness /* int32_t */);

這里面三個參數的含義上面介紹過啦。

Although tokens are refilled with a certain interval set by refill_period_us, the maximum bytes that can be granted in a single burst have to be bounded since we are not happy to see that tokens are accumulated for a long time and then consumed by a single burst request which definitely does not agree with our intention. GetSingleBurstBytes() returns this upper bound of tokens.

使用時,在每次寫請求之前,都要申請token。如果當前請求無法滿足(也就是被限速了),請求就會被阻塞,直到token被填充到足夠完成請求。

1 // block if tokens are not enough
2 rate_limiter->Request(1024 /* bytes */, rocksdb::Env::IO_HIGH); 
3 
4 // perform a write operation
5 Status s = db->Flush();

在運行過程中,也可以通過 SetBytesPerSecond(int64_t bytes_per_second) 動態修改Rate Limiter的流量。

 

 

 

Ref:

https://rocksdb.org.cn/doc/Rate-Limiter.html

https://github.com/facebook/rocksdb/wiki/Rate-Limiter

https://www.cnblogs.com/cchust/p/6007486.html

https://github.com/facebook/rocksdb/wiki/Statistics

RocksDB Compaction:

https://rimzy.net/category/graphs/

http://yinqiwen.github.io/Ardb/2014/09/13/ardb-practice.html

https://www.reddit.com/r/IAmA/comments/3de3cv/we_are_rocksdb_engineering_team_ask_us_anything/

 

 

 https://www.cnblogs.com/pdev/p/11277784.html

 http://mysql.taobao.org/monthly/2018/11/05/

 http://mysql.taobao.org/monthly/2018/12/08/

https://github.com/facebook/rocksdb/wiki/Basic-Operations

..


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM