Kafka 1.1新功能:數據的路徑間遷移


  經常有小伙伴有這樣的疑問:為什么線上Kafka機器各個磁盤間的占用不均勻,經常出現“一邊倒”的情形? 這是因為Kafka只保證分區數量在各個磁盤上均勻分布,但它無法知曉每個分區實際占用空間,故很有可能出現某些分區消息數量巨大導致占用大量磁盤空間的情況。在1.1版本之前,用戶對此毫無辦法,因為1.1之前Kafka只支持分區數據在不同broker間的重分配,而無法做到在同一個broker下的不同磁盤間做重分配。1.1版本正式支持副本在不同路徑間的遷移,具體的實現細節詳見KIP-113。本文簡單演示一下該新功能的用法。

  假設我在Kafka broker的server.properties文件中配置了多個路徑(代表多塊磁盤),如下所示:

...

############################# Log Basics #############################

# A comma seperated list of directories under which to store log files

log.dirs=/Users/huxi/SourceCode/newenv/datalogs/kafka_1,/Users/huxi/SourceCode/newenv/datalogs/kafka_2,/Users/huxi/SourceCode/newenv/datalogs/kafka_3

...

  之后我創建了一個9分區的topic,並發送了9百萬條消息。查詢這些目錄發現Kafka均勻地將9個分區分布到這三個路徑上,如下所示:

ll kafka_1/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-3

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-4

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-5

ll kafka_2/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-0

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-1

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-2

ll kafka_3/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-6

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-7

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-8

  現在我們想要將test-topic的6,7,8分區全部遷移到kafka_2路徑下,並且把test-topic的1分區遷移到kafka_1下。若要實現這個需求,我們首先需要編寫一個JSON文件,假定名為migrate-replica.json:

{"partitions":[{"topic": "test-topic","partition": 1,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_1"]},{"topic": "test-topic","partition": 6,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_2"]},{"topic": "test-topic","partition": 7,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_2"]},{"topic": "test-topic","partition": 8,"replicas": [0],"log_dirs": ["/Users/huxi/SourceCode/newenv/datalogs/kafka_2"]}],"version":1}

其中,replicas中的0表示broker ID,由於本文只啟動了一個broker,且broker.id = 0,故這里只寫0即可。實際上你可以指定多個broker實現為多個broker同時遷移副本的功能。另外當前的version固定是1.

保存好這個JSON后,我們執行以下命令執行副本遷移:

bin/kafka-reassign-partitions.sh  --zookeeper localhost:2181 --bootstrap-server localhost:9092 --reassignment-json-file ../migrate-replica.json --execute

Current partition replica assignment

 

{"version":1,"partitions":[{"topic":"test-topic","partition":8,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":4,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":5,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":2,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":6,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":3,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":1,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":7,"replicas":[0],"log_dirs":["any"]},{"topic":"test-topic","partition":0,"replicas":[0],"log_dirs":["any"]}]}

 

Save this to use as the --reassignment-json-file option during rollback

Successfully started reassignment of partitions.

再次查看路徑副本分布:

ll kafka_1/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-1

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-3

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-4

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-5

ll kafka_2/ |grep test-topic

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-0

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:21 test-topic-2

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-6

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-7

drwxr-xr-x   6 huxi  staff  192 Jun 22 17:31 test-topic-8

ll kafka_3/ |grep test-topic

<empty>

顯然,6,7,8已經被成功地遷移到kafka_2下,而分區1也遷移到了kafka_1下。值得一提的是,不僅所有的日志段、索引文件被遷移,實際上分區外層的checkpoint文件也會被更新。比如我們檢查kafka_2下的replication-offset-checkpoint文件可以發現,現在該文件已經包含了6,7,8分區的位移數據,如下所示:

cat replication-offset-checkpoint 

0

7

test-topic 8 1000000

test-topic 2 1000000

test 0 1285714

test-topic 6 1000000

test-topic 7 1000000

test-topic 0 1000000

test 2 1285714

 

以上就是對1.1新功能“副本跨路徑遷移”的簡單嘗試,希望對有此困擾的用戶有用~~


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM