ETCD磁盤空間爆滿解決方案


ETCD磁盤報警處理

etcd默認的空間配額限制為2G,超出空間配額限制就會影響服務,所以需要定期清理

查看ETCD日志

8月 04 17:00:04 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.354750458s) to execute
8月 04 17:00:05 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" range_end:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.31986XXXXXXXXXXXXXXXXXXXXX
8月 04 17:05:09 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.136787261s) to execute
8月 04 17:05:10 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" range_end:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.68081XXXXXXXXXXXXXXXXXXXXX
8月 04 17:05:11 1.novalocal etcd[24848]: WARNING: 2020/08/04 17:05:11 grpc: Server.processUnaryRPC failed to write status connection error: desc = "transport is closing"
8月 04 17:10:14 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.173390639s) to execute
8月 04 17:10:15 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" range_end:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.42705XXXXXXXXXXXXXXXXXXXXX
8月 04 17:15:19 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:1 size:3775" took too long (1.311071626s) to execute
8月 04 17:15:20 1.novalocal etcd[24848]: read-only range request "key:\"XXXXXXXXXXXXXXXXXXXXX" range_end:\"XXXXXXXXXXXXXXXXXXXXX" " with result "range_response_count:2303873 size:1274241272" took too long (11.22721XXXXXXXXXXXXXXXXXXXXX

發現存在大量 took too long (11.42705XXXXXXXXXXXXXXXXXXXXX 日志

查看ETCD集群狀態

  • 查看集群狀態
ETCDCTL_API=3 ./etcdctl --endpoints=$ip:$port --write-out=table endpoint status

+------------------------+------------------+---------+---------+-----------+-----------+------------+
|        ENDPOINT        |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+------------------------+------------------+---------+---------+-----------+-----------+------------+
| http://127.0.0.1:2379 | 728d3145169b227d |  3.3.10 |  2.1 GB |      false |         6 |    3616392 |
+------------------------+------------------+---------+---------+-----------+-----------+------------+
  • 查看ETCD集群報警情況
ETCDCTL_API=3 ./etcdctl --endpoints=$ip:$port alarm list

meberID:XXXXXXXXXXXXXXX alarm:NOSPACE

此處 alarm 提示 NOSPACE,需要升級 ETCD 集群的空間(默認為2G的磁盤使用空間),或者壓縮老數據,升級空間后,需要使用 etcd命令,取消此報警信息,否則集群依舊無法使用

增加etcd的容量,由2G-->8G,增加以下三個參數

vi /etc/systemd/system/rio-etcd.service
## auto-compaction-retention 參數#(單位⼩時)

--auto-compaction-mode=revision --auto-compaction-retention=24 --quota-backend-bytes=8589934592

獲取當前etcd數據的修訂版本(revision)

rev=$(ETCDCTL_API=3 etcdctl --endpoints=$ip:$port endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9].*')

echo $rev
  • 整合壓縮舊版本數據
ETCDCTL_API=3 etcdctl --endpoints=$ip:$port compact $rev
  • 執行碎片整理
ETCDCTL_API=3 etcdctl --endpoints=$ip:$port defrag

解除告警

ETCDCTL_API=3 etcdctl --endpoints=$ip:$port alarm disarm

驗證可以添加新數據

ETCDCTL_API=3 etcdctl --endpoints=$ip:$port put newkeytestfornospace 123

參考文檔

https://www.cnblogs.com/lvcisco/p/10775021.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM