一、問題
最近在查看線上的 es,發現最近2天的索引沒有副本,集群的狀態也是為 yellow 的。
二、問題的原因
es 所在的服務器磁盤是還有剩余空間的。只不過磁盤使用了大概 89%,按道理來說應該是會繼續使用的,並創建索引的副本的,我們經過查閱官方文檔。
cluster.routing.allocation.disk.watermark.low
Controls the low watermark for disk usage. It defaults to 85%, meaning that Elasticsearch will not allocate shards to nodes that have more than 85% disk used. It can also be set to an absolute byte value (like 500mb) to prevent Elasticsearch from allocating shards if less than the specified amount of space is available. This setting has no effect on the primary shards of newly-created indices or, specifically, any shards that have never previously been allocated.
cluster.routing.allocation.disk.watermark.high
Controls the high watermark. It defaults to 90%, meaning that Elasticsearch will attempt to relocate shards away from a node whose disk usage is above 90%. It can also be set to an absolute byte value (similarly to the low watermark) to relocate shards away from a node if it has less than the specified amount of free space. This setting affects the allocation of all shards, whether previously allocated or not.
cluster.routing.allocation.disk.watermark.flood_stage
Controls the flood stage watermark. It defaults to 95%, meaning that Elasticsearch enforces a read-only index block (index.blocks.read_only_allow_delete) on every index that has one or more shards allocated on the node that has at least one disk exceeding the flood stage. This is a last resort to prevent nodes from running out of disk space. The index block must be released manually once there is enough disk space available to allow indexing operations to continue.
我們可以知道,es 集群的默認配置是當集群中的某個節點磁盤達到使用率為 85% 的時候,就不會在該節點進行創建副本,當磁盤使用率達到 90% 的時候,嘗試將該節點的副本重分配到其他節點。當磁盤使用率達到95% 的時候,當前節點的所有索引將被設置為只讀索引。
三、問題解決的辦法
1. 擴大磁盤
……
2. 刪除部分歷史索引
3. 更改es設置
- 更改配置文件(需要重啟es)
- 動態更改(api,無需重啟)
es 的設置默認是 85% 和 90 %,我們更改為 90%和 95%。
3.1、更改配置文件(需要重啟es)
在elasticsearch.yml
文件配置:
cluster.routing.allocation.disk.threshold_enabled: true
cluster.routing.allocation.disk.watermark.low: 90%
cluster.routing.allocation.disk.watermark.high: 95%
cluster.routing.allocation.disk.watermark.flood_stage: 98%
3.2、動態更改
所謂的動態更改就是通過 es 的 api 進行更改。transient
臨時更改,persistent
是永久更改。
api 接口 /_cluster/settings
注意 cluster.routing.allocation.disk.watermark.flood_stage 參數是 6.0 版本開始才有的,在5的版本是沒有該配置的, 是不支持的,我在修改5.6 的版本的時候添加了該參數,是有錯誤返回的 "reason":"persistent setting [cluster.routing.allocation.disk.watermark.flood_stage], not dynamically updateable"},"status":4001. 5.6 版本官方文檔鏈接:https://www.elastic.co/guide/en/elasticsearch/reference/5.6/disk-allocator.html
查看es 當前的配置
查看es 當前的配置 get 請求 /_cluster/settings
。
curl 172.1.2.208:9200/_cluster/settings
{
"persistent": {
"xpack": {
"monitoring": {
"collection": {
"enabled": "true"
}
}
}
},
"transient": {
"cluster": {
"routing": {
"allocation": {
"disk": {
"watermark": {
"low": "90%",
"high": "95%"
}
}
}
},
"info": {
"update": {
"interval": "1m"
}
}
}
}
}
永久更改 persistent
重啟后不失效。
{"persistent":
{
"cluster.routing.allocation.disk.watermark.low": "90%",
"cluster.routing.allocation.disk.watermark.high": "95%",
"cluster.info.update.interval": "1m"
}
}
臨時更改 transient
重啟后配置失效。
{"transient":
{
"cluster.routing.allocation.disk.watermark.low": "90%",
"cluster.routing.allocation.disk.watermark.high": "95%",
"cluster.info.update.interval": "1m"
}
}
示例:
root@111:~# curl -H "Content-Type: application/json" -XPUT 172.1.2.208:9200/_cluster/settings -d '{"transient": { "cluster.routing.allocation.disk.watermark.low": "90%", "cluster.routing.allocation.disk.watermark.high": "95%", "cluster.info.update.interval": "1m"}}'
{"acknowledged":true,"persistent":{},"transient":{"cluster":{"routing":{"allocation":{"disk":{"watermark":{"low":"90%","high":"95%"}}}},"info":{"update":{"interval":"1m"}}}}}
四、擴展
其實我們在官方文檔也就可以看到,就是我們不僅僅可以使用百分比來進行設置,我們也可以使用空間的大小來進行設置,類似500mb
這樣。