Elasticsearch集群運維

本文轉載自查看原文 2018-08-25 10:42 2387 Distributed System/ Elasticsearch/ 運維/ 分布式/ 大數據

一、索引管理

1、 創建索引

PUT test-2019-03

{

"settings": {

"index": {

"number_of_shards": 10,

"number_of_replicas": 1,

"routing": {

"allocation": {

"include": {

"type": "hot"

}

2、 刪除索引

DELETE test-2019-03

DELETE test*

支持通配符*

3、 修改索引

修改副本數：

PUT test-2019-03/_settings

{

"index": {

"number_of_replicas": 0

}

4、 重構索引ReIndex

POST _reindex

{

"source": {

"index": ["test-2018-07-*"]

"dest": {

"index": "test -2018-07"

}

查看reIndex任務：

GET _tasks?detailed=true&actions=*reindex

5、 刪除數據delete_by_query

POST indexApple-2019-02/_delete_by_query?conflicts=proceed

{

"query": {

"bool" : {

"must" : {

"term" : { "appIndex" : "apple" }

"filter" : {

"range": {

"timestamp": {

"gte": "2019-02-23 08:00:00",

"lte": "2019-02-23 22:00:00",

"time_zone" :"+08:00"

}

查看delete_by_query任務：

GET _tasks?detailed=true&actions=*/delete/byquery

二、集群設置

ES cluster的settings：

curl -XPUT http://<domain>:<port>/_cluster/settings

1、Shard Allocation Settings

{"persistent":{"cluster.routing.allocation.enable": "all"}}

設置集群哪種分片允許分配，4個選項：

all - (default) Allows shard allocation for all kinds of shards.

primaries - Allows shard allocation only for primary shards.

new_primaries - Allows shard allocation only for primary shards for new indices.

none - No shard allocations of any kind are allowed for any indices.

{"persistent":{"cluster.routing.allocation.node_concurrent_recoveries": 8}}
設置在節點上並發分片恢復的個數（寫和讀）。

{"persistent":{"cluster.routing.allocation.node_initial_primaries_recoveries": 16}}
設置節點重啟后有多少並發數從本地恢復未分配的主分片。

{"persistent":{"indices.recovery.max_bytes_per_sec": "500mb"}}
設置索引恢復時每秒字節數。

2、Shard Rebalancing Settings

{"persistent":{"cluster.routing. rebalance.enable": "all"}}

設置集群哪種分片允許重平衡，4個選項：

all - (default) Allows shard balancing for all kinds of shards.

primaries - Allows shard balancing only for primary shards.

replicas - Allows shard balancing only for replica shards.

none - No shard balancing of any kind are allowed for any indices.

{"persistent":{"cluster.routing. allocation. allow_rebalance": "all"}}

always - Always allow rebalancing.

indices_primaries_active - Only when all primaries in the cluster are allocated.

indices_all_active - (default) Only when all shards (primaries and replicas) in the cluster are allocated.

{"transient":{"cluster.routing.allocation.cluster_concurrent_rebalance": 8}}
設置在集群上並發分片重平衡的個數，只控制“重平衡”過程的並發數，對集群“恢復”和其他情況下的並發數沒有影響。

{"transient":{"cluster.routing.allocation.cluster_concurrent_rebalance": 0}}

禁用集群“rebalance”

{"transient":{"cluster.routing.allocation.cluster_concurrent_rebalance": null}}
啟用集群“rebalance”

3、Disk-based Shard Allocation

#調整數據節點的低水位值為80%
{"transient":{"cluster.routing.allocation.disk.watermark.low":"80%"}}
#調整數據節點的高水位值為90%
{"transient":{"cluster.routing.allocation.disk.watermark.high":"90%"}}
#取消用戶設置，集群恢復這一項的默認配置
{"transient":{"cluster.routing.allocation.disk.watermark.low": null}}
{"transient":{"cluster.routing.allocation.disk.watermark.high": null}}

4、Allocation策略

明確指定是否允許分片分配到指定Node上，分為index級別和cluster級別

index.routing.allocation.require.{attribute}
index.routing.allocation.include{attribute}
index.routing.allocation.exclude.{attribute}
cluster.routing.allocation.require.{attribute}
cluster.routing.allocation.include.{attribute}
cluster.routing.allocation.exclude.{attribute}

require表示必須分配到指定node，include表示可以分配到指定node，exclude表示不允許分配到指定Node，cluster的配置會覆蓋index級別的配置，比如index include某個node，cluster exclude某個node，最后的結果是exclude某個node

#通過IP，排除集群中的某個節點：節點IP：10.100.0.11
{"transient":{"cluster.routing.allocation.exclude._ip":"10.100.0.11"}}
#通過IP，排除集群中的多個節點：節點IP：10.10.0.11,10.100.0.12
{"transient":{"cluster.routing.allocation.exclude._ip":"10.100.0.11,10.100.0.12"}}
#取消節點排除的限制
{"transient":{"cluster.routing.allocation.exclude._ip": null}}

設置索引不分配到某些IP：

PUT test/_settings

{

"index.routing.allocation.exclude._ip": "192.168.2.*"

}

默認支持的屬性：

_name Match nodes by node name

_host_ip Match nodes by host IP address (IP associated with hostname)

_publish_ip Match nodes by publish IP address

_ip Match either _host_ip or _publish_ip

_host Match nodes by hostname

5、Shard分配問題

1、查看集群unassigned shards原因
GET _cluster/allocation/explain?pretty

2、查看索引的恢復狀態，以索引user為例
GET user/_recovery?active_only=true

3、使用reroute重試之前分配失敗的，集群在嘗試分配分片index.allocation.max_retries（默認為5）次后會放棄分配
POST /_cluster/reroute?retry_failed=true

4、查看狀態是red的索引
GET _cat/indices?health=red

集群滾動重啟

1、准備工作
##提前打開如下信息，有些API是需要觀察的各項指標（出現問題則停止重啟），其余是配合檢查的API：
##查看集群UNASSIGEN shards原因
curl http://0.0.0.0:9200/_cluster/allocation/explain?pretty

###集群配置
curl http://0.0.0.0:9200/_cluster/settings?pretty

###pending-tasks
curl http://0.0.0.0:9200/_cluster/pending_tasks?pretty

###集群健康
curl http://0.0.0.0:9200/_cluster/health?pretty
2、重啟client-node
#start
步驟1：關閉其中一個client節點
步驟2：重啟節點
步驟3：檢查節點是否加入集群
步驟4：重復步驟2-3重啟其他節點
#end

3、重啟master-node
#start
步驟1：明確master節點IP
步驟2：關閉master-node組的一個非master節點
步驟3：重啟節點
步驟4：檢查節點是否加入集群（確保已經加入集群）
步驟5：重復步驟2-4，重啟另外的master-node組的一個非master節點
步驟6：關閉master節點
步驟7：重啟master節點
##在master節點選舉過程中，集群功能不可用（包括了：索引功能、search功能，API功能堵塞等），集群並不會立即選舉出master節點（默認進行選舉的時間為3s, 由於網絡的問題，往往將master選舉的時間延長）
步驟8：檢查集群裝填，檢查節點是否加入集群。
##當master選舉出來，集群功能將全部正常。
#end

4、重啟data-node
#start
步驟1：禁用分片分配
curl -X PUT http://0.0.0.0:9200/_cluster/settings?pretty -d '{"transient": {"cluster.routing.allocation.enable": "new_primaries"}}'
##禁用分片分配期間，集群新建索引將無法分配副本分片，允許新建索引主分片的分配
步驟2：執行同步刷新
curl -XPOST "http://0.0.0.0:9200/_flush/synced?pretty"
##對於在此刻不在更新的索引，此操作將通過synced值來確認主副分片是否數據一致（加快了分片加入集群的時間）；對於在此刻索引發生變化的分片，此操作對節點加入集群的索引恢復沒有作用
步驟3：關閉一個data-node節點
步驟4：重啟節點
步驟5：檢查節點是否加入集群
步驟6：啟用分片分配
curl -X PUT http://0.0.0.0:9200/_cluster/settings?pretty -d '{"transient": {"cluster.routing.allocation.enable": "all"}}'
步驟7：檢查集群狀態是否為green
##在啟用了分片分配后，UNASSIGEN shards會瞬間減少（不會瞬間減少為0，因為在大的ES集群中，每個節點都會有在更新的索引分片）；之后會出現一些initializing shards，這部分分片會需要等待一段時間才會減少為0（分片同步過程中）
步驟8：重復步驟3-7，重啟其他節點
步驟9：節點全部重啟完畢后，檢查集群配置，確保沒有禁用分片分配
#end
參考資料：

ES官方重啟教程 https://www.elastic.co/guide/en/elasticsearch/reference/1.4/cluster-nodes-shutdown.html#_rolling_restart_of_nodes_full_cluster_restart

參考：

https://www.elastic.co/guide/en/elasticsearch/reference/6.2/index.html

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Elasticsearch 運維實戰之1 -- 集群規划 ElasticSearch 集群的規划部署與運維 MongoDB集群運維筆記 PB級大規模Elasticsearch集群運維與調優實踐 Elasticsearch運維經驗總結 rabbitmq集群運維一點總結 MySQL PXC 集群運維指南 ceph集群故障運維--持續更新集群應用及運維經驗小結 redis-運維-redis單機和集群