Elasticsearch一些使用筆記(持續更新)

本文轉載自查看原文 2019-03-27 11:34 1951

這篇博客記錄這一些運維ES的一些經驗。

1、節點磁盤使用率過高，導致ES集群shard無法分配，丟失數據？

有兩個配置，分配副本的時候

參數名稱	默認值	含義
cluster.routing.allocation.disk.watermark.low	85%	當節點磁盤占用量高於85%時，就不會往該節點分配副本了
cluster.routing.allocation.disk.watermark.high	90%	當節點磁盤占用量高於90%時，嘗試將該節點的副本重分配到其他節點

配置方式

curl -XPUT 'localhost:9200/_cluster/settings' -d
'{
    "transient": {  
      "cluster.routing.allocation.disk.watermark.low": "90%"    
    }
}'

建議：密切關注ES集群節點的性能參數，對潛在風險有感知。

2、模板管理

template機制是比較有用的，特別是管理大量索引的時候。先給一個template的demo。

order：10 template的優先級，優先級高(order數字大的)會覆蓋優先級低的template里的字段。

template：test*，這個template會命中test開頭的索引。

index.number_of_shards：20 //index的一些配置

index.number_of_replicas:：1

index.refresh_interval：5s

{
    "aliases": {},
    "order": 10,
    "template": "test*",
    "settings": {
        "index": {
            "priority": "5",
            "merge": {
                "scheduler": {
                    "max_thread_count": "1"
                }
            },
            "search": {
                "slowlog": {
                    "threshold": {
                        "query": {
                            "warn": "10s",
                            "debug": "1s",
                            "info": "5s",
                            "trace": "500ms"
                        },
                        "fetch": {
                            "warn": "1s",
                            "debug": "500ms",
                            "info": "800ms",
                            "trace": "200ms"
                        }
                    }
                }
            },
            "unassigned": {
                "node_left": {
                    "delayed_timeout": "5m"
                }
            },
            "max_result_window": "10000",
            "number_of_shards": "20",
            "number_of_replicas": "1", 
            "translog": {
                "durability": "async"
            },
            "requests": {
                "cache": {
                    "enable": "true"
                }
            },
            "mapping": {
                "ignore_malformed": "true"
            },
            "refresh_interval": "5s"
        }
    }
}

配置方式

curl -XPUT localhost:9200/_template/template_1 -d '
{
    "template" : "test*",
    "order" : 0,
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "type1" : {
            "_source" : { "enabled" : false }
        }
    }
}
'

在配置了模板以后，如何建立索引

# 索引創建
curl -XPUT http://35.1.4.127:9200/index_name

3、mapping創建的一些注意事項

在創建索引type mapping的時候要妥善處理好_all和_source，不然會影響索引的性能。

_all，enable的話會把一個type中的所有字段合並成一個大字段，增加索引時間和大小。

_source，enable的話會請求會返回_source的結構體。

一般我們會禁用_all，打開_source。

另外，對時間的處理，可以如下這樣，對於各種繁瑣的時間格式都是支持的。

配置方式

curl -PUT http://35.1.4.129:9200/index_name/RELATION/_mapping -d '{
    "RELATION": {
        "_all": {
            "enabled": "false"
        },
        "_source": {
            "enabled": "true"
        },
        "properties": {
            "FROM_SFZH": {
                "type": "keyword"
            },
            "TO_SFZH": {
                "type": "keyword"
            },
            "CREATE_TIME": {
                "type": "date",
                "format": "yyyy-MM-dd HH:mm:ss.SSS Z||yyyy-MM-dd HH:mm:ss.SSS||yyyy-MM-dd HH:mm:ss,SSS||yyyy/MM/dd HH:mm:ss||yyyy-MM-dd HH:mm:ss,SSS Z||yyyy/MM/dd HH:mm:ss,SSS Z||strict_date_optional_time||epoch_millis||yyyy-MM-dd HH:mm:ss"
            }
        }
    }
}'

4、批量數據灌入ES時要禁用副本和刷新

大規模批量導入數據的時候，要禁用副本和刷新，ES在索引數據的時候，如果有副本的話，會同步副本，造成壓力。

等到數據索引完成后，在恢復副本。

配置方法

// 關閉
curl -PUT http://35.1.4.129:9200/_settings -d '{
　　"index": {
　　　　"number_of_replicas" : 0
　　　　"refresh_interval" : -1
　　}    
}'

// 打開
curl -PUT http://35.1.4.129:9200/_settings -d '{
　　"index": {
　　　　"number_of_replicas" : 1
　　　　"refresh_interval" : 5s
　　}    
}'

5、jvm層面監控和優化

Elasticsearch是java開發的組件，當然可以壓測看一下jvm的表現，例如通過jconsole遠程連接。

config/jvm.options里面有各種jvm的配置，可以根據硬件資源合理配置一下。jvm調優就不說了。

-Djava.rmi.server.hostname=192.168.1.152
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.port=9110
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false

6、高並發查詢時，優化ES線程池

當你查詢並發上來了，有時候你會發現下面這個異常

EsRejectedExcutionException[rejected execution(queue capacity 50) on.......]

這個原因是在新版本的elasticsearch中線程池已經是fixed類型了，即固定大小的線程池，默認是5*core數，當所有線程忙碌，且隊列滿的情況下，es會拒絕請求。

多種請求類型對應多種線程池

index：此線程池用於索引和刪除操作。它的類型默認為fixed，size默認為可用處理器的數量，隊列的size默認為200。
search：此線程池用於搜索和計數請求。它的類型默認為fixed，size默認為(可用處理器的數量* 3) / 2) + 1，隊列的size默認為1000。
suggest：此線程池用於建議器請求。它的類型默認為fixed，size默認為可用處理器的數量，隊列的size默認為1000。
get：此線程池用於實時的GET請求。它的類型默認為fixed，size默認為可用處理器的數量，隊列的size默認為1000。
bulk：此線程池用於批量操作。它的類型默認為fixed，size默認為可用處理器的數量，隊列的size默認為50。
percolate：此線程池用於預匹配器操作。它的類型默認為fixed，size默認為可用處理器的數量，隊列的size默認為1000。

這里以index為例，可以在elasticsearch.yml中修改線程池配置

threadpool.index.type: fixed
threadpool.index.size: 100
threadpool.index.queue_size: 500

通過api控制

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
    "transient": {
        "threadpool.index.type": "fixed",
        "threadpool.index.size": 100,
        "threadpool.index.queue_size": 500
    }
}'

7、若干副本shard分配不成功，集群狀態yellow

7.1 先看看集群狀態

curl -XGET http://10.96.78.164:9200/_cluster/health?pretty

結果如下，如果有未分配的分片，unassigned_shards應該不為0，status=yellow。

{
"cluster_name": "elasticsearch",
"status": "green",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 575,
"active_shards": 575,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 0,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 100
}

7.2 查看未分配的shard屬於哪個index，以及allocate的目標機器是哪個。

curl -XGET http://localhost:9200/_cat/shards | grep UNASSIGNED

結果

xiankan_xk_qdhj                  3 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3
xiankan_xk_qdhj                  2 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3
xiankan_xk_qdhj                  1 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3
xiankan_xk_qdhj                  4 r UNASSIGNED    0    261b 10.96.78.164 yfbf9D3

r-表示副本分片，p是主分片，ip是分配目標機器

7.3 嘗試1：索引級別的副本重新分配

有問題的索引，先關閉其副本，然后打開重新分配副本。

關閉

curl -PUT http://35.1.4.129:9200/xiankan_xk_zjhj/_settings -d '{
　　"index": {
　　　　"number_of_replicas" : 0
　　}    
}'

打開

http://10.96.78.164:9200/xiankan_xk_zjhj/_settings -d '{
  "index": {
    "number_of_replicas": 1
  }
}'

7.4 嘗試2：node級別的副本重新分配

重啟shard分配不成功的node，如果shard分布在為數不多的幾個node上，可以根據ip重啟node上的es實例

殺死es

ps -ef | grep elasticsearch | grep -v grep | awk '{print $2}' | xargs kill -9

啟動es

./bin/elasticsearch -d

7.5 嘗試3：逐個索引shard的reroute

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "xiankan_xk_zjhj",
"shard" : 1,
"node" : "yfbf9D3",
"allow_primary" : true
}
}
] }'

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Datagrip 快捷鍵和常用插件持續更新一集一些使用技巧 .NETCore .NET6中一些常用組件的配置及使用記錄，持續更新中。。。 STL 一些常用的STL函數(持續更新 kubernetes 中遇見的一些坑(持續更新) [持續更新]一些有趣的數學問題 pyes-elasticsearch的python客戶端使用筆記 dbvisualizer 使用筆記 ILRuntime使用筆記 gimp的使用筆記 Atom 使用筆記