es索引調優

本文轉載自查看原文 2021-02-22 11:53 275 elk

1.`index.refresh_interval`: "30s" 建議調大點

這個參數的意思是數據寫入后幾秒可以被搜索到，默認是 1s。每次索引的 refresh 會產生一個新的 lucene 段, 這會導致頻繁的合並行為，如果業務需求對實時性要求沒那么高，可以將此參數調大，實際調優告訴我，該參數確實很給力，cpu 使用率直線下降

2.translog調優

ES 為了保證數據不丟失，每次 index、bulk、delete、update 完成的時候，一定會觸發刷新 translog 到磁盤上。在提高數據安全性的同時當然也降低了一點性能

    {
      "index": {
          "translog": {
              "flush_threshold_size": "1gb",   log文件大小
              "sync_interval": "30s",          sync間隔調高
              "durability": "async"            異步更新
          }
      }
    }

Lucene只有在commit的時候才會把之前的變更持久化存儲到磁盤（每次操作都寫到磁盤的話，代價太大），在commit之前如果出現故障，上一次commit之后的變更都會丟失
為了防止數據丟失，Lucene會把變更操作都記錄在translog里，在出現故障的時候，從上次commit起記錄在translog里的變更都可以恢復，盡量保證數據不丟失
Lucene的flush操作就是執行一次commit，同時開始記錄一個新的translog，所以translog是用來記錄從上次commit到下一次commit之間的操作的
flush操作的頻率是通過translog的大小控制的，當translog大小達到一定值的時候就執行一次flush，對應參數為index.translog.flush_threshold_size，默認值是512mb，這里調整為1gb，減少flush的次數
translog本身是文件，也需要存儲到磁盤，它的存儲方式通過index.translog.durability和index.translog.sync_interval設定。默認情況下，index.translog.durability=request，意為每次請求都會把translog寫到磁盤。這種設定可以降低數據丟失的風險，但是磁盤IO開銷會較大
這里采用異步方式持久化translog，每隔30秒寫一次磁盤

3.index.store.throttle.type:"none"

做buck操作而無需考慮search性能的情況狂下，你當然想要讓index的性能達到disk的極限了,完全禁止merge。當index結束后，重新開啟即可

index操作首先會生成很多小的segment，會有異步邏輯合並（merge）這些segment
merge操作比較消耗IO，當系統IO性能比較差的時候，merge會影響查詢和索引的性能。
index.store.throttle.type和index.store.throttle.max_bytes_per_sec可以在節點級或者index級限制merge操作消耗的磁盤帶寬，防止因為merge導致磁盤高負載，影響其他操作
另一篇關於ES2.x index調優的文章里講到，如果不關心查詢的性能，可以把index.store.throttle.type設為none，意為不對merge操作限速
這個參數默認配置是針對merge操作限制使用磁盤帶寬20MBps

4.indices.store.throttle.max_bytes_per_sec:"100mb" （使用的是SSD）

es的默認設置對此是有考慮的：不想讓search的性能受到merge的影響，但是在一些情況下（比如ssd，或者index-heavy），限制設置的上限確實有些低。默認是20mb/s，對於splnning disk而言，這個默認值非常合理。如果是ssd disk，可以考慮增加到100-200mb/s

5.index.merge.scheduler.max_thread_count: 1 （使用的是機械磁盤而非 SSD）

優化點：減少並發並發merge對磁盤的消耗
index由多個shard組成，每個shard又分成很多segment，segment是index數據存儲的最小單位
執行索引操作時，ES會先生成小的segment
segment比較多的時候會影響搜索性能（要查詢很多segment），ES有離線的邏輯對小的segment進行合並，優化查詢性能。但是合並過程中會消耗較多磁盤IO，會影響查詢性能
index.merge.scheduler.max_thread_count控制並發的merge線程數，如果存儲是並發性能較好的SSD，可以用系統默認的max(1, min(4, availableProcessors / 2))，普通磁盤的話設為1
總的一段解說：
- - segment merge操作非常昂貴，會吃掉大量的disk io。merge操作是在后台被調度，因為執行時間會比較久，尤其是比較大的segment。這樣是合理的，因為大segment的merge操作是相對較少的。
    
    但是有時候merge操作會落后與index的攝入量。如果出現這種情況，es會自動限制index request只用一個線程。這會防止segment爆炸（在被merge前出現了大量的segment）。es在發現merge操作落后與index的攝入量的時候，日志中會出現以“now throttling indexing”開頭的INFO日志。es的默認設置對此是有考慮的：不想讓search的性能受到merge的影響，但是在一些情況下（比如ssd，或者index-heavy），限制設置的上限確實有些低。默認是20mb/s，對於splnning disk而言，這個默認值非常合理。如果是ssd disk，可以考慮增加到100-200mb/s，對應設置項為indices.store.throttle.max_bytes_per_sec:"100mb"，逐個測試哪一個具體的數值更合適一些。如果是在做buck操作而無需考慮search性能的情況狂下，你當然想要讓index的性能達到disk的極限了，因此設置indices.store.throttle.type:"none"。完全禁止merge。當index結束后，重新開啟即可。如果是在使用spinning disk而非ssd，在elasticsearch.yml中增加以下設置：index.merge.scheduler.max_thread_count：1。Spinning disk在處理並發io的時候較慢，因此需要減少訪問disk的並發線程數。上述設置將會允許3個thread。如果是ssd的話，可以忽略，默認的設置工作的就很好，Math.min(3,Runtime.getRuntime().availableProcessors()/2)。最后，調整index.translog.flush_threshold_size，默認200mb，比如增加到1gb。這樣會使得大的segment在flush之前先在內存中聚合。通過生成更大的segment，減少flush的頻率，同時較少了merge的頻率。
    
    以上所有的措施都在致力於較少disk io，以提升index性能。

6. indices.memory.index_buffer_size: "20%"

優化點：降低被動寫磁盤的可能性
該配置項指定了用於索引操作的內存大小，索引的結果先存在內存中，緩存空間滿了的話，緩存的內容會以segment為單位寫到磁盤。顯然，增大緩存空間大小可以降低被動寫磁盤的頻率

7. `index.number_of_replicas: 0`

如果你在做大批量導入，文檔在復制的時候，整個文檔內容都被發往副本節點，然后逐字的把索引過程重復一遍。這意味着每個副本也會執行分析、索引以及可能的合並過程。相反，如果你的索引是零副本，然后在寫入完成后再開啟副本，恢復過程本質上只是一個字節到字節的網絡傳輸。相比重復索引過程，這個算是相當高效的了。

8. `線程池優化略`

Elasticsearch常用配置及性能參數

cluster.name: estest 集群名稱
node.name: “testanya” 節點名稱

node.master: false 是否主節點
node.data: true 是否存儲數據

index.store.type: niofs 讀寫文件方式
index.cache.field.type: soft 緩存類型

bootstrap.mlockall: true 禁用swap

gateway.type: local 本地存儲

gateway.recover_after_nodes: 3 3個數據節點開始恢復

gateway.recover_after_time: 5m 5分鍾后開始恢復數據

gateway.expected_nodes: 4 4個es節點開始恢復

cluster.routing.allocation.node_initial_primaries_recoveries:8 並發恢復分片數
cluster.routing.allocation.node_concurrent_recoveries:2 同時recovery並發數

indices.recovery.max_bytes_per_sec: 250mb 數據在節點間傳輸最大帶寬
indices.recovery.concurrent_streams: 8 同時讀取數據文件流線程

discovery.zen.ping.multicast.enabled: false 禁用多播
discovery.zen.ping.unicast.hosts:[“192.168.169.11:9300”, “192.168.169.12:9300”]

discovery.zen.fd.ping_interval: 10s 節點間存活檢測間隔
discovery.zen.fd.ping_timeout: 120s 存活超時時間
discovery.zen.fd.ping_retries: 6 存活超時重試次數

http.cors.enabled: true 使用監控

index.analysis.analyzer.ik.type:”ik” ik分詞

thread pool setting

threadpool.index.type: fixed 寫索引線程池類型
threadpool.index.size: 64 線程池大小（建議2~3倍cpu數）
threadpool.index.queue_size: 1000 隊列大小

threadpool.search.size: 64 搜索線程池大小
threadpool.search.type: fixed 搜索線程池類型
threadpool.search.queue_size: 1000 隊列大小

threadpool.get.type: fixed 取數據線程池類型
threadpool.get.size: 32 取數據線程池大小
threadpool.get.queue_size: 1000 隊列大小

threadpool.bulk.type: fixed 批量請求線程池類型
threadpool.bulk.size: 32 批量請求線程池大小
threadpool.bulk.queue_size: 1000 隊列大小

threadpool.flush.type: fixed 刷磁盤線程池類型
threadpool.flush.size: 32 刷磁盤線程池大小
threadpool.flush.queue_size: 1000 隊列大小

indices.store.throttle.type: merge
indices.store.throttle.type: none 寫磁盤類型
indices.store.throttle.max_bytes_per_sec:500mb 寫磁盤最大帶寬

index.merge.scheduler.max_thread_count: 8 索引merge最大線程數
index.translog.flush_threshold_size:600MB 刷新translog文件閥值

cluster.routing.allocation.node_initial_primaries_recoveries:8 並發恢復分片數
cluster.routing.allocation.node_concurrent_recoveries:2 同時recovery並發數

使用bulk API 增加入庫速度
初次索引的時候，把 replica 設置為 0

增大 threadpool.index.queue_size 1000
增大 indices.memory.index_buffer_size: 20%
index.translog.durability: async –這個可以異步寫硬盤，增大寫的速度
增大 index.translog.flush_threshold_size: 600MB
增大 index.translog.flush_threshold_ops: 500000

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{
    "transient" : 
        {
          "index.indexing.slowlog.threshold.index.warn": "10s",
            "index.indexing.slowlog.threshold.index.info": "5s",
            "index.indexing.slowlog.threshold.index.debug": "2s",
            "index.indexing.slowlog.threshold.index.trace": "500ms",
            "index.indexing.slowlog.level": "info",
            "index.indexing.slowlog.source": "1000",
            "indices.memory.index_buffer_size": "20%"
        }

}'

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{
    "transient" : 
        {
          "index.search.slowlog.threshold.query.warn": "10s",
        "index.search.slowlog.threshold.query.info": "5s",
        "index.search.slowlog.threshold.query.debug": "2s",
        "index.search.slowlog.threshold.query.trace": "500ms",
        "index.search.slowlog.threshold.fetch.warn": "1s",
        "index.search.slowlog.threshold.fetch.info": "800ms",
        "index.search.slowlog.threshold.fetch.debug": "500ms",
        "index.search.slowlog.threshold.fetch.trace": "200ms"
        }

}'

–節點下線時，把所有后綴為 -2的從集群中排除

curl -XPUT   http://127.0.0.1:9200/_cluster/settings
{ "transient" : 
      {"cluster.routing.allocation.enable" : "all",            "cluster.routing.allocation.exclude._name":".*-2"}
  }

curl -XPUT ip:9200/_cluster/settings -d
'{
    "transient": {
        "logger.discover": "DEBUG" 
    }
    "persistent": {
        "discovery.zen.minimum_master_nodes": 2
    }
}'

—批量指定節點下線

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{ "transient": { "cluster.routing.allocation.exclude._name": "atest11-2,atest12-2,anatest13-2,antest14-2" } }'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{ "transient": { "cluster.routing.allocation.exclude._name": "test_aa73_2,test_aa73" } }'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{ "transient": { "cluster.routing.allocation.exclude._name": "" } }'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{ "transient": { "cluster.routing.allocation.cluster_concurrent_rebalance": 10 } }'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{ "transient": { "indices.store.throttle.type": "none", "index.store.type": "niofs", "index.cache.field.type": "soft", "indices.store.throttle.max_bytes_per_sec": "500mb", "index.translog.flush_threshold_size": "600MB", "threadpool.flush.type": "fixed", "threadpool.flush.size": 32, "threadpool.flush.queue_size": 1000 } }'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d '{ "transient": { "index.indexing.slowlog.level": "warn" } }'

shard的移動
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands" : [ {
        "move" :
            {
              "index" : "test_aa_20160529", "shard" : 4,
              "from_node" : "node1", "to_node" : "node2"
            }
        },
        {
          "allocate" : {
              "index" : "test", "shard" : 1, "node" : "node3"
          }
        }
    ]
}'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d ' { "transient": { "logger.indices.recovery": "DEBUG" } }'

curl -XPUT 127.0.0.1:9200/_cluster/settings -d ' { "transient": { "cluster.routing.allocation.node_initial_primaries_recoveries": "100" } }'

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{ "transient" : { "indices.memory.index_buffer_size": "20%" } }'

curl -XPOST '127.0.0.1:9200/_cluster/settings' -d '{ "transient" : { "index.indexing.slowlog.level" : "info" } }'

參考：

https://www.elastic.co/guide/cn/elasticsearch/guide/current/indexing-performance.html

http://www.voidcn.com/article/p-cpddjjyf-kv.html

http://www.voidcn.com/article/p-kehaizma-mb.html

http://www.voidcn.com/article/p-ypzfonym-bcq.html

http://www.voidcn.com/article/p-ubmbspny-od.html

http://www.voidcn.com/article/p-bwwyyoyx-mc.html

https://cloud.tencent.com/developer/article/1511890

https://www.it610.com/article/1280062656044089344.htm

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 es集群的調優2 ES搜索結果調優 sqlserver調優-索引 ES的性能調優技巧 ES 基礎理論配置調優數據庫索引調優技巧 Elasticsearch索引和查詢性能調優 MYSQL性能調優: 對聚簇索引和非聚簇索引的認識索引調優第三篇：索引統計 SQL Server調優系列進階篇（如何索引調優）