Elasticsearch之優化

本文轉載自查看原文 2017-02-28 14:38 25369 ELK（Elasticsearch/Logstash/Kibana）概念學習系列

為什么es需要優化？

　　答：

[root@master elasticsearch-2.4.0]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 6661
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 6661
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[root@master elasticsearch-2.4.0]# ulimit -n 32000
[root@master elasticsearch-2.4.0]# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 6661
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 32000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 6661
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[root@master elasticsearch-2.4.0]#

　　es集群的3節點，每個機器都要去設置。master、slave1和slave2都要去操作。

怎么來做好es的優化工作？

途徑1、解決es啟動的警告信息【或者es中Too many open files的問題】

　　max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]

　　vi /etc/security/limits.conf 添加下面兩行

　　* soft nofile 65536

　　* hard nofile 131072

即，意思是把它們調大，重啟es服務進程，就生效了。

途徑2、修改配置文件調整ES的JVM內存大小

　　修改bin/elasticsearch.in.sh中ES_MIN_MEM和ES_MAX_MEM的大小，建議設置一樣大，避免頻繁的分配內存，根據服務器內存大小，一般分配60%左右(默認256M)

　　注意：內存最大不要超過32G【詳情請看如下的截圖和文字說明】

　　一旦你越過這個神奇的32 GB邊界，指針會切換回普通對象指針.。每個指針的大小增加，使用更多的CPU內存帶寬。事實上，你使用40~50G的內存和使用32G的內存效果是一樣的。

鏈接：https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html#compressed_oops

Don’t Cross 32 GB!
There is another reason to not allocate enormous heaps to Elasticsearch. As it turns out, the HotSpot JVM uses a trick to compress object pointers when heaps are less than around 32 GB.
In Java, all objects are allocated on the heap and referenced by a pointer. Ordinary object pointers (OOP) point at these objects, and are traditionally the size of the CPU’s native word: either 32 bits or 64 bits, depending on the processor. The pointer references the exact byte location of the value.
For 32-bit systems, this means the maximum heap size is 4 GB. For 64-bit systems, the heap size can get much larger, but the overhead of 64-bit pointers means there is more wasted space simply because the pointer is larger. And worse than wasted space, the larger pointers eat up more bandwidth when moving values between main memory and various caches (LLC, L1, and so forth).
Java uses a trick called compressed oops to get around this problem. Instead of pointing at exact byte locations in memory, the pointers reference object offsets. This means a 32-bit pointer can reference four billion objects, rather than four billion bytes. Ultimately, this means the heap can grow to around 32 GB of physical size while still using a 32-bit pointer.
Once you cross that magical ~32 GB boundary, the pointers switch back to ordinary object pointers. The size of each pointer grows, more CPU-memory bandwidth is used, and you effectively lose memory. In fact, it takes until around 40–50 GB of allocated heap before you have the same effective memory of a heap just under 32 GB using compressed oops.
The moral of the story is this: even when you have memory to spare, try to avoid crossing the 32 GB heap boundary. It wastes memory, reduces CPU performance, and makes the GC struggle with large heaps.

　　注意：是每個es實例不要超過32G，而不是所有的。

[hadoop@master bin]$ pwd
/home/hadoop/app/elasticsearch-2.4.0/bin [hadoop@master bin]$ ll total 324 -rwxr-xr-x 1 hadoop hadoop 5551 Aug 24 2016 elasticsearch -rw-rw-r-- 1 hadoop hadoop 909 Aug 24 2016 elasticsearch.bat -rw-rw-r-- 1 hadoop hadoop 3307 Aug 24 2016 elasticsearch.in.bat -rwxr-xr-x 1 hadoop hadoop 2814 Aug 24 2016 elasticsearch.in.sh -rw-rw-r-- 1 hadoop hadoop 104448 Jul 27 2016 elasticsearch-service-mgr.exe -rw-rw-r-- 1 hadoop hadoop 103936 Jul 27 2016 elasticsearch-service-x64.exe -rw-rw-r-- 1 hadoop hadoop 80896 Jul 27 2016 elasticsearch-service-x86.exe -rwxr-xr-x 1 hadoop hadoop 2992 Aug 24 2016 plugin -rw-rw-r-- 1 hadoop hadoop 1303 Aug 24 2016 plugin.bat -rw-rw-r-- 1 hadoop hadoop 6872 Aug 24 2016 service.bat [hadoop@master bin]$ vim elasticsearch.in.sh

　　大家，自行去，根據自己機器內存實情，設置為其60%。

途徑3、設置memory_lock來鎖定進程的物理內存地址

　　避免交換（swapped）來提高性能

　　修改文件conf/elasticsearch.yml

　　bootstrap.memory_lock: true

　　這里，我就不贅述了。

[hadoop@master config]$ pwd
/home/hadoop/app/elasticsearch-2.4.0/config [hadoop@master config]$ ll total 12 -rw-rw-r-- 1 hadoop hadoop 3393 Jul 5 22:19 elasticsearch.yml -rw-rw-r-- 1 hadoop hadoop 2571 Aug 24 2016 logging.yml drwxrwxr-x 2 hadoop hadoop 4096 Apr 21 15:43 scripts [hadoop@master config]$ vim elasticsearch.yml

　　　去掉注釋。

　　es的3節點集群，master、slave1和slave2都要去操作。

途徑4、分片多的話，可以提升建立索引的能力，5-20個比較合適。

　　如果分片數過少或過多，都會導致檢索比較慢。

　　分片數過多會導致檢索時打開比較多的文件，另外也會導致多台服務器之間通訊。

　　而分片數過少會導至單個分片索引過大，所以檢索速度也會慢。

　　建議單個分片最多存儲20G左右的索引數據，所以，分片數量=數據總量/20G

途徑5、副本多的話，可以提升搜索的能力，但是如果設置很多副本的話也會對服務器造成額外的壓力，因為需要主分片需要給所有副本同步數據。所以建議最多設置1-2個即可。

途徑6、Elastic 官方文檔建議：一個 es實例中最好不要多於三個 shards，若是 "more shards”，只能增加更多的機器，如果服務器性能好的話可以在一台服務器上啟動多個es實例

途徑7、要定時對索引進行合並優化，不然segment越多，占用的segment memory越多，查詢的性能也越差

　　索引量不是很大的話情況下可以將segment設置為1

　　在es2.1.0以前調用_optimize接口，后期改為_forcemerge接口

　　curl -XPOST 'http://localhost:9200/zhouls/_forcemerge?max_num_segments=1'

　　client.admin().indices().prepareForceMerge("zhouls").setMaxNumSegments(1).get();

途徑8、針對不使用的index，建議close，減少內存占用。因為只要索引處於open狀態，索引庫中的segement就會占用內存，close之后就只會占用磁盤空間了。

curl -XPOST 'localhost:9200/zhouls/_close'

途徑9、刪除文檔：在es中刪除文檔，數據不會馬上在硬盤上除去，而是在es索引中產生一個.del的文件，而在檢索過程中這部分數據也會參與檢索，es在檢索過程會判斷是否刪除了，如果刪除了在過濾掉。這樣也會降低檢索效率。所以可以執行清除刪除文檔

curl -XPOST 'http://192.168.80.10:9200/zhouls/_forcemerge?only_expunge_deletes=true'

client.admin().indices().prepareForceMerge("zhouls").setOnlyExpungeDeletes(true).get();

途徑10、如果在項目開始的時候需要批量入庫大量數據的話，建議將副本數設置為0

　　因為es在索引數據的時候，如果有副本存在，數據也會馬上同步到副本中，這樣會對es增加壓力。可以等索引完成后將副本按需要改回來。這樣可以提高索引效率。

途徑11、去掉mapping中_all字段，Index中默認會有_all這個字段，默認會把所有字段的內容都拷貝到這一個字段里面，這樣會給查詢帶來方便，但是會增加索引時間和索引尺寸。

　　禁用_all字段 "_all":{"enabled":false}

　　如果只是某個字段不希望被加到_all中，可以使用 "include_in_all":false

途徑12、log輸出的水平默認為trace，即查詢超過500ms即為慢查詢，就要打印日志，造成cpu和mem，io負載很高。把log輸出水平改為info，可以減輕服務器的壓力。

　　修改ES_HOME/conf/logging.yaml文件

途徑1:可以解決es的警告信息

　　其實啊，若我們在ES_HOME目錄下，不用后台bin/elasticsearch -d這種方式來啟動的話，用前台bin/elasticsearch。則會看到，如下：

　　說明，我這里是因為安裝了tomcat。所以，在前台直接啟動，會出錯。

　所以，

[hadoop@HadoopMaster bin]$ pwd
/home/hadoop/app/tomcat-7.0.73/bin
[hadoop@HadoopMaster bin]$ ./startup.sh
Using CATALINA_BASE: /home/hadoop/app/tomcat-7.0.73
Using CATALINA_HOME: /home/hadoop/app/tomcat-7.0.73
Using CATALINA_TMPDIR: /home/hadoop/app/tomcat-7.0.73/temp
Using JRE_HOME: /home/hadoop/app/jdk1.7.0_79/jre
Using CLASSPATH: /home/hadoop/app/tomcat-7.0.73/bin/bootstrap.jar:/home/hadoop/app/tomcat-7.0.73/bin/tomcat-juli.jar
Tomcat started.
[hadoop@HadoopMaster bin]$ jps
2916 Jps
2906 Bootstrap
[hadoop@HadoopMaster bin]$ cd ..
[hadoop@HadoopMaster tomcat-7.0.73]$ cd ..
[hadoop@HadoopMaster app]$ cd elasticsearch-2.4.3/
[hadoop@HadoopMaster elasticsearch-2.4.3]$ bin/elasticsearch
[2017-02-28 22:08:49,862][WARN ][bootstrap ] unable to install syscall filter: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
[2017-02-28 22:08:51,324][INFO ][node ] [Dragonwing] version[2.4.3], pid[2930], build[d38a34e/2016-12-07T16:28:56Z]
[2017-02-28 22:08:51,324][INFO ][node ] [Dragonwing] initializing ...
[2017-02-28 22:08:55,760][INFO ][plugins ] [Dragonwing] modules [lang-groovy, reindex, lang-expression], plugins [analysis-ik, kopf, head], sites [kopf, head]
[2017-02-28 22:08:55,846][INFO ][env ] [Dragonwing] using [1] data paths, mounts [[/home (/dev/sda5)]], net usable_space [23.4gb], net total_space [26.1gb], spins? [possibly], types [ext4]
[2017-02-28 22:08:55,846][INFO ][env ] [Dragonwing] heap size [1015.6mb], compressed ordinary object pointers [true]
[2017-02-28 22:08:55,848][WARN ][env ] [Dragonwing] max file descriptors [4096] for elasticsearch process likely too low, consider increasing to at least [65536]
[2017-02-28 22:09:00,957][INFO ][ik-analyzer ] try load config from /home/hadoop/app/elasticsearch-2.4.3/config/analysis-ik/IKAnalyzer.cfg.xml
[2017-02-28 22:09:00,959][INFO ][ik-analyzer ] try load config from /home/hadoop/app/elasticsearch-2.4.3/plugins/ik/config/IKAnalyzer.cfg.xml
[2017-02-28 22:09:01,925][INFO ][ik-analyzer ] [Dict Loading] custom/mydict.dic
[2017-02-28 22:09:01,926][INFO ][ik-analyzer ] [Dict Loading] custom/single_word_low_freq.dic
[2017-02-28 22:09:01,932][INFO ][ik-analyzer ] [Dict Loading] custom/zhouls.dic
[2017-02-28 22:09:01,933][INFO ][ik-analyzer ] [Dict Loading] http://192.168.80.10:8081/zhoulshot.dic
[2017-02-28 22:09:09,451][INFO ][ik-analyzer ] 好記性不如爛筆頭感嘆號博客園熱更新詞
[2017-02-28 22:09:09,550][INFO ][ik-analyzer ] 桂林不霧霾
[2017-02-28 22:09:09,615][INFO ][ik-analyzer ] [Dict Loading] custom/ext_stopword.dic
[2017-02-28 22:09:13,620][INFO ][node ] [Dragonwing] initialized
[2017-02-28 22:09:13,621][INFO ][node ] [Dragonwing] starting ...
[2017-02-28 22:09:13,932][INFO ][transport ] [Dragonwing] publish_address {192.168.80.10:9300}, bound_addresses {[::]:9300}
[2017-02-28 22:09:13,960][INFO ][discovery ] [Dragonwing] elasticsearch/eKzsH0g5QoGl6pQlCG4mOQ
[2017-02-28 22:09:17,357][INFO ][cluster.service ] [Dragonwing] detected_master {Carrie Alexander}{98-Mux6mQsu1oE__EJN7yQ}{192.168.80.11}{192.168.80.11:9300}, added {{Carrie Alexander}{98-Mux6mQsu1oE__EJN7yQ}{192.168.80.11}{192.168.80.11:9300},{Shocker}{u_IYMF3ISe6_iki9KwxPCA}{192.168.80.12}{192.168.80.12:9300},}, reason: zen-disco-receive(from master [{Carrie Alexander}{98-Mux6mQsu1oE__EJN7yQ}{192.168.80.11}{192.168.80.11:9300}])
[2017-02-28 22:09:17,637][INFO ][http ] [Dragonwing] publish_address {192.168.80.10:9200}, bound_addresses {[::]:9200}
[2017-02-28 22:09:17,638][INFO ][node ] [Dragonwing] started
[2017-02-28 22:09:19,812][INFO ][ik-analyzer ] 重新加載詞典...
[2017-02-28 22:09:19,816][INFO ][ik-analyzer ] try load config from /home/hadoop/app/elasticsearch-2.4.3/config/analysis-ik/IKAnalyzer.cfg.xml
[2017-02-28 22:09:19,820][INFO ][ik-analyzer ] try load config from /home/hadoop/app/elasticsearch-2.4.3/plugins/ik/config/IKAnalyzer.cfg.xml
[2017-02-28 22:09:23,102][WARN ][monitor.jvm ] [Dragonwing] [gc][young][8][7] duration [1.6s], collections [1]/[1.9s], total [1.6s]/[5.2s], memory [121.7mb]->[79.4mb]/[1015.6mb], all_pools {[young] [59.9mb]->[457kb]/[66.5mb]}{[survivor] [8.2mb]->[8.3mb]/[8.3mb]}{[old] [53.5mb]->[70.6mb]/[940.8mb]}
[2017-02-28 22:09:23,946][INFO ][ik-analyzer ] [Dict Loading] custom/mydict.dic
[2017-02-28 22:09:23,947][INFO ][ik-analyzer ] [Dict Loading] custom/single_word_low_freq.dic
[2017-02-28 22:09:23,953][INFO ][ik-analyzer ] [Dict Loading] custom/zhouls.dic
[2017-02-28 22:09:23,955][INFO ][ik-analyzer ] [Dict Loading] http://192.168.80.10:8081/zhoulshot.dic
[2017-02-28 22:09:23,996][INFO ][ik-analyzer ] 好記性不如爛筆頭感嘆號博客園熱更新詞
[2017-02-28 22:09:23,997][INFO ][ik-analyzer ] 桂林不霧霾
[2017-02-28 22:09:24,000][INFO ][ik-analyzer ] [Dict Loading] custom/ext_stopword.dic
[2017-02-28 22:09:24,002][INFO ][ik-analyzer ] 重新加載詞典完畢...

　　更詳細，es的前台和后台啟動，請移步

Elasticsearch之啟動（前台和后台）

　　怎么做，如下：

后續更新

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Elasticsearch 優化 elasticsearch(六) 之 elasticsearch優化 ElasticSearch插入性能優化 Elasticsearch各種優化操作【ElasticSearch】查詢優化 elasticsearch 性能優化 Elasticsearch性能優化干貨 Elasticsearch查詢性能優化 ElasticSearch性能優化策略 ELASTICSEARCH 讀寫性能優化