Elasticsearch 集群
基於以下環境進行搭建:
CentOS Linux release 7.6.1810 (Core)
elasticsearch-7.6.2-x86_64.rpm
ip | 主機名 |
---|---|
192.168.1.1 | es1 |
192.168.1.2 | es2 |
192.168.1.3 | es3 |
安裝
# 3台都操作
wget https://mirrors.huaweicloud.com/elasticsearch/7.6.2/elasticsearch-7.6.2-x86_64.rpm
rpm -ivh elasticsearch-7.6.2-x86_64.rpm
配置
目錄統一在:/data/elasticsearch
/etc/elasticsearch/jvm.options
# 3台都操作
[root@es1 elasticsearch]# egrep -v "^#|^$" /etc/elasticsearch/jvm.options
-Xms16g
-Xmx16g
8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly
14-:-XX:+UseG1GC
14-:-XX:G1ReservePercent=25
14-:-XX:InitiatingHeapOccupancyPercent=30
-Djava.io.tmpdir=${ES_TMPDIR}
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/data/elasticsearch
-XX:ErrorFile=/data/elasticsearch/log/hs_err_pid%p.log
8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:/data/elasticsearch/log/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m
9-:-Xlog:gc*,gc+age=trace,safepoint:file=/data/elasticsearch/log/gc.log:utctime,pid,tags:filecount=32,filesize=64m
/etc/elasticsearch/elasticsearch.yml
[root@es1 elasticsearch]# egrep -v "^#|^$" /etc/elasticsearch/elasticsearch.yml
cluster.name: smy
# es2主機就改為es2
node.name: es1
path.data: /data/elasticsearch
path.logs: /data/elasticsearch/log
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["192.168.1.1", "192.168.1.2", "192.168.1.3"]
cluster.initial_master_nodes: ["es1", "es2", "es3"]
# es-head用的
http.cors.enabled: true
http.cors.allow-origin: "*"
啟動
systemctl start elasticsearch.service
systemctl status elasticsearch.service
檢測
# 主要關注 status 是不是 green
curl -X GET "127.0.0.1:9200/_cat/health?v"
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1589273020 08:43:40 smy green 3 3 769 769 0 0 0 0 - 100.0%
elasticsearch-head 安裝
elasticsearch-head 是用於監控Elasticsearch 狀態的客戶端插件,包括數據可視化、執行增刪改查操作等。
從 es 5.x 開始,不再做為 es 的插件運行,而是獨立運行
- for Elasticsearch 5.x, 6.x, and 7.x: site plugins are not supported. Run as a standalone server
安裝 node、 npm
# 安裝 node
wget https://mirrors.huaweicloud.com/nodejs/latest-v10.x/node-v10.20.1-linux-x64.tar.gz
tar zxvf node-v10.20.1-linux-x64.tar.gz -C /usr/local/
mv node* node
cat <<'EOF'> /etc/profile.d/node.sh
export NODE_HOME=/usr/local/node
export PATH=$NODE_HOME/bin:$PATH
EOF
source /etc/profile
# 安裝npm
npm install npm -g
運行 elasticsearch-head
git clone git://github.com/mobz/elasticsearch-head.git
cd elasticsearch-head
npm install
npm run start &
[root@wlj174 elasticsearch-head]# npm start &
[1] 17292
[root@wlj174 elasticsearch-head]#
> elasticsearch-head@0.0.0 start /data/elasticsearch/elasticsearch-head
> grunt server
[root@wlj174 elasticsearch-head]# (node:17302) ExperimentalWarning: The http2 module is an experimental API.
Running "connect:server" (connect) task
Waiting forever...
Started connect web server on http://localhost:19100
Prometheus 監控 elasticsearch
使用的是 https://github.com/vvanholl/elasticsearch-prometheus-exporter
官方文檔很詳細了。。。
# 安裝
./bin/elasticsearch-plugin install -b https://github.com/vvanholl/elasticsearch-prometheus-exporter/releases/download/7.6.2.0/prometheus-exporter-7.6.2.0.zip
# 重啟 es
systemctl restart elasticsearch.service
訪問 http://192.168.1.1:9200/_prometheus/metrics
prometheus.yaml
- job_name: elasticsearch
scrape_interval: 10s
metrics_path: "/_prometheus/metrics"
static_configs:
- targets:
- node1:9200
- node2:9200
- node3:9200
grafana
https://grafana.com/grafana/dashboards/266
遇到的坑
經常出現如下報錯
org.elasticsearch.cluster.block.ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
使用 elasticsearch-head 依次連接 3 個節點,發現只有 wlj174 有這個問題,看錯誤日志,發現出現了一個 172.18.0.1 的地址,發現另外兩台機器都沒有 172.18 這個網段的地址,於是將 network.host: 0.0.0.0
改為監聽本機的地址,重啟 es 后問題解決。猜測是因為 0.0.0.0 是監聽所有網卡的地址,而其他機器沒有這個地址的時候,集群就連不上了。所以多網卡的服務器設置 0.0.0.0 的時候一定要注意