ubuntu12.04+Elasticsearch2.3.3偽分布式配置，集群狀態分片調整

本文轉載自查看原文 2016-06-28 13:51 2093 elasticsearch/ ubuntu/ 雲計算

1、什么是Elashticsearch

1.1 Elashticsearch介紹

　　Elasticsearch是一個基於Apache Lucene(TM)的開源搜索引擎。能夠快速搜索數十億的文件以及PB級的數據，結構化或者非結構化的數據都可以。對於大多數數據庫而言，橫向擴展意味着你的程序將做非常大的改動來利用這些新添加的設備。對比來說，Elasticsearch天生是分布式的：它知道如何管理節點來提供高擴展和高可用。這意味着你的程序不需要關心這些。

　　Elasticsearch使用Java開發並使用Lucene作為其核心來實現所有索引和搜索的功能，但是它的目的是通過簡單的RESTful API來隱藏Lucene的復雜性，從而讓全文搜索變得簡單。
　　

1.2 Elashticsearch的基礎概念

接近實時（NRT）
　　Elasticsearch是一個接近實時的搜索平台。這意味着，從索引一個文檔直到這個文檔能夠被搜索到有一個輕微的延遲（通常是1秒）。
　　
集群（cluster）
　　一個集群就是由一個或多個節點組織在一起，它們共同持有你整個的數據，並一起提供索引和搜索功能。一個集群由一個唯一的名字標識，這個名字默認就是“elasticsearch”。這個名字是重要的，因為一個節點只能通過指定某個集群的名字，來加入這個集群。在產品環境中顯式地設定這個名字是一個好習慣，但是使用默認值來進行測試/開發也是不錯的。
　　
節點（node）
　　一個節點是你集群中的一個服務器，作為集群的一部分，它存儲你的數據，參與集群的索引和搜索功能。和集群類似，一個節點也是由一個名字來標識的，默認情況下，這個名字是一個隨機的漫威漫畫角色的名字，這個名字會在啟動的時候賦予節點。這個名字對於管理工作來說挺重要的，因為在這個管理過程中，你會去確定網絡中的哪些服務器對應於Elasticsearch集群中的哪些節點。
　　一個節點可以通過配置集群名稱的方式來加入一個指定的集群。默認情況下，每個節點都會被安排加入到一個叫做“elasticsearch”的集群中，這意味着，如果你在你的網絡中啟動了若干個節點，並假定它們能夠相互發現彼此，它們將會自動地形成並加入到一個叫做“elasticsearch”的集群中。
　　
索引（index）
　　一個索引就是一個擁有幾分相似特征的文檔的集合。比如說，你可以有一個客戶數據的索引，另一個產品目錄的索引，還有一個訂單數據的索引。一個索引由一個名字來標識（必須全部是小寫字母的），並且當我們要對對應於這個索引中的文檔進行索引、搜索、更新和刪除的時候，都要使用到這個名字。
　　
類型（type）
　　在一個索引中，你可以定義一種或多種類型。一個類型是你的索引的一個邏輯上的分類/分區，其語義完全由你來定。通常，會為具有一組共同字段的文檔定義一個類型。比如說，我們假設你運營一個博客平台並且將你所有的數據存儲到一個索引中。在這個索引中，你可以為用戶數據定義一個類型，為博客數據定義另一個類型，當然，也可以為評論數據定義另一個類型。
　　
文檔（document）
　　一個文檔是一個可被索引的基礎信息單元。比如，你可以擁有某一個客戶的文檔，某一個產品的一個文檔，當然，也可以擁有某個訂單的一個文檔。文檔以JSON（Javascript Object Notation）格式來表示，而JSON是一個到處存在的互聯網數據交互格式
　　在一個index/type里面，只要你想，你可以存儲任意多的文檔。注意，盡管一個文檔，物理上存在於一個索引之中，文檔必須被索引/賦予一個索引的type。
　　
分片和復制（shards & replicas）
　　一個索引可以存儲超出單個結點硬件限制的大量數據。比如，一個具有10億文檔的索引占據1TB的磁盤空間，而任一節點都沒有這樣大的磁盤空間；或者單個節點處理搜索請求，響應太慢。
　　為了解決這個問題，Elasticsearch提供了將索引划分成多份的能力，這些份就叫做分片。當你創建一個索引的時候，你可以指定你想要的分片的數量。每個分片本身也是一個功能完善並且獨立的“索引”，這個“索引”可以被放置到集群中的任何節點上。
　　默認情況下，Elasticsearch中的每個索引被分片5個主分片和1個復制，這意味着，如果你的集群中至少有兩個節點，你的索引將會有5個主分片和另外5個復制分片（1個完全拷貝），這樣的話每個索引總共就有10個分片。
　　
　　以上的概念都清楚后，我們就可以開始完全運轉Elashticsearch了

2、運行Elasticsearch需要哪些環境

2.1、 Java運行環境安裝

　　在安裝時我們要使用root帳號

root@ubuntu1:~# sudo apt-get install python-software-properties
root@ubuntu1:~# sudo apt-get install software-properties-common
root@ubuntu1:~# sudo add-apt-repository ppa:webupd8team/java
root@ubuntu1:~# sudo apt-get update && sudo apt-get install oracle-java8-installer

　　
　　查看Java安裝的版本

root@ubuntu1:~# java -version
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)

3、Elasticsearch安裝以及插件安裝

3.1、本文中的測試機器介紹

　　分別有三台機器，后文中Elashticsearch也會部署到這三台機器上，對應的IP地址為：
　　ubuntu1 192.168.0.25
　　ubuntu2 192.168.0.26
　　ubuntu3 192.168.0.27
　　

3.2、在ubuntu1中下載Elasticsearch 2.3.3

　　這時要切換到普通帳號，本文中的測試帳號是lion：

lion@ubuntu1:~# mkdir tar
lion@ubuntu1:~# cd tar
lion@ubuntu1:~/tar# wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/zip/elasticsearch/2.3.3/elasticsearch-2.3.3.zip

3.3、在ubuntu1中安裝Elasticsearch 2.3.3

　　Elasticsearch不需要單獨的安裝，解壓以后就可以直接使用

lion@ubuntu1:~/tar# unzip elasticsearch-2.3.3.zip

3.4、在ubuntu1中安裝elasticsearch-head插件

　　elasticsearch-head是一個elasticsearch的集群管理工具,它是完全由html5編寫的獨立網頁程序。

lion@ubuntu1:~/tar/elasticsearch-2.3.3$ pwd
/home/lion/tar/elasticsearch-2.3.3
lion@ubuntu1:~/tar/elasticsearch-2.3.3$ bin/plugin install mobz/elasticsearch-headd

3.5、在ubuntu1中安裝elasticsearch-sql插件

　　elasticsearch-sql可以通過sql語句進行查詢。
　　elasticsearch-sql的官網在這里：https://github.com/NLPchina/elasticsearch-sql/

　　
elasticsearch-sql for Elasticsearch 2.3.3的在線安裝方式：

lion@ubuntu1:~/tar/elasticsearch-2.3.3$ ./bin/plugin install https://github.com/NLPchina/elasticsearch-sql/releases/download/2.3.3.0/elasticsearch-sql-2.3.3.0.zip

elasticsearch-sql for Elasticsearch 2.3.3的離線安裝方式：

　　先下載包：https://github.com/NLPchina/elasticsearch-sql/releases/download/2.3.3.0/elasticsearch-sql-2.3.3.0.zip
　　PS:我在下載的過程中需要翻牆，下載后，復制到elasticsearch-2.3.3的根目錄。

lion@ubuntu1:~/tar/elasticsearch-2.3.3$ ll
total 3868
drwxr-xr-x 7 lion lion    4096 Jun 23 15:42 ./
drwxrwxr-x 3 lion lion    4096 Jun 23 15:00 ../
drwxr-xr-x 2 lion lion    4096 May 17 15:48 bin/
drwxr-xr-x 2 lion lion    4096 May 17 15:48 config/
-rw-r--r-- 1 lion lion 3901121 Jun 23 15:42 elasticsearch-sql-2.3.3.0.zip
drwxrwxr-x 2 lion lion    4096 Jun 23 15:00 lib/
-rw-rw-r-- 1 lion lion   11358 Jan 27 12:53 LICENSE.txt
drwxrwxr-x 5 lion lion    4096 May 17 15:48 modules/
-rw-rw-r-- 1 lion lion     150 May 12 13:24 NOTICE.txt
drwxrwxr-x 3 lion lion    4096 Jun 23 15:03 plugins/
-rw-rw-r-- 1 lion lion    8700 May 12 13:24 README.textile

　　
　　執行命令進行離線安裝：

lion@ubuntu1:~/tar/elasticsearch-2.3.3$ bin/plugin install file:/home/lion/tar/elasticsearch-2.3.3/elasticsearch-sql-2.3.3.0.zip

　　
　　安裝成功后的打印信息如下：

-> Installing from file:/home/lion/tar/elasticsearch-2.3.3/elasticsearch-sql-2.3.3.0.zip...
Trying file:/home/lion/tar/elasticsearch-2.3.3/elasticsearch-sql-2.3.3.0.zip ...
Downloading .......................................DONE
Verifying file:/home/lion/tar/elasticsearch-2.3.3/elasticsearch-sql-2.3.3.0.zip checksums if available ...
NOTE: Unable to verify checksum for downloaded plugin (unable to find .sha1 or .md5 file to verify)
Installed sql into /home/lion/tar/elasticsearch-2.3.3/plugins/sql

3.6、修改ubuntu1機器上的Elasticsearch配置文件

　　更改Elasticsearch2.3.3的配置文件config/elasticsearch.yml，找到network.host這一行，修改后面的IP地址為192.168.0.25，方便內網可以訪問，保存退出以后，修改后的配置文件如下：

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 配置es的集群名稱，默認是elasticsearch，es會自動發現在同一網段下的es，如果在同一網段下有多個集群，就可以用這個屬性來區分不同的集群。
 cluster.name: idoall_org
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
# 節點名稱
 node.name: node-1
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# 指定該節點是否有資格被選舉成為node，默認是true，es是默認集群中的第一台機器為master，如果這台機掛了就會重新選舉master。
# node.master: true
#
# 指定該節點是否存儲索引數據，默認為true。
# node.data: true
#
# master和data同時配置會產生一些奇異的效果： 
#         1) 當master為false，而data為true時，會對該節點產生嚴重負荷； 
#         2) 當master為true，而data為false時，該節點作為一個協調者； 
#         3) 當master為false，data也為false時，該節點就變成了一個負載均衡器。
# ----------------------------------- Paths ------------------------------------
#
# 設置索引數據的存儲路徑，默認是es根目錄下的data文件夾，可以設置多個存儲路徑，用逗號隔開
# path.data: /path/to/data
#
# 設置日志文件的存儲路徑，默認是es根目錄下的logs文件夾
# path.logs: /path/to/logs
#
# 設置配置文件的存儲路徑，默認是es根目錄下的config文件夾。
# path.conf: /path/to/conf
#
# 設置插件的存放路徑，默認是es根目錄下的plugins文件夾
# path.plugins: /path/to/plugins
# ----------------------------------- Memory -----------------------------------
#
# 設置為true來鎖住內存。因為當jvm開始swapping時es的效率 會降低，所以要保證它不swap，可以把ES_MIN_MEM和ES_MAX_MEM兩個環境變量設置成同一個值，並且保證機器有足夠的內存分配給es。 同時也要允許elasticsearch的進程可以鎖住內存，linux下可以通過`ulimit -l unlimited`命令。
# bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# 綁定host，0.0.0.0代表所有IP，為了安全考慮，建議設置為內網IP
 network.host: 192.168.0.25
#
# 對外提供http服務的端口，安全考慮，建議修改，不用默認的9200
# http.port: 9200
#
# 節點到節點之間的交互是使用tcp的，這個設置設置啟用的端口，默認是9300-9400
# transport.tcp.port: 9300
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 設置集群中master節點的初始列表，可以通過這些節點來自動發現新加入集群的節點。
# discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
# 設置這個參數來保證集群中的節點可以知道其它N個有master資格的節點
# discovery.zen.minimum_master_nodes: 3
#
# 設置集群中自動發現其它節點時ping連接超時時間，默認為3秒，對於比較差的網絡環境可以高點的值來防止自動發現時出錯。
# discovery.zen.ping.timeout: 3s
#
# 設置是否打開多播發現節點，默認是true。
# discovery.zen.ping.multicast.enabled: false
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# 設置這個集群中節點的數量，默認為2，一旦這N個節點啟動，就會立即進行數據恢復。
# gateway.expected_nodes: 2
#
# 設置初始化數據恢復進程的超時時間，默認是5分鍾。
# gateway.recover_after_time: 5m
#
# 設置集群中N個節點啟動時進行數據恢復，默認為1。
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true

3.7、在ubuntu1中啟動Elasticsearch2.3.3

　　Elashticsearch啟動只需要執行一條命令：

lion@ubuntu1:~/tar/elasticsearch-2.3.3$ bin/elasticsearch

　　如果需要系統服務化，可以參考這篇文章使用Supervisor3.2.1基於Mac10.10.3對系統進程進行管理
　　
　　瀏覽安裝過的elasticsearch-head：http://192.168.0.25:9200/_plugin/head/
　　 idoall.org
　　
瀏覽安裝過的elasticsearch-sql：http://192.168.0.25:9200/_plugin/sql/
　　
　　

4、Elasticsearch的集群配置

4.1、修改三台機器的hosts配置

　　在上文提到的ubuntu1、ubuntu2、ubuntu3中分別修改/etc/hosts文件如下

lion@ubuntu1:~$ cat /etc/hosts
127.0.0.1	localhost
# 127.0.1.1	ubuntu1
192.168.0.25	ubuntu1
192.168.0.26	ubuntu2
192.168.0.27	ubuntu3

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

4.2、修改三台機器的Elashticsearch配置

　　將在ubuntu1上的Elasticsearch文件夾復制到ubuntu2、ubuntu3上面。
　　分別修改三台機器上的配置文件config/elasticsearch.yml。
　　
　　ubuntu1機器上面的Elasticsearch的配置文件內容如下：

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 配置es的集群名稱，默認是elasticsearch，es會自動發現在同一網段下的es，如果在同一網段下有多個集群，就可以用這個屬性來區分不同的集群。
 cluster.name: idoall_org
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
# 節點名稱
 node.name: node-1
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# 指定該節點是否有資格被選舉成為node，默認是true，es是默認集群中的第一台機器為master，如果這台機掛了就會重新選舉master。
# node.master: true
#
# 指定該節點是否存儲索引數據，默認為true。
# node.data: true
#
# master和data同時配置會產生一些奇異的效果： 
#         1) 當master為false，而data為true時，會對該節點產生嚴重負荷； 
#         2) 當master為true，而data為false時，該節點作為一個協調者； 
#         3) 當master為false，data也為false時，該節點就變成了一個負載均衡器。
# ----------------------------------- Paths ------------------------------------
#
# 設置索引數據的存儲路徑，默認是es根目錄下的data文件夾，可以設置多個存儲路徑，用逗號隔開
# path.data: /path/to/data
#
# 設置日志文件的存儲路徑，默認是es根目錄下的logs文件夾
# path.logs: /path/to/logs
#
# 設置配置文件的存儲路徑，默認是es根目錄下的config文件夾。
# path.conf: /path/to/conf
#
# 設置插件的存放路徑，默認是es根目錄下的plugins文件夾
# path.plugins: /path/to/plugins
# ----------------------------------- Memory -----------------------------------
#
# 設置為true來鎖住內存。因為當jvm開始swapping時es的效率 會降低，所以要保證它不swap，可以把ES_MIN_MEM和ES_MAX_MEM兩個環境變量設置成同一個值，並且保證機器有足夠的內存分配給es。 同時也要允許elasticsearch的進程可以鎖住內存，linux下可以通過`ulimit -l unlimited`命令。
# bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# 綁定host，0.0.0.0代表所有IP，為了安全考慮，建議設置為內網IP
 network.host: 192.168.0.25
#
# 對外提供http服務的端口，安全考慮，建議修改，不用默認的9200
# http.port: 9200
#
# 節點到節點之間的交互是使用tcp的，這個設置設置啟用的端口，默認是9300-9400
# transport.tcp.port: 9300
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 設置集群中master節點的初始列表，可以通過這些節點來自動發現新加入集群的節點。
 discovery.zen.ping.unicast.hosts: ["192.168.0.26", "192.168.0.27"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
# 設置這個參數來保證集群中的節點可以知道其它N個有master資格的節點
# discovery.zen.minimum_master_nodes: 3
#
# 設置集群中自動發現其它節點時ping連接超時時間，默認為3秒，對於比較差的網絡環境可以高點的值來防止自動發現時出錯。
# discovery.zen.ping.timeout: 3s
#
# 設置是否打開多播發現節點，默認是true。
# discovery.zen.ping.multicast.enabled: false
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# 設置這個集群中節點的數量，默認為2，一旦這N個節點啟動，就會立即進行數據恢復。
# gateway.expected_nodes: 2
#
# 設置初始化數據恢復進程的超時時間，默認是5分鍾。
# gateway.recover_after_time: 5m
#
# 設置集群中N個節點啟動時進行數據恢復，默認為1。
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true

　　
　　ubuntu2機器上面的Elasticsearch的配置文件內容如下：

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 配置es的集群名稱，默認是elasticsearch，es會自動發現在同一網段下的es，如果在同一網段下有多個集群，就可以用這個屬性來區分不同的集群。
 cluster.name: idoall_org
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
# 節點名稱
 node.name: node-2
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# 指定該節點是否有資格被選舉成為node，默認是true，es是默認集群中的第一台機器為master，如果這台機掛了就會重新選舉master。
# node.master: true
#
# 指定該節點是否存儲索引數據，默認為true。
# node.data: true
#
# master和data同時配置會產生一些奇異的效果： 
#         1) 當master為false，而data為true時，會對該節點產生嚴重負荷； 
#         2) 當master為true，而data為false時，該節點作為一個協調者； 
#         3) 當master為false，data也為false時，該節點就變成了一個負載均衡器。
# ----------------------------------- Paths ------------------------------------
#
# 設置索引數據的存儲路徑，默認是es根目錄下的data文件夾，可以設置多個存儲路徑，用逗號隔開
# path.data: /path/to/data
#
# 設置日志文件的存儲路徑，默認是es根目錄下的logs文件夾
# path.logs: /path/to/logs
#
# 設置配置文件的存儲路徑，默認是es根目錄下的config文件夾。
# path.conf: /path/to/conf
#
# 設置插件的存放路徑，默認是es根目錄下的plugins文件夾
# path.plugins: /path/to/plugins
# ----------------------------------- Memory -----------------------------------
#
# 設置為true來鎖住內存。因為當jvm開始swapping時es的效率 會降低，所以要保證它不swap，可以把ES_MIN_MEM和ES_MAX_MEM兩個環境變量設置成同一個值，並且保證機器有足夠的內存分配給es。 同時也要允許elasticsearch的進程可以鎖住內存，linux下可以通過`ulimit -l unlimited`命令。
# bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# 綁定host，0.0.0.0代表所有IP，為了安全考慮，建議設置為內網IP
 network.host: 192.168.0.26
#
# 對外提供http服務的端口，安全考慮，建議修改，不用默認的9200
# http.port: 9200
#
# 節點到節點之間的交互是使用tcp的，這個設置設置啟用的端口，默認是9300-9400
# transport.tcp.port: 9300
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 設置集群中master節點的初始列表，可以通過這些節點來自動發現新加入集群的節點。
 discovery.zen.ping.unicast.hosts: ["192.168.0.25", "192.168.0.27"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
# 設置這個參數來保證集群中的節點可以知道其它N個有master資格的節點
# discovery.zen.minimum_master_nodes: 3
#
# 設置集群中自動發現其它節點時ping連接超時時間，默認為3秒，對於比較差的網絡環境可以高點的值來防止自動發現時出錯。
# discovery.zen.ping.timeout: 3s
#
# 設置是否打開多播發現節點，默認是true。
# discovery.zen.ping.multicast.enabled: false
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# 設置這個集群中節點的數量，默認為2，一旦這N個節點啟動，就會立即進行數據恢復。
# gateway.expected_nodes: 2
#
# 設置初始化數據恢復進程的超時時間，默認是5分鍾。
# gateway.recover_after_time: 5m
#
# 設置集群中N個節點啟動時進行數據恢復，默認為1。
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true

　　
　　ubuntu3機器上面的Elasticsearch的配置文件內容如下：

# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
# 配置es的集群名稱，默認是elasticsearch，es會自動發現在同一網段下的es，如果在同一網段下有多個集群，就可以用這個屬性來區分不同的集群。
 cluster.name: idoall_org
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
# 節點名稱
 node.name: node-3
#
# Add custom attributes to the node:
#
# node.rack: r1
#
# 指定該節點是否有資格被選舉成為node，默認是true，es是默認集群中的第一台機器為master，如果這台機掛了就會重新選舉master。
# node.master: true
#
# 指定該節點是否存儲索引數據，默認為true。
# node.data: true
#
# master和data同時配置會產生一些奇異的效果： 
#         1) 當master為false，而data為true時，會對該節點產生嚴重負荷； 
#         2) 當master為true，而data為false時，該節點作為一個協調者； 
#         3) 當master為false，data也為false時，該節點就變成了一個負載均衡器。
# ----------------------------------- Paths ------------------------------------
#
# 設置索引數據的存儲路徑，默認是es根目錄下的data文件夾，可以設置多個存儲路徑，用逗號隔開
# path.data: /path/to/data
#
# 設置日志文件的存儲路徑，默認是es根目錄下的logs文件夾
# path.logs: /path/to/logs
#
# 設置配置文件的存儲路徑，默認是es根目錄下的config文件夾。
# path.conf: /path/to/conf
#
# 設置插件的存放路徑，默認是es根目錄下的plugins文件夾
# path.plugins: /path/to/plugins
# ----------------------------------- Memory -----------------------------------
#
# 設置為true來鎖住內存。因為當jvm開始swapping時es的效率 會降低，所以要保證它不swap，可以把ES_MIN_MEM和ES_MAX_MEM兩個環境變量設置成同一個值，並且保證機器有足夠的內存分配給es。 同時也要允許elasticsearch的進程可以鎖住內存，linux下可以通過`ulimit -l unlimited`命令。
# bootstrap.mlockall: true
#
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# 綁定host，0.0.0.0代表所有IP，為了安全考慮，建議設置為內網IP
 network.host: 192.168.0.27
#
# 對外提供http服務的端口，安全考慮，建議修改，不用默認的9200
# http.port: 9200
#
# 節點到節點之間的交互是使用tcp的，這個設置設置啟用的端口，默認是9300-9400
# transport.tcp.port: 9300
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 設置集群中master節點的初始列表，可以通過這些節點來自動發現新加入集群的節點。
 discovery.zen.ping.unicast.hosts: ["192.168.0.25", "192.168.0.26"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
# 設置這個參數來保證集群中的節點可以知道其它N個有master資格的節點
# discovery.zen.minimum_master_nodes: 3
#
# 設置集群中自動發現其它節點時ping連接超時時間，默認為3秒，對於比較差的網絡環境可以高點的值來防止自動發現時出錯。
# discovery.zen.ping.timeout: 3s
#
# 設置是否打開多播發現節點，默認是true。
# discovery.zen.ping.multicast.enabled: false
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# 設置這個集群中節點的數量，默認為2，一旦這N個節點啟動，就會立即進行數據恢復。
# gateway.expected_nodes: 2
#
# 設置初始化數據恢復進程的超時時間，默認是5分鍾。
# gateway.recover_after_time: 5m
#
# 設置集群中N個節點啟動時進行數據恢復，默認為1。
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true

4.3、啟動Elashticsearch集群

　　在三台機器上，分別啟動Elasticsearch,執行以下命令,以ubuntu1為例：

lion@ubuntu1:~/tar/elasticsearch-2.3.3$ pwd
/home/lion/tar/elasticsearch-2.3.3
lion@ubuntu1:~/tar/elasticsearch-2.3.3$ bin/elasticsearch

　　
　　啟動后，瀏覽任意一台機器的elasticsearch-head，都可以看到集群和節點的狀態，如下圖：
　　 idoall.org
　　

4.4、測試集群寫入數據

　　向ubuntu1中寫入一條數據：

lion@ubuntu1:~$ curl -XPUT 'http://ubuntu1:9200/dept/employee/32' -d '{ "empname": "emp32"}'

　　這時再瀏覽elasticsearch-head，可以發現索引的分配，可以看到ubuntu1、ubuntu2、ubuntu3分別有不同的主分片：
　　 idoall.org
　　
　　在任意一台機器上執行搜索命令，可以看到我們剛才插入的數據，以下命令以查詢ubuntu3機器為例，可以看到查詢結果：

lion@ubuntu1:~$ curl -XGET 'http://ubuntu3:9200/dept/employee/32'

4.5、模擬節點宕機，集群重新選擇主從節點

　　復制分片的數量，可以在運行中的集群動態調整。首先我們查詢集群狀態，我們可以看到總的分片是10個，主分片5個，三個節點。

lion@ubuntu1:~$ curl -XGET 'http://ubuntu2:9200/_cluster/health?pretty=true'
{
  "cluster_name" : "idoall_org",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

　　
　　接下來我們將dep的復制分片調整為2個：

lion@ubuntu1:~$ curl -XPUT 'http://ubuntu1:9200/dept/_settings' -d '{"number_of_replicas" : 2}'

　　返回信息如下：

{"acknowledged":true}

　　再瀏覽集群的狀態，可以發現分片重新進行了分配，主分片分配在ubuntu1和ubuntu3機器上面：
　　 idoall.org
　　
　　我們關閉掉ubuntu3上的Elasticsearch節點，可以看到以下狀態，有5個復制分片待分：
　　
　　
　　集群的狀態變為了黃色，Elasticsearch的集群狀態主要有三種顏色:green、yellow、red。

green：所有主要分片和復制分片都可用
yellow：所有主要分片可用，但不是所有復制分片都可用
red：不是所有的主要分片都可用

　　實際運行過程中，我們也可以根據集群的狀態縮小復制分片的數量，保證集群一直是高可用狀態。
　　再執行命令，將復制分片調整為1個：

lion@ubuntu1:~$ curl -XPUT 'http://ubuntu1:9200/dept/_settings' -d '{"number_of_replicas" : 1}'

　　返回信息如下：

{"acknowledged":true}

　　
　　再次瀏覽，發現集群狀態已恢復，分片也動態進行了調整：
　　 idoall.org

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Hadoop偽分布式與集群式安裝配置 hadoop偽分布式集群搭建與安裝（ubuntu系統） ElasticSearch 分布式集群 Ubuntu下偽分布式模式Hadoop的安裝及配置 ubuntu下hbase的偽分布式安裝與配置 Hadoop（二）搭建偽分布式集群 zookeeper偽分布式集群搭建搭建mongodb分布式集群(3台主機分片集群) Hadoop偽分布式配置 Hadoop 偽分布式模式配置