對於這個問題,大部分人出現在這個地方:
Client client = new TransportClient(settings).addTransportAddress(new InetSocketTransportAddress("172.16.2.13", 9300));
問題在於前面初始化settings時給cluster設置了個新的名字,如:Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "tonsonmiao").build();
因為如果設置clustername后,容器會在添加transportaddress時,從集群名為tonsonmiao里查找是否有要設置的這個IP和端口,此時肯定找不到,所以會報這個錯。
但是我今天又遇到這個問題,而在此之前一切正常,也就是說並不會因為代碼的錯誤導致這個問題出現,這幾天修改了下網關,可以訪問這台服務器,於是我覺得可能是網絡的變動,導致es出了問題,進入服務器查看:
[root@es elasticsearch-1.4.2]# ps -eaf | grep java
root 8782 8752 0 05:53 pts/1 00:00:00 grep java
root 27251 1 1 2015 ? 1-10:15:49 /usr/bin/java -Xms256m -Xmx1g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Delasticsearch -Des.path.home=/opt/elasticsearch-1.4.2 -cp :/opt/elasticsearch-1.4.2/lib/elasticsearch-1.4.2.jar:/opt/elasticsearch-1.4.2/lib/*:/opt/elasticsearch-1.4.2/lib/sigar/* org.elasticsearch.bootstrap.Elasticsearch
[root@es elasticsearch-1.4.2]# netstat -anp | grep 9300
tcp 0 0 10.18.7.97:9300 10.17.3.96:51633 SYN_RECV -
tcp 0 0 10.18.7.97:9300 10.17.3.96:51635 SYN_RECV -
tcp 0 0 :::9300 :::* LISTEN 27251/java
tcp 0 0 ::ffff:10.18.7.97:41035 ::ffff:10.18.7.97:9300 ESTABLISHED 27251/java
tcp 0 0 ::ffff:10.18.7.97:9300 ::ffff:10.18.7.97:41030 ESTABLISHED 27251/java
tcp 0 0 ::ffff:10.18.7.97:41033 ::ffff:10.18.7.97:9300 ESTABLISHED 27251/java
tcp 0 0 ::ffff:10.18.7.97:41037 ::ffff:10.18.7.97:9300 ESTABLISHED 27251/java
tcp 0 0 ::ffff:10.18.7.97:41036 ::ffff:10.18.7.97:9300 ESTABLISHED 27251/java
tcp 0 0 ::ffff:10.18.7.97:41026 ::ffff:10.18.7.97:9300 ESTABLISHED 27251/java
tcp 0 0 ::ffff:10.18.7.97:9300 ::ffff:10.18.7.97:41025 ESTABLISHED 27251/java
可以看到es監聽的ip地址發生了變化,對於多網卡的情況下,es默認會綁定其中任意一張網卡,如果es中間出現問題自動修復,那么會隨機修改綁定網卡,導致節點被t出集群,曾經我在某銀行系統,就遇到這個問題,某幾個節點不定時的被t出集群。
對於這個問題的解決,需要修改es配置:
# Set the bind address specifically (IPv4 or IPv6):
#
network.bind_host: 10.18.7.97
# Set the address other nodes will use to communicate with this node. If not
# set, it is automatically derived. It must point to an actual IP address.
#
network.publish_host: 10.18.7.97
# Set both 'bind_host' and 'publish_host':
#
network.host: 10.18.7.97
強制指明ip地址即可