cloudera-scm-agent dead but pid file exists


問題一:

錯誤描述:

/opt/cm-5.7.0/etc/init.d/cloudera-scm-agent status

cloudera-scm-agent dead but pid file exists

 查看日志/opt/cm-5.7.0/log/cloudera-scm-agent/cloudera-scm-agent.log:

No socket could be created on ('testintf.novalocal', 9000) -- [Errno 99] Cannot assign requested address

 

此問題主要是網絡問題

1. 

python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'命令獲取/etc/hosts文件中的IP和hostname

正規hosts如下:
127.0.0.1 localhost.xxxx localhost
111.222.333.444 aa.aa  aa
555.666.777.888 bb.bb   bb

上述命令獲取結果為 :111.111.111.111 aa.aa

此IP和ifocnfig中獲取的IP相同,(有公網和內網的同學,請自覺選擇內網ip)
hostname和hostname命令獲取的名稱一樣。


2. 同在一個內網的幾台服務器之間是相互通信的,但是使用公網IP就不可以了,所以CDH集群中需要大量的端口通信,所以在設置ocnfig.ini文件中的server_host時,選擇內網IP。




 

 

 

問題二:cm界面安裝時,agent服務不起,所在服務器不受管。導致后面agent時界面安裝的。在界面安裝中出現以下錯誤提示:

 解決辦法:

1.

python -c 'import socket; print socket.getfqdn(), socket.gethostbyname(socket.getfqdn())'

  和

hostname

  兩種方式得出的主機名不同造成的。

 

 

 

2.   Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).

   telnet  112.35.23.45 7182

   ps -ef |grep PID?  

 

 

3.  Ensure that ports 9000 and 9001 are free on the host being added.

  netstat|grep 9000

  netstat |grep 9001

 

4.  Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details)

這個目錄時agent服務起來之后才有的,如果agent 啟動失敗,則不會有。

 

 

問題三:

[22/Oct/2018 18:49:13 +0000] 3131 MainThread agent ERROR Failed! trying again in 1 second(s)
Traceback (most recent call last):
File "/opt/cm-5.7.1/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.1-py2.6.egg/cmf/agent.py", line 2161, in connect_to_new_supervisor
self.get_supervisor_process_info()
File "/opt/cm-5.7.1/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.7.1-py2.6.egg/cmf/agent.py", line 2183, in get_supervisor_process_info
self.identifier = self.supervisor_client.supervisor.getIdentification()
File "/usr/lib64/python2.6/xmlrpclib.py", line 1199, in __call__
return self.__send(self.__name, args)
File "/usr/lib64/python2.6/xmlrpclib.py", line 1489, in __request
verbose=self.__verbose
File "/opt/cm-5.7.1/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/xmlrpc.py", line 470, in request
'' )
ProtocolError: <ProtocolError for 127.0.0.1/RPC2: 401 Unauthorized>
[22/Oct/2018 18:49:13 +0000] 3131 MainThread agent ERROR Failed to connect to newly launched supervisor. Agent will exit

解決辦法:

kill 掉supervisored的進程,重啟,多試幾次就好了。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM