Cloudera Manager 常見異常


1. Cloudera Management Service服務全部無法啟動

觀察到的現象:

(1)cm service 組件無法啟動,啟動時服務請求超時終止;(2)主機信息也無法獲取到,一直提示“無法與服務端取得聯系”(3)cm-server日志中提示Authentication failure for user: '__cloudera_internal_user__mgmt-EVENTSERVER-95d257fb4b0322939118ac4012bb8d4e' from 10.21.48.82” 組件權限認證失敗。

猜到到可能的原因:

(1)scm-agent與scm-server服務連接異常;

(2)mysql數據庫連接異常,用戶認證失敗;

cloudera-scm-server 日志信息:

2019-01-29 08:44:10,188 INFO 780911426@scm-web-776:com.cloudera.server.web.cmf.AuthenticationFailureEventListener: Authentication failure for user: '__cloudera_internal_user__mgmt-EVENTSERVER-95d257fb4b0322939118ac4012bb8d4e' from 10.21.48.82
2019-01-29 08:44:10,194 INFO 416547936@scm-web-773:com.cloudera.server.web.cmf.AuthenticationFailureEventListener: Authentication failure for user: '__cloudera_internal_user__mgmt-HOSTMONITOR-95d257fb4b0322939118ac4012bb8d4e' from 10.21.48.82
2019-01-29 08:44:11,181 INFO 416547936@scm-web-773:com.cloudera.server.web.cmf.AuthenticationFailureEventListener: Authentication failure for user: '__cloudera_internal_user__mgmt-SERVICEMONITOR-95d257fb4b0322939118ac4012bb8d4e' from 10.21.48.82

cloudera-scm-agent 日志信息:

[02/Jan/2019 16:20:21 +0000] 28617 MainThread agent        ERROR    Heartbeating to 10.21.48.82:7182 failed.
Traceback (most recent call last):
  File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.14.0-py2.6.egg/cmf/agent.py", line 1419, in _send_heartbeat
    self.master_port)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 469, in __init__
    self.conn.connect()
  File "/usr/lib64/python2.6/httplib.py", line 742, in connect
    self.timeout)
  File "/usr/lib64/python2.6/socket.py", line 567, in create_connection
    raise error, msg
error: [Errno 111] Connection refused

最后定位到了問題,是由於scm-agent連接scm-server的配置之前做過調整,導致scm-agent一直無法與scm-server取得聯系,修改scm-agent的連接信息,主要server_host和server_port都要確認下(之前修改了server_host連接還是無法正常取得聯系)。

修改scm-agent端所在的配置文件 /etc/cloudera-scm-agent/config.ini :

[General]
# Hostname of the CM server.
server_host=10.21.48.82

# Port that the CM server is listening on.
server_port=7182

 修改后,問題解決,cm service正常啟動。

Tips:定位問題要從整個系統架構層面去思考,熟悉架構的整體運行邏輯,猜測問題可能出現的環節,不要過早地陷入局部思維,然后就是一定要學會看log。

 

 

 

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM