背景:
一台controller node,一台compute1節點
兩台機器的host文件均已經進行hostname解析
兩節點本已經加入了同一rabbitmq cluster
但controller node因為服務原因,還原至裸機狀態,在yum安裝rabbitmq-server.service之后,存在compute1節點無法加入到controller rabbitmq cluster的異常
相關異常如下
[root@compute1 ~]# rabbitmqctl join_cluster rabbit@controller Clustering node rabbit@compute1 with rabbit@controller ... Error: {cannot_start_mnesia, {{shutdown,{failed_to_start_child,mnesia_kernel_sup,killed}}, {mnesia_sup,start,[normal,[]]}}} [root@compute1 ~]# rabbitmqctl start_app Starting node rabbit@compute1 ... BOOT FAILED =========== Error description: {error,{inconsistent_cluster,"Node rabbit@compute1 thinks it's clustered with node rabbit@controller, but rabbit@controller disagrees"}} Log files (may contain more information): /var/log/rabbitmq/rabbit@compute1.log /var/log/rabbitmq/rabbit@compute1-sasl.log Stack trace: [{rabbit_mnesia,check_cluster_consistency,0, [{file,"src/rabbit_mnesia.erl"},{line,598}]}, {rabbit,'-start/0-fun-0-',0,[{file,"src/rabbit.erl"},{line,260}]}, {rabbit,start_it,1,[{file,"src/rabbit.erl"},{line,296}]}, {rpc,'-handle_call_call/6-fun-0-',5,[{file,"rpc.erl"},{line,206}]}] Error: {error,{inconsistent_cluster,"Node rabbit@compute1 thinks it's clustered with node rabbit@controller, but rabbit@controller disagrees"}}
其中報錯說明是compute1 node認為controller node節點是其cluster,但是controller並不是
同時還有如下的error報錯
[root@compute1 ~]# rabbitmqctl join_cluster rabbit@controller
Clustering node rabbit@compute1 with rabbit@controller ...
Error: {cannot_start_mnesia,
{{shutdown,{failed_to_start_child,mnesia_kernel_sup,killed}},
{mnesia_sup,start,[normal,[]]}}}
因為controller node是新安裝,其icook信息也復制過去。compute1 node也執行stop_app,故應該推測應該是compute1 node之前殘留的cluster信息,導致認證失敗
在網上查詢到因為mnesia的信息殘留,故會認證失敗。
其目錄為/var/lib/rabbitmq/mnesia
mv /var/lib/rabbitmq/mnesia /tmp
然后再將controller節點的icook節點scp至compute1節點
重新使用 rabbitmqctl join_cluster rabbit@controller
完成cluster的加入
日常很難遇到,但在實驗環境中很容易遇到,特此記錄,以備后需