ceph初始monitor(s)報錯解決
由於官方文檔沒有特別說明,網上大部分ceph配置文章丟三落四。導致配置ceph初始monitor(s)時,各種報錯,本文提供了幾種解決的辦法可供參考。
執行ceph-deploy mon create-initial
報錯部分內容如下:
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph2][WARNIN] monitor: mon.ceph2, might not be running yet
[ceph2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph2][WARNIN] monitor ceph2 does not exist in monmap
[ceph2][WARNIN] neither public_addr
nor public_network
keys are defined for monitors
[ceph2][WARNIN] monitors may not be able to form quorum
注意報錯中public_network,這是由於沒有在ceph.conf中配置
解決辦法:
修改ceph.conf配置文件(此IP段根據個人情況設定),添加public_network = 192.168.1.0/24
修改后繼續執行ceph-deploy mon create-initial后,發現依舊報錯,報錯部分內容如下
[ceph3][WARNIN] provided hostname must match remote hostname
[ceph3][WARNIN] provided hostname: ceph3
[ceph3][WARNIN] remote hostname: localhost
[ceph3][WARNIN] monitors may not reach quorum and create-keys will not complete
[ceph3][WARNIN] ********************************************************************************
[ceph3][DEBUG ] deploying mon to ceph3
[ceph3][DEBUG ] get remote short hostname
[ceph3][DEBUG ] remote hostname: localhost
[ceph3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
[ceph_deploy][ERROR ] GenericError: Failed to create 3 monitors
這里看到錯誤提示/etc/ceph/ceph.conf內容不同,使用--overwrite-conf來覆蓋
命令如下:
ceph-deploy --overwrite-conf config push ceph1 ceph2 ceph3
修改后繼續執行ceph-deploy mon create-initial,發現報錯還是存在,報錯部分內容如下
[ceph3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph3][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph3 monitor is not yet in quorum, tries left: 1
[ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] ceph1
[ceph_deploy.mon][ERROR ] ceph3
[ceph_deploy.mon][ERROR ] ceph2
經過排查發現節點的hostname與/etc/hosts不符
解決辦法:修改節點hostname名稱,使其與/etc/hosts相符
節點一執行:hostnamectl set-hostname ceph1
節點二執行:hostnamectl set-hostname ceph2
節點三執行:hostnamectl set-hostname ceph3
修改后繼續執行ceph-deploy mon create-initial,mmp發現還是報錯,報錯內容又不一樣了,中間部分報錯內容如下
[ceph2][ERROR ] no valid command found; 10 closest matches:
[ceph2][ERROR ] perf dump {<logger>} {<counter>}
[ceph2][ERROR ] log reopen
[ceph2][ERROR ] help
[ceph2][ERROR ] git_version
[ceph2][ERROR ] log flush
[ceph2][ERROR ] log dump
[ceph2][ERROR ] config unset <var>
[ceph2][ERROR ] config show
[ceph2][ERROR ] get_command_descriptions
[ceph2][ERROR ] dump_mempools
[ceph2][ERROR ] admin_socket: invalid command
[ceph_deploy.mon][WARNIN] mon.ceph2 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
解決辦法:在各個節點上執行sudo pkill ceph,然后再在deploy節點執行ceph-deploy mon create-initial
然后發現ERROR報錯消失了,配置初始monitor(s)、並收集到了所有密鑰,當前目錄下可以看到下面這些密鑰環
ceph.bootstrap-mds.keyring
ceph.bootstrap-mgr.keyring
ceph.bootstrap-osd.keyring
ceph.bootstrap-rgw.keyring
ceph.client.admin.keyring