一、Ceph 服務管理
1.1 啟用和停止守護進程
# 啟動當前節點的所有Ceph服務 [root@ceph01 ~]# systemctl start ceph.target # 停止當前節點的所有Ceph服務 [root@ceph01 ~]# systemctl stop ceph\*.service ceph\*.target # 對遠端節點進行操作 -H 192.168.5.93 [root@ceph01 ~]# systemctl -H ceph02 start ceph.target
1.2 查看相關服務
systemctl status ceph-osd.target systemctl status ceph-osd@1.service systemctl status ceph-mds.target systemctl status ceph-mon.target systemctl status ceph-radosgw.target
二、集群擴展
從根本上說,Ceph一直致力於成長從幾個節點到幾百個,它應該在沒有停機的情況下即時擴展。
2.1 節點信息及系統初始化(請按第一節進行初始化配置)
# ceph-deploy節點設置免密登錄
[cephadmin@ceph01 ~]$ ssh-copy-id cephadmin@ceph04
# 以前為新加節點配置
[root@ceph04 ~]# cat /etc/yum.repos.d/ceph.repo [ceph] name=ceph baseurl=http://mirrors.aliyun.com/ceph/rpm-mimic/el7/x86_64/ gpgcheck=0 [ceph-noarch] name=cephnoarch baseurl=http://mirrors.aliyun.com/ceph/rpm-mimic/el7/noarch/ gpgcheck=0 [root@ceph04 ~]# id cephadmin uid=1001(cephadmin) gid=1001(cephadmin) groups=1001(cephadmin) [root@ceph04 ~]# cat /etc/sudoers.d/cephadmin cephadmin ALL = (root) NOPASSWD:ALL [root@ceph04 ~]# cat /etc/hosts 192.168.5.91 ceph01 192.168.5.92 ceph02 192.168.5.93 ceph03 192.168.5.94 ceph04 [root@ceph04 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 50G 0 disk └─vda1 253:1 0 50G 0 part / vdb 253:16 0 20G 0 disk vdc 253:32 0 20G 0 disk vdd 253:48 0 20G 0 disk
2.2 添加節點和OSD。
目前我們有三個節點,九個 OSD ,現在要加入一個節點,三個 OSD。
[root@ceph04 ~]# yum install ceph ceph-radosgw -y
2.3 ceph-deploy 添加新的OSD到ceph集群后,Ceph集群數據就會開始重新平衡到新的 OSD,過一段時間后,Ceph 集群就變得穩定了。 生產中,就不能這樣添加,否則會影響性能 。
[cephadmin@ceph01 my-cluster]$ for dev in /dev/vdb /dev/vdc /dev/vdd; do ceph-deploy disk zap ceph04 $dev; ceph-deploy osd create ceph04 --data $dev; done [cephadmin@ceph01 my-cluster]$ watch ceph -s [cephadmin@ceph01 my-cluster]$ rados df [cephadmin@ceph01 my-cluster]$ ceph df [cephadmin@ceph01 my-cluster]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN 162/1818 objects misplaced (8.911%) Degraded data redundancy: 181/1818 objects degraded (9.956%), 47 pgs degraded, 8 pgs undersized application not enabled on 1 pool(s)
2.4 修改ceph.conf配置文件
# 更新ceph.conf配置文件,增加ceph04的配置信息和public network信息 [cephadmin@ceph01 my-cluster]$ cat ceph.conf [global] fsid = 4d02981a-cd20-4cc9-8390-7013da54b161 mon_initial_members = ceph01, ceph02, ceph03, ceph04 mon_host = 192.168.5.91,192.168.5.92,192.168.5.93,192.168.5.94 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 192.168.5.0/24 # 需要增加此行,否則會報錯 [client.rgw.ceph01] rgw_frontends = "civetweb port=80" [client.rgw.ceph02] rgw_frontends = "civetweb port=80" [client.rgw.ceph03] rgw_frontends = "civetweb port=80"
# 上傳配置文件 [cephadmin@ceph01 my-cluster]$ ceph-deploy --overwrite-conf config push ceph01 ceph02 ceph03 ceph04
2.5 添加Ceph MON
在生產設置中,您應該始終在Ceph集群中具有奇數個監視節點以形成仲裁:
[cephadmin@ceph01 my-cluster]$ ceph-deploy mon add ceph04
如果不增加上面的public network = 192.168.5.0/24配置信息,運行后會報錯:
[cephadmin@ceph01 my-cluster]$ ceph-deploy mon add ceph04 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /bin/ceph-deploy mon add ceph04 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : add [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fe956d3efc8> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] mon : ['ceph04'] [ceph_deploy.cli][INFO ] func : <function mon at 0x7fe956fad398> [ceph_deploy.cli][INFO ] address : None [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph04 [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph04 [ceph04][DEBUG ] connection detected need for sudo [ceph04][DEBUG ] connected to host: ceph04 [ceph04][DEBUG ] detect platform information from remote host [ceph04][DEBUG ] detect machine type [ceph04][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph04 [ceph_deploy.mon][DEBUG ] using mon address by resolving host: 192.168.5.94 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph04 ... [ceph04][DEBUG ] connection detected need for sudo [ceph04][DEBUG ] connected to host: ceph04 [ceph04][DEBUG ] detect platform information from remote host [ceph04][DEBUG ] detect machine type [ceph04][DEBUG ] find the location of an executable [ceph_deploy.mon][INFO ] distro info: CentOS Linux 7.6.1810 Core [ceph04][DEBUG ] determining if provided host has same hostname in remote [ceph04][DEBUG ] get remote short hostname [ceph04][DEBUG ] adding mon to ceph04 [ceph04][DEBUG ] get remote short hostname [ceph04][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph04][DEBUG ] create the mon path if it does not exist [ceph04][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph04/done [ceph04][DEBUG ] create a done file to avoid re-doing the mon deployment [ceph04][DEBUG ] create the init path if it does not exist [ceph04][INFO ] Running command: sudo systemctl enable ceph.target [ceph04][INFO ] Running command: sudo systemctl enable ceph-mon@ceph04 [ceph04][INFO ] Running command: sudo systemctl start ceph-mon@ceph04 [ceph04][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph04.asok mon_status [ceph04][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [ceph04][WARNIN] monitor ceph04 does not exist in monmap [ceph04][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors [ceph04][WARNIN] monitors may not be able to form quorum [ceph04][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph04.asok mon_status [ceph04][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory [ceph04][WARNIN] monitor: mon.ceph04, might not be running yet
查看集群狀態:
[cephadmin@ceph01 my-cluster]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN application not enabled on 1 pool(s) services: mon: 4 daemons, quorum ceph01,ceph02,ceph03,ceph04 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph03=up:active}, 2 up:standby osd: 12 osds: 10 up, 9 in rgw: 3 daemons active data: pools: 9 pools, 368 pgs objects: 606 objects, 1.3 GiB usage: 20 GiB used, 170 GiB / 190 GiB avail pgs: 368 active+clean io: client: 2.3 KiB/s rd, 0 B/s wr, 2 op/s rd, 1 op/s wr
三、集群的縮減
存儲最重要的功能之一系統是它的靈活性。一個好的存儲解決方案應該足夠靈活,以支持其擴展和減少,而不會導致服務停機。傳統存儲系統的靈活性有限; 擴大和減少這種系統是一項艱巨的任務。
Ceph是一個絕對靈活的存儲系統,支持即時更改存儲容量,無論是擴展還是減少。
3.1 刪減Ceph OSD
在繼續縮小群集大小,或刪除OSD節點之前,請確保群集有足夠的可用空間來容納您計划移出的節點上的所有數據。群集應該不是它的全部比例,即OSD中已用磁盤空間的百分比。因此,作為最佳實踐,請勿在不考慮對全部比率的影響的情況下移除OSD或OSD節點。Ceph-Ansible不支持縮小集群中的Ceph OSD節點,這必須手動完成。
[cephadmin@ceph01 my-cluster]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.22302 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 up 0 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 up 0 1.00000 -9 0.05576 host ceph04 9 hdd 0.01859 osd.9 up 1.00000 1.00000 10 hdd 0.01859 osd.10 up 1.00000 1.00000 11 hdd 0.01859 osd.11 up 1.00000 1.00000 [cephadmin@ceph01 my-cluster]$ ceph osd out osd.9 marked out osd.9. [cephadmin@ceph01 my-cluster]$ ceph osd out osd.10 marked out osd.10. [cephadmin@ceph01 my-cluster]$ ceph osd out osd.11 marked out osd.11.
此時,Ceph就會通過將PG從OSD中移出到群集內的其他OSD來開始重新平衡群集。您的群集狀態將在一段時間內變得不健康。根據刪除的OSD數量,在恢復時間完成之前,群集性能可能會有所下降。
[cephadmin@ceph01 my-cluster]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN 159/1818 objects misplaced (8.746%) Degraded data redundancy: 302/1818 objects degraded (16.612%), 114 pgs degraded, 26 pgs undersized application not enabled on 1 pool(s) services: mon: 4 daemons, quorum ceph01,ceph02,ceph03,ceph04 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph03=up:active}, 2 up:standby osd: 12 osds: 10 up, 6 in; 32 remapped pgs rgw: 3 daemons active data: pools: 9 pools, 368 pgs objects: 606 objects, 1.3 GiB usage: 18 GiB used, 153 GiB / 171 GiB avail pgs: 0.272% pgs not active 302/1818 objects degraded (16.612%) 159/1818 objects misplaced (8.746%) 246 active+clean 88 active+recovery_wait+degraded 26 active+recovery_wait+undersized+degraded+remapped 6 active+remapped+backfill_wait 1 active+recovering 1 activating io: recovery: 2.6 MiB/s, 1 objects/s
雖然我們已把 osd.9,osd.10,osd11 從集群中標記 out ,不會參與存儲數據,但他們的服務仍然還在運行。
3.2 關閉ceph04上面的所有OSD
[root@ceph04 ~]# systemctl stop ceph-osd.target
[cephadmin@ceph01 my-cluster]$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.22302 root default
-3 0.05576 host ceph01
0 hdd 0.01859 osd.0 up 1.00000 1.00000
3 hdd 0.01859 osd.3 up 0 1.00000
6 hdd 0.01859 osd.6 up 1.00000 1.00000
-5 0.05576 host ceph02
1 hdd 0.01859 osd.1 up 0 1.00000
4 hdd 0.01859 osd.4 up 1.00000 1.00000
7 hdd 0.01859 osd.7 up 1.00000 1.00000
-7 0.05576 host ceph03
2 hdd 0.01859 osd.2 up 1.00000 1.00000
5 hdd 0.01859 osd.5 up 1.00000 1.00000
8 hdd 0.01859 osd.8 up 0 1.00000
-9 0.05576 host ceph04
9 hdd 0.01859 osd.9 down 0 1.00000
10 hdd 0.01859 osd.10 down 0 1.00000
11 hdd 0.01859 osd.11 down 0 1.00000
3.3 在crush map中刪除osd
[cephadmin@ceph01 my-cluster]$ ceph osd crush remove osd.9 removed item id 9 name 'osd.9' from crush map [cephadmin@ceph01 my-cluster]$ ceph osd crush remove osd.10 removed item id 10 name 'osd.10' from crush map [cephadmin@ceph01 my-cluster]$ ceph osd crush remove osd.11 removed item id 11 name 'osd.11' from crush map
3.4 刪除OSD身份密鑰
[cephadmin@ceph01 my-cluster]$ ceph auth del osd.9 updated [cephadmin@ceph01 my-cluster]$ ceph auth del osd.10 updated [cephadmin@ceph01 my-cluster]$ ceph auth del osd.11 updated
3.5 刪除OSD
[cephadmin@ceph01 my-cluster]$ ceph osd rm osd.9 removed osd.9 [cephadmin@ceph01 my-cluster]$ ceph osd rm osd.10 removed osd.10 [cephadmin@ceph01 my-cluster]$ ceph osd rm osd.11 removed osd.11
3.6 從 Crush map 中刪除此節點的痕跡
[cephadmin@ceph01 my-cluster]$ ceph osd crush remove ceph04 removed item id -9 name 'ceph04' from crush map
3.7 刪除Ceph MON
[cephadmin@ceph01 my-cluster]$ ceph mon stat e2: 4 mons at {ceph01=192.168.5.91:6789/0,ceph02=192.168.5.92:6789/0,ceph03=192.168.5.93:6789/0,ceph04=192.168.5.94:6789/0}, election epoch 32, leader 0 ceph01, quorum 0,1,2,3 ceph01,ceph02,ceph03,ceph04 # 停止ceph04的mon服務 [cephadmin@ceph01 my-cluster]$ sudo systemctl -H ceph04 stop ceph-mon.target # 刪除mon節點 [cephadmin@ceph01 my-cluster]$ ceph mon remove ceph04 removing mon.ceph04 at 192.168.5.94:6789/0, there will be 3 monitors # 查看 mon是否從法定人數里面刪除 [cephadmin@ceph01 my-cluster]$ ceph quorum_status --format json-pretty { "election_epoch": 42, "quorum": [ 0, 1, 2 ], "quorum_names": [ "ceph01", "ceph02", "ceph03" ], "quorum_leader_name": "ceph01", "monmap": { "epoch": 3, "fsid": "4d02981a-cd20-4cc9-8390-7013da54b161", "modified": "2020-02-17 14:20:57.664427", "created": "2020-02-02 21:00:45.936041", "features": { "persistent": [ "kraken", "luminous", "mimic", "osdmap-prune" ], "optional": [] }, "mons": [ { "rank": 0, "name": "ceph01", "addr": "192.168.5.91:6789/0", "public_addr": "192.168.5.91:6789/0" }, { "rank": 1, "name": "ceph02", "addr": "192.168.5.92:6789/0", "public_addr": "192.168.5.92:6789/0" }, { "rank": 2, "name": "ceph03", "addr": "192.168.5.93:6789/0", "public_addr": "192.168.5.93:6789/0" } ] } } # 在ceph04刪除數據,如果重要請備份 [root@ceph04 ~]# rm -rf /var/lib/ceph/mon/ceph-ceph04 # 更新並推送配置文件 [cephadmin@ceph01 my-cluster]$ cat ceph.conf [global] fsid = 4d02981a-cd20-4cc9-8390-7013da54b161 mon_initial_members = ceph01, ceph02, ceph03 mon_host = 192.168.5.91,192.168.5.92,192.168.5.93 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 192.168.5.0/24 [client.rgw.ceph01] rgw_frontends = "civetweb port=80" [client.rgw.ceph02] rgw_frontends = "civetweb port=80" [client.rgw.ceph03] rgw_frontends = "civetweb port=80" [cephadmin@ceph01 my-cluster]$ ceph-deploy --overwrite-conf config push ceph01 ceph02 ceph03 ceph04
四、故障磁盤修復
故障磁盤信息
[cephadmin@ceph01 ~]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.16727 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 down 0 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 down 0 1.00000
4.1 將故障磁盤標記為out
[cephadmin@ceph01 ~]$ ceph osd out osd.1 [cephadmin@ceph01 ~]$ ceph osd out osd.8
4.2 從 Ceph crush map 中刪除故障磁盤 OSD
[cephadmin@ceph01 ~]$ ceph osd crush rm osd.1 [cephadmin@ceph01 ~]$ ceph osd crush rm osd.8
4.3 刪除 OSD 的 Ceph 身份驗證密鑰
[cephadmin@ceph01 ~]$ ceph auth del osd.1 [cephadmin@ceph01 ~]$ ceph auth del osd.8
4.4 從集群中刪除OSD
[cephadmin@ceph01 ~]$ ceph osd rm osd.1 [cephadmin@ceph01 ~]$ ceph osd rm osd.8
4.5 卸載故障節點掛載的磁盤
[root@ceph02 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/vda1 50G 2.6G 48G 6% / devtmpfs 2.0G 0 2.0G 0% /dev tmpfs 2.0G 0 2.0G 0% /dev/shm tmpfs 2.0G 190M 1.8G 10% /run tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup tmpfs 2.0G 52K 2.0G 1% /var/lib/ceph/osd/ceph-1 tmpfs 2.0G 52K 2.0G 1% /var/lib/ceph/osd/ceph-4 tmpfs 2.0G 52K 2.0G 1% /var/lib/ceph/osd/ceph-7 tmpfs 396M 0 396M 0% /run/user/0 [root@ceph02 ~]# umount /var/lib/ceph/osd/ceph-1
4.6 如果是故障磁盤的話,需要如下操作,若更換故障磁盤后可直接添加入集群省略此步,將ceph-osd.1和ceph-osd.8移除lvm
[root@ceph03 ~]# ll /var/lib/ceph/osd/ceph-8/block lrwxrwxrwx 1 ceph ceph 93 Feb 14 09:45 /var/lib/ceph/osd/ceph-8/block -> /dev/ceph-f0f390f2-d217-47cf-b882-e212afde9cd7/osd-block-ba913d26-e67f-4bba-8efc-6c351ccaf0f8 [root@ceph03 ~]# [root@ceph03 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 50G 0 disk └─vda1 253:1 0 50G 0 part / vdb 253:16 0 20G 0 disk └─ceph--fcfa2170--d24f--4525--99f9--b88ed12d1de5-osd--block--dab88638--d753--4e01--817b--283ba3f0666b 252:1 0 19G 0 lvm vdc 253:32 0 20G 0 disk └─ceph--7e0279e5--47bc--4940--a71c--2fd23f8f046c-osd--block--1a36b1a3--deee--40ab--868c--bd735c9b4e26 252:2 0 19G 0 lvm vdd 253:48 0 20G 0 disk └─ceph--f0f390f2--d217--47cf--b882--e212afde9cd7-osd--block--ba913d26--e67f--4bba--8efc--6c351ccaf0f8 252:0 0 19G 0 lvm [root@ceph03 ~]# dmsetup remove ceph--f0f390f2--d217--47cf--b882--e212afde9cd7-osd--block--ba913d26--e67f--4bba--8efc--6c351ccaf0f8 [root@ceph03 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 50G 0 disk └─vda1 253:1 0 50G 0 part / vdb 253:16 0 20G 0 disk └─ceph--fcfa2170--d24f--4525--99f9--b88ed12d1de5-osd--block--dab88638--d753--4e01--817b--283ba3f0666b 252:1 0 19G 0 lvm vdc 253:32 0 20G 0 disk └─ceph--7e0279e5--47bc--4940--a71c--2fd23f8f046c-osd--block--1a36b1a3--deee--40ab--868c--bd735c9b4e26 252:2 0 19G 0 lvm vdd 253:48 0 20G 0 disk
4.7 擦除磁盤中的超級塊信息
[root@ceph03 ~]# wipefs -af /dev/vdd /dev/vdd: 8 bytes were erased at offset 0x00000218 (LVM2_member): 4c 56 4d 32 20 30 30 31
4.8 清除分區
[cephadmin@ceph01 my-cluster]$ ceph-deploy disk zap ceph03 /dev/vdd
4.9 將磁盤添加到集群中
[cephadmin@ceph01 my-cluster]$ ceph-deploy osd create ceph03 --data /dev/vdd
4.10 查看狀態
[cephadmin@ceph01 ~]$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.16727 root default -3 0.05576 host ceph01 0 hdd 0.01859 osd.0 up 1.00000 1.00000 3 hdd 0.01859 osd.3 up 0 1.00000 6 hdd 0.01859 osd.6 up 1.00000 1.00000 -5 0.05576 host ceph02 1 hdd 0.01859 osd.1 up 1.00000 1.00000 4 hdd 0.01859 osd.4 up 1.00000 1.00000 7 hdd 0.01859 osd.7 up 1.00000 1.00000 -7 0.05576 host ceph03 2 hdd 0.01859 osd.2 up 1.00000 1.00000 5 hdd 0.01859 osd.5 up 1.00000 1.00000 8 hdd 0.01859 osd.8 up 1.00000 1.00000 [cephadmin@ceph01 ~]$ ceph -s cluster: id: 4d02981a-cd20-4cc9-8390-7013da54b161 health: HEALTH_WARN 61/1818 objects misplaced (3.355%) Degraded data redundancy: 94/1818 objects degraded (5.171%), 37 pgs degraded application not enabled on 1 pool(s) services: mon: 3 daemons, quorum ceph01,ceph02,ceph03 mgr: ceph01(active), standbys: ceph02, ceph03 mds: cephfs-1/1/1 up {0=ceph03=up:active}, 2 up:standby osd: 9 osds: 9 up, 8 in; 8 remapped pgs rgw: 3 daemons active data: pools: 9 pools, 368 pgs objects: 606 objects, 1.3 GiB usage: 19 GiB used, 152 GiB / 171 GiB avail pgs: 94/1818 objects degraded (5.171%) 61/1818 objects misplaced (3.355%) 328 active+clean 29 active+recovery_wait+degraded 8 active+recovery_wait+undersized+degraded+remapped 2 active+remapped+backfill_wait 1 active+recovering io: recovery: 16 MiB/s, 6 objects/s
五、Ceph 集群維護
5.1 Ceph flag
作為Ceph存儲管理員,維護您的Ceph集群將是您的首要任務之一。Ceph是一個分布式系統,旨在從數十個OSD擴展到數千個。維護Ceph集群所需的關鍵之一是管理其OSD。為了更好地理解對這些命令的需求,我們假設您要在生產Ceph集群中添加新節點。一種方法是簡單地將具有多個磁盤的新節點添加到Ceph集群,並且集群將開始回填並將數據混洗到新節點上。這適用於 測試集群 。
然而,當涉及到 生產系統 中,你應該使用 noin,nobackfill,等等。這樣做是為了在新節點進入時集群不會立即啟動回填過程。然后,您可以在非高峰時段取消設置這些標志,並且集群將花時間重新平衡:
# 設置flag ceph osd set <flag_name> ceph osd set noout ceph osd set nodown ceph osd set norecover # 取消flag ceph osd unset <flag_name> ceph osd unset noout ceph osd unset nodown ceph osd unset norecover
解釋:
noup # 防止 osd 進入 up 狀態,標記 osd 進程未啟動,一般用於新添加 osd nodown # 防止 osd 進入 down 狀態,一般用在檢查 osd 進程時,而導致 osd down ,發生數據遷移。 noout # 防止 osd 進入 out 狀態,down狀態的osd 300s 后會自動被標記未 out,此時,數據就會發生遷移。noout標記后,如果 osd down , 該 osd 的 pg 會切換到副本osd 上 noin # 防止 osd 加入 ceph 集群,一般用在新添加 osd 后,又不想馬上加入集群,導致數據遷移。 nobackfill # 防止集群進行數據回填操作,Ceph 集群故障會觸發 backfill norebalance # 防止數據平衡操作,Ceph 集群在擴容時會觸發 rebalance 操作。一般和nobackfill,norecover 一起使用,用於防止數據發生數據遷移等操作。 norecover # 防止數據發生恢復操作。 noscrub # 防止集群清洗操作,在高負載、recovery, backfilling, rebalancing等期間,為了保證集群性能,可以和 nodeep-scrub 一起設置。 nodeepscrub # 防止集群進行深度清洗操作。因為會阻塞讀寫操作,影響性能。一般不要長時間設置該值,否則,一旦取消該值,則會由大量的pg進行深度清洗。
5.2 節流回填和恢復
如果要在生產峰值中添加新的OSD節點,又希望對客戶端 IO 中產生的影響最小,這時就可以借助以下命令限制回填和恢復。
設置 osd_max_backfills = 1 選項以限制回填線程。可以在 ceph.conf [osd] 部分中添加它,也可以使用以下命令動態設置。
ceph tell osd.* injectargs '--osd_max_backfills 1'
設置 osd_recovery_max_active = 1 選項以限制恢復線程。您可以在 ceph.conf[osd] 部分中添加它,也可以使用以下命令動態設置它:
ceph tell osd.* injectargs '--osd_recovery_max_active 1'
設置 osd_recovery_op_priority = 1 選項以降低恢復優先級。您可以在 ceph.conf[osd] 部分中添加它,也可以使用以下命令動態設置它:
ceph tell osd.* injectargs '--osd_recovery_op_priority 1'
5.3 OSD 和 PG 修復
ceph osd repair # 這將在指定的OSD上執行修復。
ceph pg repair # 這將在指定的PG上執行修復。 請謹慎 使用此命令; 根據您的群集狀態,如果未仔細使用,此命令可能會影響用戶數據。
ceph pg scrub # 這將在指定的PG上執行清理。
ceph deep-scrub # 這會對指定的PG執行深度清理。