1 介紹
OSD是ceph集群中的一個角色,全稱為Object Storage Device,負責響應客戶端請求返回具體數據的進程,一個ceph集群中一般有多個OSD。
本文章主要講的是一些關於OSD的基本操作命令。
2 常用操作
2.1 查看OSD狀態
$ ceph osd stat
3 osds: 3 up, 3 in
狀態說明:
● 集群內(in)
● 集群外(out)
● 活着且在運行(up)
● 掛了且不再運行(down)
說明:
如果OSD活着,它也可以是in或者out集群。如果它以前是in但最近out了,Ceph會把其歸置組遷移到其他OSD。
如果OSD out 了,CRUSH就不會再分配歸置組給它。如果它掛了(down)其狀態也應該是out。
如果OSD狀態為down且in,必定有問題,而且集群處於非健康狀態。
2.2 查看OSD映射信息
$ ceph osd dump
epoch 64
fsid 0c042a2d-b040-484f-935b-d4f68428b2d6
created 2021-12-28 18:24:46.492648
modified 2021-12-29 16:43:51.895508
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 19
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client jewel
require_osd_release luminous
max_osd 10
osd.0 up in weight 1 up_from 37 up_thru 0 down_at 36 last_clean_interval [9,35) 192.168.8.101:9601/24028 192.168.8.101:9602/24028 192.168.8.101:9603/24028 192.168.8.101:9604/24028 exists,up 33afe59f-a8f2-44a0-b479-64837183bbd6
osd.1 up in weight 1 up_from 6 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.8.102:9600/20844 192.168.8.102:9601/20844 192.168.8.102:9602/20844 192.168.8.102:9603/20844 exists,up ecf9d3c8-6f2f-42ac-bf8b-6cefcc44cdd0
osd.2 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.8.103:9601/32806 192.168.8.103:9602/32806 192.168.8.103:9603/32806 192.168.8.103:9604/32806 exists,up dde6a9b4-b247-433a-bb29-8355a45a1fb1
2.3 查看OSD目錄樹
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 1.00000 1.00000
2.4 下線OSD
#讓編號為2的osd down掉,此時該osd不接受讀寫請求,但osd還是存活的
$ systemctl stop ceph-osd@2
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 down 1.00000 1.00000
2.5 上線OSD
#讓編號為2的osd up,此時該osd接受讀寫請求
$ systemctl start ceph-osd@2
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 1.00000 1.00000
2.6 將OSD逐出集群
#將osd.2逐出集群,即下線一個osd,此時可以對該osd進行維護
$ ceph osd out 2
marked out osd.2.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.2 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 0 1.00000
2.7 將OSD加入集群
#將osd.2重新加入集群
$ ceph osd in 2
marked in osd.2.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 1.00000 1.00000
2.8 新增OSD到集群中
2.8.1 新增OSD類型為bluestore
#查看磁盤
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 50G 0 disk
├─sdb2 8:18 0 31G 0 part
├─sdb3 8:19 0 17G 0 part
└─sdb1 8:17 0 2G 0 part
sr0 11:0 1 1024M 0 rom
sdc 8:32 0 50G 0 disk
sda 8:0 0 50G 0 disk
├─sda2 8:2 0 49G 0 part
│ ├─centos-swap 253:1 0 3.9G 0 lvm [SWAP]
│ └─centos-root 253:0 0 45.1G 0 lvm /
└─sda1 8:1 0 1G 0 part /boot
#將sdc磁盤進行分區,分3個區(與sdb分區相同),分別作為wal、db、data
$ parted -s /dev/sdc mklabel gpt
$ parted -s /dev/sdc mkpart primary 2048s 4196351s
$ parted -s /dev/sdc mkpart primary 4196352s 69208063s
$ parted -s /dev/sdc mkpart primary 69208064s 100%
#使用ceph-vloume命令添加bluestore類型osd至集群中。
$ ceph-volume lvm prepare --block.db /dev/sdc2 --block.wal /dev/sdc1 --data /dev/sdc3
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6a81b934-feb1-4fce-8ad5-ed0a5f203f40
Running command: vgcreate --force --yes ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0 /dev/sdc3
stdout: Physical volume "/dev/sdc3" successfully created.
stdout: Volume group "ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40 ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0
stdout: Logical volume "osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-3
Running command: restorecon /var/lib/ceph/osd/ceph-3
Running command: chown -h ceph:ceph /dev/ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0/osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40
Running command: chown -R ceph:ceph /dev/dm-3
Running command: ln -s /dev/ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0/osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40 /var/lib/ceph/osd/ceph-3/block
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
stderr: got monmap epoch 1
stderr:
Running command: ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQCHDMxhByjSKRAA8pNSa1wpOEccKHaiilCj+g==
stdout: creating /var/lib/ceph/osd/ceph-3/keyring
stdout: added entity osd.3 auth auth(auid = 18446744073709551615 key=AQCHDMxhByjSKRAA8pNSa1wpOEccKHaiilCj+g== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
Running command: chown -R ceph:ceph /dev/sdc1
Running command: chown -R ceph:ceph /dev/sdc2
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --keyfile - --bluestore-block-wal-path /dev/sdc1 --bluestore-block-db-path /dev/sdc2 --osd-data /var/lib/ceph/osd/ceph-3/ --osd-uuid 6a81b934-feb1-4fce-8ad5-ed0a5f203f40 --setuser ceph --setgroup ceph
--> ceph-volume lvm prepare successful for: /dev/sdc3
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 0 osd.3 down 0 1.00000
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 4 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.01GiB used, 48.0GiB / 51.0GiB avail
pgs:
$ ceph osd crush add osd.3 1.00000 host=node103
add item id 3 name 'osd.3' weight 1 at location {host=node103} to crush map
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 0 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ ceph osd in osd.3
marked in osd.3.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 1.00000 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
Active: inactive (dead)
$ systemctl start ceph-osd@3
$ systemctl enable ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2021-12-29 15:30:04 CST; 15s ago
Main PID: 3368700 (ceph-osd)
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@3.service
└─3368700 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
Dec 29 15:30:04 node103 systemd[1]: Starting Ceph object storage daemon osd.3...
Dec 29 15:30:04 node103 systemd[1]: Started Ceph object storage daemon osd.3.
Dec 29 15:30:04 node103 ceph-osd[3368700]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 15:30:05 node103 ceph-osd[3368700]: 2021-12-29 15:30:05.958147 7f6eabf16d80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 15:30:06 node103 ceph-osd[3368700]: 2021-12-29 15:30:06.848849 7f6e922f1700 -1 osd.3 0 waiting for initial osdmap
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 hdd 1.00000 osd.3 up 1.00000 1.00000
2.8.2 新增OSD類型為filestore
$ parted -s /dev/sdc mklabel gpt
$ parted -s /dev/sdc mkpart primary 1M 10240M
$ parted -s /dev/sdc mkpart primary 10241M 100%
$ ceph-volume lvm prepare --filestore --data /dev/sdc2 --journal /dev/sdc1
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ca2b837d-901d-4ce2-940b-fc278364b889
Running command: vgcreate --force --yes ceph-848a4efc-ed30-47f3-b64a-6229cfa35060 /dev/sdc2
stdout: Volume group "ceph-848a4efc-ed30-47f3-b64a-6229cfa35060" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 ceph-848a4efc-ed30-47f3-b64a-6229cfa35060
stdout: Wiping xfs signature on /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889.
stdout: Logical volume "osd-data-ca2b837d-901d-4ce2-940b-fc278364b889" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mkfs -t xfs -f -i size=2048 /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889
stdout: meta-data=/dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 isize=2048 agcount=4, agsize=2651392 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=10605568, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=5178, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Running command: mount -t xfs -o rw,noatime,inode64 /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 /var/lib/ceph/osd/ceph-3
Running command: restorecon /var/lib/ceph/osd/ceph-3
Running command: chown -R ceph:ceph /dev/sdc1
Running command: ln -s /dev/sdc1 /var/lib/ceph/osd/ceph-3/journal
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
stderr: got monmap epoch 1
stderr:
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-3/journal
Running command: chown -R ceph:ceph /dev/sdc1
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore filestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --osd-data /var/lib/ceph/osd/ceph-3/ --osd-journal /var/lib/ceph/osd/ceph-3/journal --osd-uuid ca2b837d-901d-4ce2-940b-fc278364b889 --setuser ceph --setgroup ceph
stderr: 2021-12-29 17:28:26.696005 7fdf5cceed80 -1 journal check: ondisk fsid ff68b4ad-06af-4dd2-8369-aac3da4abcc3 doesn't match expected ca2b837d-901d-4ce2-940b-fc278364b889, invalid (someone else's?) journal
stderr: 2021-12-29 17:28:26.902358 7fdf5cceed80 -1 journal do_read_entry(4096): bad header magic
stderr: 2021-12-29 17:28:26.902401 7fdf5cceed80 -1 journal do_read_entry(4096): bad header magic
stderr: 2021-12-29 17:28:26.904367 7fdf5cceed80 -1 read_settings error reading settings: (2) No such file or directory
stderr: 2021-12-29 17:28:27.171320 7fdf5cceed80 -1 created object store /var/lib/ceph/osd/ceph-3/ for osd.3 fsid 0c042a2d-b040-484f-935b-d4f68428b2d6
Running command: ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQAlKsxhXaKLBRAASmGMnMaltRwMIOZ9s/NN6w==
stdout: creating /var/lib/ceph/osd/ceph-3/keyring
added entity osd.3 auth auth(auid = 18446744073709551615 key=AQAlKsxhXaKLBRAASmGMnMaltRwMIOZ9s/NN6w== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
--> ceph-volume lvm prepare successful for: /dev/sdc2
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 4 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.03GiB used, 48.0GiB / 51.0GiB avail
pgs:
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 0 osd.3 down 0 1.00000
$ ceph osd crush add osd.3 1.00000 host=node103
add item id 3 name 'osd.3' weight 1 at location {host=node103} to crush map
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 0 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ ceph osd in osd.3
marked in osd.3.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 1.00000 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Wed 2021-12-29 17:24:23 CST; 6min ago
Process: 3665352 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
Process: 3665311 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 3665352 (code=exited, status=0/SUCCESS)
Dec 29 17:19:47 node103 ceph-osd[3665352]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.875231 7f957470fd80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.875549 7f957470fd80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.976355 7f957470fd80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 17:19:50 node103 ceph-osd[3665352]: 2021-12-29 17:19:50.246598 7f95562e1700 -1 osd.3 0 waiting for initial osdmap
Dec 29 17:24:20 node103 systemd[1]: Stopping Ceph object storage daemon osd.3...
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.051448 7f954c2cd700 -1 received signal: Terminated from PID: 1 task name: /usr/lib/systemd/systemd --switched-root --system --deserialize 22 UID: 0
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.051526 7f954c2cd700 -1 osd.3 68 *** Got signal Terminated ***
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.849249 7f954c2cd700 -1 osd.3 68 shutdown
Dec 29 17:24:23 node103 systemd[1]: Stopped Ceph object storage daemon osd.3.
$ systemctl start ceph-osd@3
$ systemctl enable ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2021-12-29 17:31:07 CST; 9s ago
Main PID: 3696217 (ceph-osd)
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@3.service
└─3696217 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
Dec 29 17:31:07 node103 systemd[1]: Starting Ceph object storage daemon osd.3...
Dec 29 17:31:07 node103 systemd[1]: Started Ceph object storage daemon osd.3.
Dec 29 17:31:07 node103 ceph-osd[3696217]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.147135 7f6f8c407d80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.147178 7f6f8c407d80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.185153 7f6f8c407d80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 17:31:09 node103 ceph-osd[3696217]: 2021-12-29 17:31:09.465673 7f6f6dfd9700 -1 osd.3 0 waiting for initial osdmap
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 hdd 1.00000 osd.3 up 1.00000 1.00000
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 4 osds: 4 up, 4 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.13GiB used, 88.3GiB / 91.4GiB avail
pgs:
2.9 從集群中刪除OSD
2.9.1 刪除類型為bluestore的OSD
$ systemctl stop ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Wed 2021-12-29 16:35:12 CST; 3s ago
Process: 3541909 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
Process: 3541893 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 3541909 (code=exited, status=0/SUCCESS)
$ ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map
$ ceph osd out 3
marked out osd.3.
$ ceph osd rm osd.3
removed osd.3
$ ceph auth del osd.3
updated
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.02GiB used, 48.0GiB / 51.0GiB avail
pgs:
2.9.2 刪除類型為filestore的OSD
同2.9.1 刪除類型為bluestore的OSD操作方式
2.10 查看最大OSD個數
#查看最大osd的個數,默認最大是4個osd節點
$ ceph osd getmaxosd
max_osd = 4 in epoch 63
2.11 設置最大OSD個數
#設置最大osd的個數,當擴大osd節點的時候必須扣大這個值
$ ceph osd setmaxosd 10
set new max_osd = 10
$ ceph osd getmaxosd
max_osd = 10 in epoch 64
2.12 設置OSD的crush權重
格式:ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
$ ceph osd crush set 3 3.0 host=node4
#或者
$ ceph osd crush reweight osd.3 1.0
2.13 設置OSD的權重
格式:ceph osd reweight {id} {weight}
$ ceph osd reweight 3 0.5
2.14 暫停OSD
#暫停后整個集群不再接收數據
$ ceph osd pause
2.15 開啟OSD
#開啟后集群再次接收數據
$ ceph osd unpause
2.16 查看OSD參數配置
#查看某個osd的配置參數
$ ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show | less
2.17 OSD打擺子
#我們建議同時部署公網(前端)和集群網(后端),這樣能更好地滿足對象復制的容量需求。
#然而,如果集群網(后端)失敗、或出現了明顯的延時,同時公網(前端)卻運行良好,OSD現在不能很好地處理這種情況。
#這時OSD們會向監視器報告鄰居down了、同時報告自己是up的,我們把這種情形稱為打擺子(flapping)。
#如果有東西導致OSD打擺子(反復地被標記為down,然后又up),你可以強制監視器停止。主要用於osd抖動的時候
$ ceph osd set noup # prevent OSDs from getting marked up
$ ceph osd set nodown # prevent OSDs from getting marked down
#這些標記記錄在 osdmap 數據結構里:
ceph osd dump | grep flags
flags no-up,no-down
#下列命令可清除標記:
ceph osd unset noup
ceph osd unset nodown
2.18 動態修改OSD參數
#修改所有osd參數,重啟失效,需要寫到配置文件中持久化
$ ceph tell osd.* injectargs "--rbd_default_format 2 "
2.19 查看延遲情況
主要解決單塊磁盤問題,如果有問題應及時剔除osd。統計的是平均值
fs_commit_latency 表示從接收請求到設置 commit 狀態的時間間隔
通過 fs_apply_latency 表示從接受請求到設置為 apply 狀態的時間間隔
$ ceph osd perf
osd commit_latency(ms) apply_latency(ms)
2 0 0
0 0 0
1 0 0
2.20 主親和性
Ceph 客戶端讀寫數據時,某個OSD與其它的相比並不適合做主OSD(比如其硬盤慢、或控制器慢),最大化硬件利用率時為防止性能瓶頸(特別是讀操作),
可以調整OSD的主親和性,這樣CRUSH就盡量不把它用作acting set里的主OSD了。
說明:[0, 1, 2]中, osd.0是主的
#ceph osd primary-affinity <osd-id> <weight>
$ ceph osd primary-affinity 2 1.0
#主親和性默認為1(就是說OSD可作為主 OSD )。此值合法范圍為 0-1 ,其中0意為此 OSD 不能用作主的
1意為OSD可用作主的;此權重小於 1 時,CRUSH 選擇主OSD時選中它的可能性低
2.21 獲取Crush圖
#提取最新crush圖
#ceph osd getcrushmap -o {compiled-crushmap-filename}
$ ceph osd getcrushmap -o /root/crush
#反編譯crush圖
# crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
$ crushtool -d /root/crush -o /root/decompiled_crush
2.22 注入Crush圖
#編譯crush圖
#crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename}
$ crushtool -c /root/decompiled_crush -o /root/crush_new
#注入crush圖
# ceph osd setcrushmap -i {compiled-crushmap-filename}
$ ceph osd setcrushmap -i /root/crush_new
2.23 停止自動重均衡
#當周期性地維護集群的子系統或解決某個失敗域的問題時,不想停機維護OSD讓CRUSH自動均衡,提前設置noout
$ ceph osd set noout
noout is set
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_WARN
noout flag(s) set
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 3 osds: 3 up, 3 in
flags noout
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.03GiB used, 48.0GiB / 51.0GiB avail
pgs:
2.24 取消停止自動均衡
$ ceph osd unset noout
noout is unset
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.03GiB used, 48.0GiB / 51.0GiB avail
pgs:
2.25 查看磁盤分區情況
$ ceph-disk list
/dev/dm-0 other, xfs, mounted on /
/dev/dm-1 swap, swap
/dev/dm-2 other, unknown
/dev/dm-3 other, unknown
/dev/sda :
/dev/sda1 other, xfs, mounted on /boot
/dev/sda2 other, LVM2_member
/dev/sdb :
/dev/sdb1 ceph block.wal
/dev/sdb2 ceph block.db
/dev/sdb3 other, LVM2_member