查測試環境ceph集群時發現集群狀態出來 “3 pgs not deep-scrubbed in time” 這個告警信息,特此將處理過程記錄一下。
- 查看集群狀態信息
[root@ceph-p-001 ~]# ceph -s cluster: id: f0f53ab6-36bf-48f0-98dd-4fad46e31991 health: HEALTH_WARN 3 pgs not deep-scrubbed in time services: mon: 3 daemons, quorum ceph-p-001,ceph-v-003,ceph-p-002 (age 2w) mgr: node2(active, since 4M), standbys: node1 mds: cephfs:1 {0=node2=up:active} 1 up:standby osd: 12 osds: 12 up (since 4M), 12 in (since 5M) rgw: 3 daemons active (node1, node2, node3) data: pools: 10 pools, 640 pgs objects: 1.17M objects, 617 GiB usage: 1.3 TiB used, 21 TiB / 22 TiB avail pgs: 640 active+clean io: client: 2.3 KiB/s rd, 352 KiB/s wr, 2 op/s rd, 37 op/s wr
發現出現異常warn信息,雖然不影響整個集群正常使用,但強迫症患者還是忍不了,下面是過程。
- 查看具體報錯信息
[root@ceph-p-001 ~]# ceph health detail HEALTH_WARN 3 pgs not deep-scrubbed in time PG_NOT_DEEP_SCRUBBED 3 pgs not deep-scrubbed in time pg 12.1b not deep-scrubbed since 2020-10-29 03:52:31.523550 pg 4.d not deep-scrubbed since 2020-10-29 05:30:15.630028 pg 9.39 not deep-scrubbed since 2020-10-29 05:01:41.849331
- 這是由於部分PG沒有deep-scrubbed造成,手動對異常PG進行deep-scrubb清理及可
[root@ceph-p-001 ~]# ceph pg deep-scrub 12.1b instructing pg 12.1b on osd.11 to deep-scrub [root@ceph-p-001 ~]# ceph pg deep-scrub 4.d instructing pg 4.d on osd.9 to deep-scrub [root@ceph-p-001 ~]# ceph pg deep-scrub 9.39 instructing pg 9.39 on osd.11 to deep-scrub
- 處理后在查看集群狀態發現出現了一個active+clean+scrubbing+deep進程
[root@ceph-p-001 ~]# ceph -s cluster: id: f0f53ab6-36bf-48f0-98dd-4fad46e31991 health: HEALTH_WARN 1 pgs not deep-scrubbed in time services: mon: 3 daemons, quorum ceph-p-001,ceph-v-003,ceph-p-002 (age 2w) mgr: node2(active, since 4M), standbys: node1 mds: cephfs:1 {0=node2=up:active} 1 up:standby osd: 12 osds: 12 up (since 4M), 12 in (since 5M) rgw: 3 daemons active (node1, node2, node3) data: pools: 10 pools, 640 pgs objects: 1.17M objects, 618 GiB usage: 1.3 TiB used, 21 TiB / 22 TiB avail pgs: 639 active+clean 1 active+clean+scrubbing+deep io: client: 3.2 KiB/s rd, 408 KiB/s wr, 3 op/s rd, 43 op/s wr
- ceph集群詳細信息中看到部分PG恢復正常
[root@ceph-p-001 ~]# ceph health detail HEALTH_WARN 1 pgs not deep-scrubbed in time PG_NOT_DEEP_SCRUBBED 1 pgs not deep-scrubbed in time pg 4.d not deep-scrubbed since 2020-10-29 05:30:15.630028
- 一段時間后集群完成恢復正常
[root@ceph-p-001 ~]# ceph -s cluster: id: f0f53ab6-36bf-48f0-98dd-4fad46e31991 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-p-001,ceph-v-003,ceph-p-002 (age 2w) mgr: node2(active, since 4M), standbys: node1 mds: cephfs:1 {0=node2=up:active} 1 up:standby osd: 12 osds: 12 up (since 4M), 12 in (since 5M) rgw: 3 daemons active (node1, node2, node3) data: pools: 10 pools, 640 pgs objects: 1.17M objects, 618 GiB usage: 1.3 TiB used, 21 TiB / 22 TiB avail pgs: 640 active+clean io: client: 2.4 KiB/s rd, 1.1 MiB/s wr, 3 op/s rd, 62 op/s wr