1 部署Dashboard
(1) 說明
Ceph從L版本開始,Ceph提供了原生的Dashboard功能,通過Dashboard對Ceph集群狀態查看和基本管理,使用Dashboard需要在MGR節點安裝軟件
包。
(2) 查看ceph狀態
# ceph -s
cluster:
id: 14912382-3d84-4cf2-9fdb-eebab12107d8
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-node01,ceph-node02,ceph-node03 (age 4h)
mgr: ceph-node01(active, since 4h), standbys: ceph-node02, ceph-node03
osd: 6 osds: 6 up (since 4h), 6 in (since 12d)
data:
pools: 2 pools, 65 pgs
objects: 4 objects, 19 B
usage: 6.1 GiB used, 114 GiB / 120 GiB avail
pgs: 65 active+clean
# ceph -v
ceph version 15.2.15 (2dfb18841cfecc2f7eb7eb2afd65986ca4d95985) octopus (stable)
(3) 在ceph-node01、ceph-node02、ceph-node03上安裝
1) 設置ceph源
之前在部署ceph時已經設置了ceph源,這里就不贅述了。
2) 安裝
# yum install ceph-mgr-dashboard –y
......(省略內容)
--> Processing Dependency: python3-routes for package: 2:ceph-mgr-dashboard-15.2.15-0.el7.noarch
--> Processing Dependency: python3-cherrypy for package: 2:ceph-mgr-dashboard-15.2.15-0.el7.noarch
--> Processing Dependency: python3-jwt for package: 2:ceph-mgr-dashboard-15.2.15-0.el7.noarch
---> Package ceph-prometheus-alerts.noarch 2:15.2.15-0.el7 will be installed
---> Package python36-werkzeug.noarch 0:1.0.1-1.el7 will be installed
--> Finished Dependency Resolution
Error: Package: 2:ceph-mgr-dashboard-15.2.15-0.el7.noarch (Ceph-noarch)
Requires: python3-jwt
Error: Package: 2:ceph-mgr-dashboard-15.2.15-0.el7.noarch (Ceph-noarch)
Requires: python3-routes
Error: Package: 2:ceph-mgr-dashboard-15.2.15-0.el7.noarch (Ceph-noarch)
Requires: python3-cherrypy
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
(4) 安裝報錯原因分析
這是由於從O版本開始,MGR改為Python3編寫,而默認庫沒有這3個模塊包,即使單獨找包安裝也可能不生效或者安裝不上。從社區得知,這是已知問
題,建議使用CentOS8系統或者使用cephadm容器化部署Ceph,或者降低Ceph版本也可以,例如H版本,這個版本還是Python2編寫的,不存在缺包
問題。這里選擇降低到H版本,重新部署Ceph集群。
2 部署Ceph H版
在ceph-deploy節點上進行操作
(1) 清理Ceph集群環境
1) 從遠程主機卸載ceph包並清理數據
# ceph-deploy purge ceph-deploy ceph-node01 ceph-node02 ceph-node03
執行的命令實際為:
# yum -y -q remove ceph ceph-release ceph-common ceph-radosgw
"/etc/ceph/"目錄會被移除。
2) 清理ceph所有node節點上的數據
# ceph-deploy purgedata ceph-node01 ceph-node02 ceph-node03
執行的命令為:
Running command: rm -rf --one-file-system -- /var/lib/ceph
Running command: rm -rf --one-file-system -- /etc/ceph/
3) 從本地目錄移出認證秘鑰
# cd /root/my-cluster/
# ceph-deploy forgetkeys
# rm -f ./*
4) 徹底清理ceph相關軟件包
在ceph-deploy、ceph-node01、ceph-node02、ceph-node03節點上操作
# rpm -qa |grep 15.2.15 |xargs -i yum remove {} -y
5) 取消OSD盤創建的LVM邏輯卷映射關系
在ceph-node01、ceph-node02、ceph-node03節點上操作
# dmsetup info -C |awk '/ceph/{print $1}' |xargs -i dmsetup remove {}
6) 清除OSD盤GPT數據結構
在ceph-node01、ceph-node02、ceph-node03節點上操作
# yum install gdisk -y
# sgdisk --zap-all /dev/sdb # 從硬盤中刪除所有分區
# sgdisk --zap-all /dev/sdc
(2) 部署(與之前的部署方式一樣)
1) 設置yum源
在ceph-deploy、ceph-node01、ceph-node02、ceph-node03節點上操作
# cat > /etc/yum.repos.d/ceph.repo << EOF
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/\$basearch
gpgcheck=0
[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch
gpgcheck=0
[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS
gpgcheck=0
EOF
2) 創建ceph-deploy部署目錄
# mkdir -p /root/my-cluster/ && cd /root/my-cluster/
3) 初始化mon
# ceph-deploy install --no-adjust-repos ceph-deploy ceph-node01 ceph-node02 ceph-node03
# ceph-deploy new ceph-node01 ceph-node02 ceph-node03
# ceph-deploy mon create-initial
# ceph-deploy admin ceph-deploy ceph-node01 ceph-node02 ceph-node03
4) 初始化osd
# ceph-deploy osd create --data /dev/sdb ceph-node01
# ceph-deploy osd create --data /dev/sdc ceph-node01
# ceph-deploy osd create --data /dev/sdb ceph-node02
# ceph-deploy osd create --data /dev/sdc ceph-node02
# ceph-deploy osd create --data /dev/sdb ceph-node03
# ceph-deploy osd create --data /dev/sdc ceph-node03
5) 初始化mgr
# ceph-deploy mgr create ceph-node01 ceph-node02 ceph-node03
6) 查看ceph集群狀態(報錯)
# ceph -s
cluster:
id: 07862ba6-c411-4cee-a912-ed68850028d5
health: HEALTH_WARN
mons are allowing insecure global_id reclaim
......(省略的內容)
報錯解決辦法:
# ceph config set mon auth_allow_insecure_global_id_reclaim false
7) 查看ceph集群狀態(正常)
# ceph -s
cluster:
id: 07862ba6-c411-4cee-a912-ed68850028d5
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-node01,ceph-node02,ceph-node03 (age 13m)
mgr: ceph-node01(active, since 6m), standbys: ceph-node02, ceph-node03
osd: 6 osds: 6 up (since 7m), 6 in (since 7m)
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 6.0 GiB used, 114 GiB / 120 GiB avail
pgs:
8) 查看Ceph版本
# ceph -v
ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable)
(3) 添加RBD塊設備和CephFS文件系統測試
1) 添加RBD塊設備
# ceph osd pool create rbd-pool 64 64
# ceph osd pool application enable rbd-pool rbd
# rbd create --size 10240 rbd-pool/image01
# rbd feature disable rbd-pool/image01 object-map fast-diff deep-flatten
# rbd map rbd-pool/image01
# mkfs.xfs /dev/rbd0
# mount /dev/rbd0 /mnt
2) 添加CephFS文件系統
# cd /root/my-cluster/
# ceph-deploy mds create ceph-node01 ceph-node02 ceph-node03
# ceph osd pool create cephfs_data 64 64
# ceph osd pool create cephfs_metadata 64 64
# ceph fs new cephfs-pool cephfs_metadata cephfs_data
# ceph fs ls
name: cephfs-pool, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
# ceph osd pool ls
rbd-pool
cephfs_data
cephfs_metadata
# grep "key" /etc/ceph/ceph.client.admin.keyring
key = AQAIKKNhJJWQARAA47VYIHu8MD/j8SQkbqq+Pw==
3) 掛載cephfs文件系統
在172.16.1.34節點上操作
# mount -t ceph 172.16.1.31:6789,172.16.1.32:6789,172.16.1.33:6789:/ /mnt -o \
name=admin,secret=AQAIKKNhJJWQARAA47VYIHu8MD/j8SQkbqq+Pw==
# df -hT
Filesystem Type Size Used Avail Use% Mounted on
......(省略的內容)
172.16.1.31:6789,172.16.1.32:6789,172.16.1.33:6789:/ ceph 36G 0 36G 0% /mnt
(4) 查看ceph狀態
# ceph -s
cluster:
id: 0469a267-e475-4934-b0e3-de3cc725707e
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-node01,ceph-node02,ceph-node03 (age 19m)
mgr: ceph-node01(active, since 15m), standbys: ceph-node02, ceph-node03
mds: cephfs-pool:1 {0=ceph-node02=up:active} 2 up:standby
osd: 6 osds: 6 up (since 16m), 6 in (since 16m)
data:
pools: 3 pools, 192 pgs
objects: 46 objects, 14 MiB
usage: 6.1 GiB used, 114 GiB / 120 GiB avail
pgs: 192 active+clean
3 重新部署Dashboard
在ceph-deploy節點上操作
(1) 在每個MGR節點安裝
在ceph-node01、ceph-node02、ceph-node03節點上操作
# yum install ceph-mgr-dashboard -y
(2) 開啟MGR功能
# ceph mgr module enable dashboard
# ceph mgr module ls | head -n 20
......(省略的內容)
"enabled_modules": [
"dashboard",
"iostat",
"restful"
],
......(省略的內容)
(3) 修改默認配置
# ceph config set mgr mgr/dashboard/server_addr 0.0.0.0
# ceph config set mgr mgr/dashboard/server_port 7000 # 每個mgr節點上都會起一個7000的TCP端口。
# ceph config set mgr mgr/dashboard/ssl false # 內網環境,web訪問ceph的dashboard時,這里不需要ceph的ssl自簽證書。
注: 后面如果修改配置,重啟生效
# ceph mgr module disable dashboard
# ceph mgr module enable dashboard
(4) 創建一個dashboard登錄用戶名密碼
格式: dashboard ac-user-create <username> {<rolename>} {<name>}
# echo "123456" > password.txt
# ceph dashboard ac-user-create admin administrator -i password.txt
{"username": "admin", "lastUpdate": 1638084810, "name": null, "roles": ["administrator"], "password": "$2b$12$amtj6kibrkqqixSFrriKNOxHXkazdN8MgjirC//Koc.z6x42tGenm", "email": null}
(5) 查看服務訪問方式
# ceph mgr services
{
"dashboard": "http://ceph-node01:7000/"
}
UI登錄: http://172.16.1.31:7000/
4 Prometheus+Grafana監控Ceph
在172.16.1.34節點上操作
(1) 說明
Prometheus(普羅米修斯)是容器監控系統,官網地址為https://prometheus.io。
Grafana是一個開源的度量分析和可視化系統,官網地址為https://grafana.com/grafana。
(2) 部署docker
1) 安裝依賴包
# yum install -y yum-utils device-mapper-persistent-data lvm2
2) 更新為阿里雲的源
# wget -O /etc/yum.repos.d/docker-ce.repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
3) 清理源緩存
# yum clean all
4) 安裝Docker CE
# yum install -y docker-ce
5) 啟動Docker服務並設置開機啟動
# systemctl start docker
# systemctl enable docker
6) 查看docker版本
# docker -v
Docker version 19.03.12, build 48a66213fe
7) 添加阿里雲的鏡像倉庫
# mkdir -p /etc/docker
# tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://b1cx9cn7.mirror.aliyuncs.com"]
}
EOF
8) 重啟docker
# systemctl daemon-reload
# systemctl restart docker
(3) docker部署Prometheus
官方文檔: https://prometheus.io/docs/prometheus/latest/installation/
1) 將二進制安裝包中的prometheus.yml配置文件上傳到/opt/prometheus/config目錄下
# mkdir -p /opt/prometheus/config
# cp -a /usr/local/prometheus/prometheus.yml /opt/prometheus/config/
2) 授權prometheus TSDB存儲目錄
# mkdir -p /opt/prometheus/data && chown -R 65534.64434 /opt/prometheus/data
3) 啟動
# docker run -d \
-p 9090:9090 \
--restart=always \
--name=prometheus \
-v /opt/prometheus/config:/etc/prometheus \
-v /opt/prometheus/data:/prometheus \
prom/prometheus
4) UI訪問
http://172.16.1.34:9090/
(4) docker部署Grafana
官方文檔: https://grafana.com/docs/grafana/latest/installation/docker/
1) 創建grafana本地存儲目錄並授權
# mkdir -p /opt/grafana/
# chown -R 472.472 /opt/grafana/
2) 啟動
# docker run -d \
--name grafana \
-p 3000:3000 \
-v /opt/grafana:/var/lib/grafana \
--restart=always \
grafana/grafana:7.5.9
3) UI訪問
http://172.16.1.34:3000/
注: 初次登錄時默認用戶名和密碼均為admin,錄入后會要求修改密碼,我這里修改為123456
(5) 啟用MGR Prometheus插件
在ceph-deploy節點操作
# ceph mgr module enable prometheus # 每個mgr節點上都會起一個9283的TCP端口。
# ceph mgr services # 查看訪問mgr服務的方式
{
"dashboard": "http://ceph-node01:7000/",
"prometheus": "http://ceph-node01:9283/"
}
# curl 172.16.1.31:9283/metrics # 測試promtheus指標接口。
(6) 配置Prometheus采集
# vim /opt/prometheus/config/prometheus.yml
......(省略的內容)
- job_name: 'ceph'
static_configs:
- targets: ['172.16.1.31:9283']
# docker restart prometheus
Prometheus-Targets: http://172.16.1.34:9090/targets
(7) 訪問Grafana儀表盤
1) 訪問Grafana
http://172.16.1.34:3000/
用戶名為: admin
密碼為: 123456
2) 添加Prometheus為數據源
Configuration -> Data sources -> Add Data sources -> Promethes -> 輸入URL http://172.16.1.34:9090 -> Save & Test
3) 導入Ceph監控儀表盤
Dashboards -> Manage -> Import -> 輸入儀表盤ID -> Load
# 儀表盤ID
Ceph-Cluster ID: 2842
Ceph-OSD ID: 5336
Ceph-Pool ID: 5342