Kubernetes 安裝Rook ceph 1.5


一、rook簡介


Rook是一個自管理的分布式存儲編排系統,可以為Kubernetes提供便利的存儲解決方案。Rook本身並不提供存儲,而是在kubernetes和存儲系統之間提供適配層,簡化存儲系統的部署與維護工作。目前,rook支持的存儲系統包括:Ceph、CockroachDB、Cassandra、EdgeFS、Minio、NFS,其中Ceph為Stable狀態,其余均為Alpha。本文僅介紹Ceph相關內容。

Rook由Operator和Cluster兩部分組成:

  • Operator:由一些CRD和一個All in one鏡像構成,包含包含啟動和監控存儲系統的所有功能。
  • Cluster:負責創建CRD對象,指定相關參數,包括ceph鏡像、元數據持久化位置、磁盤位置、dashboard等等…

下圖是Rook的體系結構圖,Operator啟動之后,首先創建Agent和Discover容器,負責監視和管理各個節點上存儲資源。然后創建Cluster,Cluster是創建Operator時定義的CRD。Operator根據Cluster的配置信息啟動Ceph的相關容器。存儲集群啟動之后,使用kubernetes元語創建PVC為應用容器所用。

 

 

1、系統要求

本次安裝環境

  • kubernetes 1.16
  • centos7.7
  • kernel 5.4.65-200.el7.x86_64
  • flannel v0.12.0
須要安裝lvm包
 yum install -y lvm2

2、內核要求

RBD

通常發行版的內核都編譯有,但你最好肯定下:

foxchan@~$ lsmod|grep rbd
rbd                   114688  0 
libceph               368640  1 rbd

能夠用如下命令放到開機啟動項里

cat > /etc/sysconfig/modules/rbd.modules << EOF
modprobe rbd
EOF

CephFS

若是你想使用cephfs,內核最低要求是 4.17。

二、rook部署

2.1、環境說明

[root@k8s-master ceph]# kubectl get node
NAME         STATUS     ROLES    AGE   VERSION
k8s-master   NotReady   master   92m   v1.16.2
k8s-node1    Ready      <none>   92m   v1.16.2
k8s-node2    Ready      <none>   90m   v1.16.2
[root@k8s-node1 ~]# lsblk
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0   20G  0 disk
├─sda1            8:1    0  200M  0 part /boot
└─sda2            8:2    0 19.8G  0 part
  ├─centos-root 253:0    0 15.8G  0 lvm  /
  └─centos-swap 253:1    0    4G  0 lvm
sdb               8:16   0   20G  0 disk
sr0              11:0    1 10.3G  0 rom

每個節點 sdb用來做ceph的數據盤

2.2、拉取項目

git clone --single-branch --branch v1.5.1 https://github.com/rook/rook.git

2.3、部署Rook Operator

獲取鏡像

可能由於國內環境無法Pull鏡像,建議提前pull如下鏡像

docker pull ceph/ceph:v15.2.5
docker pull rook/ceph:v1.5.1
docker pull registry.aliyuncs.com/it00021hot/cephcsi:v3.1.2
docker pull registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.0.1
docker pull registry.aliyuncs.com/it00021hot/csi-attacher:v3.0.0
docker pull registry.aliyuncs.com/it00021hot/csi-provisioner:v2.0.0
docker pull registry.aliyuncs.com/it00021hot/csi-snapshotter:v3.0.0
docker pull registry.aliyuncs.com/it00021hot/csi-resizer:v1.0.0

docker tag registry.aliyuncs.com/it00021hot/csi-snapshotter:v3.0.0 k8s.gcr.io/sig-storage/csi-snapshotter:v3.0.0
docker tag registry.aliyuncs.com/it00021hot/csi-resizer:v1.0.0 k8s.gcr.io/sig-storage/csi-resizer:v1.0.0
docker tag registry.aliyuncs.com/it00021hot/cephcsi:v3.1.2 quay.io/cephcsi/cephcsi:v3.1.2
docker tag registry.aliyuncs.com/it00021hot/csi-node-driver-registrar:v2.0.1 k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.0.1
docker tag registry.aliyuncs.com/it00021hot/csi-attacher:v3.0.0 k8s.gcr.io/sig-storage/csi-attacher:v3.0.0
docker tag registry.aliyuncs.com/it00021hot/csi-provisioner:v2.0.0 k8s.gcr.io/sig-storage/csi-provisioner:v2.0.0


####可以將其更改tag並推送到私有倉庫;  另外或者tag為yaml文件中的名字,  建議修改為本地私有倉庫

修改  operator.yaml  的鏡像名,更改為私有倉庫

  ROOK_CSI_CEPH_IMAGE: "10.2.55.8:5000/kubernetes/cephcsi:v3.1.2"
  ROOK_CSI_REGISTRAR_IMAGE: "10.2.55.8:5000/kubernetes/csi-node-driver-registrar:v2.0.1"
  ROOK_CSI_RESIZER_IMAGE: "10.2.55.8:5000/kubernetes/csi-resizer:v1.0.0"
  ROOK_CSI_PROVISIONER_IMAGE: "10.2.55.8:5000/kubernetes/csi-provisioner:v2.0.0"
  ROOK_CSI_SNAPSHOTTER_IMAGE: "10.2.55.8:5000/kubernetes/csi-snapshotter:v3.0.0"
  ROOK_CSI_ATTACHER_IMAGE: "10.2.55.8:5000/kubernetes/csi-attacher:v3.0.0"

ROOK_CSI_KUBELET_DIR_PATH: "/data/k8s/kubelet" ###如果之前有修改過kubelet 數據目錄,這里需要修改

執行  operator.yaml

cd rook/cluster/examples/kubernetes/ceph
kubectl create -f crds.yaml -f common.yaml 
kubectl create -f operator.yaml

2.4、配置cluster

cluster.yaml文件里的內容需要修改,一定要適配自己的硬件情況,請詳細閱讀配置文件里的注釋,避免我踩過的坑。

此文件的配置,除了增刪osd設備外,其他的修改都要重裝ceph集群才能生效,所以請提前規划好集群。如果修改后不卸載ceph直接apply,會觸發ceph集群重裝,導致集群異常掛掉

修改內容如下:

vi cluster.yaml

  • 修改內容如下,更多配置參考官網
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
# 命名空間的名字,同一個命名空間只支持一個集群
  name: rook-ceph
  namespace: rook-ceph
spec:
# ceph版本說明
# v13 is mimic, v14 is nautilus, and v15 is octopus.
  cephVersion:
#修改ceph鏡像,加速部署時間
    image: image: 10.2.55.8:5000/kubernetes/ceph:v15.2.5
# 是否允許不支持的ceph版本
    allowUnsupported: false
#指定rook數據在節點的保存路徑
  dataDirHostPath: /data/k8s/rook
# 升級時如果檢查失敗是否繼續
  skipUpgradeChecks: false
# 從1.5開始,mon的數量必須是奇數
  mon:
    count: 3
# 是否允許在單個節點上部署多個mon pod
    allowMultiplePerNode: false
  mgr:
    modules:
    - name: pg_autoscaler
      enabled: true
# 開啟dashboard,禁用ssl,指定端口是7000,你可以默認https配置。我是為了ingress配置省事。
  dashboard:
    enabled: true
    port: 7000
    ssl: false
# 開啟prometheusRule
  monitoring:
    enabled: true
# 部署PrometheusRule的命名空間,默認此CR所在命名空間
    rulesNamespace: rook-ceph
# 開啟網絡為host模式,解決無法使用cephfs pvc的bug
  network:
    provider: host
# 開啟crash collector,每個運行了Ceph守護進程的節點上創建crash collector pod
  crashCollector:
    disable: false
  placement:
    osd:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: ceph-osd
              operator: In
              values:
              - enabled
# 存儲的設置,默認都是true,意思是會把集群所有node的設備清空初始化。
  storage: # cluster level storage configuration and selection
    useAllNodes: false     #關閉使用所有Node
    useAllDevices: false   #關閉使用所有設備
    nodes:
    - name: "k8s-node1"  #指定存儲節點主機
      devices:
      - name: "sdb"    #指定磁盤為/dev/sdb
    - name: "k8s-node2"
      devices:
      - name: "sdb"

更多 cluster 的 CRD 配置參考:

為osd節點增加label

[root@k8s-master ceph]# kubectl label nodes k8s-node1 ceph-osd=enabled
node/k8s-node1 labeled
[root@k8s-master ceph]# kubectl label nodes k8s-node2 ceph-osd=enabled
node/k8s-node2 labeled

執行安裝

kubectl apply -f cluster.yaml

2.5、 增刪osd

2.5.1 添加相關label

kubectl label nodes k8s-master ceph-osd=enabled

2.5.2 修改cluster.yaml

  storage: # cluster level storage configuration and selection
    useAllNodes: false     #關閉使用所有Node
    useAllDevices: false   #關閉使用所有設備
    nodes:
    - name: "k8s-node1"  #指定存儲節點主機
      devices:
      - name: "sdb"    #指定磁盤為/dev/sdb
    - name: "k8s-node2"
      devices:
      - name: "sdb"
    - name: "k8s-master"
      devices:
      - name: "sdb"

2.5.3 apply cluster.yaml

kubectl apply -f cluster.yaml

刪除重新安裝rook-ceph會報錯,可以執行如下命令生成secret。

kubectl -n rook-ceph create secret generic rook-ceph-crash-collector-keyring

2.6、 安裝toolbox

Rook工具箱是一個容器,其中包含用於rook調試和測試的常用工具。 該工具箱基於CentOS,因此yum可以輕松安裝您選擇的更多工具。

kubectl apply -f toolbox.yaml

測試Rook

一旦 toolbox 的 Pod 運行成功后,我們就可以使用下面的命令進入到工具箱內部進行操作:

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

比如現在我們要查看集群的狀態,需要滿足下面的條件才認為是健康的:

查看rook狀態:ceph status

    • 所有 mons 應該達到法定數量
    • mgr 應該是激活狀態
    • 至少有一個 OSD 處於激活狀態
    • 如果不是 HEALTH_OK 狀態,則應該查看告警或者錯誤信息

2.7、訪問dashboard

[root@k8s-master ceph]# kubectl get svc -n rook-ceph
NAME                       TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
csi-cephfsplugin-metrics   ClusterIP   10.103.103.152   <none>        8080/TCP,8081/TCP   9m39s
csi-rbdplugin-metrics      ClusterIP   10.109.21.95     <none>        8080/TCP,8081/TCP   9m41s
rook-ceph-mgr              ClusterIP   10.103.36.44     <none>        9283/TCP            8m50s
rook-ceph-mgr-dashboard    NodePort    10.104.55.171    <none>        7000:30112/TCP 8m50s
rook-ceph-mon-a            ClusterIP   10.103.40.41     <none>        6789/TCP,3300/TCP   9m36s
rook-ceph-mon-b            ClusterIP   10.96.138.43     <none>        6789/TCP,3300/TCP   9m14s
rook-ceph-mon-c            ClusterIP   10.108.169.68    <none>        6789/TCP,3300/TCP   9m3s

 ### 獲得dashboard的登錄密碼,用戶為admin ###。密碼通過如下方式獲得:

[root@k8s-master ceph]# kubectl get secrets -n rook-ceph rook-ceph-dashboard-password -o jsonpath='{.data.password}' | base64 -d
bagfSJpb/3Nj0DN5I#7Z

登錄后界面如下:

三、部署塊存儲

3.1 創建pool和StorageClass

ceph-storageclass.yaml

kubectl apply -f csi/rbd/storageclass.yaml
# 定義一個塊存儲池
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  # 每個數據副本必須跨越不同的故障域分布,如果設置為host,則保證每個副本在不同機器上
  failureDomain: host
  # 副本數量
  replicated:
    size: 3
    # Disallow setting pool with replica 1, this could lead to data loss without recovery.
    # Make sure you're *ABSOLUTELY CERTAIN* that is what you want
    requireSafeReplicaSize: true
    # gives a hint (%) to Ceph in terms of expected consumption of the total cluster capacity of a given pool
    # for more info: https://docs.ceph.com/docs/master/rados/operations/placement-groups/#specifying-expected-pool-size
    #targetSizeRatio: .5
---
# 定義一個StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
# 該SC的Provisioner標識,rook-ceph前綴即當前命名空間
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    # clusterID 就是集群所在的命名空間名
    # If you change this namespace, also change the namespace below where the secret namespaces are defined
    clusterID: rook-ceph

    # If you want to use erasure coded pool with RBD, you need to create
    # two pools. one erasure coded and one replicated.
    # You need to specify the replicated pool here in the `pool` parameter, it is
    # used for the metadata of the images.
    # The erasure coded pool must be set as the `dataPool` parameter below.
    #dataPool: ec-data-pool
    # RBD鏡像在哪個池中創建
    pool: replicapool

    # RBD image format. Defaults to "2".
    imageFormat: "2"

    # 指定image特性,CSI RBD目前僅僅支持layering
    imageFeatures: layering

    # Ceph admin 管理憑證配置,由operator 自動生成
    # in the same namespace as the cluster.
    csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
    # 卷的文件系統類型,默認ext4,不建議xfs,因為存在潛在的死鎖問題(超融合設置下卷被掛載到相同節點作為OSD時)
    csi.storage.k8s.io/fstype: ext4
# uncomment the following to use rbd-nbd as mounter on supported nodes
# **IMPORTANT**: If you are using rbd-nbd as the mounter, during upgrade you will be hit a ceph-csi
# issue that causes the mount to be disconnected. You will need to follow special upgrade steps
# to restart your application pods. Therefore, this option is not recommended.
#mounter: rbd-nbd
allowVolumeExpansion: true
reclaimPolicy: Delete

3.2 demo示例

推薦pvc 和應用寫到一個yaml里面

vim ceph-demo.yaml

#創建pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-demo-pvc
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: rook-ceph-block
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: csirbd-demo-pod
  labels:
    test-cephrbd: "true"
spec:
  replicas: 1
  selector:
    matchLabels:
      test-cephrbd: "true"
  template:
    metadata:
      labels:
        test-cephrbd: "true"
    spec:
      containers:
       - name: web-server-rbd
         image: 10.2.55.8:5000/library/nginx:1.18.0
         volumeMounts:
           - name: mypvc
             mountPath: /usr/share/nginx/html
      volumes:
       - name: mypvc
         persistentVolumeClaim:
           claimName: rbd-demo-pvc
           readOnly: false

四、部署文件系統

4.1 創建CephFS

CephFS的CSI驅動使用Quotas來強制應用PVC聲明的大小,僅僅4.17+內核才能支持CephFS quotas。

如果內核不支持,而且你需要配額管理,配置Operator環境變量 CSI_FORCE_CEPHFS_KERNEL_CLIENT: false來啟用FUSE客戶端。

使用FUSE客戶端時,升級Ceph集群時應用Pod會斷開mount,需要重啟才能再次使用PV。

cd rook/cluster/examples/kubernetes/ceph
kubectl apply -f filesystem.yaml

 

 

五、刪除ceph集群

刪除ceph集群前,請先清理相關pod

刪除塊存儲和文件存儲

kubectl delete -n rook-ceph cephblockpool replicapool
kubectl delete storageclass rook-ceph-block
kubectl delete -f csi/cephfs/filesystem.yaml
kubectl delete storageclass csi-cephfs rook-ceph-block
kubectl -n rook-ceph delete cephcluster rook-ceph

刪除operator和相關crd

kubectl delete -f cluster.yaml
kubectl delete -f operator.yaml kubectl delete -f common.yaml kubectl delete -f crds.yaml

清除主機上的數據

刪除Ceph集群后,在之前部署Ceph組件節點的/data/rook/目錄,會遺留下Ceph集群的配置信息。

rm -rf  /data/k8s/rook/*

若之后再部署新的Ceph集群,先把之前Ceph集群的這些信息刪除,不然啟動monitor會失敗;

# cat clean-rook-dir.sh
hosts=(
  192.168.130.130
  192.168.130.131
  192.168.130.132
)
for host in ${hosts[@]} ; do
  ssh $host "rm -rf /data/k8s/rook/*"
done

清除device

yum install gdisk -y
export DISK="/dev/sdb"
sgdisk --zap-all $DISK
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
blkdiscard $DISK
ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
rm -rf /dev/ceph-*

如果因為某些原因導致刪除ceph集群卡主,可以先執行以下命令, 再刪除ceph集群就不會卡主了

kubectl -n rook-ceph patch cephclusters.ceph.rook.io rook-ceph -p '{"metadata":{"finalizers": []}}' --type=merge

 

轉載於: https://my.oschina.net/u/4346166/blog/4752651

               https://blog.csdn.net/qq_40592377/article/details/110292089

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM