Velero備份與K8S集群遷移助手


一、Velero用來幫助我們什么

  • 一般來說大家都用etcd備份恢復k8s集群,但是有時候我們可能不小心刪掉了一個namespace,假設這個ns里面有上百個服務,瞬間沒了,怎么辦?

  • velero可以幫助我們:

    • 災備場景,提供備份恢復k8s集群的能力
    • 遷移場景,提供拷貝集群資源到其他集群的能力(復制同步開發,測試,生產環境的集群配置,簡化環境配置)
  • 下面就介紹一下如何使用 Velero 完成備份和遷移。

Velero 地址:https://github.com/vmware-tanzu/velero
ACK 插件地址:https://github.com/AliyunContainerService/velero-plugin

二、下載 Velero 客戶端

  • Velero 由客戶端和服務端組成,服務器部署在目標 k8s 集群上,而客戶端則是運行在本地的命令行工具。
    • 前往 Velero 的 Release 頁面 下載客戶端,直接在 GitHub 上下載即可
    • 解壓 release 包
    • 將 release 包中的二進制文件 velero 移動到 $PATH 中的某個目錄下
    • 執行 velero -h 測試

三、部署velero-plugin插件

  • 拉取代碼
$ git clone https://github.com/AliyunContainerService/velero-plugin
  • 配置修改
#修改 `install/credentials-velero` 文件,將新建用戶中獲得的 `AccessKeyID` 和 `AccessKeySecret` 填入,這里的 OSS EndPoint 為之前 OSS 的訪問域名

ALIBABA_CLOUD_ACCESS_KEY_ID=<ALIBABA_CLOUD_ACCESS_KEY_ID>
ALIBABA_CLOUD_ACCESS_KEY_SECRET=<ALIBABA_CLOUD_ACCESS_KEY_SECRET>
ALIBABA_CLOUD_OSS_ENDPOINT=<ALIBABA_CLOUD_OSS_ENDPOINT>
#修改 `install/01-velero.yaml`,將 OSS 配置填入:
---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: velero
  name: velero

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    component: velero
  name: velero
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: velero
  namespace: velero

---
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
  labels:
    component: velero
  name: default
  namespace: velero
spec:
  config:
    region: cn-beijing
  objectStorage:
    bucket: k8s-backup-test
    prefix: test
  provider: alibabacloud

---
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
  labels:
    component: velero
  name: default
  namespace: velero
spec:
  config:
    region: cn-beijing
  provider: alibabacloud

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: velero
  namespace: velero
spec:
  replicas: 1
  selector:
    matchLabels:
      deploy: velero
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "8085"
        prometheus.io/scrape: "true"
      labels:
        component: velero
        deploy: velero
    spec:
      serviceAccountName: velero
      containers:
      - name: velero
        # sync from velero/velero:v1.2.0
        image: registry.cn-hangzhou.aliyuncs.com/acs/velero:v1.2.0
        imagePullPolicy: IfNotPresent
        command:
          - /velero
        args:
          - server
          - --default-volume-snapshot-locations=alibabacloud:default
        env:
          - name: VELERO_SCRATCH_DIR
            value: /scratch
          - name: ALIBABA_CLOUD_CREDENTIALS_FILE
            value: /credentials/cloud
        volumeMounts:
          - mountPath: /plugins
            name: plugins
          - mountPath: /scratch
            name: scratch
          - mountPath: /credentials
            name: cloud-credentials
      initContainers:
      - image: registry.cn-hangzhou.aliyuncs.com/acs/velero-plugin-alibabacloud:v1.2-991b590
        imagePullPolicy: IfNotPresent
        name: velero-plugin-alibabacloud
        volumeMounts:
        - mountPath: /target
          name: plugins
      volumes:
        - emptyDir: {}
          name: plugins
        - emptyDir: {}
          name: scratch
        - name: cloud-credentials
          secret:
            secretName: cloud-credentials
  • k8s 部署 Velero 服務
# 新建 namespace
kubectl create namespace velero
# 部署 credentials-velero 的 secret
kubectl create secret generic cloud-credentials --namespace velero --from-file cloud=install/credentials-velero
# 部署 CRD
kubectl apply -f install/00-crds.yaml
# 部署 Velero
kubectl apply -f install/01-velero.yaml

四、備份測試

  • 這里,將使用velero備份一個集群內相關的resource,並在當該集群出現一些故障或誤操作的時候,能夠快速恢復集群resource, 首先我們用下面的yaml來部署:
---
apiVersion: v1
kind: Namespace
metadata:
  name: nginx-example
  labels:
    app: nginx

---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: nginx-example
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx:1.7.9
        name: nginx
        ports:
        - containerPort: 80

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: nginx
  name: my-nginx
  namespace: nginx-example
spec:
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: nginx
  • 我們可以全量備份,也可以只備份需要備份的一個namespace,本處只備份一個namespace:nginx-example
[rsync@velero-plugin]$ kubectl get pods -n nginx-example
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-5c689d88bb-f8vsx   1/1     Running   0          6m31s
nginx-deployment-5c689d88bb-rt2zk   1/1     Running   0          6m32s		
		
[rsync@velero]$ cd velero-v1.4.0-linux-amd64/
[rsync@velero-v1.4.0-linux-amd64]$ ll
total 56472
drwxrwxr-x 4 rsync rsync     4096 Jun  1 15:02 examples
-rw-r--r-- 1 rsync rsync    10255 Dec 10 01:08 LICENSE
-rwxr-xr-x 1 rsync rsync 57810814 May 27 04:33 velero
[rsync@velero-v1.4.0-linux-amd64]$ ./velero backup create nginx-backup --include-namespaces nginx-example --wait
Backup request "nginx-backup" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.
.
Backup completed with status: Completed. You may check for more information using the commands `velero backup describe nginx-backup` and `velero backup logs nginx-backup`.

image

  • 刪除 namespace
[rsync@velero-v1.4.0-linux-amd64]$ kubectl delete namespaces nginx-example
namespace "nginx-example" deleted

[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example  
No resources found.
  • 恢復
[rsync@velero-v1.4.0-linux-amd64]$ ./velero restore create --from-backup nginx-backup --wait
Restore request "nginx-backup-20200603180922" submitted successfully.
Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background.

Restore completed with status: Completed. You may check for more information using the commands `velero restore describe nginx-backup-20200603180922` and `velero restore logs nginx-backup-20200603180922`.
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example
NAME                                READY   STATUS              RESTARTS   AGE
nginx-deployment-5c689d88bb-f8vsx   1/1     Running             0          5s
nginx-deployment-5c689d88bb-rt2zk   0/1     ContainerCreating   0          5s

# 可以看到已經恢復了
  • 另外遷移和備份恢復也是一樣的,下面看一個特殊的,再部署一個項目,之后恢復會不會刪掉新部署的項目。
新建了一個tomcat容器
[rsync@tomcat-test]$ kubectl get pods -n nginx-example
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-5c689d88bb-f8vsx   1/1     Running   0          65m
nginx-deployment-5c689d88bb-rt2zk   1/1     Running   0          65m
tomcat-test-sy-677ff78f6b-rc5vq     1/1     Running   0          7s
  • restore 一下
[rsync@velero-v1.4.0-linux-amd64]$ ./velero  restore create --from-backup nginx-backup        
Restore request "nginx-backup-20200603191726" submitted successfully.
Run `velero restore describe nginx-backup-20200603191726` or `velero restore logs nginx-backup-20200603191726` for more details.
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example  
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-5c689d88bb-f8vsx   1/1     Running   0          68m
nginx-deployment-5c689d88bb-rt2zk   1/1     Running   0          68m
tomcat-test-sy-677ff78f6b-rc5vq     1/1     Running   0          2m33s

# 可以看到沒有覆蓋
  • 刪除nginx的deployment,在restore
[rsync@velero-v1.4.0-linux-amd64]$ kubectl delete deployment nginx-deployment -n nginx-example
deployment.extensions "nginx-deployment" deleted

[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example
NAME                              READY   STATUS    RESTARTS   AGE
tomcat-test-sy-677ff78f6b-rc5vq   1/1     Running   0          4m18s

[rsync@velero-v1.4.0-linux-amd64]$ ./velero  restore create --from-backup nginx-backup 
Restore request "nginx-backup-20200603191949" submitted successfully.
Run `velero restore describe nginx-backup-20200603191949` or `velero restore logs nginx-backup-20200603191949` for more details.

[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example             NAME                                READY   STATUS              RESTARTS   AGE
nginx-deployment-5c689d88bb-f8vsx   1/1     Running             0          2s
nginx-deployment-5c689d88bb-rt2zk   0/1     ContainerCreating   0          2s
tomcat-test-sy-677ff78f6b-rc5vq     1/1     Running             0          4m49s

# 可以看到,對我們的tomcat項目是沒影響的。
  • 結論:velero恢復不是直接覆蓋,而是會恢復當前集群中不存在的resource,已有的resource不會回滾到之前的版本,如需要回滾,需在restore之前提前刪除現有的resource。

五、高級用法

  • 可以設置一個周期性定時備份
# 每日1點進行備份
velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *"
# 每日1點進行備份,備份保留48小時
velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *" --ttl 48h
# 每6小時進行一次備份
velero create schedule <SCHEDULE NAME> --schedule="@every 6h"
# 每日對 web namespace 進行一次備份
velero create schedule <SCHEDULE NAME> --schedule="@every 24h" --include-namespaces web
定時備份的名稱為:`<SCHEDULE NAME>-<TIMESTAMP>`,恢復命令為:`velero restore create --from-backup <SCHEDULE NAME>-<TIMESTAMP>`。
  • 如需備份恢復持久卷,備份如下:
velero backup create nginx-backup-volume --snapshot-volumes --include-namespaces nginx-example
  • 該備份會在集群所在region給雲盤創建快照(當前還不支持NAS和OSS存儲),快照恢復雲盤只能在同region完成。

  • 恢復命令如下:

方法一,通過命令直接刪除
velero delete backups default-backup

方法二,設置備份自動過期,在創建備份時,加上TTL參數
velero backup create <BACKUP-NAME> --ttl <DURATION>
  • 還可為資源添加指定標簽,添加標簽的資源在備份的時候被排除。
# 添加標簽
kubectl label -n <ITEM_NAMESPACE> <RESOURCE>/<NAME> velero.io/exclude-from-backup=true
# 為 default namespace 添加標簽
kubectl label -n default namespace/default velero.io/exclude-from-backup=true


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM