一、Velero用來幫助我們什么
-
一般來說大家都用etcd備份恢復k8s集群,但是有時候我們可能不小心刪掉了一個namespace,假設這個ns里面有上百個服務,瞬間沒了,怎么辦?
-
velero可以幫助我們:
- 災備場景,提供備份恢復k8s集群的能力
- 遷移場景,提供拷貝集群資源到其他集群的能力(復制同步開發,測試,生產環境的集群配置,簡化環境配置)
-
下面就介紹一下如何使用 Velero 完成備份和遷移。
Velero 地址:https://github.com/vmware-tanzu/velero
ACK 插件地址:https://github.com/AliyunContainerService/velero-plugin
二、下載 Velero 客戶端
- Velero 由客戶端和服務端組成,服務器部署在目標 k8s 集群上,而客戶端則是運行在本地的命令行工具。
- 前往 Velero 的 Release 頁面 下載客戶端,直接在 GitHub 上下載即可
- 解壓 release 包
- 將 release 包中的二進制文件 velero 移動到 $PATH 中的某個目錄下
- 執行 velero -h 測試
三、部署velero-plugin插件
- 拉取代碼
$ git clone https://github.com/AliyunContainerService/velero-plugin
- 配置修改
#修改 `install/credentials-velero` 文件,將新建用戶中獲得的 `AccessKeyID` 和 `AccessKeySecret` 填入,這里的 OSS EndPoint 為之前 OSS 的訪問域名
ALIBABA_CLOUD_ACCESS_KEY_ID=<ALIBABA_CLOUD_ACCESS_KEY_ID>
ALIBABA_CLOUD_ACCESS_KEY_SECRET=<ALIBABA_CLOUD_ACCESS_KEY_SECRET>
ALIBABA_CLOUD_OSS_ENDPOINT=<ALIBABA_CLOUD_OSS_ENDPOINT>
#修改 `install/01-velero.yaml`,將 OSS 配置填入:
---
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: velero
name: velero
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
component: velero
name: velero
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: velero
namespace: velero
---
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
labels:
component: velero
name: default
namespace: velero
spec:
config:
region: cn-beijing
objectStorage:
bucket: k8s-backup-test
prefix: test
provider: alibabacloud
---
apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
labels:
component: velero
name: default
namespace: velero
spec:
config:
region: cn-beijing
provider: alibabacloud
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: velero
namespace: velero
spec:
replicas: 1
selector:
matchLabels:
deploy: velero
template:
metadata:
annotations:
prometheus.io/path: /metrics
prometheus.io/port: "8085"
prometheus.io/scrape: "true"
labels:
component: velero
deploy: velero
spec:
serviceAccountName: velero
containers:
- name: velero
# sync from velero/velero:v1.2.0
image: registry.cn-hangzhou.aliyuncs.com/acs/velero:v1.2.0
imagePullPolicy: IfNotPresent
command:
- /velero
args:
- server
- --default-volume-snapshot-locations=alibabacloud:default
env:
- name: VELERO_SCRATCH_DIR
value: /scratch
- name: ALIBABA_CLOUD_CREDENTIALS_FILE
value: /credentials/cloud
volumeMounts:
- mountPath: /plugins
name: plugins
- mountPath: /scratch
name: scratch
- mountPath: /credentials
name: cloud-credentials
initContainers:
- image: registry.cn-hangzhou.aliyuncs.com/acs/velero-plugin-alibabacloud:v1.2-991b590
imagePullPolicy: IfNotPresent
name: velero-plugin-alibabacloud
volumeMounts:
- mountPath: /target
name: plugins
volumes:
- emptyDir: {}
name: plugins
- emptyDir: {}
name: scratch
- name: cloud-credentials
secret:
secretName: cloud-credentials
- k8s 部署 Velero 服務
# 新建 namespace
kubectl create namespace velero
# 部署 credentials-velero 的 secret
kubectl create secret generic cloud-credentials --namespace velero --from-file cloud=install/credentials-velero
# 部署 CRD
kubectl apply -f install/00-crds.yaml
# 部署 Velero
kubectl apply -f install/01-velero.yaml
四、備份測試
- 這里,將使用velero備份一個集群內相關的resource,並在當該集群出現一些故障或誤操作的時候,能夠快速恢復集群resource, 首先我們用下面的yaml來部署:
---
apiVersion: v1
kind: Namespace
metadata:
name: nginx-example
labels:
app: nginx
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
namespace: nginx-example
spec:
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx:1.7.9
name: nginx
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
labels:
app: nginx
name: my-nginx
namespace: nginx-example
spec:
ports:
- port: 80
targetPort: 80
selector:
app: nginx
- 我們可以全量備份,也可以只備份需要備份的一個namespace,本處只備份一個namespace:nginx-example
[rsync@velero-plugin]$ kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 6m31s
nginx-deployment-5c689d88bb-rt2zk 1/1 Running 0 6m32s
[rsync@velero]$ cd velero-v1.4.0-linux-amd64/
[rsync@velero-v1.4.0-linux-amd64]$ ll
total 56472
drwxrwxr-x 4 rsync rsync 4096 Jun 1 15:02 examples
-rw-r--r-- 1 rsync rsync 10255 Dec 10 01:08 LICENSE
-rwxr-xr-x 1 rsync rsync 57810814 May 27 04:33 velero
[rsync@velero-v1.4.0-linux-amd64]$ ./velero backup create nginx-backup --include-namespaces nginx-example --wait
Backup request "nginx-backup" submitted successfully.
Waiting for backup to complete. You may safely press ctrl-c to stop waiting - your backup will continue in the background.
.
Backup completed with status: Completed. You may check for more information using the commands `velero backup describe nginx-backup` and `velero backup logs nginx-backup`.
- 刪除 namespace
[rsync@velero-v1.4.0-linux-amd64]$ kubectl delete namespaces nginx-example
namespace "nginx-example" deleted
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example
No resources found.
- 恢復
[rsync@velero-v1.4.0-linux-amd64]$ ./velero restore create --from-backup nginx-backup --wait
Restore request "nginx-backup-20200603180922" submitted successfully.
Waiting for restore to complete. You may safely press ctrl-c to stop waiting - your restore will continue in the background.
Restore completed with status: Completed. You may check for more information using the commands `velero restore describe nginx-backup-20200603180922` and `velero restore logs nginx-backup-20200603180922`.
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 5s
nginx-deployment-5c689d88bb-rt2zk 0/1 ContainerCreating 0 5s
# 可以看到已經恢復了
- 另外遷移和備份恢復也是一樣的,下面看一個特殊的,再部署一個項目,之后恢復會不會刪掉新部署的項目。
新建了一個tomcat容器
[rsync@tomcat-test]$ kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 65m
nginx-deployment-5c689d88bb-rt2zk 1/1 Running 0 65m
tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 7s
- restore 一下
[rsync@velero-v1.4.0-linux-amd64]$ ./velero restore create --from-backup nginx-backup
Restore request "nginx-backup-20200603191726" submitted successfully.
Run `velero restore describe nginx-backup-20200603191726` or `velero restore logs nginx-backup-20200603191726` for more details.
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 68m
nginx-deployment-5c689d88bb-rt2zk 1/1 Running 0 68m
tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 2m33s
# 可以看到沒有覆蓋
- 刪除nginx的deployment,在restore
[rsync@velero-v1.4.0-linux-amd64]$ kubectl delete deployment nginx-deployment -n nginx-example
deployment.extensions "nginx-deployment" deleted
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example
NAME READY STATUS RESTARTS AGE
tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 4m18s
[rsync@velero-v1.4.0-linux-amd64]$ ./velero restore create --from-backup nginx-backup
Restore request "nginx-backup-20200603191949" submitted successfully.
Run `velero restore describe nginx-backup-20200603191949` or `velero restore logs nginx-backup-20200603191949` for more details.
[rsync@velero-v1.4.0-linux-amd64]$ kubectl get pods -n nginx-example NAME READY STATUS RESTARTS AGE
nginx-deployment-5c689d88bb-f8vsx 1/1 Running 0 2s
nginx-deployment-5c689d88bb-rt2zk 0/1 ContainerCreating 0 2s
tomcat-test-sy-677ff78f6b-rc5vq 1/1 Running 0 4m49s
# 可以看到,對我們的tomcat項目是沒影響的。
- 結論:velero恢復不是直接覆蓋,而是會恢復當前集群中不存在的resource,已有的resource不會回滾到之前的版本,如需要回滾,需在restore之前提前刪除現有的resource。
五、高級用法
- 可以設置一個周期性定時備份
# 每日1點進行備份
velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *"
# 每日1點進行備份,備份保留48小時
velero create schedule <SCHEDULE NAME> --schedule="0 1 * * *" --ttl 48h
# 每6小時進行一次備份
velero create schedule <SCHEDULE NAME> --schedule="@every 6h"
# 每日對 web namespace 進行一次備份
velero create schedule <SCHEDULE NAME> --schedule="@every 24h" --include-namespaces web
定時備份的名稱為:`<SCHEDULE NAME>-<TIMESTAMP>`,恢復命令為:`velero restore create --from-backup <SCHEDULE NAME>-<TIMESTAMP>`。
- 如需備份恢復持久卷,備份如下:
velero backup create nginx-backup-volume --snapshot-volumes --include-namespaces nginx-example
-
該備份會在集群所在region給雲盤創建快照(當前還不支持NAS和OSS存儲),快照恢復雲盤只能在同region完成。
-
恢復命令如下:
方法一,通過命令直接刪除
velero delete backups default-backup
方法二,設置備份自動過期,在創建備份時,加上TTL參數
velero backup create <BACKUP-NAME> --ttl <DURATION>
- 還可為資源添加指定標簽,添加標簽的資源在備份的時候被排除。
# 添加標簽
kubectl label -n <ITEM_NAMESPACE> <RESOURCE>/<NAME> velero.io/exclude-from-backup=true
# 為 default namespace 添加標簽
kubectl label -n default namespace/default velero.io/exclude-from-backup=true