一、什么是有狀態負載(Statufulset)?
StatefulSet 主要用於管理有狀態的應用,它創建的Pod有持久型的標識符,即便Pod被調度的集群中不同的node節點或銷毀重啟后,標識符任然會保留,另外,支持Pod實例有序的部署和刪除,它有如下特點:
1、Pod一致性:PodName、HostName、Pod的啟動和停止的順序在運行的過程中會保持一致
2、穩定的存儲:通過VolumeClaimTemplate為每個Pod創建一個PVC和PV,即使刪除掉Pod或進行縮容,不會刪掉卷,當重啟或者擴容后會自動將之前的卷進行掛載,這樣就可以保證Pod有穩定的存儲
3、穩定的網絡:Statufulset結合headless service會給個創建的Pod配置一個DNS,其格式為(podname).(headless server name).namespace.svc.cluster.local,Pod實例之間可以通過域名進行訪問
4、穩定的次序:即Pod是有順序的,在部署或者擴展的時候要依據定義的順序依次依次進行(即從0到N-1,在下一個Pod運行之前所有之前的Pod必須都是Running和Ready狀態),刪除或縮容的時候,會從N-1到0
二、Statufulset的使用場景
在應用中對上文Statufulset的特點有需求的可以考慮使用Statufulset,在實際的應用中,經常在分布式應用中使用,如多個mysql實例,各個實例之間有其對應關系,如:主從、主備,對數據的持久化保存、啟動順序、以及實例之間相互訪問的場景。
三、Statufulset的創建和使用
官方推薦的創建Statufulset的順序為:創建PV->創建PVC->創建Headless Service->創建StatufulSet,讀者可能會好奇,為什么需要需要PV、PVC和Headless Service?
1、為什么需要PV和PVC?
創建PV和PVC然后掛載到Pod的容器中,實現數據持久化的保存嗎,本文采用的靜態創建PVC進行講述,更方便的做法是采用storageclass動態創建存儲卷,這樣可以減少集群管理員創建PV這個過程,這里另外的文章再詳細描述。
2、為什么需要Headless Service?
筆者另外的的一篇文章“K8S-Serivce的原理和實踐”詳細介紹了的Headless Service的創建和特點,通過對Headless Service的名稱進行域名解析后會返回后端所有的Pod的IP,而通過Statufulset副本控制器創建的每個Pod都會其配置一個DNS,這個域名的格式為:(podname).(headless server name).namespace.svc.cluster.local,從這里可以得知為什么需要先創建Headless Service,一個作用是返回后端所有的Pod以便Headless Service為每個Pod配置DNS,Pod配置DNS的時候會將Headless Service的名稱作為Pod域名的一部分,可以測試下如果少了Service,StatufulSet能否創建成功
[root@k8s-master zhanglei]# cat statesfulset-test.yaml apiVersion: apps/v1 kind: StatefulSet metadata: name: myapp-statefulset spec: # serviceName: myapp-headless-service replicas: 2 selector: matchLabels: app: myapp-pod template: metadata: labels: app: myapp-pod spec: containers: - name: myapp image: ikubernetes/myapp:v1 ports: - containerPort: 80 name: web volumeMounts: - name: myappdata-pvc mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: myappdata-pvc spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 0.05Gi
[root@k8s-master zhanglei]# kubectl create -f sts-testservice.yaml
error: error validating "sts-testservice.yaml": error validating data: ValidationError(StatefulSet.spec): missing required field "serviceName" in io.k8s.api.apps.v1.StatefulSetSpec; if you choose to ignore these errors, turn validation off with --validate=false
在yaml文件中,注釋掉serviceName后,再執行創建的操作會報錯:缺少必填字段serviceName,可以看出若未指定此字段,則Statufulset不會創建成功。
在筆者前面的文章中K8S-PV和PVC的原理和實踐介紹了PV和PVC的創建過程、在“K8S-Serivce的原理和實踐”介紹了Headless Service的創建過程,這里都不再進行贅述看下已經創建好的PV、PVC和Headless Service
[root@k8s-master zhanglei]# kubectl get pv |grep pv-statefulset pv-statefulset-03 107374182400m RWO Recycle Bound default/myappdata-pvc-myapp-statefulset-0 15d pv-statefulset-04 107374182400m RWO Recycle Bound default/myappdata-pvc-myapp-statefulset-1 14d
[root@k8s-master zhanglei]# kubectl get pvc |grep myappdata myappdata-pvc-myapp-statefulset-0 Bound pv-statefulset-03 107374182400m RWO 14d myappdata-pvc-myapp-statefulset-1 Bound pv-statefulset-04 107374182400m RWO 14d
[root@k8s-master zhanglei]# cat headless-svc-stu.yaml apiVersion: v1 kind: Service metadata: name: myapp-headless-service labels: app: statefulset spec: ports: - port: 80 name: web clusterIP: None selector: app: myapp-pod
創建Statufulset
[root@k8s-master zhanglei]# cat statesfulset-test.yaml apiVersion: apps/v1 kind: StatefulSet metadata: name: myapp-statefulset spec: serviceName: myapp-headless-service # 指定已經創建成功的headless Service replicas: 2 # 指定期望副本數為2 selector: matchLabels: app: myapp-pod template: metadata: labels: app: myapp-pod spec: containers: - name: myapp image: ikubernetes/myapp:v1 ports: - containerPort: 80 name: web volumeMounts: - name: myappdata-pvc mountPath: /usr/share/nginx/html volumeClaimTemplates: # 數據持久化聲明 - metadata: name: myappdata-pvc spec: accessModes: [ "ReadWriteOnce" ] # 聲明訪問模式 resources: requests: storage: 0.05Gi # 聲明容量
[root@k8s-master zhanglei]# kubectl get sts myapp-statefulset -o wide NAME READY AGE CONTAINERS IMAGES myapp-statefulset 2/2 14d myapp ikubernetes/myapp:v1
[root@k8s-master zhanglei]# kubectl describe sts myapp-statefulset Name: myapp-statefulset Namespace: default CreationTimestamp: Sat, 23 May 2020 18:25:02 +0800 Selector: app=myapp-pod Labels: <none> Annotations: <none> Replicas: 2 desired | 2 total Update Strategy: RollingUpdate Partition: 0 Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed Pod Template: Labels: app=myapp-pod Containers: myapp: Image: ikubernetes/myapp:v1 Port: 80/TCP Host Port: 0/TCP Environment: <none> Mounts: /usr/share/nginx/html from myappdata-pvc (rw) Volumes: <none> Volume Claims: Name: myappdata-pvc StorageClass: Labels: <none> Annotations: <none> Capacity: 53687091200m Access Modes: [ReadWriteOnce] Events: <none>
看下Pod的的狀態,如下所示,是Running狀態
[root@k8s-master zhanglei]# kubectl get pod -o wide | grep myapp-statefulset myapp-statefulset-0 1/1 Running 0 5d21h 10.122.235.239 k8s-master <none> <none> myapp-statefulset-1 1/1 Running 0 112m 10.122.235.253 k8s-master <none> <none>
驗證域名:在前面提到Statufulset副本控制器結合headless service會為每個創建的Pod配置一個DNS域名,先接解析headless service的名稱返回
[root@k8s-master zhanglei]# dig -t A myapp-headless-service.default.svc.cluster.local. @10.10.0.10 ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el8 <<>> -t A myapp-headless-service.default.svc.cluster.local. @10.10.0.10 ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22543 ;; flags: qr aa rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: 8e8a5971efec82f4 (echoed) ;; QUESTION SECTION: ;myapp-headless-service.default.svc.cluster.local. IN A ;; ANSWER SECTION: # 返回了有狀態負載創建的所有Pod myapp-headless-service.default.svc.cluster.local. 30 IN A 10.122.235.253 myapp-headless-service.default.svc.cluster.local. 30 IN A 10.122.235.239 ;; Query time: 13 msec ;; SERVER: 10.10.0.10#53(10.10.0.10) ;; WHEN: 日 6月 07 18:05:46 CST 2020 ;; MSG SIZE rcvd: 217
可以看到通過對headless service的名稱的域名解析后返回了所有的Pod的列表,再對單個的Pod的進行域名解析
[root@k8s-master zhanglei]# dig -t A myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. @10.10.0.10 #對Pod-0進行域名解析 ; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el8 <<>> -t A myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. @10.10.0.10 ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46972 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; COOKIE: d930083e06cfaca9 (echoed) ;; QUESTION SECTION: ;myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. IN A ;; ANSWER SECTION: myapp-statefulset-0.myapp-headless-service.default.svc.cluster.local. 30 IN A 10.122.235.239 # 返回了其IP地址 ;; Query time: 19 msec ;; SERVER: 10.10.0.10#53(10.10.0.10) ;; WHEN: 日 6月 07 18:09:18 CST 2020 ;; MSG SIZE rcvd: 193
同樣也可以對myapp-statefulset-1這個Pod進行域名解析會返回此Pod的IP,2個Pod實例之間可以通過域名進行訪問,適合數據庫的主、從Pod實例互相訪問的場景。
驗證服務的穩定性:
[root@k8s-master zhanglei]# kubectl describe pod myapp-statefulset-0 | grep ClaimName ClaimName: myappdata-pvc-myapp-statefulset-0 [root@k8s-master zhanglei]# kubectl delete pod myapp-statefulset-0 pod "myapp-statefulset-0" deleted [root@k8s-master zhanglei]# kubectl get pod NAME READY STATUS RESTARTS AGE myapp-statefulset-0 1/1 Running 0 14s myapp-statefulset-1 1/1 Running 0 129m
刪除Pod后,重新創建的Pod名字與刪除的一致,且使用同一個PVC,Pod的名稱保持了一致性,因為使用還是原來的PVC,因此數據並未丟失,實現了持久化。
驗證擴縮容的順序:
現在是2個Pod,先縮容到1個,如下所示,可以看到縮容后停止的是myapp-statefulset-1 Pod,即驗證先從序號為N-1開始刪除,以N-1到0的順序
[root@k8s-master zhanglei]# kubectl get sts NAME READY AGE myapp-statefulset 2/2 14d [root@k8s-master zhanglei]# kubectl scale sts myapp-statefulset --replicas=1 statefulset.apps/myapp-statefulset scaled [root@k8s-master zhanglei]# kubectl get pod |grep myapp myapp-statefulset-0 1/1 Running 0 5m43s
[root@k8s-master zhanglei]# kubectl get pvc|grep myappdata-pvc-myapp-statefulset myappdata-pvc-myapp-statefulset-0 Bound pv-statefulset-03 107374182400m RWO 15d myappdata-pvc-myapp-statefulset-1 Bound pv-statefulset-04 107374182400m RWO 15d
雖然對Pod進行了縮容,但是之前掛載在myapp-statefulset-1 Pod上的PVC卷並未刪除,保留了歷史數據,再擴容到3個Pod
[root@k8s-master zhanglei]# kubectl get pod |grep myapp-statefulset myapp-statefulset-0 1/1 Running 0 13m myapp-statefulset-1 1/1 Running 0 52s myapp-statefulset-2 0/1 Pending 0 49s
可以看到其擴容的創建Pod的順序為0,1,2,其中myapp-statefulset-2還處於Pending狀態,它會等myapp-statefulset-1為Running狀態后才會執行創建
驗證volume共享:
[root@k8s-master zhanglei]# kubectl describe pv pv-statefulset-testservice Name: pv-statefulset-testservice Labels: release=stable Annotations: pv.kubernetes.io/bound-by-controller: yes Finalizers: [kubernetes.io/pv-protection] StorageClass: Status: Bound Claim: default/myappdata-pvc-myapp-statefulset-2 Reclaim Policy: Recycle Access Modes: RWO VolumeMode: Filesystem Capacity: 107374182400m Node Affinity: <none> Message: Source: Type: HostPath (bare host directory volume) Path: /data/pod/volume7 # 宿主機的目錄 HostPathType: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RecyclerPod 11m persistentvolume-controller Recycler pod: Successfully assigned default/recycler-for-pv-statefulset-testservice to k8s-master Normal RecyclerPod 11m persistentvolume-controller Recycler pod: Pulling image "busybox:1.27" Normal RecyclerPod 11m persistentvolume-controller Recycler pod: Successfully pulled image "busybox:1.27" Normal RecyclerPod 11m persistentvolume-controller Recycler pod: Created container pv-recycler Normal RecyclerPod 11m persistentvolume-controller Recycler pod: Started container pv-recycler Normal VolumeRecycled 11m persistentvolume-controller Volume recycled
登錄到容器共享目錄 /usr/share/nginx/html(describe Pod Mounts可查看)目錄下創建1個sts.txt文件
[root@k8s-master zhanglei]# kubectl exec -it myapp-statefulset-2 -- sh / # ls bin dev etc home lib media mnt proc root run sbin srv sys tmp usr var / # cd /usr/share/nginx/html /usr/share/nginx/html # touch sts.txt
回到宿主機目錄下驗證該文件是否同步到宿主機/data/pod/volume7下,可以看到已完成了同步,驗證完成,另外在宿主機此目錄下的寫入也會同步到容器映射目錄。
[root@k8s-master volume7]# ls sts.txt
四、總結
StatufulSet非常適合類似數據庫實例部署等對數據持久性、啟動順序、實例之間相互訪問的場景,在創建的過程中要注意創建順序:創建PV->創建PVC->創建Headless Service->創建StatufulSet。
作者簡介:雲計算容器\Docker\K8S方向產品經理,學點技術,為更好地設計產品。