一、背景
由於年初疫情影響,身處傳統IT行業且兼職出差全國各地“救火”的我有幸被領導選中調研私有雲平台,這就給我后來的認證之路做下了鋪墊。之前調研kubernetes的v1.17版本自帶kubeadm搭建起了高可用集群,隨后負責各個模塊的docker化。后來跟同行們聊天,發現大家都用上了rancher管理平台,於是又推倒重來,用上了rancher2.x,真是折騰。
k8s作為一個容器編排引擎,一出生就是含着金湯匙出來的。背后的谷歌公司,redhat的大力支持,二者一拍即合成立了CNCF基金會來托管該項目。CKA全稱是Certificated Kubernetes Administrator,由Kubernetes的管理機構 CNCF授權,是CNCF官方認證的 Kubernetes 管理員 。除了相關工作經驗之外,通過該認證也是候選人對使用和運維k8s能力的一種體現。近年來Linux基金會下屬的THELINUXFOUNDATION在中國有了運營團隊(中國報名官網地址: LF開源軟件大學-Linux Foundation OSS University ),而且也開設了線下的考試中心。成功報名后有一年的考試時間,期間有一次retake重考機會。
二、cka考綱(v1.18,九月份會升到v1.19)
網絡11% ,存儲7%, 安全12%,集群維護11%,故障排除10%,核心概念19%,安裝、配置和驗證 12%,日志&監控5%,應用程序聲明周期管理8%,調度5%
三、備考過程
今年5月份,Linux基金會聯合CNCF推出cka和ckad報考7折優惠。原價2088RMB的報考費一下子變成了1461.60RMB。抓住這個機會在5月末前上了車。之后在各地“救火”過程中,時斷時續地看《每天5分鍾玩轉kubernetes》和聽極客時間張磊老師的《深入剖析kubernetes》的課程加深理解,然后就是在github上找cka的真題練手。
四、考試預約
預約考試的網址:https://portal.linuxfoundation.org/portal,登錄該網站,選擇要考試的時間,會匹配最近可用的相關時間。
五、考試注意事項
- 疫情期間,線下考試中心暫時關閉。老外監考,有什么不懂直接問,中英文都沒問題,考官超耐心的
- 需要英文證件,身份證+信用卡,最好用護照。(我沒辦護照,用了港澳通行證和Visa信用卡,跟考官解釋了下放行了)
- 看官網消息,9月份考題更新,時間會從3個小時縮短為2小時,考試重點會偏移,最好在8月底前考過
- 預約時間最好是早上,網絡穩定,只有參加國外考試,才知道有把好梯子是多么重要
- 任務管理器的應用程序只能有chrome一個程序,梯子可以在后台一直掛着,我用的梯子直連老美洛杉磯機房的線路 (希望老美正在醞釀的凈網行動"流產" ^_^)
- 考試只能打開兩個標簽頁,一個psi考試界面和一個kubernetes.io官網的標簽頁,不能出現第三個和官網文檔中的第三方鏈接的標簽頁(有考友打開了wins自帶的記事本,被考官視為違規禁考半小時,心態崩了,直接掛了)
- 平時練習真題的時候,最好能在2小時內完成,不然考試的時候有點懸,考試的終端和note工具並沒有那么好用,響應很慢
- 一定要把考綱中的創建pod、initContainer、secret、daemonsets、deployment等常用的yaml樣式保存到書簽里,考試的時候直接貼出來在上面改就行
- 題目中會有多個k8s集群,大多數題目會固定在一個k8s集群中,少部分會在ik8s、vk8s、bk8s這種集群中
- 一定要注意題目細節,名稱,鏡像,目錄等。建議先粘貼yaml文件,然后根據題目去粘貼更改名稱,命名空間,鏡像,容器名,標簽等,鼠標點一下關鍵字,ctrl+Insert復制,shift+Insert粘貼
- 考試說是3小時,做好4小時不上廁所和不喝水的准備,因為會遇到莫名的斷網和psi國外考官超耐心的檢查備考環境環節
六、八月真題
1、日志 kubectl logs
# Set configuration context $ kubectl config use-context k8s Monitor the logs of Pod foobar and # Extract log lines corresponding to error file-not-found # Write them to /opt/KULM00201/foobar
kubectl logs foobar | grep file-not-found > /opt/KULM00201/foobar
2、輸出排序 --sort-by=.metadata.name
# List all PVs sorted by name saving the full kubectl output to /opt/KUCC0010/my_volumes . # Use kubectl’s own functionally for sorting the output, and do not manipulate it any further.
kubectl get pv --all-namespaces --sort-by=.metadata.name
3、ds部署
# Ensure a single instance of Pod nginx is running on each node of the kubernetes cluster where nginx also represents # the image name which has to be used. Do no override any taints currently in place. # Use Daemonsets to complete this task and use ds.kusc00201 as Daemonset name
# 題目對應文檔:https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/ # 刪除tolerations字段,復制到image: gcr.io/fluentd-elasticsearch/fluentd:v2.5.1這里即可,再按題意更改yaml文件。 apiVersion: apps/v1 kind: DaemonSet metadata: name: ds.kusc00201 namespace: kube-system labels: k8s-app: fluentd-logging spec: selector: matchLabels: name: fluentd-elasticsearch template: metadata: labels: name: fluentd-elasticsearch spec: containers: - name: fluentd-elasticsearch image: nginx
4、initContainers
# Add an init container to lumpy--koala (Which has been defined in spec file /opt/kucc00100/pod-spec-KUCC00100.yaml) # The init container should create an empty file named /workdir/calm.txt # If /workdir/calm.txt is not detected, the Pod should exit # Once the spec file has been updated with the init container definition, the Pod should be created.
題目中yaml文件已經給出,只需要增加initcontainers部分,以及emptyDir: {} 即可 init文檔位置:https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ apiVersion: v1 kind: Pod metadata: name: lumpy--koala labels: app: myapp spec: containers: - name: myapp-con image: nginx command: ['sh', '-c', 'echo The app is running! && sleep 3600'] volumeMounts: #數據卷目錄 - name: data mountPath: /workdir livenessProbe: #健康檢查 exec: command: - cat - /workdir/calm.txt initContainers: - name: init-myservice image: busybox:1.28 command: ['sh', '-c', "touch /workdir/calm.txt"] volumeMounts: #數據卷目錄 - name: data mountPath: /workdir volumes: #空數據卷 - name: data emptyDir: {}
5、多容器
# Create a pod named kucc4 with a single container for each of the following images running inside #(there may be between 1 and 4 images specified): nginx + redis + memcached + consul
apiVersion: v1 kind: Pod metadata: name: kucc4 spec: containers: - name: nginx image: nginx - name: redis image: redis - name: memcached image: memcached - name: consul image: consul
6、nodeSelector
# Schedule a Pod as follows: # Name: nginx-kusc00101 # Image: nginx # Node selector: disk=ssd
apiVersion: v1 kind: Pod metadata: name: nginx-kusc00101 labels: env: test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent nodeSelector: disktype: ssd
7、deployment升級和回退(set image --record rollout undo)
# Create a deployment as follows # Name: nginx-app # Using container nginx with version 1.10.2-alpine # The deployment should contain 3 replicas # Next, deploy the app with new version 1.13.0-alpine by performing a rolling update and record that update. # Finally, rollback that update to the previous version 1.10.2-alpine
kubectl create deployment nginx-app --image=nginx:1.10.2-alpine kubectl scale deploy nginx-app --replicas=3 kubectl set image deploy nginx-app nginx=nginx:1.13.0-alpine --record #記錄 kubectl rollout history deploy nginx-app #查看更新記錄 kubectl rollout undo deploy nginx-app --to-revision=1 #回滾到上一版本
8、NodePort
# Create and configure the service front-end-service so it’s accessible through NodePort
# and routes to the existing pod named front-end
kubectl expose pod front-end --name=front-end-service --port=80 --type=NodePort
9、namespace
# Create a Pod as follows: # Name: jenkins # Using image: jenkins # In a new Kubenetes namespace named website-frontend
kubectl create ns website-frontend kubectl run jenkins --image=jenkins -n website-frontend apiVersion: v1 kind: Pod metadata: name: Jenkins namespace: website-frontend spec: containers: - name: Jenkins image: Jenkins
10、kubectl run 命令使用
# Create a deployment spec file that will: # Launch 7 replicas of the redis image with the label: app_env_stage=dev # Deployment name: kual0020 # Save a copy of this spec file to /opt/KUAL00201/deploy_spec.yaml (or .json) # When you are done, clean up (delete) any new k8s API objects that you produced during this task
kubectl run kual00201 --image=redis --labels=app_enb_stage=dev --dry-run -oyaml > /opt/KUAL00201/deploy_spec.yaml
11、根據service的selector查詢pod
# Create a file /opt/KUCC00302/kucc00302.txt that lists all pods that implement Service foo in Namespace production.
# The format of the file should be one pod name per line
kubectl get svc -n production --show-labels | grep foo
kubectl get pods -l app=foo(label標簽) | grep -v NAME | awk '{print $1}' >> /opt/KUCC00302/kucc00302.txt
12、emptyDir
# Create a pod as follows: # Name: non-persistent-redis # Container image: redis # Named-volume with name: cache-control # Mount path: /data/redis # It should launch in the pre-prod namespace and the volume MUST NOT be persistent.
沒有明確要求掛載在node主機上的具體位置,使用隨機位置emptyDir:{} ,如果明確掛載到主機的指定位置和地址,則使用hostPath. 1。創建pre-prod名稱空間 kubectl create ns pre-prod 2.創建yaml文件,如下: apiVersion: v1 kind: Pod metadata: name: non-presistent-redis namespace: pre-prod spec: containers: - image: redis name: redis volumeMounts: - mountPath: /data/redis name: cache-control volumes: - name: cache-control emptyDir: {}
13、deploy scale
kubectl scale deployment website --replicas=6
14、統計可調度node數
# Check to see how many nodes are ready (not including nodes tainted NoSchedule) and write the number to /opt/nodenum
1.kubectl get node | grep -w Ready | wc -l ####grep -w是精確匹配 通過上面命令取得一個數N 2.通過下面命令取得一個數M kubectl describe nodes | grep Taints | grep -I noschedule | wc -l 3.答案填寫N減去M得到的值
15、kubectl top
# From the Pod label name=cpu-utilizer, find pods running high CPU workloads
# and write the name of the Pod consuming most CPU to the file /opt/cpu.txt (which already exists)
kubectl top pods --sort-by="cpu" -l app=web
16、node notReady
# A Kubernetes worker node, labelled with name=wk8s-node-0 is in state NotReady . # Investigate why this is the case, and perform any appropriate steps to bring the node to a Ready state, # ensuring that any changes are made permanent.
kubectl get nodes | grep NotReady
ssh node
systemctl status kubelet
systemctl start kubelet
systemctl enable kubelet
17、pv創建
# Creae a persistent volume with name app-config of capacity 1Gi and access mode ReadWriteOnce.
# The type of volume is hostPath and its location is /srv/app-config
# https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume apiVersion: v1 kind: PersistentVolume metadata: name: pv0003 spec capacity: storage: 1Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: slow hostPath: path: /srv/app-config
18、etcd備份
# Create a snapshot of the etcd instance running at https://127.0.0.1:2379 saving the snapshot to the # file path /data/backup/etcd-snapshot.db # The etcd instance is running etcd version 3.1.10 # The following TLS certificates/key are supplied for connecting to the server with etcdctl # CA certificate: /opt/KUCM00302/ca.crt # Client certificate: /opt/KUCM00302/etcd-client.crt # Clientkey:/opt/KUCM00302/etcd-client.key
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=ca.pem --cert=server.pem --key=server-key.pem snapshot save 給的路徑 備份: ETCDCTL_API=3 /usr/bin/etcdctl snapshot save /data/backup/etcd-snapshot.db --endpoints=https://127.0.0.1:2379 --cacert=/opt/KUCM00302/ca.crt --cert=/opt/KUCM00302/etcd-client.crt --key=/opt/KUCM00302/etcd-client.key
19、node維護(drain、cordon、uncordon)
# Set the node labelled with name=ek8s-node-1 as unavailable and reschedule all the pods running on it.
先切換集群到ek8 kubectl get nodes -l name=ek8s-node-1 kubectl drain wk8s-node-1 #有人說遇到命令執行失敗,需要加以下參數,個人沒遇到 #--ignore-daemonsets=true --delete-local-data=true --force=true # Node 正常下線流程: # 1 cordon 設置維護的節點不可調度 # 2 drain 驅逐節點上pod # 3 delete node kubectl cordon k8s-node2 kubectl drain k8s-node2 --ignore-daemonsets --force
20、svc dns
# Create a deployment as follows # Name: nginx-dns # Exposed via a service: nginx-dns # Ensure that the service & pod are accessible via their respective DNS records # The container(s) within any Pod(s) running as a part of this deployment should use the nginx image # Next, use the utility nslookup to look up the DNS records of the service & pod and write the output to /opt/service.dns and /opt/pod.dns respectively. # Ensure you use the busybox:1.28 image(or earlier) for any testing, an the latest release has an unpstream bug which impacts thd use of nslookup.
第一步:創建deployment kubectl run nginx-dns --image=nginx 第二步:發布服務 kubectl expose deployment nginx-dns --name=nginx-dns --port=80 --type=NodePort 第三步:查詢podIP kubectl get pods -o wide (獲取pod的ip) 比如Ip是:10.244.1.37 第四步:使用busybox1.28版本進行測試 kubectl run busybox -it --rm --image=busybox:1.28 sh \#:/ nslookup nginx-dns #####查詢nginx-dns的記錄 \#:/ nslookup 10.244.1.37 #####查詢pod的記錄 第五步: 把查詢到的記錄,寫到題目要求的文件內,/opt/service.dns和/opt/pod.dns \####這題有疑義,干脆把查到的結果都寫進去,給不給分靠天收,寫全一點。 1。nginx-dns的 echo 'Name: nginx-dns' >> /opt/service.dns echo 'Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local' >> /opt/service.dns 2。pod的 echo 'Name: 10.244.1.37' >> /opt/pod.dns echo 'Address 1: 10.244.1.37 10-244-1-37.nginx-dns.default.svc.cluster.local' >> /opt/pod.dns
21、secret掛載
# Create a Kubernetes Secret as follows: # Name: super-secret # Credential: alice or username:bob # Create a Pod named pod-secrets-via-file using the redis image which mounts a secret named super-secret at /secrets # Create a second Pod named pod-secrets-via-env using the redis image, which exports credential as TOPSECRET
https://kubernetes.io/zh/docs/concepts/configuration/secret/#%E8%AF%A6%E7%BB%86 echo -n "bob" | base64 apiVersion: v1 kind: Secret metadata: name: super-secret type: Opaque data: username: Ym9i # echo -n "bob" | base64 apiVersion: v1 kind: Pod metadata: name: pod1 spec: containers: - name: mypod image: redis volumeMounts: - name: foo mountPath: "/secret" readOnly: true volumes: secret - name: foo secret: secretName: super-secret --- apiVersion: v1 kind: Pod metadata: name: pod-evn-eee spec: containers: - name: mycontainer image: redis env: - name: SECRET_USERNAME valueFrom: secretKeyRef: name: super-secret key: username restartPolicy: Never
22、static pod --pod-manifest-path
# Configure the kubelet systemd managed service, on the node labelled with name=wk8s-node-1,
# to launch a Pod containing a single container of image nginx named myservice automatically.
# Any spec files required should be placed in the /etc/kubernetes/manifests directory on the node.
該文件應該放置在/etc/kubernetes/manifest目錄下(給出了pod路徑) 1.vi /etc/kubernetes/manifest/static-pod.yaml 定義一個POD 2.systemctl status kubelet 查找kubelet.service路徑 3.vi /etc/systemd/system/kubernetes.service 觀察有沒有 --pod-manifest-path=/etc/kubernetes/manifest 沒有就加上 4.ssh node sudo -i 5. systemctl daemon-reload systemctl restart kubelet.service systemctl enable kubelet
23、集群問題排查
# Determine the node, the failing service and take actions to bring up the failed service # and restore the health of the cluster. Ensure that any changes are made permanently. # The worker node in this cluster is labelled with name=bk8s-node-0
ps -ef|grep kubelet
查找--config=/var/lib/kubelet/config.yaml 這個參數指定的yaml里看下有沒有指定靜態pod的指定路徑,要是也沒有的話,kubelet是不會自動創建靜態Pod的,而且pod-manifest-path沒有默認值。 cat /var/lib/kubelet/config.yaml
發現沒有指定靜態Pod路徑的參數,在最后添加staticPodPath: /etc/kubernetes/manifests 然后運行:
systemctl restart kubelet systemctl enable kubelet 再查看node啥的,就OK了
24、kubeadm部署集群
要求:
提供兩個節點,master1和node1,和一個admin.conf文件,部署集群。
步驟:
1. ssh到master節點主機 2. 安裝相關組件、kubelet配置自啟動
官網文檔(https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#%E5%AE%89%E8%A3%85-kubeadm-kubelet-%E5%92%8C-kubectl)
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
3. 初始化master節點
kubeadm init --config /etc/kubeadm.conf ignore-preflight-errors=all #忽略錯誤參數和配置文件都有提供,配置文件不用改,注意審題!
4. 復制從節點加入命令 5. 切回學生主機 6. ssh到node節點主機 7. 安裝相關組件、kubelet配置自啟動 8. 粘貼節點加入命令 9. 切回學生主機 10. ssh到master節點主機 11. 檢查從節點是否加入 12. master節點運行網絡插件安裝命令
官網文檔(https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network)
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
13. 檢查節點是否都ready 14. 切回學生主機