一、背景
由于年初疫情影响,身处传统IT行业且兼职出差全国各地“救火”的我有幸被领导选中调研私有云平台,这就给我后来的认证之路做下了铺垫。之前调研kubernetes的v1.17版本自带kubeadm搭建起了高可用集群,随后负责各个模块的docker化。后来跟同行们聊天,发现大家都用上了rancher管理平台,于是又推倒重来,用上了rancher2.x,真是折腾。
k8s作为一个容器编排引擎,一出生就是含着金汤匙出来的。背后的谷歌公司,redhat的大力支持,二者一拍即合成立了CNCF基金会来托管该项目。CKA全称是Certificated Kubernetes Administrator,由Kubernetes的管理机构 CNCF授权,是CNCF官方认证的 Kubernetes 管理员 。除了相关工作经验之外,通过该认证也是候选人对使用和运维k8s能力的一种体现。近年来Linux基金会下属的THELINUXFOUNDATION在中国有了运营团队(中国报名官网地址: LF开源软件大学-Linux Foundation OSS University ),而且也开设了线下的考试中心。成功报名后有一年的考试时间,期间有一次retake重考机会。
二、cka考纲(v1.18,九月份会升到v1.19)
网络11% ,存储7%, 安全12%,集群维护11%,故障排除10%,核心概念19%,安装、配置和验证 12%,日志&监控5%,应用程序声明周期管理8%,调度5%
三、备考过程
今年5月份,Linux基金会联合CNCF推出cka和ckad报考7折优惠。原价2088RMB的报考费一下子变成了1461.60RMB。抓住这个机会在5月末前上了车。之后在各地“救火”过程中,时断时续地看《每天5分钟玩转kubernetes》和听极客时间张磊老师的《深入剖析kubernetes》的课程加深理解,然后就是在github上找cka的真题练手。
四、考试预约
预约考试的网址:https://portal.linuxfoundation.org/portal,登录该网站,选择要考试的时间,会匹配最近可用的相关时间。
五、考试注意事项
- 疫情期间,线下考试中心暂时关闭。老外监考,有什么不懂直接问,中英文都没问题,考官超耐心的
- 需要英文证件,身份证+信用卡,最好用护照。(我没办护照,用了港澳通行证和Visa信用卡,跟考官解释了下放行了)
- 看官网消息,9月份考题更新,时间会从3个小时缩短为2小时,考试重点会偏移,最好在8月底前考过
- 预约时间最好是早上,网络稳定,只有参加国外考试,才知道有把好梯子是多么重要
- 任务管理器的应用程序只能有chrome一个程序,梯子可以在后台一直挂着,我用的梯子直连老美洛杉矶机房的线路 (希望老美正在酝酿的净网行动"流产" ^_^)
- 考试只能打开两个标签页,一个psi考试界面和一个kubernetes.io官网的标签页,不能出现第三个和官网文档中的第三方链接的标签页(有考友打开了wins自带的记事本,被考官视为违规禁考半小时,心态崩了,直接挂了)
- 平时练习真题的时候,最好能在2小时内完成,不然考试的时候有点悬,考试的终端和note工具并没有那么好用,响应很慢
- 一定要把考纲中的创建pod、initContainer、secret、daemonsets、deployment等常用的yaml样式保存到书签里,考试的时候直接贴出来在上面改就行
- 题目中会有多个k8s集群,大多数题目会固定在一个k8s集群中,少部分会在ik8s、vk8s、bk8s这种集群中
- 一定要注意题目细节,名称,镜像,目录等。建议先粘贴yaml文件,然后根据题目去粘贴更改名称,命名空间,镜像,容器名,标签等,鼠标点一下关键字,ctrl+Insert复制,shift+Insert粘贴
- 考试说是3小时,做好4小时不上厕所和不喝水的准备,因为会遇到莫名的断网和psi国外考官超耐心的检查备考环境环节
六、八月真题
1、日志 kubectl logs
# Set configuration context $ kubectl config use-context k8s Monitor the logs of Pod foobar and # Extract log lines corresponding to error file-not-found # Write them to /opt/KULM00201/foobar
kubectl logs foobar | grep file-not-found > /opt/KULM00201/foobar
2、输出排序 --sort-by=.metadata.name
# List all PVs sorted by name saving the full kubectl output to /opt/KUCC0010/my_volumes . # Use kubectl’s own functionally for sorting the output, and do not manipulate it any further.
kubectl get pv --all-namespaces --sort-by=.metadata.name
3、ds部署
# Ensure a single instance of Pod nginx is running on each node of the kubernetes cluster where nginx also represents # the image name which has to be used. Do no override any taints currently in place. # Use Daemonsets to complete this task and use ds.kusc00201 as Daemonset name
# 题目对应文档:https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/ # 删除tolerations字段,复制到image: gcr.io/fluentd-elasticsearch/fluentd:v2.5.1这里即可,再按题意更改yaml文件。 apiVersion: apps/v1 kind: DaemonSet metadata: name: ds.kusc00201 namespace: kube-system labels: k8s-app: fluentd-logging spec: selector: matchLabels: name: fluentd-elasticsearch template: metadata: labels: name: fluentd-elasticsearch spec: containers: - name: fluentd-elasticsearch image: nginx
4、initContainers
# Add an init container to lumpy--koala (Which has been defined in spec file /opt/kucc00100/pod-spec-KUCC00100.yaml) # The init container should create an empty file named /workdir/calm.txt # If /workdir/calm.txt is not detected, the Pod should exit # Once the spec file has been updated with the init container definition, the Pod should be created.
题目中yaml文件已经给出,只需要增加initcontainers部分,以及emptyDir: {} 即可 init文档位置:https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ apiVersion: v1 kind: Pod metadata: name: lumpy--koala labels: app: myapp spec: containers: - name: myapp-con image: nginx command: ['sh', '-c', 'echo The app is running! && sleep 3600'] volumeMounts: #数据卷目录 - name: data mountPath: /workdir livenessProbe: #健康检查 exec: command: - cat - /workdir/calm.txt initContainers: - name: init-myservice image: busybox:1.28 command: ['sh', '-c', "touch /workdir/calm.txt"] volumeMounts: #数据卷目录 - name: data mountPath: /workdir volumes: #空数据卷 - name: data emptyDir: {}
5、多容器
# Create a pod named kucc4 with a single container for each of the following images running inside #(there may be between 1 and 4 images specified): nginx + redis + memcached + consul
apiVersion: v1 kind: Pod metadata: name: kucc4 spec: containers: - name: nginx image: nginx - name: redis image: redis - name: memcached image: memcached - name: consul image: consul
6、nodeSelector
# Schedule a Pod as follows: # Name: nginx-kusc00101 # Image: nginx # Node selector: disk=ssd
apiVersion: v1 kind: Pod metadata: name: nginx-kusc00101 labels: env: test spec: containers: - name: nginx image: nginx imagePullPolicy: IfNotPresent nodeSelector: disktype: ssd
7、deployment升级和回退(set image --record rollout undo)
# Create a deployment as follows # Name: nginx-app # Using container nginx with version 1.10.2-alpine # The deployment should contain 3 replicas # Next, deploy the app with new version 1.13.0-alpine by performing a rolling update and record that update. # Finally, rollback that update to the previous version 1.10.2-alpine
kubectl create deployment nginx-app --image=nginx:1.10.2-alpine kubectl scale deploy nginx-app --replicas=3 kubectl set image deploy nginx-app nginx=nginx:1.13.0-alpine --record #记录 kubectl rollout history deploy nginx-app #查看更新记录 kubectl rollout undo deploy nginx-app --to-revision=1 #回滚到上一版本
8、NodePort
# Create and configure the service front-end-service so it’s accessible through NodePort
# and routes to the existing pod named front-end
kubectl expose pod front-end --name=front-end-service --port=80 --type=NodePort
9、namespace
# Create a Pod as follows: # Name: jenkins # Using image: jenkins # In a new Kubenetes namespace named website-frontend
kubectl create ns website-frontend kubectl run jenkins --image=jenkins -n website-frontend apiVersion: v1 kind: Pod metadata: name: Jenkins namespace: website-frontend spec: containers: - name: Jenkins image: Jenkins
10、kubectl run 命令使用
# Create a deployment spec file that will: # Launch 7 replicas of the redis image with the label: app_env_stage=dev # Deployment name: kual0020 # Save a copy of this spec file to /opt/KUAL00201/deploy_spec.yaml (or .json) # When you are done, clean up (delete) any new k8s API objects that you produced during this task
kubectl run kual00201 --image=redis --labels=app_enb_stage=dev --dry-run -oyaml > /opt/KUAL00201/deploy_spec.yaml
11、根据service的selector查询pod
# Create a file /opt/KUCC00302/kucc00302.txt that lists all pods that implement Service foo in Namespace production.
# The format of the file should be one pod name per line
kubectl get svc -n production --show-labels | grep foo
kubectl get pods -l app=foo(label标签) | grep -v NAME | awk '{print $1}' >> /opt/KUCC00302/kucc00302.txt
12、emptyDir
# Create a pod as follows: # Name: non-persistent-redis # Container image: redis # Named-volume with name: cache-control # Mount path: /data/redis # It should launch in the pre-prod namespace and the volume MUST NOT be persistent.
没有明确要求挂载在node主机上的具体位置,使用随机位置emptyDir:{} ,如果明确挂载到主机的指定位置和地址,则使用hostPath. 1。创建pre-prod名称空间 kubectl create ns pre-prod 2.创建yaml文件,如下: apiVersion: v1 kind: Pod metadata: name: non-presistent-redis namespace: pre-prod spec: containers: - image: redis name: redis volumeMounts: - mountPath: /data/redis name: cache-control volumes: - name: cache-control emptyDir: {}
13、deploy scale
kubectl scale deployment website --replicas=6
14、统计可调度node数
# Check to see how many nodes are ready (not including nodes tainted NoSchedule) and write the number to /opt/nodenum
1.kubectl get node | grep -w Ready | wc -l ####grep -w是精确匹配 通过上面命令取得一个数N 2.通过下面命令取得一个数M kubectl describe nodes | grep Taints | grep -I noschedule | wc -l 3.答案填写N减去M得到的值
15、kubectl top
# From the Pod label name=cpu-utilizer, find pods running high CPU workloads
# and write the name of the Pod consuming most CPU to the file /opt/cpu.txt (which already exists)
kubectl top pods --sort-by="cpu" -l app=web
16、node notReady
# A Kubernetes worker node, labelled with name=wk8s-node-0 is in state NotReady . # Investigate why this is the case, and perform any appropriate steps to bring the node to a Ready state, # ensuring that any changes are made permanent.
kubectl get nodes | grep NotReady
ssh node
systemctl status kubelet
systemctl start kubelet
systemctl enable kubelet
17、pv创建
# Creae a persistent volume with name app-config of capacity 1Gi and access mode ReadWriteOnce.
# The type of volume is hostPath and its location is /srv/app-config
# https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume apiVersion: v1 kind: PersistentVolume metadata: name: pv0003 spec capacity: storage: 1Gi volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle storageClassName: slow hostPath: path: /srv/app-config
18、etcd备份
# Create a snapshot of the etcd instance running at https://127.0.0.1:2379 saving the snapshot to the # file path /data/backup/etcd-snapshot.db # The etcd instance is running etcd version 3.1.10 # The following TLS certificates/key are supplied for connecting to the server with etcdctl # CA certificate: /opt/KUCM00302/ca.crt # Client certificate: /opt/KUCM00302/etcd-client.crt # Clientkey:/opt/KUCM00302/etcd-client.key
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=ca.pem --cert=server.pem --key=server-key.pem snapshot save 给的路径 备份: ETCDCTL_API=3 /usr/bin/etcdctl snapshot save /data/backup/etcd-snapshot.db --endpoints=https://127.0.0.1:2379 --cacert=/opt/KUCM00302/ca.crt --cert=/opt/KUCM00302/etcd-client.crt --key=/opt/KUCM00302/etcd-client.key
19、node维护(drain、cordon、uncordon)
# Set the node labelled with name=ek8s-node-1 as unavailable and reschedule all the pods running on it.
先切换集群到ek8 kubectl get nodes -l name=ek8s-node-1 kubectl drain wk8s-node-1 #有人说遇到命令执行失败,需要加以下参数,个人没遇到 #--ignore-daemonsets=true --delete-local-data=true --force=true # Node 正常下线流程: # 1 cordon 设置维护的节点不可调度 # 2 drain 驱逐节点上pod # 3 delete node kubectl cordon k8s-node2 kubectl drain k8s-node2 --ignore-daemonsets --force
20、svc dns
# Create a deployment as follows # Name: nginx-dns # Exposed via a service: nginx-dns # Ensure that the service & pod are accessible via their respective DNS records # The container(s) within any Pod(s) running as a part of this deployment should use the nginx image # Next, use the utility nslookup to look up the DNS records of the service & pod and write the output to /opt/service.dns and /opt/pod.dns respectively. # Ensure you use the busybox:1.28 image(or earlier) for any testing, an the latest release has an unpstream bug which impacts thd use of nslookup.
第一步:创建deployment kubectl run nginx-dns --image=nginx 第二步:发布服务 kubectl expose deployment nginx-dns --name=nginx-dns --port=80 --type=NodePort 第三步:查询podIP kubectl get pods -o wide (获取pod的ip) 比如Ip是:10.244.1.37 第四步:使用busybox1.28版本进行测试 kubectl run busybox -it --rm --image=busybox:1.28 sh \#:/ nslookup nginx-dns #####查询nginx-dns的记录 \#:/ nslookup 10.244.1.37 #####查询pod的记录 第五步: 把查询到的记录,写到题目要求的文件内,/opt/service.dns和/opt/pod.dns \####这题有疑义,干脆把查到的结果都写进去,给不给分靠天收,写全一点。 1。nginx-dns的 echo 'Name: nginx-dns' >> /opt/service.dns echo 'Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local' >> /opt/service.dns 2。pod的 echo 'Name: 10.244.1.37' >> /opt/pod.dns echo 'Address 1: 10.244.1.37 10-244-1-37.nginx-dns.default.svc.cluster.local' >> /opt/pod.dns
21、secret挂载
# Create a Kubernetes Secret as follows: # Name: super-secret # Credential: alice or username:bob # Create a Pod named pod-secrets-via-file using the redis image which mounts a secret named super-secret at /secrets # Create a second Pod named pod-secrets-via-env using the redis image, which exports credential as TOPSECRET
https://kubernetes.io/zh/docs/concepts/configuration/secret/#%E8%AF%A6%E7%BB%86 echo -n "bob" | base64 apiVersion: v1 kind: Secret metadata: name: super-secret type: Opaque data: username: Ym9i # echo -n "bob" | base64 apiVersion: v1 kind: Pod metadata: name: pod1 spec: containers: - name: mypod image: redis volumeMounts: - name: foo mountPath: "/secret" readOnly: true volumes: secret - name: foo secret: secretName: super-secret --- apiVersion: v1 kind: Pod metadata: name: pod-evn-eee spec: containers: - name: mycontainer image: redis env: - name: SECRET_USERNAME valueFrom: secretKeyRef: name: super-secret key: username restartPolicy: Never
22、static pod --pod-manifest-path
# Configure the kubelet systemd managed service, on the node labelled with name=wk8s-node-1,
# to launch a Pod containing a single container of image nginx named myservice automatically.
# Any spec files required should be placed in the /etc/kubernetes/manifests directory on the node.
该文件应该放置在/etc/kubernetes/manifest目录下(给出了pod路径) 1.vi /etc/kubernetes/manifest/static-pod.yaml 定义一个POD 2.systemctl status kubelet 查找kubelet.service路径 3.vi /etc/systemd/system/kubernetes.service 观察有没有 --pod-manifest-path=/etc/kubernetes/manifest 没有就加上 4.ssh node sudo -i 5. systemctl daemon-reload systemctl restart kubelet.service systemctl enable kubelet
23、集群问题排查
# Determine the node, the failing service and take actions to bring up the failed service # and restore the health of the cluster. Ensure that any changes are made permanently. # The worker node in this cluster is labelled with name=bk8s-node-0
ps -ef|grep kubelet
查找--config=/var/lib/kubelet/config.yaml 这个参数指定的yaml里看下有没有指定静态pod的指定路径,要是也没有的话,kubelet是不会自动创建静态Pod的,而且pod-manifest-path没有默认值。 cat /var/lib/kubelet/config.yaml
发现没有指定静态Pod路径的参数,在最后添加staticPodPath: /etc/kubernetes/manifests 然后运行:
systemctl restart kubelet systemctl enable kubelet 再查看node啥的,就OK了
24、kubeadm部署集群
要求:
提供两个节点,master1和node1,和一个admin.conf文件,部署集群。
步骤:
1. ssh到master节点主机 2. 安装相关组件、kubelet配置自启动
官网文档(https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#%E5%AE%89%E8%A3%85-kubeadm-kubelet-%E5%92%8C-kubectl)
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
3. 初始化master节点
kubeadm init --config /etc/kubeadm.conf ignore-preflight-errors=all #忽略错误参数和配置文件都有提供,配置文件不用改,注意审题!
4. 复制从节点加入命令 5. 切回学生主机 6. ssh到node节点主机 7. 安装相关组件、kubelet配置自启动 8. 粘贴节点加入命令 9. 切回学生主机 10. ssh到master节点主机 11. 检查从节点是否加入 12. master节点运行网络插件安装命令
官网文档(https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network)
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
13. 检查节点是否都ready 14. 切回学生主机