kubernetes資源監控
1. 查看集群資源狀況
·k8s集群的master節點一般不會跑業務容器·
kubectl get cs #查看master資源狀態
kubectl get node #查看節點
kubectl cluster-info #查看集群狀態
kubectl describe pods [pod名] #查看各類資源狀態
kubectl get pods -o wide #查看更新信息
2.監控集群資源利用率【metrics-server安裝使用】
#可以通過kubectl 來查看資源利用率,但是該命令需要 "heapster" 調用資源才行,如果沒有提供資源則會報錯如下:
[root@k8s-master1 ~]# kubectl top node k8s-node1
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
#heapster組件已經被棄用,現在使用的是: metrics-server + cAdvisor聚合器來提供資源。
#cAdvisor 已經內置於kubelet組件中。所以需要安裝metrics-server
metrics-server架構示意圖:
wget https://www.chenleilei.net/soft/k8s/metrics-server.zip
#安裝metrics-server 或 上傳相關包
git clone https://github.com/kubernetes-incubator/metrics-server
cd metrics-server/
vim metrics-server-deployment.yaml
31,32行改為:
- name: metrics-server
image: lizhenliang/metrics-server-amd64:v0.3.1
33行 - --secure-port=4443 下方加入以下兩行內容[從github中下載需要添加這兩處,從博客下載這里就不用改了已經改好了]:
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP [這里意思是講鏈接方式改為內部IP鏈接]
[注意:metrics-server是通過主機名來區分主機的,所以說必須要配置 host解析,metrics-server才能正確的采集到目標。]
修改配置文件 vim /etc/kubernetes/manifests/kube-apiserver.yaml
大約20行 - --enable-admission-plugins=NodeRestriction的下方添加1行代碼:
- --enable-aggregator-routing=true
root@k8s-master1 metrics-server]# ll
otal 28
rw-r--r-- 1 root root 397 Mar 15 21:04 aggregated-metrics-reader.yaml
rw-r--r-- 1 root root 303 Mar 15 21:04 auth-delegator.yaml
rw-r--r-- 1 root root 324 Mar 15 21:04 auth-reader.yaml
rw-r--r-- 1 root root 298 Mar 15 21:04 metrics-apiservice.yaml #將merics-server注冊到k8s的api中
rw-r--r-- 1 root root 1277 Mar 27 15:57 metrics-server-deployment.yaml #部署metrics-server
rw-r--r-- 1 root root 297 Mar 15 21:04 metrics-server-service.yaml
rw-r--r-- 1 root root 532 Mar 15 21:04 resource-reader.yaml
修改完畢后直接安裝:
[root@k8s-master1 metrics-server]# kubectl apply -f .
提示:
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
# 重啟 kubelet:
systemctl restart kubelet
檢查安裝:
[root@k8s-master1 metrics-server]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-kvbbh 1/1 Running 0 6d4h
coredns-7ff77c879f-lqk9q 1/1 Running 5 6d4h
etcd-k8s-master1 1/1 Running 2 8d
kube-apiserver-k8s-master1 1/1 Running 2 8d
kube-controller-manager-k8s-master1 1/1 Running 2 8d
kube-flannel-ds-amd64-8gssm 1/1 Running 5 8d
kube-flannel-ds-amd64-gtpwc 1/1 Running 2 8d
kube-flannel-ds-amd64-mx4jx 1/1 Running 4 8d
kube-proxy-kzwft 1/1 Running 5 8d
kube-proxy-rgjmf 1/1 Running 2 8d
kube-proxy-vhdpp 1/1 Running 4 8d
kube-scheduler-k8s-master1 1/1 Running 2 8d
metrics-server-5667498b7d-lmbtr 1/1 Running 0 56s <------metrics-server安裝完畢。
查看:
[root@k8s-master1 metrics-server]# kubectl get apiservice
看到這行:
v1beta1.metrics.k8s.io kube-system/metrics-server True 17s #啟動成功
驗證配置kubectl top配置:
[root@k8s-master1 metrics-server]# kubectl top pods
NAME CPU(cores) MEMORY(bytes)
nginx-f89759699-6jfdp 0m 2Mi
#最開始執行這條命令報錯,現在正常:
[root@k8s-master1 metrics-server]# kubectl top node k8s-node1
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% #<---- 成功
k8s-node1 42m 2% 383Mi 13%
[root@k8s-master1 metrics-server]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master1 94m 4% 809Mi 27% #<---- 成功
k8s-node1 48m 2% 385Mi 13%
k8s-node2 26m 1% 386Mi 13%
#資源利用率排序
[root@k8s-master1 metrics-server]# kubectl top pods -l app=nginx --sort-by=memory
NAME CPU(cores) MEMORY(bytes)
nginx-f89759699-6jfdp 0m 2Mi
3. 問題排查
[root@k8s-master1 ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
如果kubectl get cs查詢失敗,可能是apiserver出現了問題。
#apiserver 正常可以通過該命令查看狀態
[root@k8s-master1 ~]# kubectl cluster-info
Kubernetes master is running at https://10.0.0.63:6443
KubeDNS is running at https://10.0.0.63:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
# 通過dump.log查看集群問題
[root@k8s-master1 ~]# kubectl cluster-info
Kubernetes master is running at https://10.0.0.63:6443
KubeDNS is running at https://10.0.0.63:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
#狀態信息寫入到文件:
kubectl cluster-info dump >a.txt
#kubectl describe pod [pod名]
#實時觀察pod動態
kubectl get pods -w [刪除創建pod,這條命令里都會輸出出來,並可以顯示整理流程]