step1:
在官網下載部署文件
https://github.com/kubernetes-retired/heapster/tree/master/deploy/kube-config/influxdb
如果只部署heapster,只需要 kubectl apply -f heapster.yaml
出現如下錯誤:
E0312 09:39:32.275587 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:30: Failed to list *v1.Node: nodes is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list resource "nodes" in API group "" at the cluster scope E0312 09:39:32.276426 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:30: Failed to list *v1.Node: nodes is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list resource "nodes" in API group "" at the cluster scope I0312 09:39:32.282892 1 heapster.go:112] Starting heapster on port 8082 E0312 09:39:33.264051 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:30: Failed to list *v1.Node: nodes is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list resource "nodes" in API group "" at the cluster scope E0312 09:39:33.281099 1 reflector.go:190] k8s.io/heapster/metrics/heapster.go:328: Failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list resource "pods" in API group "" at the cluster scope E0312 09:39:33.281196 1 reflector.go:190] k8s.io/heapster/metrics/processors/namespace_based_enricher.go:89: Failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list resource "namespaces" in API group "" at the cluster scope
這里的錯誤顯示是沒有權限。所以在倉庫里面找到文件 https://github.com/kubernetes-retired/heapster/tree/master/deploy/kube-config/rbac
創建 serviceaccount 權限, kubectl apply -f heapster-rbac.yaml
然后發現還是有如上錯誤。所以創建一個最高權限的賬戶,並修改heapster.yaml文件中的serviceaccount
step2:
創建 admin 賬戶
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
先撤掉原來部署的 kubectl delete -f heapster.yaml
修改:serviceAccountName: heapster -> serviceAccountName: admin
再執行: kubectl appy -f heapster.yaml
使用命令查看狀態,顯示失敗。
kubectl top node error: metrics not available yet
再次查看日志
kubectl -n kube-system logs heapster-xxx -f
E0312 09:51:05.007570 1 manager.go:101] Error in scraping containers from kubelet:10.13.145.21:10255: failed to get all container stats from Kubelet URL "http://10.13.145.21:10255/stats/container/": Post http://10.13.145.21:10255/stats/container/: dial tcp 10.13.145.21:10255: getsockopt: connection refused E0312 09:51:05.015293 1 manager.go:101] Error in scraping containers from kubelet:10.13.89.52:10255: failed to get all container stats from Kubelet URL "http://10.13.89.52:10255/stats/container/": Post http://10.13.89.52:10255/stats/container/: dial tcp 10.13.89.52:10255: getsockopt: connection refused E0312 09:51:05.023599 1 manager.go:101] Error in scraping containers from kubelet:10.13.89.53:10255: failed to get all container stats from Kubelet URL "http://10.13.89.53:10255/stats/container/": Post http://10.13.89.53:10255/stats/container/: dial tcp 10.13.89.53:10255: getsockopt: connection refused E0312 09:51:05.029772 1 manager.go:101] Error in scraping containers from kubelet:10.13.89.51:10255: failed to get all container stats from Kubelet URL "http://10.13.89.51:10255/stats/container/": Post http://10.13.89.51:10255/stats/container/: dial tcp 10.13.89.51:10255: getsockopt: connection refused W0312 09:51:25.001639 1 manager.go:152] Failed to get all responses in time (got 0/4)
還是出現錯誤。在查詢之后發現訪問鏈接有問題,解決方案如下
# heapster.yaml文件中的 - --source=kubernetes:https://kubernetes.default # 修改為 - --source=kubernetes:kubernetes:https://kubernetes.default?useServiceAccount=true&kubeletHttps=true&kubeletPort=10250&insecure=true
還是先 kubectl delete -f heapster.yaml
然后再創建 kubectl apply -f heapster.yaml
接着查看日志:
I0312 09:55:05.272254 1 influxdb.go:274] Created database "k8s" on influxDB server at "monitoring-influxdb.kube-system.svc:8086"
然后使用下面命令查看是否成功獲取,顯示結果表示獲取成功
root@n1:~# kubectl top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% master 347m 8% 2253Mi 58% n1 34m 0% 909Mi 5% n2 28m 0% 917Mi 5% n3 28m 0% 870Mi 5%
