搭建Heapster+InfluxDB+Grafana集群性能監控平台
在大規模容器集群中,需要對所有node和全部容器進行性能監控。kubernetes建議使用一套工具來實現集群性能數據的采集、存儲和展示:heapster、InfluxDB和Grafana。
heapster:對集群中各個Node上cAdvisor的數據采集匯聚的系統,通過訪問每個node上kubelet的API,再通過kubelet調用cAdvisor的API來采集該節點上所有容器的性能數據。Heapster對性能書庫進行聚合,並將結果保存到后端存儲系統中。
InfluxDB:是分布式時序數據庫(每條記錄都帶有時間戳屬性),主要用於實時數據采集、事件跟蹤記錄、存儲時間圖表、原始數據等。InfluxDB提供了REST API用於數據的存儲和查詢。
Grafana:通過dashboard將InfluxDB中的時序數據展現成圖表或曲線等形式,便於運維人員查看集群的運行狀態。
基於Heapster+InfluxDB+Grafana集群監控系統總體框架如圖所示
Heapster+InfluxDB+Grafana均以Pod的形式啟動和運行。
1. 上傳相關的鏡像到私有倉庫
[root@kub_master ~]# docker load -i docker_heapster.tar.gz c12ecfd4861d: Loading layer [==================================================>] 130.9 MB/130.9 MB 5f70bf18a086: Loading layer [==================================================>] 1.024 kB/1.024 kB 998608e2fcd4: Loading layer [==================================================>] 45.16 MB/45.16 MB 591569fa6c34: Loading layer [==================================================>] 126.5 MB/126.5 MB 0b2fe2c6ef6b: Loading layer [==================================================>] 136.2 MB/136.2 MB f9f3fb66a490: Loading layer [==================================================>] 322.9 MB/322.9 MB 6e2e798f8998: Loading layer [==================================================>] 2.56 kB/2.56 kB 21ac53bc7cd6: Loading layer [==================================================>] 5.632 kB/5.632 kB 7f96c89af577: Loading layer [==================================================>] 79.98 MB/79.98 MB 4371d588893a: Loading layer [==================================================>] 150.2 MB/150.2 MB Loaded image: docker.io/kubernetes/heapster:canary
[root@kub_master ~]# docker load -i docker_heapster_influxdb.tar.gz 8ceab61e5aa8: Loading layer [==================================================>] 197.2 MB/197.2 MB 3c84ae1bbde2: Loading layer [==================================================>] 208.9 kB/208.9 kB e8061ac24ae3: Loading layer [==================================================>] 4.608 kB/4.608 kB 5f70bf18a086: Loading layer [==================================================>] 1.024 kB/1.024 kB 58484cf9c5e7: Loading layer [==================================================>] 63.49 MB/63.49 MB 07d2297acddc: Loading layer [==================================================>] 4.608 kB/4.608 kB Loaded image: docker.io/kubernetes/heapster_influxdb:v0.5 [root@kub_master ~]# docker images |grep heapster_influxdb docker.io/kubernetes/heapster_influxdb v0.5 a47993810aac 5 years ago 251 MB
[root@kub_master ~]# docker load -i docker_heapster_grafana.tar.gz c69ae1aa4698: Loading layer [==================================================>] 131 MB/131 MB 5f70bf18a086: Loading layer [==================================================>] 1.024 kB/1.024 kB 75a5b97e491c: Loading layer [==================================================>] 127.1 MB/127.1 MB e188e2340071: Loading layer [==================================================>] 16.92 MB/16.92 MB 5e43af080be6: Loading layer [==================================================>] 55.81 kB/55.81 kB 7ecd917a174c: Loading layer [==================================================>] 4.096 kB/4.096 kB Loaded image: docker.io/kubernetes/heapster_grafana:v2.6.0 [root@kub_master ~]# docker images |grep heapster_grafana docker.io/kubernetes/heapster_grafana v2.6.0 4fe73eb13e50 4 years ago 267 MB
#上傳以上三款鏡像至私有倉庫
[root@kub_master ~]# docker tag docker.io/kubernetes/heapster:canary 192.168.0.212:5000/heapster:canary [root@kub_master ~]# docker push 192.168.0.212:5000/heapster:canary The push refers to a repository [192.168.0.212:5000/heapster] 5f70bf18a086: Mounted from kubernetes-dashboard-amd64 4371d588893a: Pushed 7f96c89af577: Pushed 21ac53bc7cd6: Pushed 6e2e798f8998: Pushed f9f3fb66a490: Pushed 0b2fe2c6ef6b: Pushed 591569fa6c34: Pushed 998608e2fcd4: Pushed c12ecfd4861d: Pushed canary: digest: sha256:a024ec78a53b4b9b0bd13b12a6aec29913359e17d30c8d0c35d0a510d10a9d75 size: 4276 [root@kub_master ~]# docker tag docker.io/kubernetes/heapster_grafana:v2.6.0 192.168.0.212:5000/heapster_grafana:v2.6.0 [root@kub_master ~]# docker push 192.168.0.212:5000/heapster_grafana:v2.6.0 The push refers to a repository [192.168.0.212:5000/heapster_grafana] 5f70bf18a086: Mounted from heapster 7ecd917a174c: Pushed 5e43af080be6: Pushed e188e2340071: Pushed 75a5b97e491c: Pushed c69ae1aa4698: Pushed v2.6.0: digest: sha256:1299d1ebb518416f90895a29fc58ad87149af610c8d5d9376c3379d65b6d9568 size: 2811 [root@kub_master ~]# docker tag docker.io/kubernetes/heapster_influxdb:v0.5 192.168.0.212:5000/heapster_influxdb:v0.5 [root@kub_master ~]# docker push 192.168.0.212:5000/heapster_influxdb:v0.5 The push refers to a repository [192.168.0.212:5000/heapster_influxdb] 5f70bf18a086: Mounted from heapster_grafana 07d2297acddc: Pushed 58484cf9c5e7: Pushed e8061ac24ae3: Pushed 3c84ae1bbde2: Pushed 8ceab61e5aa8: Pushed v0.5: digest: sha256:32cdba763a3fab92deeb8074b47ace45ee4a15d537e318ceae89f1e2dd069b53 size: 3012
2. 下載heapster資料包
[root@kub_master ~]# cd k8s/ [root@kub_master k8s]# mkdir heapster [root@kub_master k8s]# cd heapster/ [root@kub_master heapster]# wget https://www.qstack.com.cn/heapster-influxdb.zip --2020-09-28 20:29:42-- https://www.qstack.com.cn/heapster-influxdb.zip Resolving www.qstack.com.cn (www.qstack.com.cn)... 180.96.32.89 Connecting to www.qstack.com.cn (www.qstack.com.cn)|180.96.32.89|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2636 (2.6K) [application/zip] Saving to: ‘heapster-influxdb.zip’ 100%[===============================================================================================>] 2,636 --.-K/s in 0s 2020-09-28 20:29:43 (40.6 MB/s) - ‘heapster-influxdb.zip’ saved [2636/2636] [root@kub_master heapster]# unzip heapster-influxdb.zip Archive: heapster-influxdb.zip creating: heapster-influxdb/ inflating: heapster-influxdb/grafana-service.yaml inflating: heapster-influxdb/heapster-controller.yaml inflating: heapster-influxdb/heapster-service.yaml inflating: heapster-influxdb/influxdb-grafana-controller.yaml inflating: heapster-influxdb/influxdb-service.yaml [root@kub_master heapster]# cd heapster-influxdb [root@kub_master heapster-influxdb]# ll total 20 -rw-r--r-- 1 root root 414 Sep 14 2016 grafana-service.yaml -rw-r--r-- 1 root root 682 Jul 1 2019 heapster-controller.yaml -rw-r--r-- 1 root root 249 Sep 14 2016 heapster-service.yaml -rw-r--r-- 1 root root 1605 Jul 1 2019 influxdb-grafana-controller.yaml -rw-r--r-- 1 root root 259 Sep 14 2016 influxdb-service.yaml
3. 部署Heapster容器
[root@kub_master heapster-influxdb]# vim heapster-controller.yaml [root@kub_master heapster-influxdb]# cat heapster-controller.yaml apiVersion: v1 kind: ReplicationController metadata: labels: k8s-app: heapster name: heapster version: v6 name: heapster namespace: kube-system spec: replicas: 1 selector: k8s-app: heapster version: v6 template: metadata: labels: k8s-app: heapster version: v6 spec: nodeSelector: kubernetes.io/hostname: 192.168.0.212 containers: - name: heapster image: 192.168.0.212:5000/heapster:canary imagePullPolicy: IfNotPresent command: - /heapster - --source=kubernetes:http://192.168.0.212:8080?inClusterConfig=false #配置采集來源,為master URL地址 - --sink=influxdb:http://monitoring-influxdb:8086 #配置后端存儲系統,使用influxdb數據庫
[root@kub_master heapster-influxdb]# vim heapster-service.yaml [root@kub_master heapster-influxdb]# cat heapster-service.yaml apiVersion: v1 kind: Service metadata: labels: kubernetes.io/cluster-service: 'true' kubernetes.io/name: Heapster name: heapster namespace: kube-system spec: ports: - port: 80 targetPort: 8082 selector: k8s-app: heapster
[root@kub_master heapster-influxdb]# kubectl create -f heapster-controller.yaml replicationcontroller "heapster" created [root@kub_master heapster-influxdb]# kubectl create -f heapster-service.yaml service "heapster" created [root@kub_master heapster-influxdb]# kubectl get all --namespace=kube-system NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deploy/kube-dns 1 1 1 1 2d deploy/kubernetes-dashboard-latest 1 1 1 1 1d NAME DESIRED CURRENT READY AGE rc/heapster 1 1 1 26s NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/heapster 192.168.211.92 <none> 80/TCP 11s svc/kube-dns 192.168.230.254 <none> 53/UDP,53/TCP 2d svc/kubernetes-dashboard 192.168.108.11 <none> 80/TCP 1d NAME DESIRED CURRENT READY AGE rs/kube-dns-4072910292 1 1 1 2d rs/kubernetes-dashboard-latest-3255858758 1 1 1 1d NAME READY STATUS RESTARTS AGE po/heapster-jhr57 1/1 Running 0 26s po/kube-dns-4072910292-4qb6c 4/4 Running 0 2d po/kubernetes-dashboard-latest-3255858758-fqsbc 1/1 Running 0 1d
4. 部署InfluxDB和Grafana
#InfluxDB和Grafana的RC定義:
[root@kub_master heapster-influxdb]# vim influxdb-grafana-controller.yaml [root@kub_master heapster-influxdb]# cat influxdb-grafana-controller.yaml apiVersion: v1 kind: ReplicationController metadata: labels: name: influxGrafana name: influxdb-grafana namespace: kube-system spec: replicas: 1 selector: name: influxGrafana template: metadata: labels: name: influxGrafana spec: containers: - name: influxdb image: 192.168.0.212:5000/heapster_influxdb:v0.5 imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /data name: influxdb-storage - name: grafana imagePullPolicy: IfNotPresent image: 192.168.0.212:5000/heapster_grafana:v2.6.0 env: - name: INFLUXDB_SERVICE_URL value: http://monitoring-influxdb:8086 # The following env variables are required to make Grafana accessible via # the kubernetes api-server proxy. On production clusters, we recommend # removing these env variables, setup auth for grafana, and expose the grafana # service using a LoadBalancer or a public IP. - name: GF_AUTH_BASIC_ENABLED value: "false" - name: GF_AUTH_ANONYMOUS_ENABLED value: "true" - name: GF_AUTH_ANONYMOUS_ORG_ROLE value: Admin - name: GF_SERVER_ROOT_URL value: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/ volumeMounts: - mountPath: /var name: grafana-storage nodeSelector: kubernetes.io/hostname: 192.168.0.212 volumes: - name: influxdb-storage emptyDir: {} - name: grafana-storage emptyDir: {}
#InfluxDB Service 定義
[root@kub_master heapster-influxdb]# vim influxdb-service.yaml [root@kub_master heapster-influxdb]# cat influxdb-service.yaml apiVersion: v1 kind: Service metadata: labels: null name: monitoring-influxdb namespace: kube-system spec: ports: - name: http port: 8083 targetPort: 8083 - name: api port: 8086 targetPort: 8086 selector: name: influxGrafana
#Grafana Service定義
[root@kub_master heapster-influxdb]# vim grafana-service.yaml [root@kub_master heapster-influxdb]# cat grafana-service.yaml apiVersion: v1 kind: Service metadata: labels: kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-grafana name: monitoring-grafana namespace: kube-system spec: # In a production setup, we recommend accessing Grafana through an external Loadbalancer # or through a public IP. # type: LoadBalancer ports: - port: 80 targetPort: 3000 selector: name: influxGrafana
[root@kub_master heapster-influxdb]# kubectl create -f influxdb-grafana-controller.yaml replicationcontroller "influxdb-grafana" created [root@kub_master heapster-influxdb]# kubectl get all --namespace=kube-system NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deploy/kube-dns 1 1 1 1 2d deploy/kubernetes-dashboard-latest 1 1 1 1 1d NAME DESIRED CURRENT READY AGE rc/heapster 1 1 1 4m rc/influxdb-grafana 1 1 1 5s NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/heapster 192.168.211.92 <none> 80/TCP 4m svc/kube-dns 192.168.230.254 <none> 53/UDP,53/TCP 2d svc/kubernetes-dashboard 192.168.108.11 <none> 80/TCP 1d NAME DESIRED CURRENT READY AGE rs/kube-dns-4072910292 1 1 1 2d rs/kubernetes-dashboard-latest-3255858758 1 1 1 1d NAME READY STATUS RESTARTS AGE po/heapster-jhr57 1/1 Running 0 4m po/influxdb-grafana-thgp6 2/2 Running 0 5s po/kube-dns-4072910292-4qb6c 4/4 Running 0 2d po/kubernetes-dashboard-latest-3255858758-fqsbc 1/1 Running 0 1d [root@kub_master heapster-influxdb]# kubectl create -f influxdb-service.yaml service "monitoring-influxdb" created [root@kub_master heapster-influxdb]# kubectl create -f grafana-service.yaml service "monitoring-grafana" created [root@kub_master heapster-influxdb]# kubectl get all --namespace=kube-system NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deploy/kube-dns 1 1 1 1 2d deploy/kubernetes-dashboard-latest 1 1 1 1 1d NAME DESIRED CURRENT READY AGE rc/heapster 1 1 1 5m rc/influxdb-grafana 1 1 1 44s NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/heapster 192.168.211.92 <none> 80/TCP 5m svc/kube-dns 192.168.230.254 <none> 53/UDP,53/TCP 2d svc/kubernetes-dashboard 192.168.108.11 <none> 80/TCP 1d svc/monitoring-grafana 192.168.15.46 <none> 80/TCP 3s svc/monitoring-influxdb 192.168.188.130 <none> 8083/TCP,8086/TCP 10s NAME DESIRED CURRENT READY AGE rs/kube-dns-4072910292 1 1 1 2d rs/kubernetes-dashboard-latest-3255858758 1 1 1 1d NAME READY STATUS RESTARTS AGE po/heapster-jhr57 1/1 Running 0 5m po/influxdb-grafana-thgp6 2/2 Running 0 44s po/kube-dns-4072910292-4qb6c 4/4 Running 0 2d po/kubernetes-dashboard-latest-3255858758-fqsbc 1/1 Running 0 1d
5. 訪問dashboard
6. 通過cAdvisor頁面查看容器的運行狀態
在kubernetes系統中,cAdvisor已被默認集成到了kubelet組件內,當kubelet服務啟動時,它會自動啟動cAdvisor服務,然后cAdvisor會實時采集所在節點的性能指標及在節點上運行的容器的性能指標。kubelet的啟動參數--cadvisor-port可自定義cAdvisor對外提供服務的端口號,默認為4194.。
cAdvisor提供了web頁面可供瀏覽器訪問。如監控node2節點的性能指標
[root@kub_node2 ~]# vim /etc/kubernetes/kubelet [root@kub_node2 ~]# cat /etc/kubernetes/kubelet ### # kubernetes kubelet (minion) config # The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces) KUBELET_ADDRESS="--address=0.0.0.0" # The port for the info server to serve on KUBELET_PORT="--port=10250" # You may leave this blank to use the actual hostname KUBELET_HOSTNAME="--hostname-override=192.168.0.208" # location of the api-server KUBELET_API_SERVER="--api-servers=http://192.168.0.212:8080" # pod infrastructure container KUBELET_POD_INFRA_CONTAINER="--pod-infra-container-image=192.168.0.212:5000/pod-infrastructure:latest" # Add your own! KUBELET_ARGS="--cluster_dns=192.168.230.254 --cluster_domain=cluster.local --cadvisor-port=4194" [root@kub_node2 ~]# systemctl restart kubelet
在瀏覽器中輸入http://192.168.0.212:4194來訪問cAdvisor的監控頁面。
cAdvisor的主頁顯示了主機的實時運行狀態,包括cpu使用情況、內存使用情況、網絡吞吐量及文件系統使用情況等信息。
容器的性能數據對於集群監控非常有用,系統管理員可以根據cAdvisor提供的數據進行分析和警告。但是cAdvisor是在每台Node上運行的,只能采集本機的性能指標數據。