k8s監控方案

1、cadvisor+heapster+influxdb+grafana
cAdvisor(k8s node節點已經集成了,暴露一個端口) <--容器中拿數據-- Heapster --匯總到--> InfluxDB <--拿數據展示--Grafana
缺點:只能支持監控容器資源,無法支持業務監控,擴展性較差
2、cadvisor/exporter+prometheus+grafana
總體流程: 數據采集-->匯總-->處理-->存儲-->展示
1. pod監控:prometheus使用cadvisor采集容器監控指標,cadvisor集成在k8s的kubelet中--->通過prometheus進程存儲--->使用grafana進行展現
2. node物理節點的監控:通過node_pxporter采集當前主機的資源--->通過prometheus進程存儲--->使用grafana進行展現
3. master物理節點的監控:通過kube-state-metrics插件從k8s中獲取到apiserver的相關數據--->通過prometheus進程存儲--->使用grafana進行展現
kubernetes監控指標
集群監控:節點資源利用率、節點數、運行Pods
Pod監控:Kubernetes指標、容器指標、應用程序
1. kubernetes自身的監控
node的資源利用率-node節點上的cpu、內存、硬盤、鏈接
node的數量-node數量與資源利用率、業務負載的比例情況、成本、資源擴展的評估
pod的數量-當負載到一定程度時,node與pod的數量,評估負載到哪個階段,大約需要多少服務器,每個pod的資源占用率如何,進行整體評估
資源對象狀態-k8s在運行過程中,會創建很多pod,控制器,任務,這些內容都是由k8s中的資源對象進行維護,需要進行對資源對象的監控,獲取資源對象的狀態
2. pod監控
每個項目中pod的數量-正常的pod數量,有問題的pod數量
容器資源利用率-統計當前pod的資源利用率,統計pod中的容器資源利用率,cpu、網絡、內存評估
應用程序-項目中的程序的自身情況,如並發,請求響應,項目用戶數量,訂單數等
官方文檔
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
Heapster+InfluxDB+Grafana

#部署監控前 必須先部署DNS服務(上次已經部署了coredns) [root@master01 Monitor]# kubectl get all -n kube-system NAME READY STATUS RESTARTS AGE pod/coredns-5c5d76fdbb-lnhfq 1/1 Running 0 8d pod/kubernetes-dashboard-587699746d-4njgl 1/1 Running 0 8d NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kube-dns ClusterIP 10.0.0.2 <none> 53/UDP,53/TCP 8d service/kubernetes-dashboard NodePort 10.0.0.153 <none> 443:30001/TCP 8d NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/coredns 1/1 1 1 8d deployment.apps/kubernetes-dashboard 1/1 1 1 8d NAME DESIRED CURRENT READY AGE replicaset.apps/coredns-5c5d76fdbb 1 1 1 8d replicaset.apps/kubernetes-dashboard-587699746d 1 1 1 8d [root@master01 Monitor]# kubectl apply -f grafana.yaml influxdb.yaml kubernetes-pod-statistics_rev1.json heapster.yaml kubernetes-node-statistics_rev1.json
#部署heapster [root@master01 Monitor]# cat heapster.yaml apiVersion: v1 kind: ServiceAccount #為了有權限訪問apiserver metadata: name: heapster namespace: kube-system --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1beta1 metadata: name: heapster roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io subjects: #綁定集群角色 - kind: ServiceAccount name: heapster namespace: kube-system --- apiVersion: extensions/v1beta1 kind: Deployment metadata: name: heapster namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: heapster spec: serviceAccountName: heapster containers: - name: heapster image: 10.192.27.111/library/heapster-amd64:v1.4.2 imagePullPolicy: IfNotPresent command: - /heapster - --source=kubernetes:https://kubernetes.default #kube-apiserver地址 - --sink=influxdb:http://monitoring-influxdb:8086 #influxdb的地址 一定要DNS解析 --- apiVersion: v1 kind: Service metadata: labels: task: monitoring kubernetes.io/cluster-service: 'true' kubernetes.io/name: Heapster name: heapster namespace: kube-system spec: ports: - port: 80 targetPort: 8082 selector: k8s-app: heapster [root@master01 Monitor]# [root@master01 Monitor]# kubectl apply -f heapster.yaml serviceaccount/heapster created clusterrolebinding.rbac.authorization.k8s.io/heapster created deployment.extensions/heapster created service/heapster created
#部署influxdb [root@master01 Monitor]# cat influxdb.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: monitoring-influxdb namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: influxdb spec: containers: - name: influxdb image: 10.192.27.111/library/heapster-influxdb-amd64:v1.1.1 volumeMounts: - mountPath: /data #數據可以永久存儲,這里只是臨時存儲 name: influxdb-storage volumes: - name: influxdb-storage emptyDir: {} --- apiVersion: v1 kind: Service metadata: labels: task: monitoring kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-influxdb name: monitoring-influxdb namespace: kube-system spec: ports: - port: 8086 targetPort: 8086 selector: k8s-app: influxdb [root@master01 Monitor]# [root@master01 Monitor]# kubectl apply -f influxdb.yaml deployment.extensions/monitoring-influxdb created service/monitoring-influxdb created
#部署grafana [root@master01 Monitor]# cat grafana.yaml apiVersion: extensions/v1beta1 kind: Deployment metadata: name: monitoring-grafana namespace: kube-system spec: replicas: 1 template: metadata: labels: task: monitoring k8s-app: grafana spec: containers: - name: grafana image: 10.192.27.111/library/heapster-grafana-amd64:v4.4.1 ports: - containerPort: 3000 protocol: TCP volumeMounts: - mountPath: /var name: grafana-storage env: - name: INFLUXDB_HOST #設置環境變量 influxdb的域名 value: monitoring-influxdb - name: GF_AUTH_BASIC_ENABLED value: "false" - name: GF_AUTH_ANONYMOUS_ENABLED value: "true" - name: GF_AUTH_ANONYMOUS_ORG_ROLE value: Admin - name: GF_SERVER_ROOT_URL value: / volumes: - name: grafana-storage emptyDir: {} --- apiVersion: v1 kind: Service metadata: labels: kubernetes.io/cluster-service: 'true' kubernetes.io/name: monitoring-grafana name: monitoring-grafana namespace: kube-system spec: type: NodePort ports: - port : 80 targetPort: 3000 selector: k8s-app: grafana [root@master01 Monitor]# [root@master01 Monitor]# kubectl apply -f grafana.yaml deployment.extensions/monitoring-grafana created service/monitoring-grafana created
#查看結果 [root@master01 Monitor]# kubectl get all -n kube-system NAME READY STATUS RESTARTS AGE pod/coredns-5c5d76fdbb-lnhfq 1/1 Running 0 8d pod/heapster-6567dc64f4-44j29 1/1 Running 0 19s pod/kubernetes-dashboard-587699746d-4njgl 1/1 Running 0 8d pod/monitoring-grafana-6d7b7f5fd8-9xftz 1/1 Running 0 5s pod/monitoring-influxdb-7875d7469c-8gxfn 1/1 Running 0 10s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/heapster ClusterIP 10.0.0.189 <none> 80/TCP 19s service/kube-dns ClusterIP 10.0.0.2 <none> 53/UDP,53/TCP 8d service/kubernetes-dashboard NodePort 10.0.0.153 <none> 443:30001/TCP 8d service/monitoring-grafana NodePort 10.0.0.17 <none> 80:49268/TCP 5s #訪問端口 service/monitoring-influxdb ClusterIP 10.0.0.233 <none> 8086/TCP 10s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/coredns 1/1 1 1 8d deployment.apps/heapster 1/1 1 1 19s deployment.apps/kubernetes-dashboard 1/1 1 1 8d deployment.apps/monitoring-grafana 1/1 1 1 5s deployment.apps/monitoring-influxdb 1/1 1 1 10s NAME DESIRED CURRENT READY AGE replicaset.apps/coredns-5c5d76fdbb 1 1 1 8d replicaset.apps/heapster-6567dc64f4 1 1 1 19s replicaset.apps/kubernetes-dashboard-587699746d 1 1 1 8d replicaset.apps/monitoring-grafana-6d7b7f5fd8 1 1 1 5s replicaset.apps/monitoring-influxdb-7875d7469c 1 1 1 10s [root@master01 Monitor]#
訪問地址:http://10.192.27.115:49268




導入自己設計好的樣式






上面好像看不了每個pod的詳情 由於它的過濾規則問題,要將它清除掉





