K8S從入門到放棄系列-(16)Kubernetes集群Prometheus-operator監控部署


Prometheus Operator不同於Prometheus,Prometheus Operator是 CoreOS 開源的一套用於管理在 Kubernetes 集群上的 Prometheus 控制器,它是為了簡化在 Kubernetes 上部署、管理和運行 Prometheus 和 Alertmanager 集群。

官方提供的架構圖:

 kubernetes也在官方的github上關於使用prometheus監控的建議:

地址:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/prometheus

  

相關服務說明:

  Operator: Operator 資源會根據自定義資源(Custom Resource Definition / CRDs)來部署和管理 Prometheus Server,同時監控這些自定義資源事件的變化來做相應的處理,是整個系統的控制中心。
  Prometheus: Prometheus 資源是聲明性地描述 Prometheus 部署的期望狀態。
  Prometheus Server: Operator 根據自定義資源 Prometheus 類型中定義的內容而部署的 Prometheus Server 集群,這些自定義資源可以看作是用來管理 Prometheus Server 集群的 StatefulSets 資源。
  ServiceMonitor: ServiceMonitor 也是一個自定義資源,它描述了一組被 Prometheus 監控的 targets 列表。該資源通過 Labels 來選取對應的 Service Endpoint,讓 Prometheus Server 通過選取的 Service 來獲取 Metrics 信息。
  Service: Service 資源主要用來對應 Kubernetes 集群中的 Metrics Server Pod,來提供給 ServiceMonitor 選取讓 Prometheus Server 來獲取信息。簡單的說就是 Prometheus 監控的對象,例如之前了解的 Node Exporter Service、Mysql Exporter Service 等等。
Alertmanager: Alertmanager 也是一個自定義資源類型,由 Operator 根據資源描述內容來部署 Alertmanager 集群

1、下載配置文件

官方地址:https://github.com/coreos/kube-prometheus

因為整個項目並沒有多大,這里我把整個項目克隆下來,你也可以下載單獨的文件,把https://github.com/coreos/kube-prometheus/tree/master/manifests下面的文件全部下載到本地。

[root@k8s-master01 k8s]# git clone https://github.com/coreos/kube-prometheus.git

 2、部署

   2.1 配置清單yml文件歸檔

## 因為官方把所有資源配置文件都放到一個文件目錄下,這里我們為了方便,把不同服務的清單文件分別歸檔
[root@k8s-master01 ~]# cd /opt/k8s/kube-prometheus/manifests
[root@k8s-master01 manifests]# mkdir serviceMonitor operator grafana kube-state-metrics alertmanager node-exporter adapter prometheus [root@k8s
-master01 manifests]# mv *-serviceMonitor* serviceMonitor/ [root@k8s-master01 manifests]# mv 0prometheus-operator* operator/ [root@k8s-master01 manifests]# mv grafana-* grafana/ [root@k8s-master01 manifests]# mv kube-state-metrics-* kube-state-metrics/ [root@k8s-master01 manifests]# mv alertmanager-* alertmanager/ [root@k8s-master01 manifests]# mv node-exporter-* node-exporter/ [root@k8s-master01 manifests]# mv prometheus-adapter-* adapter/ [root@k8s-master01 manifests]# mv prometheus-* prometheus/

   2.2 部署operator

## 首先創建prometheus監控專有命名空間
[root@k8s-master01 manifests]# kubectl apply -f 00namespace-namespace.yaml
## 部署operator
[root@k8s-master01 manifests]# kubectl apply -f operator/
## 查看pod運行情況,配置清單中鏡像倉庫地址為,quay.io,所以無需進行其它操作
[root@k8s-master01 manifests]# kubectl get pods -n monitoring
NAME                                   READY   STATUS    RESTARTS   AGE
prometheus-operator-69bd579bf9-mjsxz   1/1     Running   0          20s

 2.3 部署metrics

這里部署metrics之前,需要先確定集群中kube-apiserver是否已經開啟聚合(支持集群接入第三方api)以及其它組件參數是否正確,否則會導致無法獲取數據的情況出現,具體請參考前面文章K8S從入門到放棄系列-(13)Kubernetes集群mertics-server部署.

## 這里addon-resizer服務鏡像使用的是google的倉庫,我們修改為阿里雲倉庫
[root@k8s-master01 manifests]# vim kube-state-metrics/kube-state-metrics-deployment.yaml
    #image: k8s.gcr.io/addon-resizer:1.8.4 ## 原有配置,注釋修改為一下地址
       image: registry.aliyuncs.com/google_containers/addon-resizer:1.8.4
[root@k8s-master01 kube-state-metrics]# kubectl apply -f kube-state-metrics/

2.4 部署其它組件

其它組件按照以上部署即可,鏡像無需翻牆均可以正常下載。鏡像下載速度取決與本地網絡狀況。

[root@k8s-master01 kube-state-metrics]# kubectl apply -f adapter/
[root@k8s-master01 kube-state-metrics]# kubectl apply -f alertmanager/
[root@k8s-master01 kube-state-metrics]# kubectl apply -f node-exporter/
[root@k8s-master01 kube-state-metrics]# kubectl apply -f grafana/
[root@k8s-master01 kube-state-metrics]# kubectl apply -f prometheus/
[root@k8s-master01 kube-state-metrics]# kubectl apply -f serviceMonitor/

## 部署完成后,可以查看下個資源運行部署詳情
[root@k8s-master01 manifests]# kubectl get all -n monitoring
NAME                                       READY   STATUS    RESTARTS   AGE
pod/grafana-558647b59-bhqmq                1/1     Running   0          94m
pod/kube-state-metrics-79d4446fb5-5mj7d    4/4     Running   0          98m
pod/node-exporter-4xq5t                    2/2     Running   0          111m
pod/node-exporter-9b88m                    2/2     Running   0          111m
pod/node-exporter-fdntx                    2/2     Running   0          111m
pod/node-exporter-mwbxj                    2/2     Running   0          111m
pod/node-exporter-tn7tl                    2/2     Running   0          111m
pod/prometheus-adapter-57c497c557-vbgxd    1/1     Running   0          144m
pod/prometheus-k8s-0                       3/3     Running   0          96m
pod/prometheus-k8s-1                       3/3     Running   0          96m
pod/prometheus-operator-69bd579bf9-mjsxz   1/1     Running   0          155m
NAME                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/grafana               ClusterIP   10.254.42.208   <none>        3000/TCP            94m
service/kube-state-metrics    ClusterIP   None            <none>        8443/TCP,9443/TCP   104m
service/node-exporter         ClusterIP   None            <none>        9100/TCP            111m
service/prometheus-adapter    ClusterIP   10.254.107.95   <none>        443/TCP             144m
service/prometheus-k8s        ClusterIP   10.254.82.246   <none>        9090/TCP            96m
service/prometheus-operated   ClusterIP   None            <none>        9090/TCP            96m
service/prometheus-operator   ClusterIP   None            <none>        8080/TCP            156m
NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                 AGE
daemonset.apps/node-exporter   5         5         5       5            5           beta.kubernetes.io/os=linux   111m
NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/grafana               1/1     1            1           94m
deployment.apps/kube-state-metrics    1/1     1            1           98m
deployment.apps/prometheus-adapter    1/1     1            1           144m
deployment.apps/prometheus-operator   1/1     1            1           156m
NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/grafana-558647b59                1         1         1       94m
replicaset.apps/kube-state-metrics-5b86559fd5    0         0         0       98m
replicaset.apps/kube-state-metrics-79d4446fb5    1         1         1       98m
replicaset.apps/prometheus-adapter-57c497c557    1         1         1       144m
replicaset.apps/prometheus-operator-69bd579bf9   1         1         1       156m
NAME                              READY   AGE
statefulset.apps/prometheus-k8s   2/2     96m

 3、創建ingress服務

這里我沒有采用NodePort的方式暴露服務,而是使用的Ingress,具體Ingress安裝部署,查看前面文章K8S從入門到放棄系列-(15)Kubernetes集群Ingress部署

  3.1 編輯ingress配置文件

### 配置prometheus、grafana、alertmanager三個服務可視化web界面Ingress訪問
[root@k8s-master01 manifests]# cat ingress-all-svc.yml apiVersion: extensions/v1beta1 kind: Ingress metadata: name: prometheus-ing namespace: monitoring spec: rules: - host: prometheus.monitoring.k8s.local http: paths: - backend: serviceName: prometheus-k8s servicePort: 9090 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: grafana-ing namespace: monitoring spec: rules: - host: grafana.monitoring.k8s.local http: paths: - backend: serviceName: grafana servicePort: 3000 --- apiVersion: extensions/v1beta1 kind: Ingress metadata: name: alertmanager-ing namespace: monitoring spec: rules: - host: alertmanager.monitoring.k8s.local http: paths: - backend: serviceName: alertmanager-main servicePort: 9093
[root@k8s-master01 manifests]# kubectl apply -f ingress-all-svc.yml
## 可以看到三個服務對應域名已經創建
[root@k8s-master01 manifests]# kubectl get ingress -n monitoring
NAME               HOSTS                               ADDRESS   PORTS   AGE
alertmanager-ing   alertmanager.monitoring.k8s.local             80      3d21h
grafana-ing        grafana.monitoring.k8s.local                  80      3d21h
prometheus-ing     prometheus.monitoring.k8s.local               80      3d21h
## 查看ingress暴露svc端口,后面訪問需要加上端口號
[root@k8s-master01 manifests]# kubectl get svc -n ingress-nginx
NAME            TYPE       CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx   NodePort   10.254.102.184   <none>        80:33848/TCP,443:45891/TCP   4d3h

 4、訪問

配置本地host解析,路徑:C:\Windows\System32\drivers\etc\hosts

172.16.11.123 prometheus.monitoring.k8s.local
172.16.11.123 grafana.monitoring.k8s.local
172.16.11.123 alertmanager.monitoring.k8s.local

 瀏覽器打開http://prometheus.monitoring.k8s.local:33848,可以看到已經監控的主機及pod

  

  4.2 問題

  這里部署好后,查看prometheus target界面,看到kube-controller-manager、kube-scheduler目標主機都為0

 原因分析:

  這是因為serviceMonitor選擇svc時,是根據labels標簽選取,而在指定的命名空間(kube-system),並沒有對應的標簽。kube-apiserver之所以正常是因為kube-apiserver 服務 namespace 在default使用默認svc kubernetes。其余組件服務在kube-system 空間 ,需要單獨創建svc。

## 查看serviceMonitor選取svc規則
[root@k8s-master01 manifests]# grep -2 selector serviceMonitor/prometheus-serviceMonitorKube* serviceMonitor/prometheus-serviceMonitorKubeControllerManager.yaml- matchNames: serviceMonitor/prometheus-serviceMonitorKubeControllerManager.yaml- - kube-system serviceMonitor/prometheus-serviceMonitorKubeControllerManager.yaml: selector: serviceMonitor/prometheus-serviceMonitorKubeControllerManager.yaml- matchLabels: serviceMonitor/prometheus-serviceMonitorKubeControllerManager.yaml- k8s-app: kube-controller-manager -- serviceMonitor/prometheus-serviceMonitorKubelet.yaml- matchNames: serviceMonitor/prometheus-serviceMonitorKubelet.yaml- - kube-system serviceMonitor/prometheus-serviceMonitorKubelet.yaml: selector: serviceMonitor/prometheus-serviceMonitorKubelet.yaml- matchLabels: serviceMonitor/prometheus-serviceMonitorKubelet.yaml- k8s-app: kubelet -- serviceMonitor/prometheus-serviceMonitorKubeScheduler.yaml- matchNames: serviceMonitor/prometheus-serviceMonitorKubeScheduler.yaml- - kube-system serviceMonitor/prometheus-serviceMonitorKubeScheduler.yaml: selector: serviceMonitor/prometheus-serviceMonitorKubeScheduler.yaml- matchLabels: serviceMonitor/prometheus-serviceMonitorKubeScheduler.yaml- k8s-app: kube-scheduler
##查看kube-system命名空間下的svc,可以看到並沒有kube-scheduler、kube-controller-manager
[root@k8s-master01 manifests]# kubectl -n kube-system get svc
NAME                      TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns                  ClusterIP   10.254.0.2   <none>        53/UDP,53/TCP,9153/TCP   49d
kubelet                   ClusterIP   None         <none>        10250/TCP                4d2h

但是卻有對應的ep(沒有帶任何label)被創建,另外如果你的集群是kubeadm部署的就沒有kubelet的ep,二進制部署的會有。

[root@k8s-master01 manifests]# kubectl get ep -n kube-system
NAME                      ENDPOINTS                                                                  AGE
kube-controller-manager   <none>                                                                     6m38s
kube-dns                  10.254.88.22:53,10.254.96.207:53,10.254.88.22:53 + 3 more...               49d
kube-scheduler            <none>                                                                     6m39s
kubelet                   172.16.11.120:10255,172.16.11.121:10255,172.16.11.122:10255 + 12 more...   4d3h

解決:

 1)創建kube-controller-manager、kube-scheduler兩個組件服務的集群svc,需要打上對應的標簽,使其可以被servicemonitor選中。

## svc服務
[root@k8s-master01 manifests]# cat controller-scheduler-svc.yml apiVersion: v1 kind: Service metadata: namespace: kube-system name: kube-controller-manager labels: k8s-app: kube-controller-manager spec: type: ClusterIP clusterIP: None ports: - name: http-metrics port: 10252 targetPort: 10252 protocol: TCP --- apiVersion: v1 kind: Service metadata: namespace: kube-system name: kube-scheduler labels: k8s-app: kube-scheduler spec: type: ClusterIP clusterIP: None ports: - name: http-metrics port: 10251 targetPort: 10251 protocol: TCP
#####################注意##################################
這里可以看到定義的svc並沒有使用selector去過濾pod的標簽,是因為
kube-controller-manager、kube-scheduler屬於非pod模式運行,所以
無需使用selector過濾,但是需要手動創建endpoints與svc進行映射。
官方文檔解釋:https://kubernetes.io/zh/docs/concepts/services-networking/service/
##########################################################
## ep修改,二進制部署需要我們把svc對應的ep的屬性修改下,ip地址修改為自己的集群ip
[root@k8s-master01 manifests]# cat controller-scheduler-ep.yml
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-controller-manager
  name: kube-controller-manager
  namespace: kube-system
subsets:
- addresses:
  - ip: 172.16.11.120
  - ip: 172.16.11.121
  - ip: 172.16.11.122
  ports:
  - name: http-metrics
    port: 10252
    protocol: TCP
---
apiVersion: v1
kind: Endpoints
metadata:
  labels:
    k8s-app: kube-scheduler
  name: kube-scheduler
  namespace: kube-system
subsets:
- addresses:
  - ip: 172.16.11.120
  - ip: 172.16.11.121
  - ip: 172.16.11.122
  ports:
  - name: http-metrics
    port: 10251
    protocol: TCP
## 查看創建資源
[root@k8s-master01 manifests]# kubectl get ep,svc -n kube-system
NAME                                ENDPOINTS                                                                  AGE
endpoints/kube-controller-manager   172.16.11.120:10252,172.16.11.121:10252,172.16.11.122:10252                18m
endpoints/kube-dns                  10.254.88.22:53,10.254.96.207:53,10.254.88.22:53 + 3 more...               49d
endpoints/kube-scheduler            172.16.11.120:10251,172.16.11.121:10251,172.16.11.122:10251                18m
endpoints/kubelet                   172.16.11.120:10255,172.16.11.121:10255,172.16.11.122:10255 + 12 more...   4d3h
 
         
NAME                              TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
service/kube-controller-manager   ClusterIP   None         <none>        10252/TCP                12m
service/kube-dns                  ClusterIP   10.254.0.2   <none>        53/UDP,53/TCP,9153/TCP   49d
service/kube-scheduler            ClusterIP   None         <none>        10251/TCP                12m
service/kubelet                   ClusterIP   None         <none>        10250/TCP                4d3h
 

   2)修改kube-controller-manager、kube-scheduler監聽地址使其能訪問 metrics。 

## 修改為0.0.0.0
--address=0.0.0.0

修改完,重啟服務,再次查看prometheus targets界面,可以看到都已經正常監聽目標主機服務

5、grafana監控查看

  1)可以看到grafana已經有多個dashboard頁面

  2)可以正常監控


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM