Prometheus operator


一、簡介

地址:https://github.com/prometheus-operator/kube-prometheus

https://blog.csdn.net/choerodon/article/details/98587027

Prometheus Operator架構圖:

   

  • Operator:根據自定義資源(Custom Resource Definition / CRDs)來部署和管理Prometheus Server,同時監控這些自定義資源事件的變化來做相應的處理,是整個系統的控制中心
  • Prometheus:聲明Prometheus deployment期望的狀態,Operator確保這個deployment運行時一直與定義保持一致
  • Prometheus Server:Opreator根據自定義資源Prometheus類型中定義內容而部署的Prometheus Server集群,這些自定義資源可以看作是用來管理Prometheus Server集群的StatefulSets資源
  • ServiceMonitor:聲明指定監控的服務,描述了一組被Prometheus監控的目標列表。該資源通過Labels來獲取對應的Service Endpoint,讓Prometheus Server通過選取的Service 來獲取 Metrics信息
  • Service:簡單的說就是Prometheus監控的對象
  • Alertmanager:定義AlertManager deployment期望的狀態,Operator確保這個deployment運行時一直與定義保持一致

 

二、部署

Prometheus Operator部署很簡單

# 下載
# git clone https://github.com/prometheus-operator/kube-prometheus.git
​
# cd kube-prometheus
​
# 安裝operator
# kubectl create -f manifests/setup
​
# 安裝prometheus
kubectl create -f manifests/
  • 可以在replicas定義啟動個數

查看

# kubectl get pods -n monitoring
NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    2/2     Running   10         8d
blackbox-86b7486879-w6n22              1/1     Running   0          18h
grafana-5cb8d5c55b-wplg4               1/1     Running   5          8d
kafka-exporter-5cf8fdd8f8-c4j5t        1/1     Running   0          20h
kube-state-metrics-65f69f9759-spcr6    3/3     Running   27         8d
node-exporter-rdjl9                    2/2     Running   2          24h
prometheus-adapter-865cc8dbcd-bc7v6    1/1     Running   34         8d
prometheus-k8s-0                       2/2     Running   3          76m
prometheus-operator-56d44459f7-vt2l9   2/2     Running   15         8d
# kubectl get svc -n monitoring
NAME                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
alertmanager-main       ClusterIP   10.99.189.210   <none>        9093/TCP                     8d
alertmanager-operated   ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   8d
blackbox                ClusterIP   10.108.47.141   <none>        9115/TCP                     18h
grafana                 ClusterIP   10.104.30.183   <none>        3000/TCP                     8d
kafka-exporter          ClusterIP   10.98.228.115   <none>        9308/TCP                     20h
kube-state-metrics      ClusterIP   None            <none>        8443/TCP,9443/TCP            8d
node-exporter           ClusterIP   None            <none>        9100/TCP                     8d
prometheus-adapter      ClusterIP   10.108.67.0     <none>        443/TCP                      8d
prometheus-k8s          ClusterIP   10.96.50.138    <none>        9090/TCP                     8d
prometheus-operated     ClusterIP   None            <none>        9090/TCP                     16h
prometheus-operator     ClusterIP   None            <none>        8443/TCP                     8d

 

定義ingress,用於訪問alertmanager、grafana、prometheus

prom-monitor.yaml

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prom-monitor
  namespace: monitoring
spec:
  rules:
  - host: alert.test.com
    http:
      paths:
      - backend:
          serviceName: alertmanager-main
          servicePort: 9093
        path: /
  - host: grafana.test.com
    http:
      paths:
      - backend:
          serviceName: grafana
          servicePort: 3000
        path: /
  - host: prom.test.com
    http:
      paths:
      - backend:
          serviceName: prometheus-k8s
          servicePort: 9090
        path: /
  • grafana.test.com prom.test.com alert.test.com

修改本機hosts文件

訪問 grafana.test.com,其本身提供了很多dashboard

  

 

三、處理無法監控controller-manager

  二進制安裝的k8s,在使用operator安裝的Prometheus,默認是監控不到controller-manager和scheduler,需要另行配置這兩個組件。原因在於servicemonitor是通過匹配service中的label來添加監控的,但是二進制安裝的k8s中,kube-system這個namespace中沒有controller-manager和scheduler的service。

  查看 

# 查看servicemonitor
# kubectl get servicemonitor -n monitoring
NAME                      AGE
alertmanager              7d2h
coredns                   7d2h
grafana                   7d2h
kube-apiserver            7d2h
kube-controller-manager   7d2h
kube-scheduler            7d2h
kube-state-metrics        7d2h
kubelet                   7d2h
node-exporter             7d2h
prometheus                7d2h
prometheus-adapter        7d2h
prometheus-operator       7d2h

  查看kube-controller-manager的servicemonitor

# kubectl get servicemonitor kube-controller-manager -n monitoring -o yaml | tail -15
...
    port: http-metrics
    scheme: http
    tlsConfig:
      insecureSkipVerify: false
  jobLabel: k8s-app
  namespaceSelector:
    matchNames:
    - kube-system
  selector:
    matchLabels:
      k8s-app: kube-controller-manager
  • 其需要在kube-system下匹配一個含有k8s-app=kube-controller-manager的service
  • 修改其scheme為http,默認為https

  kube-controller-manager這個標簽的serviceendpoints在kube-system這個namespace是沒有的,所有Prometheus無法獲取controller-manager的信息,所以需要創建controller-manager的service和endpoint

  controller-endpoint.yaml

apiVersion: v1
kind: Endpoints
metadata:
  name: kube-controller-manager-monitoring
  namespace: kube-system
  labels:
    k8s-app: kube-controller-manager
subsets:
  - addresses:
    - ip: 192.168.10.240
    - ip: 192.168.10.241
    - ip: 192.168.10.242
    ports:
    - name: http-metrics
      port: 10252
      protocol: TCP

  controller-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: kube-controller-manager-monitoring
  namespace: kube-system
  labels:
    k8s-app: kube-controller-manager
spec:
  ports:
  - port: 10252
    name: http-metrics
    protocol: TCP
  type: ClusterIP

創建

# kubectl create -f .

查看

# kubectl get svc,ep -n kube-system -l k8s-app=kube-controller-manager
NAME                                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)     AGE
service/kube-controller-manager-monitoring   ClusterIP   10.102.204.13   <none>        10252/TCP   44m

NAME                                           ENDPOINTS                                                        AGE
endpoints/kube-controller-manager-monitoring   192.168.10.240:10252,192.168.10.241:10252,192.168.10.242:10252   44m


同時修改controller-manager的啟動配置文件

/usr/lib/systemd/system/kube-controller-manager.service

# 修改地址
--address=0.0.0.0 

重啟controller-manager

 

測試

# curl 127.0.0.1:10252
404 page not found

# curl 10.102.204.13:10252
404 page not found

訪問本機端口和controller-manager的service端口的結果是一樣的

 

查看prometheus

  

同理修改scheduler的相關配置,就能監控scheduler的信息


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM