一、簡介
地址:https://github.com/prometheus-operator/kube-prometheus
https://blog.csdn.net/choerodon/article/details/98587027
Prometheus Operator架構圖:

- Operator:根據自定義資源(Custom Resource Definition / CRDs)來部署和管理Prometheus Server,同時監控這些自定義資源事件的變化來做相應的處理,是整個系統的控制中心
- Prometheus Server:Opreator根據自定義資源Prometheus類型中定義內容而部署的Prometheus Server集群,這些自定義資源可以看作是用來管理Prometheus Server集群的StatefulSets資源
- ServiceMonitor:聲明指定監控的服務,描述了一組被Prometheus監控的目標列表。該資源通過Labels來獲取對應的Service Endpoint,讓Prometheus Server通過選取的Service 來獲取 Metrics信息
- Service:簡單的說就是Prometheus監控的對象
二、部署
Prometheus Operator部署很簡單
# 下載 # git clone https://github.com/prometheus-operator/kube-prometheus.git # cd kube-prometheus # 安裝operator # kubectl create -f manifests/setup # 安裝prometheus kubectl create -f manifests/
-
可以在replicas定義啟動個數
查看
# kubectl get pods -n monitoring NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 10 8d blackbox-86b7486879-w6n22 1/1 Running 0 18h grafana-5cb8d5c55b-wplg4 1/1 Running 5 8d kafka-exporter-5cf8fdd8f8-c4j5t 1/1 Running 0 20h kube-state-metrics-65f69f9759-spcr6 3/3 Running 27 8d node-exporter-rdjl9 2/2 Running 2 24h prometheus-adapter-865cc8dbcd-bc7v6 1/1 Running 34 8d prometheus-k8s-0 2/2 Running 3 76m prometheus-operator-56d44459f7-vt2l9 2/2 Running 15 8d # kubectl get svc -n monitoring NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE alertmanager-main ClusterIP 10.99.189.210 <none> 9093/TCP 8d alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 8d blackbox ClusterIP 10.108.47.141 <none> 9115/TCP 18h grafana ClusterIP 10.104.30.183 <none> 3000/TCP 8d kafka-exporter ClusterIP 10.98.228.115 <none> 9308/TCP 20h kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 8d node-exporter ClusterIP None <none> 9100/TCP 8d prometheus-adapter ClusterIP 10.108.67.0 <none> 443/TCP 8d prometheus-k8s ClusterIP 10.96.50.138 <none> 9090/TCP 8d prometheus-operated ClusterIP None <none> 9090/TCP 16h prometheus-operator ClusterIP None <none> 8443/TCP 8d
定義ingress,用於訪問alertmanager、grafana、prometheus
prom-monitor.yaml
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: prom-monitor namespace: monitoring spec: rules: - host: alert.test.com http: paths: - backend: serviceName: alertmanager-main servicePort: 9093 path: / - host: grafana.test.com http: paths: - backend: serviceName: grafana servicePort: 3000 path: / - host: prom.test.com http: paths: - backend: serviceName: prometheus-k8s servicePort: 9090 path: /
-
grafana.test.com prom.test.com alert.test.com
修改本機hosts文件
訪問 grafana.test.com,其本身提供了很多dashboard

三、
# 查看servicemonitor # kubectl get servicemonitor -n monitoring NAME AGE alertmanager 7d2h coredns 7d2h grafana 7d2h kube-apiserver 7d2h kube-controller-manager 7d2h kube-scheduler 7d2h kube-state-metrics 7d2h kubelet 7d2h node-exporter 7d2h prometheus 7d2h prometheus-adapter 7d2h prometheus-operator 7d2h
查看kube-controller-manager的servicemonitor
# kubectl get servicemonitor kube-controller-manager -n monitoring -o yaml | tail -15 ... port: http-metrics scheme: http tlsConfig: insecureSkipVerify: false jobLabel: k8s-app namespaceSelector: matchNames: - kube-system selector: matchLabels: k8s-app: kube-controller-manager
- 其需要在kube-system下匹配一個含有k8s-app=kube-controller-manager的service
- 修改其scheme為http,默認為https
apiVersion: v1 kind: Endpoints metadata: name: kube-controller-manager-monitoring namespace: kube-system labels: k8s-app: kube-controller-manager subsets: - addresses: - ip: 192.168.10.240 - ip: 192.168.10.241 - ip: 192.168.10.242 ports: - name: http-metrics port: 10252 protocol: TCP
controller-service.yaml
apiVersion: v1 kind: Service metadata: name: kube-controller-manager-monitoring namespace: kube-system labels: k8s-app: kube-controller-manager spec: ports: - port: 10252 name: http-metrics protocol: TCP type: ClusterIP
創建
# kubectl create -f .
查看
# kubectl get svc,ep -n kube-system -l k8s-app=kube-controller-manager NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kube-controller-manager-monitoring ClusterIP 10.102.204.13 <none> 10252/TCP 44m NAME ENDPOINTS AGE endpoints/kube-controller-manager-monitoring 192.168.10.240:10252,192.168.10.241:10252,192.168.10.242:10252 44m
同時修改controller-manager的啟動配置文件
/usr/lib/systemd/system/kube-controller-manager.service
# 修改地址 --address=0.0.0.0
重啟controller-manager
測試
# curl 127.0.0.1:10252 404 page not found # curl 10.102.204.13:10252 404 page not found
訪問本機端口和controller-manager的service端口的結果是一樣的
查看prometheus

