k8s集群外部署prometheus+grafana x509 certificate signed by unknown authority

本文轉載自查看原文 2021-11-22 23:39 1209

背景

一般情況下prometheus+grafana是監控kubernetes的最佳實踐方式，但是大部分會將prometheus和grafana部署在集群內，方便直接調用集群內的cert和url進行監控；但是當有多個集群的時候不方便統一查看，所以本文章是將prometheus和grafana部署在集群外，方便數據統一存儲和grafana統一頁面展示，

無論是集群內還是集群外，收集集群信息的工具基本都是一致的，但是在集群外部署需要自行構建url路徑

收集信息	工具	監控url
node性能	node-exporter	/api/v1/nodes/nodename:9100/proxy/metrics	node節點信息
pod性能	cadvisor	/api/v1/nodes/nodename:9100/proxy/metrics/cadvisor	pod容器信息
k8s資源	kube-state-metrics		deployment、ingress、daemonset等信息

當部署在集群內時，Prometheus有的監控使用的是集群私有ip，而集群外Prometheus使用默認自動拼接的監控url是無法訪問的，此時需要自行構造apiserver proxy URLs，可以參考apiserver proxy URLs。通過proxy url集群外Prometheus就可以訪問監控url來拉取監控指標了。

上表中Pod性能和Node性能的監控url其實都是自行構造的proxy url，而K8S資源使用默認的監控url，就會發現其endpoint都是K8S集群內部的私有ip，因此所有的狀態都是down的。但是如果構建url，所有的狀態都會是up。

一、集群內部署prometheus

要訪問K8S apiserver需要先進行授權，而集群內部Prometheus可以使用集群內默認配置進行訪問，而集群外訪問需要使用token+客戶端cert進行認證，因此需要先進行RBAC授權。

因此我們要首先創建service-account，cluster-role，clusterrolebinding，步驟如下　　

#創建namespace
#kubectl create namespace prometheus
#創建名為prometheus的serviceaccount
#kubectl create sa prometheus
#每次創建一個命名空間都會自動創建一個default的sa
#kubectl get sa
NAME                         SECRETS   AGE
arms-prom-operator        1         169d
default                            1         169d

　　創建clusterrole 或者直接用集群自帶的cluster-admin

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: prometheus
  resourceVersion: "2351123657"
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/prometheus
  uid: 809fe89e-3ac5-4901-afca-08b988324f34
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get

　　將sa和cluster-role進行綁定

#kubectl create clusterrolebinding prometheus --clusterrole cluster-admin --serviceaccount=prometheus:prometheus

　　查看sa對應的token

#kubectl get sa prometheus -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  creationTimestamp: "2021-11-09T09:36:31Z"
  name: prometheus
  namespace: prometheus
  resourceVersion: "2351125959"
  selfLink: /api/v1/namespaces/prometheus/serviceaccounts/prometheus
  uid: 8f8f8e88-4d2c-40c5-abe7-c61615f7cf27
secrets:
- name: prometheus-token-rt59g

　　查看sa里secrets里對應的token,將此處得到token保存好，后期連接k8s獲取數據時會用到

#kubectl describe secrets prometheus-token-rt59g
Name:         prometheus-token-rt59g
Namespace:    prometheus
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: prometheus
              kubernetes.io/service-account.uid: 8f8f8e88-4d2c-40c5-abe7-c61615f7cf27

Type:  kubernetes.io/service-account-token

Data
====
token:      eyJhbGciOiJSUzI*******************
ca.crt:     1135 bytes
namespace:  10 bytes

　　開始在集群內部署收集信息工具

通過node-exporter采集集群node節點的數據，如cpu、內存、磁盤、網絡流量等，建議用daemonset的方式部署node-exporter#vim node-exporter.y

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  labels:
    app: node-exporter
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      containers:
      - image: quay.io/prometheus/node-exporter
        imagePullPolicy: Always
        name: node-exporter
        resources:
          limits:
            cpu: 102m
            memory: 180Mi
          requests:
            cpu: 102m
            memory: 180Mi
        ports:
        - containerPort: 9100
          hostPort: 9100
          name: node-exporter
      hostNetwork: true
      hostPID: true
      nodeSelector:
        beta.kubernetes.io/os: linux
      restartPolicy: Always
      tolerations:
      - key: "node-role.kubernetes.io/master"
        operator: "Exists"
        effect: "NoSchedule"
#kubectl apply -f node-exporter.yaml
# kubectl get pods
NAME                  READY   STATUS    RESTARTS   AGE
node-exporter-496dj   1/1     Running   0          9d
node-exporter-628wq   1/1     Running   0          9d
node-exporter-v96rt   1/1     Running   0          9d

pod信息收集

pod信息收集不用單獨部署工具，可直接在node收集信息的url后加cadvisor即可　　

k8s資源對象收集工具 kube-state-metrics部署

下載部署文件
#wget https://github.com/kubernetes/kube-state-metrics/tree/master/examples

#kubectl apply deployment.yaml

　此時部署kube-state-metrics時容易由於鏡像下載失敗導致pod處於pending狀態，建議在電腦上自己下載后tag打標簽上傳到自己的鏡像倉庫后在修改deploymen里的image就可以下載下來。

至此，集群內收集node、集群和pod的信息就部署完成。

2、集群外通過docker-compose部署prometheus. prometheus 版本為2.31

version: '3'
services:
  prometheus:
    image: 'prom/prometheus'
    container_name: prometheus
    ports:
      - "9090:9090"
    volumes:
      - /work/prometheus/etc/prometheus.yml:/etc/prometheus/prometheus.yml
      - /work/prometheus/etc:/etc/prometheus
      - /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime       #修改集群時間為中國標准時間
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.enable-admin-api'
      - '--web.enable-lifecycle'
#mkdir /work/prometheus/etc 
#docker-compose up -d    
#vim /work/prometheus/etc/prometheus.yml
global:
#  scrape_interval: 15s
  scrape_timeout: 30s
#  evaluation_interval: 10s
scrape_configs:
- job_name: "k8s"
  scheme: https
  tls_config:                         #此4行也要添加，如果不添加容易報錯，提示error：x509 certificate signed by unknown authority 
    insecure_skip_verify: true   
  authorization:
    credentials_file: /etc/prometheus/k8s.token     #之前具有權限secret中的token
  kubernetes_sd_configs:
  - role: node
    api_server: https://123.56.65.107:6443
    tls_config:
#      insecure_skip_verify: true
      ca_file: /etc/prometheus/ca.crt        #可以通過config文件內容，將certificate-authority-data內容進行解密
    authorization:
      credentials_file: /etc/prometheus/k8s.token

此處可以看到已經收集到了k8s的信息

獲取node-exporter信息

修改node-exporter的url為https://aoi-server:6443/api/v1/nodes/NODES-HOSTNAME:9100/proxy/metrics

如果不確定url，可以先用本地8002端口進行映射，kubectl proxy --port=8002，通過本地8002端口http://localhost:8001/來確定url的具體位置和內容是否正確

- job_name: "dev-bj-node-exporter"
  scheme: https
  tls_config:
    insecure_skip_verify: true
  authorization:
    credentials_file: /etc/prometheus/k8s.token
  kubernetes_sd_configs:
  - role: node
    api_server: https://api-server:6443
    tls_config:
#      insecure_skip_verify: true          #跳過證書校驗認證
      ca_file: /etc/prometheus/ca.crt
    authorization:
      credentials_file: /etc/prometheus/k8s.token
  relabel_configs:
  - target_label: __address__
    replacement: api-server：6443
  - source_labels: [__meta_kubernetes_node_name]
    regex: (.+)
    target_label: __metrics_path__
    replacement: /api/v1/nodes/$1:9100/proxy/metrics
  - action: labelmap
    regex: __meta_kubernetes_node_label_(.+)
  - source_labels: [__meta_kubernetes_namespace__]
    target_label: kubernetes_namespace
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: service_name

建議收集多個 k8s集群數據的時候，開啟多個docker容器，映射到主機不同的端口，這樣方便在grafana根據不同端口收集不同集群數據，方便查看

參考鏈接：https://blog.csdn.net/yanggd1987/article/details/108807171、

https://www.yxingxing.net/archives/prometheus-20191203-k8s#1-pod%E6%80%A7%E8%83%BD

grafana安裝和數據永久存儲

#mkdir /work/grafana
#cd /work/grafana
#vim grafana.sh
docker stop -t 10 grafana
docker rm grafana

docker run -itd --name grafana \
    -p 3000:3000 \
    -e "GF_SERVER_ADMIN_PASSWORD=PASSWORD" \
    -v "/work/grafana/data:/var/lib/grafana" \       #數據永久存儲在
    --user "root" \
    --log-opt max-size=500m --log-opt max-file=3 \
    -d grafana/grafana
bash grafana.sh
默認登陸賬號和密碼是admin

選擇prometheus數據源

tips:

　　grafana添加的dashboards模板后，部分內容顯示不出數據，是因為promQL的語法中的關鍵字與prometheus那邊的內容不相符，導致搜索不到，因此可以通過prometheus的9100端口下/metrics查看其關鍵字內容進行更改后可查看。

　　可以在dashboard的設置頁面variables頁面進行添加變量，添加變量時也需要參考prometheus的/metrics頁面內容查看。

注意url即使是部署在本機，也要寫明本機的ip地址，最好不要用localhost，如果出現網關錯誤，可能是prometheus機器的安全組端口沒開。

不同的數據可以選擇不同的模板，建議在grafana的官網進行搜索合適的dashboard模板 https://grafana.com/grafana/dashboards/

推薦模板ID：

　　node性能：8919、9276

　　pod性能：8588

　　k8s資源：3119、6417

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。