所有組件都以容器形式啟動,部分啟動文件參考prometheus for swarm
-
部署Prometheus
-
編寫啟動文件
$ mkdir -p /opt/k8s/prometheus/conf $ cd /opt/k8s/prometheus/ $ cat > prome-stack.yml<<EOF version: "3" services: prometheus: image: prom/prometheus:v2.16.0 ports: - "9090:9090" volumes: - ./conf/:/etc/prometheus/ - prometheus_data:/prometheus command: - '--config.file=/etc/prometheus/prometheus.yml' - '--storage.tsdb.path=/prometheus' - '--web.console.libraries=/usr/share/prometheus/console_libraries' - '--web.console.templates=/usr/share/prometheus/consoles' networks: - mcsas-network deploy: replicas: 1 restart_policy: condition: on-failure placement: constraints: - node.role == manager networks: mcsas-network: external: true volumes: prometheus_data: {} EOF -
編輯配置文件
$ cd /opt/k8s/prometheus/prom/conf/ $ cat > prometheus.yml<<EOF global: scrape_interval: 15s evaluation_interval: 15s scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090'] - job_name: 'springboot' metrics_path: /actuator/prometheus file_sd_configs: - files: - /etc/prometheus/service.yaml - job_name: 'node-exporter' scrape_interval: 5s dns_sd_configs: - names: - 'tasks.node-exporter' type: 'A' port: 9100 - job_name: 'cadvisor' scrape_interval: 5s dns_sd_configs: - names: - 'tasks.cadvisor' type: 'A' port: 8080 EOF- 對node-exporter、cadvisor采用dns服務發現形式
- 對於系統應用采用file_sd_configs,通過conf/service.yaml中配置,來是prometheus對我們提供的服務進行監控
- 因為Prometheus沒有專門針對swarm的服務發現組件,需要手動向file_sd_configs對應的文件中追加,Prometheus官方上有一個方案,具體可參考prometheus-swarm-discovery
-
啟動prometheus
$ cd /opt/k8s/prometheus/prom $ docker stack deploy -c prome-stack.yml prom
-
-
部署node-exporter
Node-Exporter並不是為了Mac平台設計的,在Mac上運行時不會正確收集系統相關的信息,如果平台是Mac,不要部署這個組件
$ cd /opt/k8s/prometheus $ cat > node-exporter-stack.yml<<EOF version: "3" services: node-exporter: image: quay.azk8s.cn/prometheus/node-exporter:v0.18.1 volumes: - /proc:/host/proc:ro - /sys:/host/sys:ro - /:/rootfs:ro command: - '--path.procfs=/host/proc' - '--path.sysfs=/host/sys' - --collector.filesystem.ignored-mount-points - "^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)" #ports: # - 9100:9100 networks: - mcsas-network deploy: mode: global restart_policy: condition: on-failure networks: mcsas-network: external: true EOF啟動node-exporter
$ cd /opt/k8s/prometheus $ docker stack deploy -c node-exporter-stack.yml node -
部署cadvisor
$ cd /opt/k8s/prometheus $ cat > cadvisor-stack.yml<<EOF version: "3" services: cadvisor: image: gcr.azk8s.cn/google_containers/cadvisor:v0.35.0 volumes: - /:/rootfs:ro - /var/run:/var/run:rw - /sys:/sys:ro - /var/lib/docker/:/var/lib/docker:ro #ports: # - 8080:8080 networks: - mcsas-network deploy: mode: global restart_policy: condition: on-failure networks: mcsas-network: external: true EOF- 關於鏡像:google/cadvisor已經不推薦再用,新鏡像已不再更新,使用gcr.io/google-containers/cadvisor,但是國內沒發下載,更換成從gcr.azk8s.cn下載
啟動cadvisor
$ docker stack deploy -c cadvisor-stack.yml cadvisor -
部署grafana
$ cd /opt/k8s/prometheus $ cat > grafana-stack.yml<<EOF version: "3" services: grafana: image: grafana/grafana:6.6.2 volumes: - grafana-data:/var/lib/grafana deploy: replicas: 1 restart_policy: condition: on-failure resources: limits: cpus: "0.2" memory: 200M ports: - 3000:3000 networks: - mcsas-network volumes: grafana-data: {} networks: mcsas-network: external: true EOF啟動grafana
$ docker stack deploy -c grafana-stack.yml grafana
部署組件全部完成,關於在grafana中配置dashboard進行指標監控的具體步驟,參考[Prometheus grafana安裝](Prometheus grafana安裝.md)
