thanos 實現 prometheus 高可用 數據持久化2


Prometheus 官方的高可用有幾種方案:

    1. HA:即兩套 Prometheus 采集完全一樣的數據,外邊掛負載均衡
    2. HA + 遠程存儲:除了基礎的多副本 Prometheus,還通過 Remote write 寫入到遠程存儲,解決存儲持久化問題
    3. 聯邦集群:即 Federation,按照功能進行分區,不同的 Shard 采集不同的數據,由 Global 節點來統一存放,解決監控數據規模的問題.

Thanos 的默認模式:sidecar 方式.  or Receiver方式 ;

 

 

 

  • Thanos Query. 主要是對從Promethues Pod采集來的數據進行merge,提供查詢接口給客戶端;
  • Thanos SideCar. 將Promethues container的數據進行封裝,以提供接口給Thanos Query;
  • Prometheus Container. 采集數據,通過Remote Read API提供接口給Thanos SideCar
  • Thanos store Gateway: 將對象存儲的數據暴露給 Thanos Query 去查詢。
  • thanos compact: 將對象存儲中的數據進行壓縮和降低采樣率,加速大時間區間監控數據查詢的速度。
  • Thanos Ruler: 對監控數據進行評估和告警,還可以計算出新的監控數據,將這些新數據提供給 Thanos Query 查詢並且/或者上傳到對象存儲,以供長期存儲。

thanos 主:

thanos compact --data-dir ./thanos/comp --http-address 0.0.0.0:19192 --objstore.config-file ./bucket_config.yaml
thanos store --data-dir ./thanos/store --objstore.config-file ./bucket_config.yaml --http-address 0.0.0.0:19191 --grpc-address 0.0.0.0:19090
thanos query --http-address 0.0.0.0:8080 --grpc-address 0.0.0.0:8081 --query.replica-label slave --store 172.16.10.11:10901 --store 172.16.10.10:10901 --store 127.0.0.1:19090
thanos sidecar --tsdb.path /data/ --prometheus.url http://localhost:9090 --objstore.config-file ./bucket_config.yaml --shipper.upload-compacted

thanos 備:

thanos sidecar --tsdb.path /data/ --prometheus.url http://localhost:9090 --objstore.config-file ./bucket_config.yaml --shipper.upload-compacted
thanos query --http-address 0.0.0.0:8080 --grpc-address 0.0.0.0:8081 --query.replica-label slave --store 172.16.10.10:10901 --store 172.16.10.11:10901 --store 172.16.10.10:19090

 

注: bucket_config.yaml為雲存儲;

 

promtheus 配置:

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).
  external_labels:
    slave: 02

# Alertmanager configuration
#alerting:
#  alertmanagers:
#  - static_configs:
#     - targets: ['127.0.0.1:9093']
alerting:
  alertmanagers:
    - scheme: http
      static_configs:
        - targets:
            - "172.16.10.10:9093"

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "/data/alert/etc/*.rule"
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'Node_Exporter'
    consul_sd_configs:
      - server: '172.16.10.10:8500'
    relabel_configs:
      - source_labels: ["__meta_consul_service_address"]
        regex: "(.*)"
        replacement: $1
        action: replace
        target_label: "address"
      - source_labels: ["__meta_consul_service"]
        regex: "(.*)"
        replacement: $1
        action: replace
        target_label: "hostname"
      - source_labels: ["__meta_consul_service_address"]
        regex: "10.2.*"
        action: drop
      - source_labels: ["__meta_consul_service_address"]
        regex: "10.3.*"
        action: drop
      - source_labels: ["__meta_consul_service_address"]
        regex: "10.7.*"
        action: drop
      - source_labels: ["__meta_consul_tags"]
        regex: ".*測試環境.*"
        action: drop
      - source_labels: ["__meta_consul_tags"]
        regex: ".*安全組.*"
        action: drop
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(.*),(.*),(.*),(.*),(.*),"
        replacement: $1
        action: replace
        target_label: "department"
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(.*),(.*),(.*),(.*),(.*),"
        replacement: $2
        action: replace
        target_label: "group"
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(prod|dev|pre|test),"
        replacement: $2
        action: replace
        target_label: "env"
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(.*),(.*),(.*),(.*),(.*),"
        replacement: $3
        action: replace
        target_label: "application"
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(.*),(.*),(.*),(.*),(.*),"
        replacement: $4
        action: replace
        target_label: "type"
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(.*),(.*),(.*),(.*),(.*),"
        replacement: $5
        action: replace
        target_label: "dc"
      - source_labels: ["__meta_consul_tags"]
        regex: ",(.*),(.*),(.*),(.*),(.*),(.*),"
        replacement: $6
        action: replace
        target_label: "appCode"
      - source_labels: ["__meta_consul_service_id"]
        regex: "(.*)"
        replacement: $1
        action: replace
        target_label: "id"


  - job_name: kubetake

    kubernetes_sd_configs:
      - role: node
        api_server: 'https://172.16.10.11:6443'
        tls_config:
          ca_file: /data/alert/etc/ca.crt
        bearer_token_file: /data/alert/etc/token
    relabel_configs:
      - action: labelmap
        regex: (.+)
      - target_label: __address__
        source_labels: [__meta_kubernetes_node_address_InternalIP]
        regex: (.+)
        replacement: ${1}:9100

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM