Envoy：離群點檢測 outlier detection

本文轉載自查看原文 2020-09-15 00:01 6703 envoy

outlier detection

在異常檢測領域中，常常需要決定新觀察的點是否屬於與現有觀察點相同的分布（則它稱為inlier），或者被認為是不同的（稱為outlier）。離群是異常的數據，但是不一定是錯誤的數據點。

在Envoy中，離群點檢測是動態確定上游集群中是否有某些主機表現不正常，然后將它們從正常的負載均衡集群中刪除的過程。outlier detection可以與healthy check同時/獨立啟用，並構成整個上游運行狀況檢查解決方案的基礎。

此處概念不做過多的說明，具體可以參考官方文檔與自行google

監測類型

連續的5xx
連續的網關錯誤
連續的本地來源錯誤

更多介紹參考官方文檔 outlier detection

離群檢測測試

說明，此處只能在單機環境測試更多還的參考與實際環境

環境准備

docker-compose 模擬后端5個節點

version: '3'
services:
  envoy:
    image: envoyproxy/envoy-alpine:v1.15-latest
    environment: 
    - ENVOY_UID=0
    ports:
    - 80:80
    - 443:443
    - 82:9901
    volumes:
    - ./envoy.yaml:/etc/envoy/envoy.yaml
    networks:
      envoymesh:
        aliases:
        - envoy
    depends_on:
    - webserver1
    - webserver2
  
  webserver1:
    image: sealloong/envoy-end:latest
    networks:
      envoymesh:
        aliases:
        - myservice
        - webservice
    expose:
    - 90
  webserver2:
    image: sealloong/envoy-end:latest
    networks:
      envoymesh:
        aliases:
        - myservice
        - webservice
    expose:
    - 90
  webserver3:
    image: sealloong/envoy-end:latest
    networks:
      envoymesh:
        aliases:
        - myservice
        - webservice
    expose:
    - 90
  webserver4:
    image: sealloong/envoy-end:latest
    networks:
      envoymesh:
        aliases:
        - myservice
        - webservice
    expose:
    - 90
  webserver5:
    image: sealloong/envoy-end:latest
    networks:
      envoymesh:
        aliases:
        - myservice
        - webservice
    expose:
    - 90
networks:
  envoymesh: {}

envoy 配置文件

admin:
  access_log_path: /dev/null
  address:
    socket_address: { address: 0.0.0.0, port_value: 9901 }

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address: { address: 0.0.0.0, port_value: 80 }
    filter_chains:
    - filters:
      - name: envoy_http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              domains: [ "*" ]
              routes:
              - match: { prefix: "/" }
                route: { cluster: local_service }
          http_filters:
          - name: envoy.filters.http.router

  clusters:
  - name: local_service
    connect_timeout: 0.25s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: local_service
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address: { address: webservice, port_value: 90 }
    health_checks:
      timeout: 3s
      interval: 90s
      unhealthy_threshold: 5
      healthy_threshold: 5
      no_traffic_interval: 240s
      http_health_check:
        path: "/ping"
        expected_statuses:
          start: 200
          end: 201
    outlier_detection:
      consecutive_5xx: 2
      base_ejection_time: 30s
      max_ejection_percent: 40
      interval: 20s
      success_rate_minimum_hosts: 5
      success_rate_request_volume: 10

配置說明

    outlier_detection:
      consecutive_5xx: 2 # 連續的5xx錯誤數量
      base_ejection_time: 30s # 彈出主機的基准時間。實際時間等於基本時間乘以主機彈出的次數
      max_ejection_percent: 40 # 可彈出主機集群的最大比例，默認值為10% ，此處為40% 即集群中5個節點的2個節點
      interval: 20s # 間隔時間
      success_rate_minimum_hosts: 5 # 集群中最小主機數量
      success_rate_request_volume: 10 # 在一個時間間隔內中收集請求檢測的最小數量

此處為了效果，將主動檢測狀態時間增加，主機彈出時間增加

路由

/502bad 模擬一個502的錯誤

運行結果

模擬一些5xx請求和200請求

 workers
envoy_1       | [2020-09-13 06:10:01.093][1][warning][main] [source/server/server.cc:537] there is no configured limit to the number of allowed active connections. Set a limit via the runtime key overload.global_downstream_max_connections
webserver2_1  | [GIN] 2020/09/13 - 06:10:08 | 200 |      63.272?s |      172.22.0.7 | GET      "/"
webserver5_1  | [GIN] 2020/09/13 - 06:10:10 | 200 |      46.732?s |      172.22.0.7 | GET      "/"
webserver1_1  | [GIN] 2020/09/13 - 06:10:11 | 200 |       45.43?s |      172.22.0.7 | GET      "/"
webserver3_1  | [GIN] 2020/09/13 - 06:10:13 | 502 |      43.858?s |      172.22.0.7 | GET      "/502bad"
webserver4_1  | [GIN] 2020/09/13 - 06:10:14 | 502 |      47.486?s |      172.22.0.7 | GET      "/502bad"
webserver2_1  | [GIN] 2020/09/13 - 06:10:15 | 200 |      15.691?s |      172.22.0.7 | GET      "/"
webserver5_1  | [GIN] 2020/09/13 - 06:10:16 | 200 |      14.719?s |      172.22.0.7 | GET      "/"
webserver1_1  | [GIN] 2020/09/13 - 06:10:16 | 200 |      15.758?s |      172.22.0.7 | GET      "/"
webserver3_1  | [GIN] 2020/09/13 - 06:10:17 | 502 |      15.697?s |      172.22.0.7 | GET      "/502bad"
webserver2_1  | [GIN] 2020/09/13 - 06:10:17 | 502 |      14.002?s |      172.22.0.7 | GET      "/502bad"
webserver5_1  | [GIN] 2020/09/13 - 06:10:17 | 502 |      14.913?s |      172.22.0.7 | GET      "/502bad"
webserver1_1  | [GIN] 2020/09/13 - 06:10:18 | 502 |      14.911?s |      172.22.0.7 | GET      "/502bad"
webserver4_1  | [GIN] 2020/09/13 - 06:10:18 | 502 |      30.429?s |      172.22.0.7 | GET      "/502bad"
webserver5_1  | [GIN] 2020/09/13 - 06:10:19 | 200 |      14.377?s |      172.22.0.7 | GET      "/"
webserver1_1  | [GIN] 2020/09/13 - 06:10:19 | 200 |      14.861?s |      172.22.0.7 | GET      "/"
webserver2_1  | [GIN] 2020/09/13 - 06:10:19 | 200 |      18.924?s |      172.22.0.7 | GET      "/"
webserver5_1  | [GIN] 2020/09/13 - 06:10:19 | 200 |      15.899?s |      172.22.0.7 | GET      "/"
webserver1_1  | [GIN] 2020/09/13 - 06:10:19 | 200 |      24.849?s |      172.22.0.7 | GET      "/"

集群已彈出 20%的節點，健康檢查結果為 failed_outlier_check

請求已分配到其余三台節點

30秒后，彈出主機已回復正常

再次模擬請求

30秒后，如在時間間隔內，無新增請求，節點依舊為 failed_outlier_check，有新增請求時恢復。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python 離群點檢測離群點檢測與序列數據異常檢測以及異常檢測大殺器-iForest 語音端點檢測（Voice Activity Detection,VAD）幾種常見的離群點檢驗方法結合Python代碼介紹音符起始點檢測 (onset detection) OpenCV 之角點檢測 FAST特征點檢測 ORB特征點檢測異常點檢測 Opencv Harris角點檢測