1)集群管理器與服務發現機制
3)負載均衡策略
(1)分布式負載均衡 負載均衡算法:加權輪詢、加權最少連接、環哈希、磁懸浮和隨機等 區域感知路由 (2)全局負載均衡 位置優先級 位置權重 均衡器子集
二、Cluster Manager
1)Envoy支持同時配置任意數量的上游集群,並基於Cluster Manager 管理它們
(1)Cluster Manager負責為集群管理上游主機的健康狀態、負載均衡機制、連接類型及適用協 議等; (2)生成集群配置的方式由靜態或動態(CDS)兩種;
2)集群預熱
(1)集群在服務器啟動或者通過 CDS 進行初始化時需要一個預熱的過程,這意味着集群存在下列狀況 初始服務發現加載 (例如DNS 解析、EDS 更新等)完成之前不可用 配置了主動健康狀態檢查機制時,Envoy會主動 發送健康狀態檢測請求報文至發現的每個上游主機; 於是,初始的主動健康檢查成功完成之前不可用 (2)於是,新增集群初始化完成之前對Envoy的其它組件來說不可見;而對於需要更新的集群,在其預熱完成后通過與舊集群的原子交換來確保不會發生流量中斷類的錯誤;
配置單個集群
v3格式的集群配置框架全覽
{ "transport_socket_matches": [], "name": "...", "alt_stat_name": "...", "type": "...", "cluster_type": "{...}", "eds_cluster_config": "{...}", "connect_timeout": "{...}", "per_connection_buffer_limit_bytes": "{...}", "lb_policy": "...", "load_assignment": "{...}", "health_checks": [], "max_requests_per_connection": "{...}", "circuit_breakers": "{...}", "upstream_http_protocol_options": "{...}", "common_http_protocol_options": "{...}", "http_protocol_options": "{...}", "http2_protocol_options": "{...}", "typed_extension_protocol_options": "{...}", "dns_refresh_rate": "{...}", "dns_failure_refresh_rate": "{...}", "respect_dns_ttl": "...", "dns_lookup_family": "...", "dns_resolvers": [], "use_tcp_for_dns_lookups": "...", "outlier_detection": "{...}", "cleanup_interval": "{...}", "upstream_bind_config": "{...}", "lb_subset_config": "{...}", "ring_hash_lb_config": "{...}", "maglev_lb_config": "{...}", "original_dst_lb_config": "{...}", "least_request_lb_config": "{...}", "common_lb_config": "{...}", "transport_socket": "{...}", "metadata": "{...}", "protocol_selection": "...", "upstream_connection_options": "{...}", "close_connections_on_host_health_failure": "...", "ignore_health_on_host_removal": "...", "filters": [], "track_timeout_budgets": "...", "upstream_config": "{...}", "track_cluster_stats": "{...}", "preconnect_policy": "{...}", "connection_pool_per_downstream_connection": "..." }
1、集群管理器配置上游集群時需要知道如何解析集群成員,相應的解析機制即為服務發現
(1)集群中的每個成員由endpoint進行標識,它可由用戶靜態配置,也可通過EDS或DNS服務 動態發現; Static :靜態配置,即顯式指定每個上游主機的已解析名稱(IP地址/端口或unix 域套按字文件); ◆Strict DNS:嚴格DNS,Envoy將持續和異步地解析指定的DNS目標,並將DNS結果中的返回的每 個IP地址視為上游集群中可用成員; Logical DNS:邏輯DNS,集群僅使用在需要啟動新連接時返回的第一個IP地址,而非嚴格獲取 DNS查詢的結果並假設它們構成整個上游集群;適用於必須通過DNS訪問的大規模Web服務集群; Original destination:當傳入連接通過iptables的REDIRECT或TPROXY target或使用代理協議重定向 到Envoy時,可以使用原始目標集群; Endpoint discovery service (EDS) :EDS是一種基於GRPC或REST-JSON API的xDS 管理服務器獲取集 群成員的服務發現方式; Custom cluster :Envoy還支持在集群配置上的cluster_type字段中指定使用自定義集群發現機制;
2、Envoy的服務發現並未采用完全一致的機制,而是假設主機以最終一致的方式加入或 離開網格,它結合主動健康狀態檢查機制來判定集群的健康狀態;
(1)健康與否的決策機制以完全分布式的方式進行,因此可以很好地應對網絡分區 (2)為集群啟用主機健康狀態檢查機制后,Envoy基於如下方式判定是否路由請求到一個主機
1、故障處理機制
1)、Envoy提供了一系列開箱即用的故障處理機制
超時(timeout)
有限次數的重試,並支持可變的重試延遲
主動健康檢查與異常探測
連接池
斷路器
3)、結合流量管理機制,用戶可為每個服務/版本定制所需的故障恢復機制
2、Upstreams 健康狀態檢測
1)健康狀態檢測用於確保代理服務器不會將下游客戶端的請求代理至工作異常的上游主機
2)Envoy支持兩種類型的健康狀態檢測,二者均基於集群進行定義
(1)主動檢測(Active Health Checking):Envoy周期性地發送探測報文至上游主機,並根據其響應 判斷其 健康狀態;Envoy目前支持三種類型的主動檢測: HTTP:向上游主機發送HTTP請求報文 L3/L4:向上游主機發送L3/L4請求報文,基於響應的結果判定其健康狀態,或僅通過連接狀態進行判定; ◆Redis:向上游的redis服務器發送Redis PING ; (2)被動檢測(Passive Health Checking):Envoy通過異常檢測(Outlier Detection)機制進行被動模式的健 康狀態檢測 目前,僅http router、tcp proxy和redis proxy三個過濾器支持異常值檢測; Envoy支持以下類型的異常檢測 連續5XX:意指所有類型的錯誤,非http router過濾器生成的錯誤也會在內部映射為5xx錯誤代碼; 連續網關故障:連續5XX的子集,單純用於http的502、503或504錯誤,即網關故障; 連續的本地原因故障:Envoy無法連接到上游主機或與上游主機的通信被反復中斷; 成功率:主機的聚合成功率數據閾值;
集群的主機健康狀態檢測機制需要顯式定義,否則,發現的所有上游主機即被視為可用;定義語法
clusters: - name: ... ... load_assignment: endpoints: - lb_endpoints: - endpoint: health_check_config: port_value: ... # 自定義健康狀態檢測時使用的端口; ... ... health_checks: - timeout: ... # 超時時長 interval: ... # 時間間隔 initial_jitter: ... # 初始檢測時間點散開量,以毫秒為單位; interval_jitter: ... # 間隔檢測時間點散開量,以毫秒為單位; unhealthy_threshold: ... # 將主機標記為不健康狀態的檢測閾值,即至少多少次不健康的檢測后才將其標記為不可用; healthy_threshold: ... # 將主機標記為健康狀態的檢測閾值,但初始檢測成功一次即視主機為健康; http_health_check: {...} # HTTP類型的檢測;包括此種類型在內的以下四種檢測類型必須設置一種; tcp_health_check: {...} # TCP類型的檢測; grpc_health_check: {...} # GRPC專用的檢測; custom_health_check: {...} # 自定義檢測; reuse_connection: ... # 布爾型值,是否在多次檢測之間重用連接,默認值為true; unhealthy_interval: ... # 標記為“unhealthy” 狀態的端點的健康檢測時間間隔,一旦重新標記為“healthy” 即轉為正常時間間隔; unhealthy_edge_interval: ... # 端點剛被標記為“unhealthy” 狀態時的健康檢測時間間隔,隨后即轉為同unhealthy_interval的定義; healthy_edge_interval: ... # 端點剛被標記為“healthy” 狀態時的健康檢測時間間隔,隨后即轉為同interval的定義; tls_options: { … } # tls相關的配置 transport_socket_match_criteria: {…} # Optional key/value pairs that will be used to match a transport socket from those specified in the cluster’s transport socket matches.
TCP類型的檢測
clusters: - name: local_service connect_timeout: 0.25s lb_policy: ROUND_ROBIN type: EDS eds_cluster_config: eds_config: api_config_source: api_type: GRPC grpc_services: - envoy_grpc: cluster_name: xds_cluster health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 2 tcp_health_check: {}
非空負載的tcp檢測可以使用send和receive來分別指定請求負荷及 於響應報文中期望模糊匹配 的結果
{ "send": "{...}", "receive": [] }
http類型的檢測可以自定義使用的path、 host和期望的響應碼等,並能夠在必要時修 改(添加/刪除)請求報文的標頭 。
具體配置語法如下:
health_checks: [] - ... http_health_check: "host": "..." # 檢測時使用的主機標頭,默認為空,此時使用集群名稱; "path": "..." # 檢測時使用的路徑,例如/healthz;必選參數; “service_name_matcher”: “...” # 用於驗證檢測目標集群服務名稱的參數,可選; "request_headers_to_add": [] # 向檢測報文添加的自定義標頭列表; "request_headers_to_remove": [] # 從檢測報文中移除的標頭列表; "expected_statuses": [] # 期望的響應碼列表;
配置示例
clusters: - name: local_service connect_timeout: 0.25s lb_policy: ROUND_ROBIN type: EDS eds_cluster_config: eds_config: api_config_source: api_type: GRPC grpc_services: - envoy_grpc: cluster_name: xds_cluster health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 2 http_health_check: host: ... # 默認為空值,並自動使用集群為其值; path: ... # 檢測針對的路徑,例如/healthz; expected_statuses: ... # 期望的響應碼,默認為200;
1)異常主機驅逐機制
確定主機異常 -> 若尚未驅逐主機,且已驅逐的數量低於允許的閾值,則已經驅逐主機 -> 主機處於驅逐狀態一定時長 -> 超出時長后自動恢復服務
2) 異常探測通過outlier_dection字段定義在集群上下文中
lusters: - name: ... ... outlier_detection: consecutive_5xx: ... # 因連續5xx 錯誤而彈出主機之前允許出現的連續5xx 響應或本地原始錯誤的數量,默認為5; interval: ... # 彈射分析掃描之間的時間間隔,默認為10000ms 或10s ; base_ejection_time: ... # 主機被彈出的基准時長,實際時長等於基 准 時 長乘以主機已經彈出的次數;默認為30000ms或30s ; max_ejection_percent: ... # 因異常探測而允許彈出的上游集群中的主機數量百分比,默認為10%;不過,無論如何,至少要彈出一個主機; enforcing_consecutive_5xx: ... # 基於連續的5xx 檢測到主機異常時主機將被彈出的幾率,可用於禁止彈出或緩慢彈出;默認為100; enforcing_success_rate: ... # 基於成功率檢測到主機異常時主機將被彈出的幾率,可用於禁止彈出或緩慢彈出;默認為100; success_rate_minimum_hosts: ... # 對集群啟動成功率異常檢測的最少主機數,默認值為5; success_rate_request_volume: ... # 在檢測的一次時間間隔中必須收集的總請求的最小值,默認值為100; success_rate_stdev_factor: ... # 用確定成功率異常值彈出的彈射閾值的因子;彈射閾值=均值-(因子*平均成功率標准差);不過,此處設置的值需要除以1000以得到因子,例如,需要使用1.3為因子時,需要將該參數值設定為1300; consecutive_gateway_failure: ... # 因連續網關故障而彈出主機的最少連續故障數,默認為5; enforcing_consecutive_gateway_failure: ... # 基於連續網關故障檢測到異常狀態時而彈出主機的幾率的百分比,默認為0; split_external_local_origin_errors: ... # 是否區分本地原因而導致的故障和外部故障,默認為false;此項設置為true時,以下三項方能生效; consecutive_local_origin_failure: ... # 因本地原因的故障而彈出主機的最少故障次數,默認為5; enforcing_consecutive_local_origin_failure: ... # 基於連續的本地故障檢測到異常狀態而彈出主機的幾率百分比,默認為100; enforcing_local_origin_success_rate: ... # 基於本地故障檢測的成功率統計檢測到異常狀態而彈出主機的幾率,默認為100; failure_percentage_threshold: { …} # 確定基於故障百分比的離群值檢測時要使用的故障百分比,如果給定主機的故障百分比大於或等於該值,它將被彈出;默認為 85; enforcing_failure_percentage: {…} # 基於故障百分比統計信息檢測到異常狀態時,實際彈出主機的幾率的百分比;此設置可用於禁用彈出或使其緩慢上升;默認為 0; enforcing_failure_percentage_local_origin: { …} #基於本地故障百分比統計信息檢測到異常狀態時,實際主機的概率的百分比;默認為0; failure_percentage_minimum_hosts: {…} # 集群中執行基於故障百分比的彈出的主機的最小數量;若集群中的主機總數小於此值,將不會執行基於故障百分比的彈出;默認為 5; failure_percentage_request_volume: { …} # 必須在一個時間間隔(由上面的時間間隔持續時間定義)中收集總請求的最小數量,以對此主機執行基於故障百分比的彈出;如果數量低於此設置,則不會對此主機執行基於故障百分比的彈出;默認為50; max_ejection_time: {…} # 主機彈出的最長時間;如果未指定,則使用默認值(300000ms或300s )或 base_ejection_time值中的大者;
1) 同主動健康檢查一樣,異常檢測也要配置在集群級別;下面的示例用於配置在返回3個連續 5xx錯誤時將主機彈出30秒:
consecutive_5xx: "3" base_ejection_time: "30s"
2) 在新服務上啟用異常檢測時應該從不太嚴格的規則集開始,以便僅彈出具有網關連接錯誤的 主機(HTTP 503),並且僅在10%的時間內彈出它們
consecutive_gateway_failure: "3" base_ejection_time: "30s" enforcing_consecutive_gateway_failure: "10"
3) 同時,高流量、穩定的服務可以使用統計信息來彈出頻繁異常容的主機;下面的配置示例將 彈出錯誤率低於群集平均值1個標准差的任何端點,統計信息每10秒進行一次評估,並且算法 不會針對任何在10秒內少於500個請求的主機運行
interval: "10s" base_ejection_time: "30s" success_rate_minimum_hosts: "10" success_rate_request_volume: "500" success_rate_stdev_factor: "1000“ # divided by 1000 to get a double
1)Envoy提供了幾種不同的負載均衡策略,並可大體分為全局負載均衡和分布式負載均衡兩類
(1)分布式負載均衡:Envoy自身基於上游主機(區域感知)的位置及健康狀態等來確定如何分配負載至相關端點 主動健康檢查 區域感知路由 負載均衡算法 (2)全局負載均衡:這是一種通過單個具有全局權限的組件來統一決策負載機制, Envoy的控制平面即是該類組件之一,它能夠通過指定各種參數來調整應用於各端點的負載 優先級 位置權重 端點權重端點健康狀態
3) Cluster中與負載均衡相關的配置參數
... load_assignment: {...} cluster_name: ... endpoints: [] # LocalityLbEndpoints列表,每個列表項主要由位置、端點列表、權重和優先級四項組成; - locality: {...} # 位置定義 region: ... zone: ... sub_zone: ... lb_endpoints: [] # 端點列表 - endpoint: {...} # 端點定義 address: {...} # 端點地址 health_check_config: {...} # 當前端點與健康狀態檢查相關的配置; load_balancing_weight: ... # 當前端點的負載均衡權重,可選; metadata: {...} # 基於匹配的偵聽器、過濾器鏈、路由和端點等為過濾器提供額外信息的元數據,常用用於提供服務 配置或輔助負載均 衡; health_status: ... # 端點是經EDS發現時,此配置項用於管理式設定端點的健康狀態,可用值有UNKOWN、HEALTHY、UNHEALTHY、DRAINING 、TIMEOUT和DEGRADED ; load_balancing_weight: {...} # 權重 priority: ... # 優先級 policy: {...} # 負載均衡策略設定 drop_overloads: [] # 過載保護機制,丟棄過載流量的機制; overprovisioning_factor: ... # 整數值,定義超配因子(百分比),默認值為140,即1.4; endpoint_stale_after: ... # 過期時長,過期之前未收到任何新流量分配的端點將被視為過時,並標記為不健康;默認值0表示永不過時 ; lb_subset_config: {...} ring_hash_lb_config: {...} original_dst_lb_config: {...} least_request_lb_config: {...} common_lb_config: {...} health_panic_threshold: ... # Panic閾值,默認為50%; zone_aware_lb_config: {...} # 區域感知路由的相關配置; locality_weighted_lb_config: {...} # 局部權重負載均衡相關的配置; ignore_new_hosts_until_first_hc: ... # 是否在新加入的主機經歷第一次健康狀態檢查之前不予考慮進負載均衡;
1) Cluster Manager使用負載均衡策略將下游請求調度至選中的上游主機,它支持如下幾個算法
(1)加權輪詢(weighted round robin):算法名稱為ROUND_ROBIN (2)加權最少請求(weighted least request):算法名稱為LEAST_REQUEST (3)環哈希(ring hash):算法名稱為RING_HASH,其工作方式類似於一致性哈希算法; (4)磁懸浮(maglev):類似於環哈希,但其大小固定為65537,並需要 各主機映射的節點填滿整個環; 無論配置的主機和位置權重如何,算法都會嘗試確保將每 個主機至 少映射一 次;算法 名稱為MAGLEV (5)隨機(random):未配置健康檢查策略,則隨機負載均衡算法通常比輪詢更好;
2 負載算法:加權最少請求
1) 加權最少請求算法根據主機的權重相同或不同而使用不同的算法
(1)所有主機的權重均相同 這是一種復雜度為O(1)調度算法,它隨機選擇N個(默認為2,可配置)可用主機並從中挑選具有 最少活動請求的主機; 研究表明,這種稱之為P2C的算法效果不亞於O(N)復雜度的全掃描算法,它確保了集群中具有最 大連接數的端點決不會收到新的請求,直到其連接數小於等於其它主機; (2)所有主機的權重並不完全相同,即集群中的兩個或更多的主機具有不同的權重 調度算法將使用加權輪詢調度的模式,權重將根據主機在請求時的請求負載進行動態調整,方法 是權重除以當前的活動請求計數;例如,權重為2且活動請求計數為4的主機的綜合權重為2/4 = 0.5 ); 該算法在穩態下可提供良好的平衡效果,但可能無法盡快適應不太均衡的負載場景; 與P2C不同,主機將永遠不會真正排空,即便隨着時間的推移它將收到更少的請求。
2)LEAST_REQUEST 的配置參數
least_request_lb_config: choice_count: "{...}" # 從健康主機中隨機挑選出多少個做為樣本進行最少連接數比較;
1) Envoy使用ring/modulo算法對同一集群中的上游主機實行一致性哈希算法,但它需要依賴於在 路由中定義了相應的哈希策略時方才有效。
(1)通過散列其地址的方式將每個主機映射到一個環上 (2)然后,通過散列請求的某些屬性后將其映射在環上 ,並以順 時針方式 找到最接 近的對應 主機從而完成路由; (3)該技術通常也稱為“ Ketama” 散列,並且像所有基於散列的負載平衡器一樣 ,僅在使用協議路 由指 定要散列的值時才有效;
2) 為了避免環偏斜,每個主機都經過哈希處理,並按其權重成比例地放置在環上。
最佳做法是顯式設置minimum_ring_size和maximum_ring_size參數,並監視min_hashes_per_host和max_hashes_per_host指標以確保請求的能得到良好的均衡
3) 配置參數
ring_hash_lb_config: "minimum_ring_size": "{...}", # 哈希環的最小值,環越大調度結果越接近權重酷比,默認為1024,最在值為8M; "hash_function": "...", # 哈希算法,支持XX_HASH和MURMUR_HASH_2兩種,默認為前一種; "maximum_ring_size": "{...}" # 哈希環的最大值,默認為8M;不過,值越大越消耗計算資源;
1) route.RouteAction.HashPolicy
(1)用於一致性哈希算法的散列策略列表,即指定將請求報文的哪部分屬性進行哈希運算並 映射至主機的哈希環上以完成路由 (2)列表中每個哈希策略都將單獨評估,合並后的結果用於路由請求 組合的方法是確定性的,以便相同的哈希策略列表將產生相同的哈希 (3)哈希策略檢查的請求的特定部分不存時將會導致無法生成哈希結果 如果(且僅當)所有已配置的哈希策略均無法生成哈希,則不會為該路由生成哈希,在這種情況 下,其行為與未指定任何哈希策略的行為相同(即,環形哈希負載均衡器將選擇一個隨機后端) (4)若哈希策略將“terminal”屬性設置為true,並且已經生成了哈希,則哈希算法將立即返 回,而忽略哈希策略列表的其余部分
2) 路由哈希策略定義在路由配置中
route_config: ... virutal_host:s: - ... routes: - match: ... route: ... hash_policy: [] # 指定哈希策略列表,每個列表項僅可設置如下header、cookie或connection_properties三者之一; header: {...} header_name: ... # 要哈希的首部名稱 cookie: {...} name: ... # cookie 的名稱,其值將用於哈希計算,必選項; ttl: ... # 持續時長,不存在帶有ttl的cookie將自動生成該cookie;如果T TL存在且為零,則生成的cookie 將是會話cookie path: ... # cookie的路徑; connection_properties: {...} source_ip: ... # 布爾型值,是否哈希源IP地址; terminal: ... # 是否啟用哈希算法的短路標志,即一旦當前策略生成哈希值,將不再考慮列表中后續的其它哈希策略;
下面的示例將哈希請求報文的源IP地址和User-Agent首部;
static_resources: listeners: - address: ... filter_chains: - filters: - name: envoy.http_connection_manager ... route: cluster: webcluster1 hash_policy: - connection_properties: source_ip: true - header: header_name: User-Agent http_filters: - name: envoy.router clusters: - name: webcluster1 connect_timeout: 0.25s type: STRICT_DNS lb_policy: RING_HASH ring_hash_lb_config: maximum_ring_size: 1048576 minimum_ring_size: 512 load_assignment: ...
1) Maglev是環哈希算法的一種特殊形式,它使用固定為6 537的環大小;
(1)環構建算法將每個主機按其權重成比例地放置在環上,直到 環完全填 滿為止; 例如,如 果主機 A的 權 重為1,主機B的權重為2,則主機A將具有21,846項,而主機B將 具有43,691項 (總計65,537 項) (2)該算法嘗試將每個主機至少放置一次在表中,而不 管配置的 主機和位 置權重如 何,因此 在某些極 端 情況下,實際比例可能與配置的權重不同; (3)最佳做法是監視min_entries_per_host和max_entries_per_host指標以確保沒有主機出現異常配置;
2) 在需要一致哈希的任何地方,Maglev都可以取代環哈希;同時與環形哈希算法一樣,Magelev僅在使用協議路由指定要哈希的值時才有效;
(1)通常,與環哈希ketama算法相比,Maglev具有顯着更快的表查找建立時間 以及主機 選擇時間 (2)穩定性略遜於環哈希
七、節點優先級及優先級調度
(1)locality:從大到小可由region(地域)、zone(區域)和sub_zone(子區域)進行逐級標識; (2)load_balancing_weight:可選參數,用於為每個priority/region/zone/sub_zone配置權重,取值范圍[1,n);通常,一個 locality權重除以具有相同優先級的所有locality的權重之和即為當前locality的流量比例;此配置僅啟用了位置加權負載均衡機制時才會生效; (3)priority:此LocalityLbEndpoints組的優先級,默認為最高優先級0;
3) 注意,也可在同一位置配置多個LbEndpoints,但這通常僅在不同組需要 具有不同 的負載均衡權重或不同的優先級時才需要;
# endpoint.LocalityLbEndpoints { "locality": "{...}", "lb_endpoints": [], "load_balancing_weight": "{...}", “priority”: “... “ # 0為最高優先級,可用范圍為[0,N],但配置時必須按順序使用各優先級數字,而不能跳過;默認為0; }
1) 調度時,Envoy僅將流量調度至最高優先級的一組端點(LocalityLbEnpoints)
(1)在最高優先級的端點變得不健康時,流量才會按比例轉移至次一個優先級的點;例如一個優先級中20%的端點不健康時,也將有20%的流量轉移至次一個優先級端點; (2)超配因子:也可為一組端點設定超配因子,實現部分端點故障時仍將更大比例 的流量導 向至本組端點; 計算公式:轉移的流量=100%-健康的端點比例*超配因子;於是,對於1.4的因子來說,20%的故障比例時,所有 流量仍將保留在當前組;當健康的端點比例低於72%時,才會有部分流量轉移至次優先級端點; 一個優先級別當前處理流量的能力也稱為健康評分(健康主機比例*超配因子,上限為100%); (3)若各個優先級的健康評分總和(也稱為標准化的總健康狀態 )小於100,則Envoy會認為沒有足夠的健 康端點來分配所有待處理的流量,此時,各級別會根據其健康分值的比例重新分配100%的流量;例如,對於具有{20,30}健康評分的兩個組(標准化的總健康狀況為50)將被標准化, 並導致負載比例為40% 和60%;
2) 另外,優先級調度還支持同一優先級內部的端點降級(DEGRADED)機制,其工作方式類同於在兩個不同優先級之間的端點分配流量的機制
(1)非降級端點健康比例*超配因子大於等於100%時,降級端點不承接流量; (2)非降級端點的健康比例*超配因子小於100%時,降級端點承接與100%差額部分的流量;
1) 調度期間,Envoy僅考慮上游主機列表中的可用(健康或降級)端點,但可用端點的百分比過 低時,Envoy將忽略所有端點的健康狀態並將流量調度給所有端點;此百分比即為Panic閾值,也稱為恐慌閾值;
(1)默認的Panic閾值為50%; (2)Panic閾值用於避免在流量增長時導致主機故障進入級聯狀態;
給定優先級中的可用端點數量下降時,Envoy會將一些流量轉 移至較低優先級的端點;
◆若在低優先級中找到的承載所有流量的端點,則忽略恐慌閾值;
◆否則,Envoy會在所有優先級之間分配流量,並在給定的優先級的可用性低於恐慌閾值時將該優先的流量分配 至該優先級的所有主機;
# Cluster.CommonLbConfig { "healthy_panic_threshold": "{...}", # 百分比數值,定義恐慌閾值,默認為50%; "zone_aware_lb_config": "{...}", "locality_weighted_lb_config": "{...}", "update_merge_window": "{...}", "ignore_new_hosts_until_first_hc": "..." }
下面的示例基於不同的locality分別定義了兩組不同優先級的端點組
clusters: - name: webcluster1 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: webcluster1 endpoints: - locality: region: cn-north-1 priority: 0 lb_endpoints: - endpoint: address: socket_address: address: webservice1 port_value: 80 - locality: region: cn-north-2 priority: 1 lb_endpoints: - endpoint: address: socket_address: address: webservice2 port_value: 80 health_checks: - ...
1、位置加權負載均衡配置介紹
1)、位置加權負載均衡(Locality weighted load balancing)即為特定的Locality及相關的LbEndpoints 組顯式賦予權重,並根據此權重比在各Locality之間分配流量;
所有Locality的所有Endpoint均可用時,則根據位置權重在各Locality之間進行加權輪詢;
例如,cn-north-1和cn-north-2兩個region的權重分別為1 和2時,且各region內的端點均處理於健康狀態,則流量分配比 例為“1:2”,即一個33%,一個是67%;
啟用位置加權負載均衡及位置權重定義的方法
cluster: - name: ... ... common_lb_config: locality_weighted_lb_config: {} # 啟用位置加權負載均衡機制,它沒有可用的子參數; ... load_assignment: endpoints: locality: "{...}" lb_endpoints": [] load_balancing_weight: "{}" # 整數值,定義當前位置或優先級的權重,最小值為1; priority: "..."
2)、當某Locality的某些Endpoint不可用時,Envoy則按比例動態調整該Locality的權重;
位置加權負載均衡方式也支持為LbEndpoint配置超配因子,默認為1.4;
於是,一個Locality(假設為X)的有效權重計 算方式如下:
health(L_X) = 140 * healthy_X_backends / total_X_backends effective_weight(L_X) = locality_weight_X * min(100, health(L_X)) load to L_X = effective_weight(L_X) / Σ_c(effective_weight(L_c))
例如,假設位置X和Y分別 擁有 1和 2的權重,則Y的健 康端點比 例只有50%時,其權重調 整為 “2×(1.4×0.5)=1.4”,於是流量分配比例變為“ 1:1.4”;
(1)選擇priority; (2)從選出的priority中選擇locality; (3)從選出的locality中選擇Endpoint;
clusters: - name: webcluster1 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN common_lb_config: locality_weighted_lb_config: {} load_assignment: cluster_name: webcluster1 policy: overprovisioning_factor: 140 endpoints: - locality: region: cn-north-1 priority: 0 load_balancing_weight: 1 lb_endpoints: - endpoint: address: socket_address: address: colored port_value: 80 - locality: region: cn-north-2 priority: 0 load_balancing_weight: 2 lb_endpoints: - endpoint: address: socket_address: address: myservice port_value: 80
說明
(1)該集群定義了兩個Locality ,cn-north-1和 cn-north-2, 它們分別具有權重 1和 2;它們具有相同的優先級0; (2)於是,所有端點都健康時,該集群的流量會以1:2 的比例分配至cn-north-1和cn-north-2; (3)假設cn-north-2具有兩個端點,且一個端點 健康狀 態檢測失敗時,則流量分配變更為1:1.4;
1) Envoy還支持在一個集群中基於子集實現更細粒度的流量分發
(1)首先,在集群的上游主機上添加元數據(鍵值標簽) ,並使用 子集選擇 器(分類元數據)將 上游主機 划分 為子集; (2)而后,在路由配置中指定負載均衡器可以選擇的且 必須具有 匹配的元 數據的上 游主機, 從而實現 向 特定子集的路由; (3)各子集內的主機間的負載均衡采用集群定義的策略(lb_policy);
2) 配置了子集,但路由並未指定元數據或不存在與指定元數據匹配的子集時,則子集均衡均衡 器為其應用“回退策略”
(1)NO_FALLBACK:請求失敗,類似集群中不存在任何主機;此為默認 策略; (2)ANY_ENDPOINT:在所有主機間進行調度,不再考慮主機元數據; (3)DEFAULT_SUBSET:調度至默認的子集,該子集需要事先定義;
1) 子集必須預定義方可由子集負載均衡器在調度時使用
定義主機元數據:鍵值數據
(1)主機的子集元數據必須要定義在“envoy.lb”過濾器下; (2)僅當使用ClusterLoadAssignments定義主機時才支持主機元數據; 通過EDS發現的端點 通過load_assignment字段定義的端點
配置示例
load_assignment: cluster_name: webcluster1 endpoints: - lb_endpoints: - endpoint: address: socket_address: protocol: TCP address: ep1 port_value: 80 metadata: filter_metadata: envoy.lb: version: '1.0' stage: 'prod'
2)子集必須預定義方可由子集負載均衡器在調度時使用
clusters: - name ... ... lb_subset_config: fallback_policy: "..." # 回退策略,默認為NO_FALLBACK default_subset: "{...}" # 回退策略DEFAULT_SUBSET使用的默認子集; subset_selectors : [] # 子集選擇器 - keys: [] # 定義一個選擇器,指定用於歸類主機元數據的鍵列表; fallback_policy: ... # 當前選擇器專用的回退策略; locality_weight_aware: "..." # 是否在將請求路由到子集時考慮端點的位置和位置權重;存在一些潛在的缺陷; scale_locality_weight: "..." # 是否將子集與主機中的主機比率來縮放每個位置的權重; panic_mode_any: "..." # 是否在配置回退策略且其相應的子集無法找到主機時嘗試從整個集群中選擇主機; list_as_any": ..."
(1)對於每個選擇器,Envoy會遍歷主機並檢查其“envoy.lb”過濾器元數據,並為每個惟一的鍵值組 合創建一個子集; (2)若某主機元數據可以匹配該選擇器中指定每個鍵,則會將該主機添加至此選擇器中;這同時意味 着,一個主機可能同時滿足多個子集選擇器的適配條件,此時,該主機將同時隸屬於多個子集; (3)若所有主機均未定義元數據,則不會生成任何子集;
3) 路由元數據匹配(metadata_match)
(1)僅在上游集群中與metadata_match中設置的元數據匹配的子集時才能完成流量路由; (2)使用了weighted_clusters定義路由目標時,其內部的各目標集群也可定義專用的metadata_match;
routes: - name: ... match: {...} route: {...} # 路由目標,cluster和weighted_clusters只能使用其一; cluster: metadata_match: {...} # 子集負載均衡器使用的端點元數據匹配條件;若使用了weighted_clusters且內部定義了metadat_match, # 則元數據將被合並,且weighted_cluster中定義的值優先;過濾器名稱應指定為envoy.lb; filter_metadata: {...} # 元數據過濾器 envoy.lb: {...} key1: value1 key2: value2 ... weighted_clusters: {...} clusters: [] - name: ... weight: ... metadata_match: {...}
不存在與路由元數據匹配的子集時,將啟用后退策略 ;
5、子集選擇器配置示例
子集選擇器定義
clusters: - name: webclusters lb_policy: ROUND_ROBIN lb_subset_config: fallback_policy: DEFAULT_SUBSET default_subset: stage: prod version: '1.0' type: std subset_selectors: - keys: [stage, type] - keys: [stage, version] - keys: [version] - keys: [xlarge, version]
集群元數據
映射出十個子集
stage=prod, type=std (e1, e2, e3, e4) stage=prod, type=bigmem (e5, e6) stage=dev, type=std (e7) stage=prod, version=1.0 (e1, e2, e5) stage=prod, version=1.1 (e3, e4, e6) stage=dev, version=1.2-pre (e7) version=1.0 (e1, e2, e5) version=1.1 (e3, e4, e6) version=1.2-pre (e7) version=1.0, xlarge=true (e1)
額外還有一個默認的子集
stage=prod, type=std, version=1.0 (e1, e2)
1) 通常,始發集群和上游集群屬於不同區域的部署中 ,Envoy執行區域感 知路由
2)區域感知路由(zone aware routing)用於盡可能地向上游集群中的本地區域發送流量,並大致確保將流量均衡分配至上游相關的所有端點;它依賴於以下幾個先決條件
(1)始發集群(客戶端)和上游集群(服務端)都未處於恐慌模式; 啟用了區域感知路由; (2)始發集群與上游集群具有相同數量的區域; (3)上游集群具有能承載所有請求流量的主機;
3) Envoy將流量路由到本地區域,還是進行跨區域路由取決於始發集群和 上游集群 中健康主 機的百分比
(1)始發集群的本地區域百分比大於上游集群中本地區域的百分比: (2)Envoy 計算可以直接路由到上游集群的本地區域的請求的百分比,其余的請求被路由到其它區域; (3)始發集群本地區域百分比小於上游集群中的本地區域百分比:可實現所有請求的本地區域路由,並可承載一部分其 它區域的跨區域路由;
4) 目前,區域感知路由僅支持0優先級;
common_lb_config: zone_aware_lb_config: "routing_enabled": "{...}", # 值類型為百分比,用於配置在多大比例的請求流量上啟用區域感知路由機制,默認為100%,; "min_cluster_size": "{...}" # 配置區域感知路由所需的最小上游集群大小,上游集群大小小於指定的值時即使配置了區域感知路由也不會執行 區域感知路由;默認值為6,可用值為64位整數;
1) 調度時,可用的目標上游主機范圍將根據下游發出的請求連接上的元數據進行選定, 並將請求調度至此范圍內的某主機;
(1)連接請求會被指向到將其重定向至Envoy之前的原始目標地址;換句話說,是直接轉發 連接到客戶端連接的目標地址,即沒有做負載均衡; 原始連接請求發往Envoy之前會被iptables的REDIRECT或TPROXY重定向,其目標地址也就會發生 變動; (2)這是專用於原始目標集群(Original destination,即type參數為ORIGINAL_DST的集群) 的負載均衡器; (3)請求中的新目標由負載均衡器添按需添加到集群中,而集群也會周期性地清理不再被使 用的目標主機;
3) 需要注意的是,原始目標集群不與其它負載均衡策略相兼容;
1) 多級服務調度用場景中,某上游服務因網絡 故障或服務繁忙無法響應請求時很可能會導 致多級上游調用者大規模級聯故障,進而導 致整個系統不可用, 此 即為服務的雪崩效應 ;
2)服務雪崩效應是一種因“服務提供者”的 不可用導致“服務消費者”的不可用,並將不可用逐漸放大的過程;
服務網格之上的微服務應用中,多級調用 的長調用鏈並不鮮見;
3) 熔斷:上游服務(被調用者,即服務提供者)因壓 力 過大而變得響應過慢甚至失敗時,下游服務(服務消費 者)通過暫時切斷對上游的請求調用達到犧牲局 部,保全上游甚至是整體之目的;
(1)熔斷打開(Open):在固定時間窗口內,檢測到的失敗指標 達到指定的閾值時啟動熔斷; 所有請求會直接失敗而不再發往后端端點; (2)熔斷半打開(Half Open):斷路器在工作一段時間后自動切 換至半打開狀態,並根據下一次請求的返回結果判定狀態 切換 請求成功:轉為熔斷關閉狀態; 請求失敗:切回熔斷打開狀態; (3)熔斷關閉(Closed):一定時長后上游服務可能會變得再次可 用,此時下游即可關閉熔斷,並再次請求其服務;
總結起來, 熔斷是分布式應用常用的一種流量管理模 式,它能夠讓應用程序免受上游服務失敗、延遲 峰值或 其它網絡異常的侵害。
十二、Envoy斷路器
1) Envoy支持多種類型的完全分布式斷路機 制,達到由其定義的 閾值時,相應的斷路器即會溢出:
(1)集群最大連接數:Envoy同上游集群建立的最大連接數,僅適用於HTTP/1.1,因為HTTP/2可以鏈路復用; 集群最大請求數:在給定的時間,集群中的所有主機未完成的最大請求數,僅適用於HTTP/2; (2)集群可掛起的最大請求數:連接池滿載時所允許的等待隊列的最大長度; (3)集群最大活動並發重試次數:給定時間內集群中所有主機可以執行的最大並發重試次數; (3)集群最大並發連接池:可以同時實例化出的最大連接池數量;
注 意 :在Istio中,熔斷的功能通過連接池(連接池管理)和故障實例隔離(異常點檢測)進 行定義,而Envoy的斷路器通常僅對應於Istio中的連接池功能;
通過限制某個客戶端對 目標服務的連接數、訪問請 求、隊列長度和重試次數等 ,避免對一個服務的過量訪問 某個服務實例頻繁超時或者出錯時交其昨時逐出,以避免影響整個服務
十三、連接池和熔斷器
1) 連接池的常用指標
(1)最大連接數:表示在任何給定時間內, Envoy 與上游集群建立的最大連接數,適用於 HTTP/1.1; (2)每連接最大請求數:表示在任何給定時間內,上游集群中所有主機可以處理的最大請求數;若設為 1 則會禁止 keep alive 特性; (3)最大請求重試次數:在指定時間內對目標主機最大重試次數 (4)連接超時時間:TCP 連接超時時間,最小值必須大於 1ms;最大連接數和連接超時時間是對 TCP 和 HTTP 都有效的 通用連接設置; (5)最大等待請求數:待處理請求隊列的長度,若該斷路器溢出,集群的 upstream_rq_pending_overflow計數器就會遞增 熔斷器的常用指標(Istio上下文) (6)連續錯誤響應個數:在一個檢查周期內,連續出現5xx錯誤的個數,例502、503狀態碼 (7)檢查周期:將會對檢查周期內的響應碼進行篩選 (8)隔離實例比例:上游實例中,允許被隔離的最大比例;采用向上取整機制,假設有10個實例,13%則最多會隔離2個 實例 (9)最短隔離時間:實例第一次被隔離的時間,之后每次隔離時間為隔離次數與最短隔離時間的乘積
3) 與連接池相關的參數有兩個定義在cluster的上下文
--- clusters: - name: ... ... connect_timeout: ... # TCP 連接的超時時長,即主機網絡連接超時,合理的設置可以能夠改善因調用服務變慢而導致整個鏈接變慢的情形; max_requests_per_connection: ... # 每個連接可以承載的最大請求數,HTTP/1.1和HTTP/2的連接池均受限於此設置,無設置則無限制,1表示禁用keep-alive ... circuit_breakers: {...} # 熔斷相關的配置,可選; threasholds: [] # 適用於特定路由優先級的相關指標及閾值的列表; - priority: ... # 當前斷路器適用的路由優先級; max_connections: ... # 可發往上游集群的最大並發連接數,僅適用於H TTP/1,默認為1024;超過指定數量的連接則將其短路; max_pending_requests: ... # 允許請求服務時的可掛起的最大請求數,默認為1024;;超過指定數量的連接則將其短路; max_requests: ... # Envoy可調度給上游集群的最大並發請求數,默認為1024;僅適用於HTTP/2 max_retries: ... # 允許發往上游集群的最大並發重試數量(假設配置了retry_policy),默認為3; track_remaining: ... # 其值為true時表示將公布統計數據以顯示斷路器打開前所剩余的資源數量;默認為false; max_connection_pools: ... # 每個集群可同時打開的最大連接池數量,默認為無限制;
4) 集群級斷路器配置示例
clusters: - name: service_httpbin connect_timeout: 2s type: LOGICAL_DNS dns_lookup_family: V4_ONLY lb_policy: ROUND_ROBIN load_assignment: cluster_name: service_httpbin endpoints: - lb_endpoints: - endpoint: address: socket_address: address: httpbin.org port_value: 80 circuit_breakers: thresholds: max_connections: 1 max_pending_requests: 1 max_retries: 3
5) 可使用工具fortio進行壓力測試
fortio load -c 2 -qps 0 -n 20 -loglevel Warning URL 項目地址:https://github.com/fortio/fortio
1、health-check
實驗環境
envoy:Front Proxy,地址為172.31.18.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.18.11 webserver02:第二個后端服務 webserver02-sidecar:第二個后端服務的Sidecar Proxy,地址為172.31.18.12
front-envoy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: webservice domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: web_cluster_01 } http_filters: - name: envoy.filters.http.router clusters: - name: web_cluster_01 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: web_cluster_01 endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: myservice, port_value: 80 } health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 2 http_health_check: path: /livez expected_statuses: start: 200 end: 399
front-envoy-with-tcp-check.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: webservice domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: web_cluster_01 } http_filters: - name: envoy.filters.http.router clusters: - name: web_cluster_01 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: web_cluster_01 endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: myservice, port_value: 80 } health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 2 tcp_health_check: {}
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
docker-compose.yaml
version: '3.3' services: envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml # - ./front-envoy-with-tcp-check.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.18.2 aliases: - front-proxy depends_on: - webserver01-sidecar - webserver02-sidecar webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.18.11 aliases: - myservice webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.18.12 aliases: - myservice webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.18.0/24
實驗驗證
docker-compose up
克隆窗口測試
# 持續請求服務上的特定路徑/livez root@test:~# while true; do curl 172.31.18.2; sleep 1; done iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.18.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.18.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.18.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.18.11! ...... # 等服務調度就緒后,另啟一個終端,修改其中任何一個服務的/livez響應為非"OK"值,例如,修改第一個后端端點; root@test:~# curl -X POST -d 'livez=FAIL' http://172.31.18.11/livez # 通過請求的響應結果即可觀測服務調度及響應的記錄 iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! #不在調度到172.31.18.11 # 請求中,可以看出第一個端點因主動健康狀態檢測失敗,因而會被自動移出集群,直到其再次轉為健康為止; # 我們可使用類似如下命令修改為正常響應結果; root@test:~# curl -X POST -d 'livez=OK' http://172.31.18.11/livez iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.18.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.18.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.18.11! #172.31.18.11故障恢復,參與調度
2、outlier-detection
實驗環境
envoy:Front Proxy,地址為172.31.20.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.20.11 webserver02:第二個后端服務 webserver02-sidecar:第二個后端服務的Sidecar Proxy,地址為172.31.20.12 webserver03:第三個后端服務 webserver03-sidecar:第三個后端服務的Sidecar Proxy,地址為172.31.20.13
front-envoy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: webservice domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: web_cluster_01 } http_filters: - name: envoy.filters.http.router clusters: - name: web_cluster_01 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: web_cluster_01 endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: myservice, port_value: 80 } outlier_detection: consecutive_5xx: 3 base_ejection_time: 10s max_ejection_percent: 10
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
docker-compose.yaml
version: '3.3' services: envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.20.2 aliases: - front-proxy depends_on: - webserver01-sidecar - webserver02-sidecar - webserver03-sidecar webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.20.11 aliases: - myservice webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.20.12 aliases: - myservice webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.20.13 aliases: - myservice webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.20.0/24
實驗驗證
docker-compose up
克隆窗口測試
# 持續請求服務上的特定路徑/livez root@test:~# while true; do curl 172.31.20.2/livez && echo; sleep 1; done OK OK OK OK OK ...... # 等服務調度就緒后,另啟一個終端,修改其中任何一個服務的/livez響應為非"OK"值,例如,修改第一個后端端點; root@test:~# curl -X POST -d 'livez=FAIL' http://172.31.20.11/livez # 而后回到docker-compose命令的控制台上,或者直接通過請求的響應結果 ,即可觀測服務調度及響應的記錄 webserver01_1 | 127.0.0.1 - - [02/Dec/2021 13:43:54] "POST /livez HTTP/1.1" 200 - webserver02_1 | 127.0.0.1 - - [02/Dec/2021 13:43:55] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:43:56] "GET /livez HTTP/1.1" 200 - webserver01_1 | 127.0.0.1 - - [02/Dec/2021 13:43:57] "GET /livez HTTP/1.1" 506 - webserver02_1 | 127.0.0.1 - - [02/Dec/2021 13:43:58] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:43:59] "GET /livez HTTP/1.1" 200 - webserver01_1 | 127.0.0.1 - - [02/Dec/2021 13:44:00] "GET /livez HTTP/1.1" 506 - webserver02_1 | 127.0.0.1 - - [02/Dec/2021 13:44:01] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:44:02] "GET /livez HTTP/1.1" 200 - webserver01_1 | 127.0.0.1 - - [02/Dec/2021 13:44:03] "GET /livez HTTP/1.1" 506 - webserver02_1 | 127.0.0.1 - - [02/Dec/2021 13:44:04] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:44:05] "GET /livez HTTP/1.1" 200 - # 請求中,可以看出第一個端點因響應5xx的響應碼,每次被加回之后,會再次彈出,除非使用類似如下命令修改為正常響應結果; root@test:~#curl -X POST -d 'livez=OK' http://172.31.20.11/livez webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:45:32] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:45:33] "GET /livez HTTP/1.1" 200 - webserver01_1 | 127.0.0.1 - - [02/Dec/2021 13:45:34] "GET /livez HTTP/1.1" 200 - webserver02_1 | 127.0.0.1 - - [02/Dec/2021 13:45:35] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:45:36] "GET /livez HTTP/1.1" 200 - webserver01_1 | 127.0.0.1 - - [02/Dec/2021 13:45:37] "GET /livez HTTP/1.1" 200 - webserver02_1 | 127.0.0.1 - - [02/Dec/2021 13:45:38] "GET /livez HTTP/1.1" 200 - webserver03_1 | 127.0.0.1 - - [02/Dec/2021 13:45:39] "GET /livez HTTP/1.1" 200 -
3、 least-requests
實驗環境
envoy:Front Proxy,地址為172.31.22.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.22.11 webserver02:第二個后端服務 webserver02-sidecar:第二個后端服務的Sidecar Proxy,地址為172.31.22.12 webserver03:第三個后端服務 webserver03-sidecar:第三個后端服務的Sidecar Proxy,地址為172.31.22.13
front-envoy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: webservice domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: web_cluster_01 } http_filters: - name: envoy.filters.http.router clusters: - name: web_cluster_01 connect_timeout: 0.25s type: STRICT_DNS lb_policy: LEAST_REQUEST load_assignment: cluster_name: web_cluster_01 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: red port_value: 80 load_balancing_weight: 1 - endpoint: address: socket_address: address: blue port_value: 80 load_balancing_weight: 3 - endpoint: address: socket_address: address: green port_value: 80 load_balancing_weight: 5
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
send-request.sh
#!/bin/bash declare -i red=0 declare -i blue=0 declare -i green=0 #interval="0.1" counts=300 echo "Send 300 requests, and print the result. This will take a while." echo "" echo "Weight of all endpoints:" echo "Red:Blue:Green = 1:3:5" for ((i=1; i<=${counts}; i++)); do if curl -s http://$1/hostname | grep "red" &> /dev/null; then # $1 is the host address of the front-envoy. red=$[$red+1] elif curl -s http://$1/hostname | grep "blue" &> /dev/null; then blue=$[$blue+1] else green=$[$green+1] fi # sleep $interval done echo "" echo "Response from:" echo "Red:Blue:Green = $red:$blue:$green"
docker-compose.yaml
version: '3.3' services: envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.22.2 aliases: - front-proxy depends_on: - webserver01-sidecar - webserver02-sidecar - webserver03-sidecar webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.22.11 aliases: - myservice - red webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.22.12 aliases: - myservice - blue webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.22.13 aliases: - myservice - green webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.22.0/24
實驗驗證
docker-compose up
克隆窗口測試
# 使用如下腳本即可直接發起服務請求,並根據結果中統計的各后端端點的響應大體比例,判定其是否能夠大體符合加權最少連接的調度機制; ./send-request.sh 172.31.22.2 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/least-requests# ./send-request.sh 172.31.22.2 Send 300 requests, and print the result. This will take a while. Weight of all endpoints: Red:Blue:Green = 1:3:5 Response from: Red:Blue:Green = 56:80:164 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/least-requests# ./send-request.sh 172.31.22.2 Send 300 requests, and print the result. This will take a while. Weight of all endpoints: Red:Blue:Green = 1:3:5 Response from: Red:Blue:Green = 59:104:137
4、weighted-rr
envoy:Front Proxy,地址為172.31.27.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.27.11 webserver02:第二個后端服務 webserver02-sidecar:第二個后端服務的Sidecar Proxy,地址為172.31.27.12 webserver03:第三個后端服務 webserver03-sidecar:第三個后端服務的Sidecar Proxy,地址為172.31.27.13
front-envoy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: webservice domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: web_cluster_01 } http_filters: - name: envoy.filters.http.router clusters: - name: web_cluster_01 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: web_cluster_01 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: red port_value: 80 load_balancing_weight: 1 - endpoint: address: socket_address: address: blue port_value: 80 load_balancing_weight: 3 - endpoint: address: socket_address: address: green port_value: 80 load_balancing_weight: 5
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
send-request.sh
#!/bin/bash declare -i red=0 declare -i blue=0 declare -i green=0 #interval="0.1" counts=300 echo "Send 300 requests, and print the result. This will take a while." echo "" echo "Weight of all endpoints:" echo "Red:Blue:Green = 1:3:5" for ((i=1; i<=${counts}; i++)); do if curl -s http://$1/hostname | grep "red" &> /dev/null; then # $1 is the host address of the front-envoy. red=$[$red+1] elif curl -s http://$1/hostname | grep "blue" &> /dev/null; then blue=$[$blue+1] else green=$[$green+1] fi # sleep $interval done echo "" echo "Response from:" echo "Red:Blue:Green = $red:$blue:$green"
docker-compose.yaml
version: '3.3' services: envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.27.2 aliases: - front-proxy depends_on: - webserver01-sidecar - webserver02-sidecar - webserver03-sidecar webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.27.11 aliases: - myservice - red webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.27.12 aliases: - myservice - blue webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.27.13 aliases: - myservice - green webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.27.0/24
實驗驗證
docker-compose up
窗口克隆測試
# 使用如下腳本即可直接發起服務請求,並根據結果中統計的各后端端點的響應大體比例,判定其是否能夠大體符合加權最少連接的調度機制; root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/weighted-rr# ./send-request.sh 172.31.27.2 Send 300 requests, and print the result. This will take a while. Weight of all endpoints: Red:Blue:Green = 1:3:5 Response from: Red:Blue:Green = 55:81:164
實驗環境
envoy:Front Proxy,地址為172.31.31.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.31.11, 別名為red和webservice1 webserver02:第二個后端服務 webserver02-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.31.12, 別名為blue和webservice1 webserver03:第三個后端服務 webserver03-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.31.13, 別名為green和webservice1 webserver04:第四個后端服務 webserver04-sidecar:第四個后端服務的Sidecar Proxy,地址為172.31.31.14, 別名為gray和webservice2 webserver05:第五個后端服務 webserver05-sidecar:第五個后端服務的Sidecar Proxy,地址為172.31.31.15, 別名為black和webservice2
front-envoy.yaml
admin: access_log_path: "/dev/null" address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - address: socket_address: { address: 0.0.0.0, port_value: 80 } name: listener_http filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager codec_type: auto stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" route: cluster: webcluster1 http_filters: - name: envoy.filters.http.router clusters: - name: webcluster1 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN http2_protocol_options: {} load_assignment: cluster_name: webcluster1 policy: overprovisioning_factor: 140 endpoints: - locality: region: cn-north-1 priority: 0 load_balancing_weight: 10 lb_endpoints: - endpoint: address: socket_address: { address: webservice1, port_value: 80 } - locality: region: cn-north-2 priority: 0 load_balancing_weight: 20 lb_endpoints: - endpoint: address: socket_address: { address: webservice2, port_value: 80 } health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 1 http_health_check: path: /livez expected_statuses: start: 200 end: 399
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
send-request.sh
#!/bin/bash declare -i colored=0 declare -i colorless=0 interval="0.1" while true; do if curl -s http://$1/hostname | grep -E "red|blue|green" &> /dev/null; then # $1 is the host address of the front-envoy. colored=$[$colored+1] else colorless=$[$colorless+1] fi echo $colored:$colorless sleep $interval done
docker-compose.yaml
version: '3' services: front-envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: - envoymesh expose: # Expose ports 80 (for general traffic) and 9901 (for the admin server) - "80" - "9901" webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.31.11 aliases: - webservice1 - red webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.31.12 aliases: - webservice1 - blue webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.31.13 aliases: - webservice1 - green webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar webserver04-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: gray networks: envoymesh: ipv4_address: 172.31.31.14 aliases: - webservice2 - gray webserver04: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver04-sidecar" depends_on: - webserver04-sidecar webserver05-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: black networks: envoymesh: ipv4_address: 172.31.31.15 aliases: - webservice2 - black webserver05: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver05-sidecar" depends_on: - webserver05-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.31.0/24
實驗驗證
docker-compose up
窗口克隆測試
# 通過send-requests.sh腳本進行測試,可發現,用戶請求被按權重分配至不同的locality之上,每個locality內部再按負載均衡算法進行調度; root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/locality-weighted#./send-requests.sh 172.31.31.2 ...... 283:189 283:190 283:191 284:191 285:191 286:191 286:192 286:193 287:193 ...... # 可以試着將權重較高的一組中的某一主機的健康狀態團置為不可用;
實驗環境
envoy:Front Proxy,地址為172.31.25.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.25.11 webserver02:第二個后端服務 webserver02-sidecar:第二個后端服務的Sidecar Proxy,地址為172.31.25.12 webserver03:第三個后端服務 webserver03-sidecar:第三個后端服務的Sidecar Proxy,地址為172.31.25.13
front-envoy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: webservice domains: ["*"] routes: - match: { prefix: "/" } route: cluster: web_cluster_01 hash_policy: # - connection_properties: # source_ip: true - header: header_name: User-Agent http_filters: - name: envoy.filters.http.router clusters: - name: web_cluster_01 connect_timeout: 0.5s type: STRICT_DNS lb_policy: RING_HASH ring_hash_lb_config: maximum_ring_size: 1048576 minimum_ring_size: 512 load_assignment: cluster_name: web_cluster_01 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: myservice port_value: 80 health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 2 http_health_check: path: /livez expected_statuses: start: 200 end: 399
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
send-request.sh
#!/bin/bash declare -i red=0 declare -i blue=0 declare -i green=0 interval="0.1" counts=200 echo "Send 300 requests, and print the result. This will take a while." for ((i=1; i<=${counts}; i++)); do if curl -s http://$1/hostname | grep "red" &> /dev/null; then # $1 is the host address of the front-envoy. red=$[$red+1] elif curl -s http://$1/hostname | grep "blue" &> /dev/null; then blue=$[$blue+1] else green=$[$green+1] fi sleep $interval done echo "" echo "Response from:" echo "Red:Blue:Green = $red:$blue:$green"
docker-compose.yaml
version: '3.3' services: envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.25.2 aliases: - front-proxy depends_on: - webserver01-sidecar - webserver02-sidecar - webserver03-sidecar webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.25.11 aliases: - myservice - red webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.25.12 aliases: - myservice - blue webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.25.13 aliases: - myservice - green webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.25.0/24
實驗驗證
docker-compose up
克隆窗口測試
# 我們在路由hash策略中,hash計算的是用戶的瀏覽器類型,因而,使用如下命令持續發起請求可以看出,用戶請求將始終被定向到同一個后端端點;因為其瀏覽器類型一直未變。 while true; do curl 172.31.25.2; sleep .3; done # 我們可以模擬使用另一個瀏覽器再次發請求;其請求可能會被調度至其它節點,也可能仍然調度至前一次的相同節點之上;這取決於hash算法的計算結果; root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/ring-hash# while true; do curl 172.31.25.2; sleep .3; done iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: red, ServerIP: 172.31.25.11! ...... # 也可使用如下腳本,驗證同一個瀏覽器的請求是否都發往了同一個后端端點,而不同瀏覽器則可能會被重新調度; root@test:~# while true; do index=$[$RANDOM%10]; curl -H "User-Agent: Browser_${index}" 172.31.25.2/user-agent && curl -H "User-Agent: Browser_${index}" 172.31.25.2/hostname && echo ; sleep .1; done User-Agent: Browser_0 ServerName: green User-Agent: Browser_0 ServerName: green User-Agent: Browser_2 ServerName: red User-Agent: Browser_2 ServerName: red User-Agent: Browser_5 ServerName: blue User-Agent: Browser_9 ServerName: red # 也可以使用如下命令,將一個后端端點的健康檢查結果置為失敗,動態改變端點,並再次判定其調度結果,驗證此前調度至該節點的請求是否被重新分配到了其它節點; root@test:~# curl -X POST -d 'livez=FAIL' http://172.31.25.11/livez iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.25.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.25.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.25.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.25.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.25.12! iKubernetes demoapp v1.0 !! ClientIP: 127.0.0.1, ServerName: blue, ServerIP: 172.31.25.12! #172.31.25.11故障,被調度到172.31.25.12了
實驗環境
envoy:Front Proxy,地址為172.31.29.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.29.11, 別名為red和webservice1 webserver02:第二個后端服務 webserver02-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.29.12, 別名為blue和webservice1 webserver03:第三個后端服務 webserver03-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.29.13, 別名為green和webservice1 webserver04:第四個后端服務 webserver04-sidecar:第四個后端服務的Sidecar Proxy,地址為172.31.29.14, 別名為gray和webservice2 webserver05:第五個后端服務 webserver05-sidecar:第五個后端服務的Sidecar Proxy,地址為172.31.29.15, 別名為black和webservice2
front-envoy.yaml
admin: access_log_path: "/dev/null" address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - address: socket_address: address: 0.0.0.0 port_value: 80 name: listener_http filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager codec_type: auto stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" route: cluster: webcluster1 http_filters: - name: envoy.filters.http.router clusters: - name: webcluster1 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN http2_protocol_options: {} load_assignment: cluster_name: webcluster1 policy: overprovisioning_factor: 140 endpoints: - locality: region: cn-north-1 priority: 0 lb_endpoints: - endpoint: address: socket_address: address: webservice1 port_value: 80 - locality: region: cn-north-2 priority: 1 lb_endpoints: - endpoint: address: socket_address: address: webservice2 port_value: 80 health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 1 http_health_check: path: /livez expected_statuses: start: 200 end: 399
front-envoy-v2.yaml
admin: access_log_path: "/dev/null" address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - address: socket_address: address: 0.0.0.0 port_value: 80 name: listener_http filter_chains: - filters: - name: envoy.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager codec_type: auto stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" route: cluster: webcluster1 http_filters: - name: envoy.router clusters: - name: webcluster1 connect_timeout: 0.5s type: STRICT_DNS lb_policy: ROUND_ROBIN http2_protocol_options: {} load_assignment: cluster_name: webcluster1 policy: overprovisioning_factor: 140 endpoints: - locality: region: cn-north-1 priority: 0 lb_endpoints: - endpoint: address: socket_address: address: webservice1 port_value: 80 - locality: region: cn-north-2 priority: 1 lb_endpoints: - endpoint: address: socket_address: address: webservice2 port_value: 80 health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 1 http_health_check: path: /livez
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 }
docker-compose.yaml
version: '3' services: front-envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy-v2.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.29.2 aliases: - front-proxy expose: # Expose ports 80 (for general traffic) and 9901 (for the admin server) - "80" - "9901" webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.29.11 aliases: - webservice1 - red webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.29.12 aliases: - webservice1 - blue webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.29.13 aliases: - webservice1 - green webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar webserver04-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: gray networks: envoymesh: ipv4_address: 172.31.29.14 aliases: - webservice2 - gray webserver04: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver04-sidecar" depends_on: - webserver04-sidecar webserver05-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: black networks: envoymesh: ipv4_address: 172.31.29.15 aliases: - webservice2 - black webserver05: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver05-sidecar" depends_on: - webserver05-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.29.0/24
實驗驗證
docker-compose up
窗口克隆測試
持續請求服務,可發現,請求均被調度至優先級為0的webservice1相關的后端端點之上; while true; do curl 172.31.29.2; sleep .5; done # 等確定服務的調度結果后,另啟一個終端,修改webservice1中任何一個后端端點的/livez響應為非"OK"值,例如,修改第一個后端端點; curl -X POST -d 'livez=FAIL' http://172.31.29.11/livez # 而后通過請求的響應結果可發現,因過載因子為1.4,客戶端的請求仍然始終只發往webservice1的后端端點blue和green之上; # 等確定服務的調度結果后,再修改其中任何一個服務的/livez響應為非"OK"值,例如,修改第一個后端端點; curl -X POST -d 'livez=FAIL' http://172.31.29.12/livez # 請求中,可以看出第一個端點因響應5xx的響應碼,每次被加回之后,會再次彈出,除非使用類似如下命令修改為正常響應結果; curl -X POST -d 'livez=OK' http://172.31.29.11/livez # 而后通過請求的響應結果可發現,因過載因子為1.4,優先級為0的webserver1已然無法鎖住所有的客戶端請求,於是,客戶端的請求的部分流量將被轉發至webservice2的端點之上;
實驗環境
envoy:Front Proxy,地址為172.31.33.2 [e1, e7]:7個后端服務
front-envoy.yaml
admin: access_log_path: "/dev/null" address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - address: socket_address: { address: 0.0.0.0, port_value: 80 } name: listener_http filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager codec_type: auto stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/" headers: - name: x-custom-version exact_match: pre-release route: cluster: webcluster1 metadata_match: filter_metadata: envoy.lb: version: "1.2-pre" stage: "dev" - match: prefix: "/" headers: - name: x-hardware-test exact_match: memory route: cluster: webcluster1 metadata_match: filter_metadata: envoy.lb: type: "bigmem" stage: "prod" - match: prefix: "/" route: weighted_clusters: clusters: - name: webcluster1 weight: 90 metadata_match: filter_metadata: envoy.lb: version: "1.0" - name: webcluster1 weight: 10 metadata_match: filter_metadata: envoy.lb: version: "1.1" metadata_match: filter_metadata: envoy.lb: stage: "prod" http_filters: - name: envoy.filters.http.router clusters: - name: webcluster1 connect_timeout: 0.5s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: webcluster1 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: e1 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "prod" version: "1.0" type: "std" xlarge: true - endpoint: address: socket_address: address: e2 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "prod" version: "1.0" type: "std" - endpoint: address: socket_address: address: e3 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "prod" version: "1.1" type: "std" - endpoint: address: socket_address: address: e4 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "prod" version: "1.1" type: "std" - endpoint: address: socket_address: address: e5 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "prod" version: "1.0" type: "bigmem" - endpoint: address: socket_address: address: e6 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "prod" version: "1.1" type: "bigmem" - endpoint: address: socket_address: address: e7 port_value: 80 metadata: filter_metadata: envoy.lb: stage: "dev" version: "1.2-pre" type: "std" lb_subset_config: fallback_policy: DEFAULT_SUBSET default_subset: stage: "prod" version: "1.0" type: "std" subset_selectors: - keys: ["stage", "type"] - keys: ["stage", "version"] - keys: ["version"] - keys: ["xlarge", "version"] health_checks: - timeout: 5s interval: 10s unhealthy_threshold: 2 healthy_threshold: 1 http_health_check: path: /livez expected_statuses: start: 200 end: 399
test.sh
#!/bin/bash declare -i v10=0 declare -i v11=0 for ((counts=0; counts<200; counts++)); do if curl -s http://$1/hostname | grep -E "e[125]" &> /dev/null; then # $1 is the host address of the front-envoy. v10=$[$v10+1] else v11=$[$v11+1] fi done echo "Requests: v1.0:v1.1 = $v10:$v11"
docker-compose.yaml
version: '3' services: front-envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: envoymesh: ipv4_address: 172.31.33.2 expose: # Expose ports 80 (for general traffic) and 9901 (for the admin server) - "80" - "9901" e1: image: ikubernetes/demoapp:v1.0 hostname: e1 networks: envoymesh: ipv4_address: 172.31.33.11 aliases: - e1 expose: - "80" e2: image: ikubernetes/demoapp:v1.0 hostname: e2 networks: envoymesh: ipv4_address: 172.31.33.12 aliases: - e2 expose: - "80" e3: image: ikubernetes/demoapp:v1.0 hostname: e3 networks: envoymesh: ipv4_address: 172.31.33.13 aliases: - e3 expose: - "80" e4: image: ikubernetes/demoapp:v1.0 hostname: e4 networks: envoymesh: ipv4_address: 172.31.33.14 aliases: - e4 expose: - "80" e5: image: ikubernetes/demoapp:v1.0 hostname: e5 networks: envoymesh: ipv4_address: 172.31.33.15 aliases: - e5 expose: - "80" e6: image: ikubernetes/demoapp:v1.0 hostname: e6 networks: envoymesh: ipv4_address: 172.31.33.16 aliases: - e6 expose: - "80" e7: image: ikubernetes/demoapp:v1.0 hostname: e7 networks: envoymesh: ipv4_address: 172.31.33.17 aliases: - e7 expose: - "80" networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.33.0/24
實驗驗證
docker-compose up
窗口克隆測試
# test.sh腳本接受front-envoy的地址,並持續向該地址發起請求,而后顯示流量分配的結果;根據路由規則,未指定x-hardware-test和x-custom-version且給予了相應值的請求,均會調度給默認子集,且在兩個組之間進行流量分發; root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# ./test.sh 172.31.33.2 Requests: v1.0:v1.1 = 183:17 # 我們可以指定特殊的首部發出特定的請求,例如附帶有”x-hardware-test: memory”的請求,將會被分發至特定的子集;該子集要求標簽type的值為bigmem,而標簽stage的值為prod;該子集共有e5和e6兩個端點 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-hardware-test: memory" 172.31.33.2/hostname ServerName: e6 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-hardware-test: memory" 172.31.33.2/hostname ServerName: e5 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-hardware-test: memory" 172.31.33.2/hostname ServerName: e6 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-hardware-test: memory" 172.31.33.2/hostname ServerName: e5 # 或者,我們也可以指定特殊的首部發出特定的請求,例如附帶有”x-custom-version: pre-release”的請求,將會被分發至特定的子集;該子集要求標簽version的值為1.2-pre,而標簽stage的值為dev;該子集有e7一個端點; root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-custom-version: pre-release" 172.31.33.2/hostname ServerName: e7 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-custom-version: pre-release" 172.31.33.2/hostname ServerName: e7 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-custom-version: pre-release" 172.31.33.2/hostname ServerName: e7 root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/lb-subsets# curl -H "x-custom-version: pre-release" 172.31.33.2/hostname ServerName: e7
9、circuit-breaker
envoy:Front Proxy,地址為172.31.35.2 webserver01:第一個后端服務 webserver01-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.35.11, 別名為red和webservice1 webserver02:第二個后端服務 webserver02-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.35.12, 別名為blue和webservice1 webserver03:第三個后端服務 webserver03-sidecar:第一個后端服務的Sidecar Proxy,地址為172.31.35.13, 別名為green和webservice1 webserver04:第四個后端服務 webserver04-sidecar:第四個后端服務的Sidecar Proxy,地址為172.31.35.14, 別名為gray和webservice2 webserver05:第五個后端服務 webserver05-sidecar:第五個后端服務的Sidecar Proxy,地址為172.31.35.15, 別名為black和webservice2
front-envoy.yaml
admin: access_log_path: "/dev/null" address: socket_address: { address: 0.0.0.0, port_value: 9901 } static_resources: listeners: - address: socket_address: { address: 0.0.0.0, port_value: 80 } name: listener_http filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager codec_type: auto stat_prefix: ingress_http route_config: name: local_route virtual_hosts: - name: backend domains: - "*" routes: - match: prefix: "/livez" route: cluster: webcluster2 - match: prefix: "/" route: cluster: webcluster1 http_filters: - name: envoy.filters.http.router clusters: - name: webcluster1 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: webcluster1 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: webservice1 port_value: 80 circuit_breakers: thresholds: max_connections: 1 max_pending_requests: 1 max_retries: 3 - name: webcluster2 connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: webcluster2 endpoints: - lb_endpoints: - endpoint: address: socket_address: address: webservice2 port_value: 80 outlier_detection: interval: "1s" consecutive_5xx: "3" consecutive_gateway_failure: "3" base_ejection_time: "10s" enforcing_consecutive_gateway_failure: "100" max_ejection_percent: "30" success_rate_minimum_hosts: "2"
envoy-sidecar-proxy.yaml
admin: profile_path: /tmp/envoy.prof access_log_path: /tmp/admin_access.log address: socket_address: address: 0.0.0.0 port_value: 9901 static_resources: listeners: - name: listener_0 address: socket_address: { address: 0.0.0.0, port_value: 80 } filter_chains: - filters: - name: envoy.filters.network.http_connection_manager typed_config: "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager stat_prefix: ingress_http codec_type: AUTO route_config: name: local_route virtual_hosts: - name: local_service domains: ["*"] routes: - match: { prefix: "/" } route: { cluster: local_cluster } http_filters: - name: envoy.filters.http.router clusters: - name: local_cluster connect_timeout: 0.25s type: STATIC lb_policy: ROUND_ROBIN load_assignment: cluster_name: local_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: { address: 127.0.0.1, port_value: 8080 } circuit_breakers: thresholds: max_connections: 1 max_pending_requests: 1 max_retries: 2
send-requests.sh
#!/bin/bash # if [ $# -ne 2 ] then echo "USAGE: $0 <URL> <COUNT>" exit 1; fi URL=$1 COUNT=$2 c=1 #interval="0.2" while [[ ${c} -le ${COUNT} ]]; do #echo "Sending GET request: ${URL}" curl -o /dev/null -w '%{http_code}\n' -s ${URL} & (( c++ )) # sleep $interval done wait
docker-compose.yaml
version: '3' services: front-envoy: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./front-envoy.yaml:/etc/envoy/envoy.yaml networks: - envoymesh expose: # Expose ports 80 (for general traffic) and 9901 (for the admin server) - "80" - "9901" webserver01-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: red networks: envoymesh: ipv4_address: 172.31.35.11 aliases: - webservice1 - red webserver01: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver01-sidecar" depends_on: - webserver01-sidecar webserver02-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: blue networks: envoymesh: ipv4_address: 172.31.35.12 aliases: - webservice1 - blue webserver02: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver02-sidecar" depends_on: - webserver02-sidecar webserver03-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: green networks: envoymesh: ipv4_address: 172.31.35.13 aliases: - webservice1 - green webserver03: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver03-sidecar" depends_on: - webserver03-sidecar webserver04-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: gray networks: envoymesh: ipv4_address: 172.31.35.14 aliases: - webservice2 - gray webserver04: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver04-sidecar" depends_on: - webserver04-sidecar webserver05-sidecar: image: envoyproxy/envoy-alpine:v1.20.0 environment: - ENVOY_UID=0 volumes: - ./envoy-sidecar-proxy.yaml:/etc/envoy/envoy.yaml hostname: black networks: envoymesh: ipv4_address: 172.31.35.15 aliases: - webservice2 - black webserver05: image: ikubernetes/demoapp:v1.0 environment: - PORT=8080 - HOST=127.0.0.1 network_mode: "service:webserver05-sidecar" depends_on: - webserver05-sidecar networks: envoymesh: driver: bridge ipam: config: - subnet: 172.31.35.0/24
實驗驗證
docker-compose up
窗口克隆測試
# 通過send-requests.sh腳本進行webcluster1的請求測試,可發現,有部分請求的響應碼為5xx,這其實就是被熔斷的處理結果; root@test:/apps/servicemesh_in_practise-develop/Cluster-Manager/circuit-breaker# ./send-requests.sh http://172.31.35.2/ 300 200 200 200 503 #熔斷 200 200