Nginx心跳檢測

本文轉載自查看原文 2019-06-26 11:26 1130

通常我們會使用Nginx的ngx_http_upstream_module模塊來配置服務器組，示例如下

 upstream springboot {
 server 10.3.73.223:8080 max_fails=2 fail_timeout=30s;
 server 10.3.73.223:8090 max_fails=2 fail_timeout=30s;
 }
 server {
 listen 80;
 server_name localhost;
 location /test {
 proxy_pass http://springboot;
 }
 }

在30s內（fail_timeout，默認值為10s），與服務端通訊失敗2次（max_fails，默認值為1，設置為0則認為服務端一直可用），則認為服務器不可用

不可用服務器在30s內與服務端通訊成功2次，則認為服務器恢復

特別需要注意的是，何為與服務端通訊失敗是由upstream的使用方定義的（ngx_http_proxy_module、proxy_next_upstream、fastcgi_next_upstream和memcached_next_upstream）

以proxy_next_upstream為例：

與服務端建立連接、向服務端發送請求或者解析服務端響應頭時，發生異常或超時將被認為是通訊失敗

服務端返回的響應為空或不合法將被認為是通訊失敗

如果配置了http_500，http_502，http_503，http_504和http_429，服務端返回這些狀態碼將被認為是通訊失敗

服務端返回http_403和http_404永遠不會被認為通訊失敗

當upstream中的一台服務器響應失敗時， Nginx會將請求轉發給下一台服務器，直到所有的服務器都發送過該請求，如果此時依然無法獲得成功的響應，客戶端將收到最后一台服務器返回的響應結果

使用上面的配置進行測試：

package com.sean.test;
import org.springframework.http.HttpStatus;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseBody;
import org.springframework.web.bind.annotation.ResponseStatus;
/**
* Created by seanzou on 2018/8/20.
*/
@org.springframework.stereotype.Controller
public class Controller {
@RequestMapping("/test")
@ResponseBody
// @ResponseStatus(code= HttpStatus.NOT_FOUND)
@ResponseStatus(code= HttpStatus.INTERNAL_SERVER_ERROR)
public String test(){
System.out.println("test");
// throw new RuntimeException();
return "test";
}
}

即便服務端響應404、500狀態碼，Nginx依然認為通訊成功，除非停掉當前服務

upstream存在如下一些問題：

1 無法主動感知服務器狀態

2 配置不靈活，無法自定義通訊失敗判斷條件（僅提供少數定義好的狀態碼可供使用）

1，ngx_http_upstream_hc_module

ngx_http_upstream_hc_module允許周期性的對服務器組中的服務器進行健康檢查，前提條件是服務器組中的服務器必須使用共享內存模式（共享內存用來保存服務器組的配置信息以及運行時狀態，Nginx的woker進程將共享該配置和狀態），示例如下：

upstream dynamic {
# 共享內存
zone upstream_dynamic 64k;
server backend1.example.com;
server backend2.example.com;
}
http {
server {
...
# Nginx每5s（默認值）就會向dynamic中的每一個服務器發送“/”請求
# location / {
# proxy_pass http://dynamic;
# health_check;
# }
# hc可以通過自定義的校驗規則判斷服務器狀態
location / {
proxy_pass http: //dynamic;
health_check match=welcome;
}
}
# 如果配置了多個條件，所有條件均滿足服務器狀態才被認為是正常的
# 響應狀態碼為200，且響應body中包含"Welcome to nginx!"服務器狀態才被認為是正常的
# Nginx僅檢查響應body中的前256k數據
match welcome {
status 200;
header Content-Type = text/html;
body ~ "Welcome to nginx!";
}
}

功能十分強大，遺憾的是只有Nginx商業版才包含該模塊

2，nginx_upstream_check_module

這個模塊是由淘寶團隊開發的，並且是完全開源的：nginx_upstream_check_module

淘寶Tengine自帶該模塊，如果我們沒有使用Tengine，可以通過打補丁的方式把該模塊加到我們自己的Nginx中

sean@ubuntu:~$ unzip nginx_upstream_check_module-master.zip
sean@ubuntu:~$ cd nginx-1.14.0/
# 打補丁
sean@ubuntu:~/nginx-1.14.0$ patch -p1 < ../nginx_upstream_check_module-master/check_1.12.1+.patch
# 添加心跳檢測模塊后重新編譯
sean@ubuntu:~/nginx-1.14.0$ sudo ./configure --add-module=../nginx_upstream_check_module-master/
sean@ubuntu:~/nginx-1.14.0$ sudo make
# 備份舊文件
sean@ubuntu:~/nginx-1.14.0$ sudo mv /usr/ local/nginx/sbin/nginx /usr/local/nginx/sbin/nginx.bak
# 使用新文件
sean@ubuntu:~/nginx-1.14.0$ sudo cp ./objs/nginx /usr/ local/nginx/sbin/
sean@ubuntu:~/nginx-1.14.0$ sudo /usr/ local/nginx/sbin/nginx -t
nginx: the configuration file /usr/ local/nginx/conf/nginx.conf syntax is ok
nginx: configuration file /usr/ local/nginx/conf/nginx.conf test is successful

修改Nginx配置，官方文檔參考：ngx_http_upstream_check_module document

http {
check_shm_size 1M;
upstream springboot {
server 10.3.73.223:8080;
server 10.3.73.223:8090;
check interval= 3000 rise=2 fall=5 timeout=1000 type=http;
check_keepalive_requests 1;
check_http_send "GET /test HTTP/1.0\r\n\r\n";
check_http_expect_alive http_2xx;
}
server {
listen 80;
server_name localhost;
location /test {
proxy_pass http: //springboot;
}
location /status {
check_status;
access_log off;
}
}
}

后端服務器健康檢查狀態都存於共享內存中，該指令可以設置共享內存的大小，如果服務器數量較大，需要注意該設置是否夠用

語法：check_shm_size size

默認值：1M

上下文：http

健康檢查規則配置

語法：check interval=milliseconds [fall=count] [rise=count] [timeout=milliseconds] [default_down=true | false]

[type=tcp | http | ssl_hello | mysql | ajp] [port=check_port]

默認值：check interval=30000 fall=5 rise=2 timeout=1000 default_down=true type=tcp

上下文：upstream

interval：向后端發送的健康檢查的間隔時間

fall(fall_count): 連續失敗次數達到fall_count，服務器被認為是down狀態

rise(rise_count): 連續成功次數達到rise_count，服務器被認為是up狀態

timeout: 健康檢查請求超時時間

default_down: 設定初始時服務器的狀態，如果是true，服務器默認是down狀態，如果是false，服務器默認是up狀態，默認值是true，也就是一開始服務器被認為不可用，要等健康檢查請求達到一定成功次數以后才會被認為是健康的

type：健康檢查請求協議類型，支持tcp，http，ssl_hello，mysql和ajp

port：健康檢查請求發送端口，可以與后端服務器服務端口不同

一個連接可發送的請求數，默認值為1，表示完成1次請求后即關閉連接

語法：check_keepalive_requests request_num

默認值：1

上下文：upstream

HTTP接口健康檢查發送的請求內容，為了減少傳輸數據量，推薦采用HEAD方法

語法：check_http_send http_packet

默認值："GET / HTTP/1.0\r\n\r\n"

上下文：upstream

HTTP接口健康檢查成功狀態碼

語法：check_http_expect_alive [http_2xx | http_3xx | http_4xx | http_5xx]

默認值：http_2xx | http_3xx

上下文：upstream

后端服務器狀態查詢頁面，提供三種展示方式

語法：check_status [html | csv | json]

默認值：check_status html

上下文：location

------------2018-11-29------------

線上環境檢測的worker機使用tomcat作為容器，check_http_send需要配置Host（僅需配置即可，值是否正確不重要），否則會一直發送心跳檢測，但是一直判定檢測失敗，示例如下：

upstream bj{
server 1.2.3.4:80;
server 5.6.7.8:80;
check interval= 5000 rise=2 fall=2 timeout=1000 type=http;
check_http_send "GET /admin/health_check.htm HTTP/1.0\r\nHost: 127.0.0.1\r\n\r\n";
check_http_expect_alive http_2xx;

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 WebSocket心跳檢測和重連機制 NETTY keeplive 參數，心跳檢測 python opcua 檢測心跳，心跳斷開，自動重新鏈接 netty長（短）連接，心跳檢測 C# Socket keeplive 心跳檢測實例 javascript websocket 心跳檢測機制介紹 JAVA實現長連接(含心跳檢測)Demo 記錄初試Netty(2)-服務端心跳檢測 Delphi之TClientSocket和TServerSocket使用tcp keepalive心跳機制實現“斷網”、"斷電"檢測通過netty實現服務端與客戶端的長連接通訊，及心跳檢測。