算法服務布署在k8s上
服務走了兩層代理,出問題其實並不覺得意外,因為代理層數過多
kong -> nginx-ingress -> svc
通過kong訪問,部分請求返回502
<html><head><title>502 Bad Gateway</title></head><body><center><h1>502 Bad Gateway</h1></center><hr><center>nginx/1.19.1</center></body></html>
查看kong的訪問,發現warn
2021/01/25 16:46:05 [warn] 166110#0: *3522559308 a client request body is buffered to a temporary file /usr/local/kong/client_body_temp/0000015840, client: 192.168.11.111, server: kong, request: "POST /slap/algo-api/3d1503b8ee0c4b8082d13a0d8f2f3173/general_sentiment HTTP/1.1", host: "cclient.github.com"
2021/01/25 16:46:06 [warn] 166118#0: *3522560171 a client request body is buffered to a temporary file /usr/local/kong/client_body_temp/0000015841, client: 192.168.11.111, server: kong, request: "POST /slap/algo-api/3d1503b8ee0c4b8082d13a0d8f2f3173/general_sentiment HTTP/1.1", host: "cclient.github.com"
2021/01/25 16:46:06 [warn] 166117#0: *3522561394 a client request body is buffered to a temporary file /usr/local/kong/client_body_temp/0000015842, client: 192.168.11.111, server: kong, request: "POST /slap/algo-api/3d1503b8ee0c4b8082d13a0d8f2f3173/general_sentiment HTTP/1.1", host: "cclient.github.com"
調大client_body_buffer_size
后報警不再出現,但依然有502請求
經測試-直接通過nodeport訪問,無502
通過nginx-ingress,部分502
能過kong,部分502
先一級一級優化排查吧
原始的ingress信息
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$2
generation: 1
name: statefulset-123456
namespace: default
spec:
rules:
- host: cclient.github.com
http:
paths:
- backend:
serviceName: statefulset-123456
servicePort: 80
path: /api_v1/123456(/|$)(.*)
服務本身,nodeport訪問,返回header
curl -D header_nodeport -H "Content-Type: application/json" -X POST -d '["我感覺就是翻新機。","蘇寧的電腦就是比便宜好幾百"]' "http://192.168.100.128:32359/sentiment"
HTTP/1.1 200 OK
Content-Length: 519
Content-Type: application/json
Connection: keep-alive
Keep-Alive: 5
服務通過nginx-ingress訪問,返回header
curl -D header_nodeport -H "Content-Type: application/json" -X POST -d '["我感覺就是翻新機。","蘇寧的電腦就是比便宜好幾百"]' "http://matrix-paas.mlamp.cn/api_v1/123456/sentiment
curl -D header_nodeport -H "Content-Type: application/json" -X POST -d '["我感覺就是翻新機。","蘇寧的電腦就是比便宜好幾百"]' "http://matrix-paas.mlamp.cn/slap/algo-api/3d1503b8ee0c4b8082d13a0d8f2f3173/general_sentiment"
HTTP/1.1 200 OK
Server: nginx/1.19.1
Date: Wed, 27 Jan 2021 10:36:37 GMT
Content-Type: application/json
Content-Length: 519
Connection: keep-alive
Vary: Accept-Encoding
服務通過外層的kong訪問返回header
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 519
Connection: keep-alive
Server: nginx/1.19.1
Date: Wed, 27 Jan 2021 10:36:01 GMT
Vary: Accept-Encoding
X-Kong-Upstream-Latency: 92
X-Kong-Proxy-Latency: 16
Via: kong/2.0.4
先保證nginx-ingress不返回502再說,加了很多和nginx對應的time-out,buffer參數,但都不生效,依然大量502
nginx.ingress.kubernetes.io/client-body-buffer-size: 100m
nginx.ingress.kubernetes.io/proxy-body-size: 100m
nginx.ingress.kubernetes.io/proxy-connect-timeout: "600"
nginx.ingress.kubernetes.io/proxy-next-upstream-timeout: "600"
nginx.ingress.kubernetes.io/proxy-next-upstream-tries: "8"
nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
以前寫過一篇也是http1.0,http1.1的問題記錄文章,按個人經驗試了下http版本
nginx 配合jersey+netty的奇怪問題 - 資本主義接班人 - 博客園 (cnblogs.com)
嘗試性的加了一條
nginx.ingress.kubernetes.io/proxy-http-version: "1.0"
502便不再出現了,問題解決
只是還有稍許疑問,因為后端服務又確實是http1.1,可能是nginx-ingress的bug,暫時保證可以正常訪問不深究(老集群,nginx-ingress現在也早已不推薦使用)
外層的kong因為集成度過高,無法為特定請求配置1.0,全局1.0的話,又會影響其他服務,暫時放下