k8s集群中pod內不能訪問clusterIP和service


排錯背景:在一次生產環境的部署過程中,配置文件中配置的訪問地址為集群的Service,配置好后發現服務不能正常訪問,遂啟動了一個busybox進行測試,測試發現在busybox中,能通過coredns正常的解析到IP,然后去ping了一下service,發現不能ping通,ping clusterIP也不能ping通。

排錯經歷:首先排查了kube-proxy是否正常,發現啟動都是正常的,然后也重啟了,還是一樣ping不通,然后又排查了網絡插件,也重啟過flannel,依然沒有任何效果。后來想到自己的另一套k8s環境,是能正常ping通service的,就對比這兩套環境檢查配置,發現所有配置中只有kube-proxy的配置有一點差別,能ping通的環境kube-proxy使用了--proxy-mode=ipvs ,不能ping通的環境使用了默認模式(iptables)。

 

iptables沒有具體設備響應。

然后就是開始經過多次測試,添加--proxy-mode=ipvs 后,清空node上防火牆規則,重啟kube-proxy后就能正常的ping通了。

 

kubeadm 部署方式修改kube-proxy為 ipvs模式。

默認情況下,我們部署的kube-proxy通過查看日志,能看到如下信息:Flag proxy-mode="" unknown,assuming iptables proxy

# kubectl logs -n kube-system kube-proxy-xxxx
W1013 06:55:35.773739       1 proxier.go:513] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.868822       1 proxier.go:513] Failed to load kernel module ip_vs_rr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.869786       1 proxier.go:513] Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.870800       1 proxier.go:513] Failed to load kernel module ip_vs_sh with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.876832       1 server_others.go:249] Flag proxy-mode="" unknown, assuming iptables proxy
I1013 06:55:35.890892       1 server_others.go:143] Using iptables Proxier.
I1013 06:55:35.892136       1 server.go:534] Version: v1.15.0
I1013 06:55:35.909025       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I1013 06:55:35.909053       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I1013 06:55:35.919298       1 conntrack.go:83] Setting conntrack hashsize to 32768
I1013 06:55:35.945969       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I1013 06:55:35.946044       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I1013 06:55:35.946623       1 config.go:96] Starting endpoints config controller
I1013 06:55:35.946660       1 controller_utils.go:1029] Waiting for caches to sync for endpoints config controller
I1013 06:55:35.946695       1 config.go:187] Starting service config controller
I1013 06:55:35.946713       1 controller_utils.go:1029] Waiting for caches to sync for service config controller
I1013 06:55:36.047121       1 controller_utils.go:1036] Caches are synced for endpoints config controller
I1013 06:55:36.047195       1 controller_utils.go:1036] Caches are synced for service config controller

修改kube-proxy的配置文件,添加mode 為ipvs。
默認是空的
# kubectl edit cm kube-proxy -n kube-system ... ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "" strictARP: false syncPeriod: 30s kind: KubeProxyConfiguration metricsBindAddress: 127.0.0.1:10249 mode: "ipvs" ...
ipvs模式需要注意的是要添加ip_vs相關模塊,
cat > /etc/sysconfig/modules/ipvs.modules <<EOF #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
重啟kube-proxy 的pod
 
         
# kubectl get pod -n kube-system | grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}' pod "kube-proxy-62gvr" deleted pod "kube-proxy-n2rml" deleted pod "kube-proxy-ppdb6" deleted pod "kube-proxy-rr9cg" deleted
在pod重啟后再查看日志,發現模式已經變為ipvs了。
# kubectl get pod -n kube-system |grep kube-proxy kube-proxy-cbm8p 1/1 Running 0 85s kube-proxy-d97pn 1/1 Running 0 83s kube-proxy-gmq6s 1/1 Running 0 76s kube-proxy-x6tcg 1/1 Running 0 81s # kubectl logs -n kube-system kube-proxy-cbm8p I1013 07:34:38.685794 1 server_others.go:170] Using ipvs Proxier. W1013 07:34:38.686066 1 proxier.go:401] IPVS scheduler not specified, use rr by default I1013 07:34:38.687224 1 server.go:534] Version: v1.15.0 I1013 07:34:38.692777 1 conntrack.go:52] Setting nf_conntrack_max to 131072 I1013 07:34:38.693378 1 config.go:187] Starting service config controller I1013 07:34:38.693391 1 controller_utils.go:1029] Waiting for caches to sync for service config controller I1013 07:34:38.693406 1 config.go:96] Starting endpoints config controller I1013 07:34:38.693411 1 controller_utils.go:1029] Waiting for caches to sync for endpoints config controller I1013 07:34:38.793684 1 controller_utils.go:1036] Caches are synced for endpoints config controller I1013 07:34:38.793688 1 controller_utils.go:1036] Caches are synced for service config controller
再次測試ping service就ok了

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM