k8s集群中pod內不能訪問clusterIP和service

本文轉載自查看原文 2020-12-15 22:23 1495 故障排查

排錯背景：在一次生產環境的部署過程中，配置文件中配置的訪問地址為集群的Service，配置好后發現服務不能正常訪問，遂啟動了一個busybox進行測試，測試發現在busybox中，能通過coredns正常的解析到IP，然后去ping了一下service，發現不能ping通，ping clusterIP也不能ping通。

排錯經歷：首先排查了kube-proxy是否正常，發現啟動都是正常的，然后也重啟了，還是一樣ping不通，然后又排查了網絡插件，也重啟過flannel，依然沒有任何效果。后來想到自己的另一套k8s環境，是能正常ping通service的，就對比這兩套環境檢查配置，發現所有配置中只有kube-proxy的配置有一點差別，能ping通的環境kube-proxy使用了--proxy-mode=ipvs ,不能ping通的環境使用了默認模式（iptables）。

iptables沒有具體設備響應。

然后就是開始經過多次測試，添加--proxy-mode=ipvs 后，清空node上防火牆規則，重啟kube-proxy后就能正常的ping通了。

kubeadm 部署方式修改kube-proxy為 ipvs模式。

默認情況下，我們部署的kube-proxy通過查看日志，能看到如下信息：Flag proxy-mode="" unknown，assuming iptables proxy

# kubectl logs -n kube-system kube-proxy-xxxx
W1013 06:55:35.773739       1 proxier.go:513] Failed to load kernel module ip_vs with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.868822       1 proxier.go:513] Failed to load kernel module ip_vs_rr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.869786       1 proxier.go:513] Failed to load kernel module ip_vs_wrr with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.870800       1 proxier.go:513] Failed to load kernel module ip_vs_sh with modprobe. You can ignore this message when kube-proxy is running inside container without mounting /lib/modules
W1013 06:55:35.876832       1 server_others.go:249] Flag proxy-mode="" unknown, assuming iptables proxy
I1013 06:55:35.890892       1 server_others.go:143] Using iptables Proxier.
I1013 06:55:35.892136       1 server.go:534] Version: v1.15.0
I1013 06:55:35.909025       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I1013 06:55:35.909053       1 conntrack.go:52] Setting nf_conntrack_max to 131072
I1013 06:55:35.919298       1 conntrack.go:83] Setting conntrack hashsize to 32768
I1013 06:55:35.945969       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I1013 06:55:35.946044       1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I1013 06:55:35.946623       1 config.go:96] Starting endpoints config controller
I1013 06:55:35.946660       1 controller_utils.go:1029] Waiting for caches to sync for endpoints config controller
I1013 06:55:35.946695       1 config.go:187] Starting service config controller
I1013 06:55:35.946713       1 controller_utils.go:1029] Waiting for caches to sync for service config controller
I1013 06:55:36.047121       1 controller_utils.go:1036] Caches are synced for endpoints config controller
I1013 06:55:36.047195       1 controller_utils.go:1036] Caches are synced for service config controller

修改kube-proxy的配置文件,添加mode 為ipvs。默認是空的

# kubectl edit cm kube-proxy -n kube-system ... ipvs: excludeCIDRs: null minSyncPeriod: 0s scheduler: "" strictARP: false syncPeriod: 30s kind: KubeProxyConfiguration metricsBindAddress: 127.0.0.1:10249 mode: "ipvs" ...

ipvs模式需要注意的是要添加ip_vs相關模塊，

cat > /etc/sysconfig/modules/ipvs.modules <<EOF #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack_ipv4 EOF

chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

重啟kube-proxy 的pod

# kubectl get pod -n kube-system | grep kube-proxy |awk '{system("kubectl delete pod "$1" -n kube-system")}' pod "kube-proxy-62gvr" deleted pod "kube-proxy-n2rml" deleted pod "kube-proxy-ppdb6" deleted pod "kube-proxy-rr9cg" deleted

在pod重啟后再查看日志，發現模式已經變為ipvs了。

# kubectl get pod -n kube-system |grep kube-proxy kube-proxy-cbm8p 1/1 Running 0 85s kube-proxy-d97pn 1/1 Running 0 83s kube-proxy-gmq6s 1/1 Running 0 76s kube-proxy-x6tcg 1/1 Running 0 81s # kubectl logs -n kube-system kube-proxy-cbm8p I1013 07:34:38.685794 1 server_others.go:170] Using ipvs Proxier. W1013 07:34:38.686066 1 proxier.go:401] IPVS scheduler not specified, use rr by default I1013 07:34:38.687224 1 server.go:534] Version: v1.15.0 I1013 07:34:38.692777 1 conntrack.go:52] Setting nf_conntrack_max to 131072 I1013 07:34:38.693378 1 config.go:187] Starting service config controller I1013 07:34:38.693391 1 controller_utils.go:1029] Waiting for caches to sync for service config controller I1013 07:34:38.693406 1 config.go:96] Starting endpoints config controller I1013 07:34:38.693411 1 controller_utils.go:1029] Waiting for caches to sync for endpoints config controller I1013 07:34:38.793684 1 controller_utils.go:1036] Caches are synced for endpoints config controller I1013 07:34:38.793688 1 controller_utils.go:1036] Caches are synced for service config controller

再次測試ping service就ok了

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 k8s中ClusterIP類型的service name在pod中無法通過dig命令獲取結果 kubectl proxy 讓外部網絡訪問K8S service的ClusterIP k8s集群上刪除pod及service k8s通過service訪問pod（五） k8s報錯信息-Pod節點內無法訪問ClusterIp Kubernetes K8S在IPVS代理模式下Service服務的ClusterIP類型訪問失敗處理 k8s中Pod、ReplicaSet、Deployment、Service的概念 k8s pod與service的關系 Kubernetes K8S之Pod跨namespace名稱空間訪問Service服務 K8S中Service