PodCIDRs問題
Kubernetes v1.18.5 + Cilium 1.9.4 安裝遇到的問題
- 報錯信息
E0827 21:08:22.925379 1 controller_utils.go:245] Error while processing Node Add: failed to allocate cidr from cluster cidr at idx:0: CIDR allocation failed; there are no remaining CIDRs left to allocate in the accepted range I0827 21:08:22.925407 1 event.go:278] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"prod-be-k8s-wn6", UID:"8a72a498-c29a-4fb9-a798-7773f5a4f538", APIVersion:"v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'CIDRNotAvailable' Node prod-be-k8s-wn6 status is now: CIDRNotAvailable I0827 21:08:22.925420 1 shared_informer.go:230] Caches are synced for service account W0827 21:08:22.925499 1 actual_state_of_world.go:506] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="prod-fe-k8s-wn1" does not exist
判斷原因是因為Kubernetes Controller-manager --allocate-node-cidrs 為Pod預留的網段與 Cilium-agent的 --set ipam.mode=cluster-pool重疊,只是懷疑
- 修復方案修改cilium-agent的PodCIDRs IPAM管理方案使用 --set ipam.mode=kubernetes
Kubernetes svc externalIP
背景情況,work node(ECS)同時具有公網/內網地址時
Case
創建SVC時通過ECS的公網地址作為externalIPs地址時,發現無法通信,訪問模式 比如prometheus 地址+端口,頁面無法顯示,但是端口是通的
但是該公網地址是CP節點的地址,使用了svc service.spec.externalTrafficPolicy默認策略Cluster,實現集群內部流量負載均衡分布功能,同時不會保留請求的Source ip,在集群內部會發生多跳
service.spec.externalTrafficPolicy:Local // 參數設置代表保留請求的Source ip並保留,但存在潛在的不均衡流量傳播風險
解決方法
設置SVC的模式 NodePort即可,原因阿里雲的公網地址在集群初始化時,地址不在Kubernetes node的地址范圍中,既不是InternalIP也不是ExternalIP
Cilium masquerading
背景描述,在集群完成部署后,及路由配置完成后,發現Kubernetes集群外部節點連接集群內部的Pod的服務端口,無法連接,但ICMP有報文回應
初步懷疑是cilium networkpolicy作祟,修改CiliumNetworkPolicy && CiliumClusterWideNetworkPolicy ,ingress egress都試了,還是不通
抓包后發現,如下
cloud route模式
- cilium networkpolicy保持默認
root@PROD-BE-K8S-WN7:/home/cilium# cilium endpoint list ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS ENFORCEMENT ENFORCEMENT 1384 Disabled Disabled 1 k8s:node-role.kubernetes.io/worker=worker ready reserved:host 3402 Disabled Disabled 1571 k8s:app=tomcat 172.21.12.32 ready k8s:io.cilium.k8s.policy.cluster=default k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default 3569 Disabled Disabled 4 reserved:health 172.21.12.20 ready root@PROD-BE-K8S-WN7:/home/cilium# cilium bpf policy get --all /sys/fs/bpf/tc/globals/cilium_policy_01384: POLICY DIRECTION LABELS (source:key[=value]) PORT/PROTO PROXY PORT BYTES PACKETS Allow Ingress reserved:unknown ANY NONE 0 0 Allow Egress reserved:unknown ANY NONE 0 0 /sys/fs/bpf/tc/globals/cilium_policy_03402: POLICY DIRECTION LABELS (source:key[=value]) PORT/PROTO PROXY PORT BYTES PACKETS Allow Ingress reserved:unknown ANY NONE 7596 118 Allow Ingress reserved:host ANY NONE 0 0 Allow Egress reserved:unknown ANY NONE 4444 60 /sys/fs/bpf/tc/globals/cilium_policy_03569: POLICY DIRECTION LABELS (source:key[=value]) PORT/PROTO PROXY PORT BYTES PACKETS Allow Ingress reserved:unknown ANY NONE 12049100 153645 Allow Ingress reserved:host ANY NONE 906423 10407 Allow Egress reserved:unknown ANY NONE 10679305 139853
- cilium bpf nat list | grep "10.1.16.186"
root@PROD-BE-K8S-WN7:/home/cilium# cilium bpf nat list | grep "10.1.16.186" Unable to open /sys/fs/bpf/tc/globals/cilium_snat_v6_external: Unable to get object /sys/fs/bpf/tc/globals/cilium_snat_v6_external: no such file or directory. Skipping. TCP IN 10.1.16.186:20940 -> 10.1.17.237:49667 XLATE_DST 172.21.12.32:8080 Created=69667sec HostLocal=0 TCP OUT 172.21.12.32:8080 -> 10.1.16.186:20940 XLATE_SRC 10.1.17.237:49667 Created=69667sec HostLocal=0
- cilium monitor | grep "10.1.16.186"
root@PROD-BE-K8S-WN7:/home/cilium# cilium monitor | grep "10.1.16.186" level=info msg="Initializing dissection cache..." subsys=monitor -> endpoint 3402 flow 0x0 identity 2->1571 state new ifindex lxc22edfa994e70 orig-ip 10.1.16.186: 10.1.16.186:20982 -> 172.21.12.32:8080 tcp SYN -> endpoint 3402 flow 0x0 identity 2->1571 state established ifindex lxc22edfa994e70 orig-ip 10.1.16.186: 10.1.16.186:20982 -> 172.21.12.32:8080 tcp RST -> endpoint 3402 flow 0x0 identity 2->1571 state established ifindex lxc22edfa994e70 orig-ip 10.1.16.186: 10.1.16.186:20982 -> 172.21.12.32:8080 tcp RST -> endpoint 3402 flow 0x0 identity 2->1571 state established ifindex lxc22edfa994e70 orig-ip 10.1.16.186: 10.1.16.186:20982 -> 172.21.12.32:8080 tcp RST
// 上面的輸出代表請求被reset - tcpdump抓包
cloud route - ipMasqAgent.enabled=true
- cilium networkPolicy保持默認策略
- 當指定ipMasqAgent.enabled=true時,代表集群Pod與nonMasqueradeCIDRs的地址通信時,不會主動masquerade,意味着不需要做SNAT
root@HK-K8S-WN2:/home/cilium# cilium status --verbose KVStore: Ok Disabled Kubernetes: Ok 1.18 (v1.18.5) [linux/amd64] Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1beta1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"] KubeProxyReplacement: Strict [eth0 (Direct Routing)] Cilium: Ok 1.9.10 (v1.9.10-4e26039) NodeMonitor: Listening for events on 2 CPUs with 64x4096 of shared memory Cilium health daemon: Ok IPAM: IPv4: 8/255 allocated from 172.20.2.0/24, Allocated addresses: 172.20.2.130 (default/prometheus-alertmanager-58dc496b97-b82ml [restored]) 172.20.2.167 (default/rabbitmq-0 [restored]) 172.20.2.207 (default/nginx-55d4fb7c6f-n9rxb [restored]) 172.20.2.215 (router) 172.20.2.222 (default/tomcat-74b4555889-z5pxt [restored]) 172.20.2.253 (default/zk-0 [restored]) 172.20.2.27 (default/redis-56bdbddbbb-r9fsx [restored]) 172.20.2.86 (health) BandwidthManager: Disabled Host Routing: BPF Masquerading: BPF (ip-masq-agent) [eth0] 172.20.0.0/20
- 查看nonMasqueradeCIDRs地址范圍
root@HK-K8S-WN2:/home/cilium# cilium bpf ipmasq list IP PREFIX/ADDRESS 100.64.0.0/10 169.254.0.0/16 172.16.0.0/12 192.0.2.0/24 192.168.0.0/16 198.18.0.0/15 10.0.0.0/8 192.0.0.0/24 192.88.99.0/24 198.51.100.0/24 203.0.113.0/24 240.0.0.0/4
- 測試集群外部地址請求Pod端口
<root@proxy ~># curl -i 172.20.2.222:8080 HTTP/1.1 200 Content-Type: text/html;charset=UTF-8 Transfer-Encoding: chunked Date: Mon, 13 Sep 2021 11:27:07 GMT # 查看Pod節點的cilium-agent bpf nat列表時,並沒有NAT轉換,證實了不需要Destination ip地址轉換 DNAT root@HK-K8S-WN2:/home/cilium# cilium bpf nat list | grep "172.19.0.195" Unable to open /sys/fs/bpf/tc/globals/cilium_snat_v6_external: Unable to get object /sys/fs/bpf/tc/globals/cilium_snat_v6_external: no such file or directory. Skipping.
- 查看cilium monitor 事件消息
root@HK-K8S-WN2:/home/cilium# cilium monitor | grep "172.19.0.195" level=info msg="Initializing dissection cache..." subsys=monitor -> endpoint 777 flow 0x0 identity 2->16701 state new ifindex lxce0b12341417a orig-ip 172.19.0.195: 172.19.0.195:56966 -> 172.20.2.222:8080 tcp SYN -> endpoint 777 flow 0x0 identity 2->16701 state established ifindex lxce0b12341417a orig-ip 172.19.0.195: 172.19.0.195:56966 -> 172.20.2.222:8080 tcp ACK -> endpoint 777 flow 0x0 identity 2->16701 state established ifindex lxce0b12341417a orig-ip 172.19.0.195: 172.19.0.195:56966 -> 172.20.2.222:8080 tcp ACK -> endpoint 777 flow 0x0 identity 2->16701 state established ifindex lxce0b12341417a orig-ip 172.19.0.195: 172.19.0.195:56966 -> 172.20.2.222:8080 tcp ACK, FIN -> endpoint 777 flow 0x0 identity 2->16701 state established ifindex lxce0b12341417a orig-ip 172.19.0.195: 172.19.0.195:56966 -> 172.20.2.222:8080 tcp ACK