一、摘要
=======================================================================================
包括三項:
calico相關概念
calico和docker集成
calico和kubernetes集成
不包括:
calico集群穿透
二、概念簡介
=======================================================================================
Calico是一個純三層的協議,為OpenStack虛機和Docker容器提供多主機間通信。Calico不使用重疊網絡比如flannel和libnetwork重疊網絡驅動,
它是一個純三層的方法,使用虛擬路由代替虛擬交換,每一台虛擬路由通過BGP協議傳播可達信息(路由)到剩余數據中心。
通過將整個互聯網的可擴展 IP 網絡原則壓縮到數據中心級別,Calico 在每一個計算節點利用Linux kernel實現了一個高效的vRouter來負責數據轉發而每個vRouter通過BGP協議負責把自己上運行的 workload 的路由信息像整個 Calico 網絡內傳播 - 小規模部署可以直接互聯,大規模下可通過指定的
BGP route reflector (摒棄所有節點互聯的 mesh 模式,通過一個或者多個BGP Route Reflector來完成集中式的路由分發)來完成。
BGP用於在不同的自治系統(AS)之間交換路由信息。當兩個AS需要交換路由信息時,每個AS都必須指定一個運行BGP的節點,來代表AS與其他的AS交換路由信息。這個節點可以是一個主機。但通常是路由器來執行BGP。兩個AS中利用BGP交換信息的路由器也被稱為邊界網關(Border Gateway)或邊界路由器(Border Router)
bgp相關介紹:
三、版本要求
=======================================================================================
Docker:>=1.9
Calicoctl:>=1.0(舊版本也可以用,api不同)
Kubernetes:>=1.1,
The kube-proxy must be started in iptables proxy mode. This is the default as of Kubernetes v1.2.0.
如果使用網絡策略至少1.3(The Kubernetes NetworkPolicy API requires at least Kubernetes version v1.3.0 )
如果需要應用全部功能至少需要1.5
四、相關名詞
=======================================================================================
Host Endpoint Resource (hostEndpoint)
A Host Endpoint resource (hostEndpoint) represents an interface attached to a host that is running Calico.
Each host endpoint may include a set of labels and list of profiles that Calico will use to apply policy to the interface. If no profiles or labels are applied, Calico will not apply any policy.
意思是說:一個主機終端是一個calico節點上的一個網絡接口,這些主機終端可能包括一些標簽或者輪廓來指定網路策略,如果沒有的話,默認沒有規則.
下面是創建資源的例子:
Sample YAML
apiVersion:v1
kind:hostEndpoint
metadata:
name:
eth0
node:
myhost
labels:
type:
production
spec:
interfaceName:
eth0
expectedIPs:
-
192.168.0.1
-
192.168.0.2
profiles:
-
profile1
-
profile2
IP Pool Resource (ipPool)
An IP pool resource (ipPool) represents a collection of IP addresses from which Calico expects endpoint IPs to be assigned.
意思是說:ip池代表了calico即將拿出來分配給終端的ip段
下面是例子:
Sample YAML
apiVersion:v1
kind:ipPool
metadata:
cidr:
10.1.0.0/16
spec:
ipip:
enabled:
true
mode:
cross-subnet
nat-outgoing:
true
disabled:
false
Node Resource (node)
An Node resource (node) represents a node running Calico. When adding a host to a Calico cluster, a Node resource needs to be created which contains the configuration for the Calico Node instance running on the host.
When starting a Calico node instance, the name supplied to the instance should match the name configured in the Node resource.
By default, starting a calico/node
instance will automatically create a node resource using the hostname
of the compute host.
啟動calico的機器就是一個node,
看例子:
Sample YAML
apiVersion:v1
kind:node
metadata:
name:
node-hostname
spec:
bgp:
asNumber:
64512
ipv4Address:
10.244.0.1/24
ipv6Address:
2001:db8:85a3::8a2e:370:7334/120
Policy Resource (policy)
A Policy resource (policy) represents an ordered set of rules which are applied to a collection of endpoints which match a label selector.
Policy resources can be used to define network connectivity rules between groups of Calico endpoints and host endpoints, and take precedence over Profile resources if any are defined.
翻譯:策略中規定了對擁有不同的label的終端,執行不同的策略,優先級高於profile
看例子:
Sample YAML
This sample policy allows TCP traffic from frontend
endpoints to port 6379 on database
endpoints.
apiVersion:v1
kind:policy
metadata:
name:
allow-tcp-6379
spec:
selector:
role == 'database'
ingress:
-
action:
allow
protocol:
tcp
source:
selector:
role == 'frontend'
destination:
ports:
-
6379
egress:
-
action:
allow
Profile Resource (profile)
A Profile resource (profile) represents a set of rules which are applied to the individual endpoints to which this profile has been assigned.
Each Calico endpoint or host endpoint can be assigned to zero or more profiles.
Also see the Policy resource which provides an alternate way to select what policy is applied to an endpoint.
翻譯:profile與policy差不多,只不過只針對單獨的終端,一個終端可以配置多個profile,policy中可以配置哪個終端應用哪個profile
看例子:
Sample YAML
The following sample profile allows all traffic from endpoints that have the profile label set to profile1
(i.e. endpoints that reference this profile), except that all traffic from 10.0.20.0/24 is denied.
apiVersion:v1
kind:profile
metadata:
name:
profile1
labels:
profile:
profile1
spec:
ingress:
-
action:
deny
source:
net:
10.0.20.0/24
-
action:
allow
source:
selector:
profile == 'profile1'
egress:
-
action:
allow
四、calico集成docker(security using calico profiles)
=======================================================================================
安裝etcd(略)
部署calico
要想使用calico構建容器網絡的話,docker daemon需要配置一個cluster store。由於calico使用的是etcd,在docker的/etc/sysconfig/docker配置文件的OPTIONS里面添加:
#根據實際情況替換<etcd_ip>和<etcd_port>
--cluster-store=etcd:
//10.1.8.9:2379
使用calicoctl
calicoctl在1.0以后的版本,命令與之前有所改變。calicoctl 1.0之后calicoctl管理的都是資源(resource),之前版本的ip pool,profile, policy等都是資源。資源通過yaml或者json格式方式來定義,通過calicoctl create 或者apply來創建和應用,通過calicoctl get命令來查看。使用yaml或json來定義資源的格式為:
apiVersion: v1
kind: <type of resource>
metadata:
# Identifying information
name: <name of resource>
...
spec:
# Specification of the resource
...
a) 下載calico的命令管理工具calicoctl
下載calicoctl的操作如下:
sudo wget
-O/usr/
local/bin/calicoctl https:
//github.com/projectcalico/calicoctl/releases/download/v1.0.2/calicoctl
sudo chmod
+x /usr/
local/bin/calicoctl
由於/usr/local/bin在系統的PATH路徑里面,所以之后就可以直接使用calicoctl命令了。
b) 配置calicoctl的datastore
calicoctl默認是會讀取/etc/calico/calicoctl.cfg的配置文件(也可以通過–config選項來指定要讀取的配置文件),配置里指定etcd集群的地址,文件的格式類似如下:
apiVersion: v1
kind: calicoApiConfig
metadata:
spec:
datastoreType:
"etcdv2"
etcdEndpoints:
"http://10.1.8.9:2379"
c) 運行calico/node
使用如下命令運行:
#以在node1上運行為例
sudo calicoctl node
run--ip=10.1.4.57
實際上是運行一個calico的容器,作為插件啟動,實際運行命令(已修改為倉庫下載):
docker run --net=host --privileged --name=calico-node -d --restart=always \
-e NODENAME=sdw1 \
-e NO_DEFAULT_POOLS= \
-e CALICO_LIBNETWORK_IFPREFIX=cali \
-e IP6_AUTODETECTION_METHOD=first-found \
-e CALICO_LIBNETWORK_CREATE_PROFILES=true \
-e CALICO_LIBNETWORK_LABEL_ENDPOINTS=false \
-e IP=10.1.4.57 -e ETCD_ENDPOINTS=http://10.1.8.9:2379 \
-e CALICO_NETWORKING_BACKEND=bird \
-e CALICO_LIBNETWORK_ENABLED=true \
-e IP_AUTODETECTION_METHOD=first-found \
-v /var/log/calico:/var/log/calico \
-v /var/run/calico:/var/run/calico \
-v /lib/modules:/lib/modules \
-v /run/docker/plugins:/run/docker/plugins \
-v /var/run/docker.sock:/var/run/docker.sock 10.1.8.9:5000/calico
得到:
服務器上還沒有calico/node鏡像的話,會先進行下載。運行成功后,可以看到正在運行的名為calico-node的容器。通過下面命令查看節點狀態信息:
sudo calicoctl node status
d) 使用calicoctl創建ipPool
查看ip pool的命令為:
calicoctl
getipPool
創建完得到下圖:
創建ip pool首先定義一個資源文件ipPool.yaml,如:
- apiVersion: v1
kind: ipPool
metadata:
cidr:
10.20.0.0/
24
spec:
ipip:
enabled:
true
nat
-outgoing:
true
然后運行命令進行創建:
calicoctl create
-fipPool.yaml
這樣再通過calicoctl get ipPool命令查看,就會發現生成了一個10.20.0.0/24的ip pool。
連通性驗證
在上面創建的ip pool(10.20.0.0/24)里創建子網絡,如:
dockernetwork
create
--driver
calico
--ipam-driver
calico-ipam
--subnet
10.20.0.0/24
net1
dockernetwork
create
--driver
calico
--ipam-driver
calico-ipam
--subnet
10.20.0.0/24
net2
dockernetwork
create
--driver
calico
--ipam-driver
calico-ipam
--subnet
10.20.0.0/24
net3
上面創建了net1,net2和net3三個不同的網絡。上面的命令在任意一個節點上執行即可。由於node1和node2使用的是同一套etcd,在兩個節點上都可以通過docker network ls命令查看到生成的網絡信息:
參考官網上的一個例子,在node1和node2上分別創建幾個容器來測試下容器網絡的連通性。
#node1
docker
run--net net1 --name workload-A -tid busybox
docker
run--net net2 --name workload-B -tid busybox
docker
run--net net1 --name workload-C -tid busybox
#node2
docker
run--net net3 --name workload-D -tid busybox
docker
run--net net1 --name workload-E -tid busybox
可以在node1上使用如下命令來試驗連通性:
#同一網絡內的容器(即使不在同一節點主機上)可以使用容器名來訪問
docker exec workload-A
ping-c
4workload-C
.net1
docker exec workload-A
ping-c
4workload-E
.net1
#不同網絡內的容器需要使用容器ip來訪問(使用容器名會報:bad address)
docker exec workload-A
ping-c
2`docker inspect --format
"{{ .NetworkSettings.Networks.net2.IPAddress }}"workload-B`
同一網絡內的容器是能相互通信的;不同網絡內的容器相互是不通的。不同節點上屬於同一網絡的容器也是能相互通信的,這樣就實現了容器的跨主機互連。
除此之外,官網還提供了兩個個應用profile和policy來建立策略的案例,這里附上:
第一種:
先創建網絡:
On any host in your Calico / Docker network, run the following commands:
docker network create --driver calico --ipam-driver calico-ipam database
docker network create --driver calico --ipam-driver calico-ipam frontend
創建profile:
cat << EOF | calicoctl apply -f -
- apiVersion: v1
kind: profile
metadata:
name: database
labels:
role: database
- apiVersion: v1
kind: profile
metadata:
name: frontend
labels:
role: frontend
EOF
創建策略,在策略中指定哪種終端應用哪種profile:
cat << EOF | calicoctl create -f -
- apiVersion: v1
kind: policy
metadata:
name: database
spec:
order: 0
selector: role == 'database'
ingress:
- action: allow
protocol: tcp
source:
selector: role == 'frontend'
destination:
ports:
- 3306
- action: allow
source:
selector: role == 'database'
egress:
- action: allow
destination:
selector: role == 'database'
- apiVersion: v1
kind: policy
metadata:
name: frontend
spec:
order: 0
selector: role == 'frontend'
egress:
- action: allow
protocol: tcp
destination:
selector: role == 'database'
ports:
- 3306
EOF
第二種:(把他們寫到一起)
cat << EOF | calicoctl create -f -
- apiVersion: v1
kind: policy
metadata:
name: database
spec:
order: 0
selector: role == 'database'
ingress:
- action: allow
protocol: tcp
source:
selector: role == 'frontend'
destination:
ports:
- 3306
- action: allow
source:
selector: role == 'database'
egress:
- action: allow
destination:
selector: role == 'database'
- apiVersion: v1
kind: policy
metadata:
name: frontend
spec:
order: 0
selector: role == 'frontend'
egress:
- action: allow
protocol: tcp
destination:
selector: role == 'database'
ports:
- 3306
EOF
六、calico集成k8s
=======================================================================================
使用calico需要kubernetes>=1.1。使用NetworkPolicy功能,kubernetes>=1.3.0
一鍵安裝:
1、下載calico.yaml,地址為http://docs.projectcalico.org/v2.0/getting-started/kubernetes/installation/hosted/calico.yaml
2、修改calico.yaml文件中,etcd的地址
etcd_endpoints: "http://10.1.8.9:2379"
3、通過以下命令部署calico
kubectl apply -f calico.yaml
手工安裝:
如果一鍵安裝出現結果不正確,可以根據手工安裝查看該有的文件等是否存在,如果正確了就跳過.
kubelet需要調用calico和calico-ipam插件
wget -N -P /opt/cni/bin https://github.com/projectcalico/calico-cni/releases/download/v1.4.3/calico
wget -N -P /opt/cni/bin https://github.com/projectcalico/calico-cni/releases/download/v1.4.3/calico-ipam
chmod +x /opt/cni/bin/calico /opt/cni/bin/calico-ipam
CalicoCNI插件需要標准的CNI配置文件,如下所示。只有當部署calico/kube-policy-controller時候才需要policy字段。
mkdir -p /etc/cni/net.d
cat >/etc/cni/net.d/10-calico.conf <5.安裝標准CNI lo插件wget https://github.com/containernetworking/cni/releases/download/v0.3.0/cni-v0.3.0.tgz
tar -zxvf cni-v0.3.0.tgz
sudo cp loopback /opt/cni/bin/
配置kubeletkubelet啟動的時候使用如下參數配置使用calico--network-plugin=cni--network-plugin-dir=/etc/cni/net.d
注意:
1 kube-apiserver和kubelet的啟動腳本中添加--allow_privileged=true,如果不添加的話,下面在部署calico的時候,會以下錯誤:
The DaemonSet "calico-node" is invalid: spec.template.spec.containers[0].securityContext.privileged: Forbidden: disallowed by policy
2下載 https://github.com/containernetworking/cni/releases/download/v0.4.0/cni-v0.4.0.tgz,解壓之后,將loopback拷貝到/opt/cni/bin目錄下,如果不做這步的話,創建pod時會拋錯,說找不到loopback。
3 報錯:kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
KUBE_ARGS="--cluster-dns=10.10.0.10 --cluster-domain=cluster.local"
systemctl daemon-reload; systemctl restart kubelet
驗證
部署redis,redis-rc.yaml如下:
apiVersion: v1 kind: ReplicationController metadata: name: redis spec: replicas: 2 selector: name: redis template: metadata: labels: name: redis spec: containers: - name: redis image: redis ports: - containerPort: 22
redis-svc.yaml如下:
apiVersion: v1 kind: Service metadata: name: redis spec: selector: k8s-app: redis clusterIP: 10.1.66.66 ports: - name: "1" port: 6379 protocol: TCP
3、部署情況如下:
運行kubectl get pods –o wide
運行kubectl get svc –o wide
master主機上的路由:
[root@master redis]# route -n
slave1主機上的路由:
[root@slave1 bin]# route –n
slave2主機上的路由:
[root@slave2 bin]# route -n
【驗證網絡連通性】
1、在master主機上ping mysql和redis的ip
[root@master redis]# ping 10.20.0.4
telnet:
還可以進入容器中進行驗證等.
Using Calico with Kubernetes
官網給出一個簡單的例子讓kubernetes應用網絡策略
配置命名空間
kubectl create ns policy-demo
創建demo pods
1) Create some nginx pods in the policy-demo
Namespace, and expose them through a Service.
# 運行pods
kubectl run --namespace
=policy-demo nginx --replicas
=2 --image
=nginx
# 創建service
kubectl expose --namespace
=policy-demo deployment nginx --port
=80
2) 確保nginx可以訪問
# 創建一個可以訪問service的nginx pod.
$ kubectl run --namespace=policy-demo access --rm -ti --image busybox /bin/sh
Waiting for pod policy-demo/access-472357175-y0m47 to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # wget -q nginx -O -
You should see a response from nginx
. Great! Our Service is accessible. You can exit the Pod now.
啟用獨立配置
配置isolation選項,默認阻止命名空間內所有應用訪問.
kubectl annotate ns policy-demo "net.beta.kubernetes.io/network-policy={\"ingress\":{\"isolation\":\"DefaultDeny\"}}"
測試:
# Run a Pod and try to access the `nginx` Service.
$ kubectl run --namespace=policy-demo access --rm -ti --image busybox /bin/sh
Waiting for pod policy-demo/access-472357175-y0m47 to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # wget -q --timeout=5 nginx -O -
wget: download timed out
/ #
利用網絡策略打開網絡連接
創建網絡策略: access-nginx
:
kubectl create -f - <<EOF
kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
name: access-nginx
namespace: policy-demo
spec:
podSelector:
matchLabels:
run: nginx
ingress:
- from:
- podSelector:
matchLabels:
run: access
EOF
允許流量從帶有標簽 run: access
的pod到 run: nginx
.
測試帶正確標簽的可以訪問:
# Run a Pod and try to access the `nginx` Service.
$ kubectl run --namespace=policy-demo access --rm -ti --image busybox /bin/sh
Waiting for pod policy-demo/access-472357175-y0m47 to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # wget -q --timeout=5 nginx -O -
沒有 run: access標簽的不能訪問
:
# Run a Pod and try to access the `nginx` Service.
$ kubectl run --namespace=policy-demo cant-access --rm -ti --image busybox /bin/sh
Waiting for pod policy-demo/cant-access-472357175-y0m47 to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
/ # wget -q --timeout=5 nginx -O -
wget: download timed out
/ #
You can clean up the demo by deleting the demo Namespace:
kubectl delete ns policy-demo
七、集群穿透(未驗證)
=======================================================================================
順便看一下官網給的相關說明
Example topology / multiple cluster IDs
When the topology includes a cluster of Route Reflectors, BGP uses the concept of a cluster ID to ensure there are no routing loops when distributing routes.
The Route Reflector image provided assumes that it has a fixed cluster ID for each Route Reflector rather than being configurable on a per peer basis. This simplifies the overall configuration of the network, but does place some limitations on the topology as described here.
當拓撲包含了多個反射路由時,BGP利用集群id來保證分配路由時不陷入循環路由.
反射路由鏡像幫助每個反射路由提供固定的集群id而不是依賴單一平行原則進行配置,這簡化了整個網絡的配置,但也給拓撲帶來了一些限制:
The topology is based on the Top of Rack model where you would have a set of redundant route reflectors peering with all of the servers in the rack.
- Each rack is assigned its own cluster ID (a unique number in IPv4 address format).
- Each node (server in the rack) peers with a redundant set of route reflectors specific to that set rack.
- All of the Route Reflectors across all racks form a full BGP mesh (this is handled automatically by the Calico BIRD Route Reflector image and does not require additional configuration).
For example, to set up the topology described above, you would:
- Spin up nodes N1 - N9
- Spin up Route Reflectors RR1 - RR6
- Add node specific peers, peering:
- N1, N2 and N3 with RR1 and RR2
- N4, N5 and N6 with RR3 and RR4
- N7, N8 and N9 with RR5 and RR6
- Add etcd config for the Route Reflectors:
- RR1 and RR2 both using the cluster ID 1.0.0.1
- RR2 and RR3 both using the cluster ID 1.0.0.2
- RR4 and RR5 both using the cluster ID 1.0.0.3
一個反射路由的例子(資料參考)
http://www.tuicool.com/articles/yMbmY3v
By default, Calico enable full node-to-node mesh, and each Calico node automatically sets up a BGP peering with every other Calico node in the network.
However, the full node-to-node mesh is only useful for small scale deployments and where all Calico nodes are on the same L2 network.
We can disable full node-to-node mesh by setup Route Reflector (or set of Route Reflectors), and each Calico node only peer with Route Reflector.
Environment
172.17.42.30 kube-master
172.17.42.31 kube-node1
172.17.42.32 kube-node2
172.17.42.40 node1
[root@kube-node1 ~]# calicoctl bgp node-mesh
on
[root@kube-node1 ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.31
blackhole 192.168.0.0/26 proto bird
192.168.0.2 dev cali1f3c9fa633a scope link
192.168.0.64/26 via 172.17.42.32 dev eth0 proto bird
[root@kube-node2 ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.32
192.168.0.0/26 via 172.17.42.31 dev eth0 proto bird
192.168.0.64 dev cali03adc9f233a scope link
blackhole 192.168.0.64/26 proto bird
192.168.0.65 dev calicb1a3b2633b scope link
Setup Route Reflector
[root@kube-node1 ~]# calicoctl bgp node-mesh off
[root@kube-node1 ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.31
blackhole 192.168.0.0/26 proto bird
192.168.0.2 dev cali1f3c9fa633a scope link
[root@kube-node2 ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.32
192.168.0.64 dev cali03adc9f233a scope link
blackhole 192.168.0.64/26 proto bird
192.168.0.65 dev calicb1a3b2633b scope link
Route entry 192.168.0.64/26 on kube-node1 is removed after disable full node-to-node BGP mesh.
- Run BIRD Route Reflector on node1
- Adding the Route Reflector into etcd
- Config every node peer with each of the Route Reflectors
# docker run --privileged --net=host -d -e IP=172.17.42.40 -e ETCD_AUTHORITY=172.17.42.30:2379 -v /var/log/:/var/log/ calico/routereflector:latest
# curl -L http://172.17.42.30:2379/v2/keys/calico/bgp/v1/rr_v4/172.17.42.40 -XPUT -d value="{\"ip\":\"172.17.42.40\",\"cluster_id\":\"1.0.0.1\"}"
[root@kube-node1 ~]# calicoctl bgp peer add 172.17.42.40 as 65100
[root@kube-node1 ~]# calicoctl bgp peer show
+----------------------+--------+
| Global IPv4 BGP Peer | AS Num |
+----------------------+--------+
| 172.17.42.40 | 65100 |
+----------------------+--------+
No global IPv6 BGP Peers defined.
Bird of Route Reflector will connect to every Calico node, and route entries will be automatically recreated.
[root@node1 ~]# netstat -tnp|grep 179
tcp 0 0 172.17.42.40:54395 172.17.42.31:179 ESTABLISHED 27782/bird
tcp 0 0 172.17.42.40:56733 172.17.42.30:179 ESTABLISHED 27782/bird
tcp 0 0 172.17.42.40:58889 172.17.42.32:179 ESTABLISHED 27782/bird
[root@kube-node1 ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.31
blackhole 192.168.0.0/26 proto bird
192.168.0.2 dev cali1f3c9fa633a scope link
192.168.0.64/26 via 172.17.42.32 dev eth0 proto bird
[root@kube-master ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.30
192.168.0.0/26 via 172.17.42.31 dev eth0 proto bird
192.168.0.64/26 via 172.17.42.32 dev eth0 proto bird
[root@kube-node2 ~]# ip route show
default via 172.17.42.1 dev eth0
172.17.0.0/16 dev eth0 proto kernel scope link src 172.17.42.32
192.168.0.0/26 via 172.17.42.31 dev eth0 proto bird
192.168.0.64 dev cali03adc9f233a scope link
blackhole 192.168.0.64/26 proto bird
192.168.0.65 dev calicb1a3b2633b scope link
For redundancy, multiple BGP route reflectors can be deployed seamlessly. The route reflectors are purely involved in the control of the network: no endpoint data passes through them.
- Bird config of Route Reflector
- Bird config of Calico node
[root@node1 ~]# docker exec 56854e7cb79a cat /config/bird.cfg
# Generated by confd
router id 172.17.42.40;
# Watch interface up/down events.
protocol device {
scan time 2; # Scan interfaces every 2 seconds
}
# Template for all BGP clients
template bgp bgp_template {
debug all;
description "Connection to BGP peer";
multihop;
import all; # Import all routes, since we don't know what the upstream
# topology is and therefore have to trust the ToR/RR.
export all; # Export all.
source address 172.17.42.40; # The local address we use for the TCP connection
graceful restart; # See comment in kernel section about graceful restart.
}
# ------------- RR-to-RR full mesh -------------
# For RR 172.17.42.40
# Skipping ourselves
# ------------- RR as a global peer -------------
# This RR is a global peer with *all* calico nodes.
# Peering with Calico node kube-master
protocol bgp Global_172_17_42_30 from bgp_template {
local as 65100;
neighbor 172.17.42.30 as 65100;
rr client;
rr cluster id 1.0.0.1;
}
# Peering with Calico node kube-node1
protocol bgp Global_172_17_42_31 from bgp_template {
local as 65100;
neighbor 172.17.42.31 as 65100;
rr client;
rr cluster id 1.0.0.1;
}
# Peering with Calico node kube-node2
protocol bgp Global_172_17_42_32 from bgp_template {
local as 65100;
neighbor 172.17.42.32 as 65100;
rr client;
rr cluster id 1.0.0.1;
}
# ------------- RR as a node-specific peer -------------
[root@kube-node1 ~]
# docker exec e234b4e9dce7 cat /etc/calico/confd/config/bird.cfg
# Generated by confd
include
"bird_aggr.cfg";
include
"bird_ipam.cfg";
router id
172.17.
42.31;
# Configure synchronization between routing tables and kernel.
protocol kernel {
learn;
# Learn all alien routes from the kernel
persist;
# Don't remove routes on bird shutdown
scan time
2;
# Scan kernel routing table every 2 seconds
import
all;
export
filter calico_ipip;
# Default is export none
graceful restart;
# Turn on graceful restart to reduce potential flaps in
# routes when reloading BIRD configuration. With a full
# automatic mesh, there is no way to prevent BGP from
# flapping since multiple nodes update their BGP
# configuration at the same time, GR is not guaranteed to
# work correctly in this scenario.
}
# Watch interface up/down events.
protocol device {
debug { states };
scan time
2;
# Scan interfaces every 2 seconds
}
protocol direct {
debug { states };
interface -
"cali*",
"*";
# Exclude cali* but include everything else.
}
# Template for all BGP clients
template bgp bgp_template {
debug { states };
description
"Connection to BGP peer";
local as
65100;
multihop;
gateway recursive;
# This should be the default, but just in case.
import
all;
# Import all routes, since we don't know what the upstream
# topology is and therefore have to trust the ToR/RR.
export
filter calico_pools;
# Only want to export routes for workloads.
next hop self;
# Disable next hop processing and always advertise our
# local address as nexthop
source address
172.17.
42.31;
# The local address we use for the TCP connection
add paths
on;
graceful restart;
# See comment in kernel section about graceful restart.
}
# ------------- Global peers -------------
# For peer /global/peer_v4/172.17.42.40
protocol bgp Global_172_17_42_40
frombgp_template {
neighbor
172.17.
42.40as
65100;
}