使用 FRR 测试 Calico BGP 模式

Calico 可以利用 BGP 实现节点间三层互通,而无需任何封装。因为没有了报文封装和解封的过程,所以在这种模式下网络性能最好,但需底层物理网络支持 BGP。

如果底层物理网络不支持 BGP,可以通过 FRR 工具模拟 BGP Router,使其能够使用 BGP 模式。

关于 Calico 详细解说可以参考:Calico 的三种工作模式

安装配置 FRR

准备一台 VM 安装 FRR:

1
apt -y install frr

开启 BGPD:

1
2
sed -i s#bgpd=no#bgpd=yes#g /etc/frr/daemons
systemctl restart frr

检查 BGPD 是否开启成功:

1
2
3
# 此处还没有配置 BGP,所以显示 not found
root@test-0:~# vtysh -c "show ip bgp summary"
% BGP instance not found

安装后,进行 BGP 配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# 进入控制台
root@test-0:~# vtysh

Hello, this is FRRouting (version 8.1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.

test-0# configure terminal # 配置模式
test-0(config)# router bgp 64512 # 设置 BGP Router 的 BGP AS Number
test-0(config-router)# bgp router-id 172.16.16.140 # BGP Router 的接口 IP
test-0(config-router)# no bgp ebgp-requires-policy # 关闭 EBPF Filter
test-0(config-router)# neighbor 172.16.16.142 remote-as 64512 # 配置邻居,此处为集群中的第一台节点
test-0(config-router)# neighbor 172.16.16.142 description controlplane1 # 配置描述
test-0(config-router)# neighbor 172.16.16.143 remote-as 64512 # 同上,配置集群中的第二台节点
test-0(config-router)# neighbor 172.16.16.143 description woker1
test-0(config-router)# exit
test-0(config)# exit
test-0# exit

检查配置情况,此处可以看到 State/PfxRcdActive,说明已经准备好建立 BGP 连接:

1
2
3
4
5
6
7
8
9
10
11
12
13
root@test-0:~# vtysh -c "show ip bgp summary"

IPv4 Unicast Summary (VRF default):
BGP router identifier 172.16.16.140, local AS number 64512 vrf-id 0
BGP table version 0
RIB entries 0, using 0 bytes of memory
Peers 2, using 1446 KiB of memory

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
172.16.16.142 4 64512 0 1 0 0 0 never Active 0 controlplane1
172.16.16.143 4 64512 0 1 0 0 0 never Active 0 woker1

Total number of neighbors 2

配置 Calico BGP 模式

此处集群使用 RKE2,由于 RKE2 使用 Calico 构建集群的时候默认使用 VXLAN,所以要先切换模式到 BGP:

如果是新建集群,同样参考如上配置即可。

如果是自建 RKE2,可以直接创建 HelmChartConfig 进行配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cat <<EOF | kubectl apply -f -
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-calico
namespace: kube-system
spec:
valuesContent: |-
installation:
calicoNetwork:
bgp: Enabled
ipPools:
- cidr: 10.42.0.0/16
encapsulation: None
nodeAddressAutodetectionV4:
firstFound: false
interface: ens34
EOF

删除 IPPool,Operator 会进行重建,可以看到重建后的 IPPool 没有了隧道相关的配置:

如果是新建集群,无需执行此步骤。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
root@docker-test-0:~# kubectl delete ippools.crd.projectcalico.org default-ipv4-ippool
ippool.crd.projectcalico.org "default-ipv4-ippool" deleted

root@docker-test-0:~# kubectl get ippools.crd.projectcalico.org default-ipv4-ippool -oyaml
apiVersion: crd.projectcalico.org/v1
kind: IPPool
metadata:
creationTimestamp: "2025-09-04T08:33:00Z"
generation: 1
labels:
app.kubernetes.io/managed-by: tigera-operator
name: default-ipv4-ippool
resourceVersion: "35419"
uid: 18011d8b-6d3f-4857-ba05-4439c716e594
spec:
allowedUses:
- Workload
- Tunnel
blockSize: 26
cidr: 10.42.0.0/16
natOutgoing: true
nodeSelector: all()

手动删除所有 calico-node Pod,等 Pod 重建完毕后,会发现已经没有隧道网卡了:

1
2
3
4
5
6
7
8
9
10
root@docker-test-0:~# ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: ens34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
altname enp2s2
inet 172.16.16.142/24 brd 172.16.16.255 scope global ens34
valid_lft forever preferred_lft forever

root@docker-test-0:~# ip a | grep -E "tunl0|vxlan"

配置为 BGP 模式后,创建 BGPConfiguration 和 BGPPeer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 新版本的 apiVersion 为 crd.projectcalico.org/v1
cat <<EOF | kubectl apply -f -
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
logSeverityScreen: Info
nodeToNodeMeshEnabled: true
# Calico 使用的 BGP AS Number
asNumber: 64512

---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
name: peer-to-frr
spec:
# 对应 BGP Router 的 FRR 配置信息
asNumber: 64512
peerIP: 172.16.16.140
EOF

创建完成后,在 GitHub 下载 calicoctl 工具,检查 BGP 会话是否建立成功,globalEstablished 代表建立成功:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
root@docker-test-0:~# calicoctl node status
Calico process is running.

IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 172.16.16.143 | node-to-node mesh | up | 08:34:36 | Established |
| 172.16.16.140 | global | up | 02:36:02 | Established |
+---------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

root@docker-test-1:~# calicoctl node status
Calico process is running.

IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 172.16.16.142 | node-to-node mesh | up | 08:34:36 | Established |
| 172.16.16.140 | global | up | 02:36:03 | Established |
+---------------+-------------------+-------+----------+-------------+

IPv6 BGP status
No IPv6 peers found.

在 BGP Router 查看会话建立状态,此处可以看到 State/PfxRcd1,说明会话建立成功:

1
2
3
4
5
6
7
8
9
10
11
12
13
root@test-0:~# vtysh -c "show ip bgp summary"

IPv4 Unicast Summary (VRF default):
BGP router identifier 172.16.16.140, local AS number 64512 vrf-id 0
BGP table version 2
RIB entries 3, using 552 bytes of memory
Peers 2, using 1446 KiB of memory

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
172.16.16.142 4 64512 10 10 0 0 0 00:05:50 1 0 controlplane1
172.16.16.143 4 64512 10 10 0 0 0 00:05:50 1 0 woker1

Total number of neighbors 2

在 BGP Router 查看 FRR 路由表信息,可以看到存在两个 B 相关的路由,分别对应集群中两个节点的容器网段:

1
2
3
4
5
6
7
8
9
10
11
12
root@test-0:~# vtysh -c "show ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure

K>* 0.0.0.0/0 [0/0] via 172.16.16.1, ens34, 00:10:16
B>* 10.42.5.192/26 [200/0] via 172.16.16.143, ens34, weight 1, 00:06:09
B>* 10.42.22.64/26 [200/0] via 172.16.16.142, ens34, weight 1, 00:06:09
C>* 172.16.16.0/24 is directly connected, ens34, 00:10:16

在 BGP Router 查看节点路由信息,同样可以看到两个容器网段的路由信息,是直接路由到对端节点的物理网络的:

1
2
3
4
5
6
7
root@test-0:~# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.16.16.1 0.0.0.0 UG 0 0 0 ens34
10.42.5.192 172.16.16.143 255.255.255.192 UG 20 0 0 ens34
10.42.22.64 172.16.16.142 255.255.255.192 UG 20 0 0 ens34
172.16.16.0 0.0.0.0 255.255.255.0 U 0 0 0 ens34

此时 BGP Router 可以直接与容器网段进行通信:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Pod IP 信息
root@docker-test-0:~# kubectl get pod -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox-f96f87d5d-bhd5b 1/1 Running 0 28m 10.42.5.213 docker-test-1 <none> <none>
busybox-f96f87d5d-k5d22 1/1 Running 0 28m 10.42.22.100 docker-test-0 <none> <none>

# BGP Router 进行连通信测试
root@test-0:~# ping -c 3 10.42.5.213
PING 10.42.5.213 (10.42.5.213) 56(84) bytes of data.
64 bytes from 10.42.5.213: icmp_seq=1 ttl=63 time=0.305 ms
64 bytes from 10.42.5.213: icmp_seq=2 ttl=63 time=0.252 ms
64 bytes from 10.42.5.213: icmp_seq=3 ttl=63 time=0.226 ms

--- 10.42.5.213 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2055ms
rtt min/avg/max/mdev = 0.226/0.261/0.305/0.032 ms

root@test-0:~# ping -c 3 10.42.22.100
PING 10.42.22.100 (10.42.22.100) 56(84) bytes of data.
64 bytes from 10.42.22.100: icmp_seq=1 ttl=63 time=0.200 ms
64 bytes from 10.42.22.100: icmp_seq=2 ttl=63 time=0.222 ms
64 bytes from 10.42.22.100: icmp_seq=3 ttl=63 time=0.185 ms

--- 10.42.22.100 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2027ms
rtt min/avg/max/mdev = 0.185/0.202/0.222/0.015 ms

测试集群中跨节点通信:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
root@docker-test-0:~# kubectl exec -it busybox-f96f87d5d-k5d22 -- ping -c 3 10.42.5.213
PING 10.42.5.213 (10.42.5.213): 56 data bytes
64 bytes from 10.42.5.213: seq=0 ttl=62 time=1.368 ms
64 bytes from 10.42.5.213: seq=1 ttl=62 time=0.352 ms
64 bytes from 10.42.5.213: seq=2 ttl=62 time=0.313 ms

--- 10.42.5.213 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.313/0.677/1.368 ms

root@docker-test-0:~# kubectl exec -it busybox-f96f87d5d-bhd5b -- ping -c 3 10.42.22.100
PING 10.42.22.100 (10.42.22.100): 56 data bytes
64 bytes from 10.42.22.100: seq=0 ttl=62 time=0.721 ms
64 bytes from 10.42.22.100: seq=1 ttl=62 time=0.286 ms
64 bytes from 10.42.22.100: seq=2 ttl=62 time=0.307 ms

--- 10.42.22.100 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.286/0.438/0.721 ms
Author

Warner Chen

Posted on

2025-09-05

Updated on

2025-12-03

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.
You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.