RKE2 Cilium 配置 Cluster Mesh 实现跨集群通信

通过 Cilium 的能力实现 Kubernetes 集群连接在一起来构建一个网状集群,在所有集群之间启用 pod-to-pod 连接,定义全局服务来平衡集群之间的负载,并执行安全策略来限制访问。

官方文档:https://docs.cilium.io/en/stable/network/clustermesh/clustermesh/#enable-clustermesh

首先需要创建两个 Cilium 集群,两个集群的 Cluster CIDR 和 Service CIDR 不能冲突:

在附加配置中,设置两个集群的 Cilium Cluster ID 和 Name,也不能够冲突:

集群创建好后,在两个集群创建一个 SVC,用于将该集群的 API server 对外暴露:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: apiserver
namespace: kube-system
spec:
ports:
- name: port-6443
port: 6443
protocol: TCP
targetPort: 6443
selector:
component: kube-apiserver
type: NodePort
EOF

分别在两个集群的 Control Plane 节点配置 kubeconfig,添加对端集群的配置,server 地址可以配置刚刚创建好的 Service NodePort,例如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: xxx
server: https://172.16.16.140:30645
name: rke2-cilium-1
- cluster:
certificate-authority-data: xxx
server: https://172.16.16.141:32460
name: rke2-cilium-2
contexts:
- context:
cluster: rke2-cilium-1
user: rke2-cilium-1
name: rke2-cilium-1
- context:
cluster: rke2-cilium-2
user: rke2-cilium-2
name: rke2-cilium-2
current-context: rke2-cilium-1
kind: Config
preferences: {}
users:
- name: rke2-cilium-1
user:
client-certificate-data: xxx
client-key-data: xxx
- name: rke2-cilium-2
user:
client-certificate-data: xxx
client-key-data: xxx

安装 Cilium CLI:

1
2
3
4
5
6
7
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}

分别在两个集群开启 Cluster Mesh:

1
2
3
4
5
# Cluster 1
cilium clustermesh enable --context rke2-cilium-1 --service-type NodePort --helm-release-name rke2-cilium

# Cluster 2
cilium clustermesh enable --context rke2-cilium-2 --service-type NodePort --helm-release-name rke2-cilium

执行该命令后在 kube-system 下会生成 Deployment clustermesh-apiserver,其状态正常则代表开启成功,也可以通过下面的命令检查:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
root@test-0:~# cilium status --context rke2-cilium-1
/¯¯\
/¯¯\__/¯¯\ Cilium: OK
\__/¯¯\__/ Operator: OK
/¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode)
\__/¯¯\__/ Hubble Relay: disabled
\__/ ClusterMesh: OK

DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1
Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1
Deployment clustermesh-apiserver Desired: 1, Ready: 1/1, Available: 1/1
Containers: cilium Running: 1
cilium-operator Running: 1
clustermesh-apiserver Running: 1
Cluster Pods: 15/15 managed by Cilium
Helm chart version:
Image versions cilium harbor.warnerchen.com/rancher/mirrored-cilium-cilium:v1.16.2: 1
cilium-operator harbor.warnerchen.com/rancher/mirrored-cilium-operator-generic:v1.16.2: 1
clustermesh-apiserver harbor.warnerchen.com/rancher/mirrored-cilium-clustermesh-apiserver:v1.16.2: 3

建立集群连接:

1
2
3
4
5
# Cluster 1
cilium clustermesh connect --context rke2-cilium-1 --destination-context rke2-cilium-2 --helm-release-name rke2-cilium

# Cluster 2
cilium clustermesh connect --context rke2-cilium-2 --destination-context rke2-cilium-1 --helm-release-name rke2-cilium

日志如下则代表开启成功:

也可以通过下面的命令检查:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
root@test-0:~# cilium clustermesh status rke2-cilium-1
⚠️ Service type NodePort detected! Service may fail when nodes are removed from the cluster!
✅ Service "clustermesh-apiserver" of type "NodePort" found
✅ Cluster access information is available:
- 172.16.16.140:32379
✅ Deployment clustermesh-apiserver is ready
ℹ️ KVStoreMesh is enabled

✅ All 1 nodes are connected to all clusters [min:1 / avg:1.0 / max:1]
✅ All 1 KVStoreMesh replicas are connected to all clusters [min:1 / avg:1.0 / max:1]

🔌 Cluster Connections:
- rke2-cilium-2: 1/1 configured, 1/1 connected - KVStoreMesh: 1/1 configured, 1/1 connected

🔀 Global services: [ min:1 / avg:1.0 / max:1 ]

如果连接 Cilium Pod 失败可以尝试重启 Cilium 解决:

1
kubectl -n kube-system rollout restart ds cilium

部署 Demo 进行测试:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
annotations:
service.cilium.io/global: 'true'
name: global-service-video
spec:
ports:
- name: port-80
port: 80
protocol: TCP
targetPort: 80
selector:
name: rebel-base

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rebel-base
spec:
selector:
matchLabels:
name: rebel-base
replicas: 2
template:
metadata:
labels:
name: rebel-base
spec:
containers:
- name: rebel-base
image: harbor.warnerchen.com/library/nginx:mainline
volumeMounts:
- name: html
mountPath: /usr/share/nginx/html/
livenessProbe:
httpGet:
path: /
port: 80
periodSeconds: 1
readinessProbe:
httpGet:
path: /
port: 80
volumes:
- name: html
configMap:
name: rebel-base-response
items:
- key: message
path: index.html

---
apiVersion: v1
kind: ConfigMap
metadata:
name: rebel-base-response
data:
# 需要替换此处的这里的 Cluster 为对应的集群名称
message: "{\"Name\": \"Warner\", \"Cluster\": \"rke2-cilium-1\"}\n"

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: x-wing
spec:
selector:
matchLabels:
name: x-wing
replicas: 2
template:
metadata:
labels:
name: x-wing
spec:
containers:
- name: x-wing-container
image: quay.io/cilium/json-mock:v1.3.3@sha256:f26044a2b8085fcaa8146b6b8bb73556134d7ec3d5782c6a04a058c945924ca0
livenessProbe:
exec:
command:
- curl
- -sS
- -o
- /dev/null
- localhost
readinessProbe:
exec:
command:
- curl
- -sS
- -o
- /dev/null
- localhost
EOF

可以直接通过 Pod IP/Global Service 进行通信:

Author

Warner Chen

Posted on

2025-02-13

Updated on

2025-02-27

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.
You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.