Longhorn 配置 Storage Network

默认情况下,Longhorn 使用 Kubernetes 集群的默认 CNI 网络,这个网络会被整个集群中的其他工作负载共享,并且通常只涉及单个网络接口。

如果需要隔离 Longhorn 的集群内部数据流量(出于安全或性能考虑),Longhorn 支持通过 Storage Network 设置来实现这一点。

Storage Network 配置依赖 Multus,因此集群需要先安装 Multus。在 RKE2 集群创建初期,可以选择 Multus CNI。此处测试通过 RKE2 的自动部署和手动部署 Multus 两种方式。

前提条件

集群中所有节点需要新增一张网卡给 Storage Network 使用:

手动部署 Multus 方式

环境信息:

  • RKE2: v1.32.8+rke2r1
  • CNI: Cilium
  • Longhorn: v1.8.2
  • Multus: v4.2.3

配置 Storage Network 前挂载情况:

1
2
3
4
5
6
[root@rhel-1 ~]# mount | grep pvc
# RWX
# 此处 NFS Client 使用的是宿主机 IP
# 此处 NFS Server 使用的是 share-manager-pvc Pod 对应的 Service ClusterIP
10.43.217.7:/pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc on /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/447ee02d670adfe4a48ae856b8eb99bd43bd6fa936594b10cdecc04fb0addbb3/globalmount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=172.16.16.151,local_lock=none,addr=10.43.217.7)
10.43.217.7:/pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc on /var/lib/kubelet/pods/37e1e1e7-ead3-41ba-b6cc-1f548186a05b/volumes/kubernetes.io~csi/pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc/mount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=172.16.16.151,local_lock=none,addr=10.43.217.7)

配置 Cilium

参考文档:https://docs.rke2.io/networking/multus_sriov#using-multus-with-cilium

安装 Multus 之前,由于此处 RKE2 版本为 v1.32.8+rke2r1,所以 Cilium 需要设置 exclusivefalse

1
2
3
4
5
6
7
8
9
10
11
cat <<EOF | kubectl apply -f -
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-cilium
namespace: kube-system
spec:
valuesContent: |-
cni:
exclusive: false
EOF

设置完成后,手动重启 Cilium:

1
kubectl -n kube-system rollout restart ds cilium

如果没有重启,后续配置 NAD 会不生效,这是由于 Cilium 会把 Multus 的 CNI 配置进行 Rename:

1
time=2025-11-28T02:13:54.197067708Z level=info msg="Renaming non-Cilium CNI configuration file" module=agent.infra.cni-config source=00-multus.conf destination=00-multus.conf.cilium_bak

安装 Multus

参考文档:https://github.com/k8snetworkplumbingwg/multus-cni/blob/master/docs/how-to-use.md#install-multus

此处安装 Thick 模式:

1
kubectl apply -f https://raw.githubusercontent.com/k8snetworkplumbingwg/multus-cni/master/deployments/multus-daemonset-thick.yml # thick (client/server) deployment

安装 CNI Plugin

此步骤所有 Worker 节点都需要操作。

参考文档:https://github.com/containernetworking/plugins

1
2
3
4
5
wget "https://github.com/containernetworking/plugins/releases/download/v1.8.0/cni-plugins-linux-amd64-v1.8.0.tgz"
tar -xzf cni-plugins-linux-amd64-v1.8.0.tgz \
-C /opt/cni/bin \
--exclude='LICENSE' \
--exclude='README.md'

创建 NAD

此处使用 DHCP 方式下发 IP 给 Pod 使用,参考文档:https://www.cni.dev/plugins/current/ipam/dhcp/#network-configuration-reference

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
cat <<EOF | kubectl apply -f -
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
name: longhorn-macvlan-config
namespace: longhorn-system
spec:
config: |
{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "ens35",
"mode": "bridge",
"ipam": {
"type": "dhcp",
"request": [
{
"skipDefault": true,
"option": "subnet-mask"
}
]
}
}
EOF

创建 DHCP DaemonSet

使用 DHCP 模式,在所有 Worker 节点需要启用 DHCP Daemon,此处使用 DaemonSet 快速启动:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-multus-dhcp
namespace: kube-system
labels:
app: kube-multus-dhcp
spec:
selector:
matchLabels:
app: kube-multus-dhcp
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
app: kube-multus-dhcp
spec:
hostNetwork: true
nodeSelector:
node-role.kubernetes.io/worker: "true"
initContainers:
- name: kube-multus-cleanup-dhcp-socket
image: harbor.warnerchen.com/library/busybox:latest
command: ["rm", "-f", "/run/cni/dhcp.sock"]
securityContext:
privileged: true
volumeMounts:
- name: socketpath
mountPath: /host/run/cni
containers:
- name: kube-multus-dhcp
image: harbor.warnerchen.com/library/busybox:latest
command: ["/opt/cni/bin/dhcp", "daemon"]
securityContext:
privileged: true
volumeMounts:
- name: binpath
mountPath: /opt/cni/bin
- name: socketpath
mountPath: /run/cni
- name: netnspath
mountPath: /var/run/netns
mountPropagation: HostToContainer
volumes:
- name: binpath
hostPath:
path: /opt/cni/bin
- name: socketpath
hostPath:
path: /run/cni
- name: netnspath
hostPath:
path: /run/netns
EOF

等待 Pod 正常运行后,在节点检查是否有 dhcp.sock

1
2
[root@rhel-0 ~]# ls /run/cni/
dhcp.sock

测试 NAD 是否可用

创建一个 DaemonSet,测试每个节点都可以分配 IP,并相互进行通信:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: multus-nad-test
namespace: default
spec:
selector:
matchLabels:
app: multus-nad-test
template:
metadata:
labels:
app: multus-nad-test
annotations:
k8s.v1.cni.cncf.io/networks: longhorn-system/longhorn-macvlan-config
spec:
containers:
- name: multus-nad-test
image: harbor.warnerchen.com/library/busybox:latest
command: ["sh", "-c", "sleep 360000"]
EOF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
[root@rhel-0 ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
multus-nad-test-jbxcz 1/1 Running 0 69s
multus-nad-test-s8zqx 1/1 Running 0 69s

[root@rhel-0 ~]# kubectl exec -it multus-nad-test-jbxcz -- ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: net1@if90: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue qlen 1000
inet 172.16.16.201/24 brd 172.16.16.255 scope global net1
valid_lft forever preferred_lft forever
91: eth0@if92: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
inet 10.42.0.71/32 scope global eth0
valid_lft forever preferred_lft forever

[root@rhel-0 ~]# kubectl exec -it multus-nad-test-s8zqx -- ip -4 a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: net1@if68: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue qlen 1000
inet 172.16.16.206/24 brd 172.16.16.255 scope global net1
valid_lft forever preferred_lft forever
69: eth0@if70: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue
inet 10.42.1.31/32 scope global eth0
valid_lft forever preferred_lft forever

[root@rhel-0 ~]# kubectl exec -it multus-nad-test-jbxcz -- ip r
default via 10.42.0.163 dev eth0
10.42.0.163 dev eth0 scope link
172.16.16.0/24 dev net1 scope link src 172.16.16.201

[root@rhel-0 ~]# kubectl exec -it multus-nad-test-jbxcz -- ping -c 3 172.16.16.206
PING 172.16.16.206 (172.16.16.206): 56 data bytes
64 bytes from 172.16.16.206: seq=0 ttl=64 time=0.435 ms
64 bytes from 172.16.16.206: seq=1 ttl=64 time=0.211 ms
64 bytes from 172.16.16.206: seq=2 ttl=64 time=0.181 ms

--- 172.16.16.206 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.181/0.275/0.435 ms

如无问题,可以删除:

1
kubectl delete ds multus-nad-test

Longhorn 配置 Storage Network

参考文档:https://longhorn.io/docs/1.8.2/advanced-resources/deploy/storage-network/

配置前,需要将使用了 Longhorn PV 的 Workload 全部停止,释放旧连接。

在 Longhorn UI -> Setting -> General,找到 Storage Network,将先前创建的 NAD 填入,并勾选 Storage Network for RWX Volume Enabled

保存后,会触发 instance-manager/longhorn-csi-plugin 等 Pod 的重建,并分配 Macvlan IP:

验证 Workload 使用 Storage Network

启动 Workload,启动完成后查看挂载情况:

1
2
3
4
5
6
[root@rhel-1 ~]# mount | grep pvc
# RWX
# NFS Client 是所在节点 longhorn-csi-plugin Pod 的 Macvlan IP,即 172.16.16.215
# NFS Server 是 share-manager-pvc Pod 的 Macvlan IP,即 172.16.16.223
pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc.longhorn-system.svc.cluster.local:/pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc on /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/447ee02d670adfe4a48ae856b8eb99bd43bd6fa936594b10cdecc04fb0addbb3/globalmount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=172.16.16.215,local_lock=none,addr=172.16.16.223)
pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc.longhorn-system.svc.cluster.local:/pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc on /var/lib/kubelet/pods/2948c4ba-b41e-4f19-b5b6-628a78abdd02/volumes/kubernetes.io~csi/pvc-01f47f0a-12d2-4d1c-8dd6-8343e9e534cc/mount type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,softerr,softreval,noresvport,proto=tcp,timeo=600,retrans=5,sec=sys,clientaddr=172.16.16.215,local_lock=none,addr=172.16.16.223)

从 Service 的 Endpoint 也可以看到使用的是 Macvlan IP:

RKE2 自动部署 Multus 方式

环境信息:

  • RKE2: v1.32.8+rke2r1
  • CNI: Cilium
  • Longhorn: v1.8.2
  • Multus: v4.2.202

创建 RKE2 集群

创建时 CNI 选择 multus,cilium

配置 Cilium

如前所述,是否需要此配置取决于所使用的 RKE2 版本。

1
2
3
4
5
6
7
8
9
10
11
cat <<EOF | kubectl apply -f -
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-cilium
namespace: kube-system
spec:
valuesContent: |-
cni:
exclusive: false
EOF

设置完成后,手动重启 Cilium:

1
kubectl -n kube-system rollout restart ds cilium

检查 Multus CNI 配置文件是否有被 Rename,如有手动恢复:

1
mv /etc/cni/net.d/00-multus.conf.cilium_bak /etc/cni/net.d/00-multus.conf

安装 CNI Plugin

RKE2 自动部署的 Multus,默认情况下会在节点上安装所需的 CNI Plugin,无需手动安装:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@docker-test-1:~# ls -l /opt/cni/bin/
total 208808
-rwxr-xr-x 1 root root 5204704 Nov 28 15:51 bandwidth
-rwxr-xr-x 1 root root 5275360 Nov 28 16:00 bond
-rwxr-xr-x 1 root root 5649088 Nov 28 15:51 bridge
-rwxr-xr-x 1 root root 69579264 Nov 28 16:00 cilium-cni
-rwxr-xr-x 1 root root 11497152 Nov 28 15:51 dhcp
-rwxr-xr-x 1 root root 5337120 Nov 28 16:00 dummy
-rwxr-xr-x 1 root root 5681088 Nov 28 15:51 firewall
-rwxr-xr-x 1 root root 5271680 Nov 28 15:51 host-device
-rwxr-xr-x 1 root root 4720960 Nov 28 15:51 host-local
-rwxr-xr-x 1 root root 5353504 Nov 28 15:51 ipvlan
-rwxr-xr-x 1 root root 2828960 Nov 28 15:51 loopback
-rwxr-xr-x 1 root root 5378112 Nov 28 15:51 macvlan
-rwxr-xr-x 1 root root 49046752 Nov 28 15:52 multus
-rwxr-xr-x 1 root root 5246944 Nov 28 16:00 portmap
-rwxr-xr-x 1 root root 5497568 Nov 28 15:51 ptp
-rwxr-xr-x 1 root root 2989024 Nov 28 15:51 sbr
-rwxr-xr-x 1 root root 2477984 Nov 28 15:51 static
-rwxr-xr-x 1 root root 5400192 Nov 28 16:00 tap
-rwxr-xr-x 1 root root 2891328 Nov 28 15:51 tuning
-rwxr-xr-x 1 root root 5349408 Nov 28 15:51 vlan
-rwxr-xr-x 1 root root 3099520 Nov 28 15:51 vrf

启用 DHCP DaemonSet

rke2-multus 支持通过 Helm Chart 参数启用 DHCP DaemonSet:

1
2
3
4
5
6
7
8
9
10
11
cat <<EOF | kubectl apply -f -
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-multus
namespace: kube-system
spec:
valuesContent: |-
manifests:
dhcpDaemonSet: true
EOF

等待 Pod 正常运行后,在节点检查是否有 dhcp.sock

1
2
root@docker-test-0:~# ls /var/run/cni/
dhcp.sock

测试验证

参考先前的步骤创建 NAD 和配置 Longhorn Storage Network 即可。

Author

Warner Chen

Posted on

2025-11-28

Updated on

2025-12-08

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.
You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.