RKE2 通过 fake-gpu-operator 使用 HAMi

fake-gpu-operator 可以在 CPU 节点上模拟 nvidia.com/gpu 资源。在没有真实 GPU 的情况下,可以用于了解 HAMi 组件组成,并验证 GPU Pod 的调度流程。

参考资料:https://project-hami.io/zh/tutorials/labs/local-fake-gpu#%E6%AD%A5%E9%AA%A4-5-%E8%BF%90%E8%A1%8C%E6%A8%A1%E6%8B%9F-gpu-%E5%B7%A5%E4%BD%9C%E8%B4%9F%E8%BD%BD


前提条件

  • RKE2 版本 >= 1.18
  • 集群已安装 Prometheus
1
2
3
4
5
6
7
root@test-0:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
test-0 Ready control-plane,etcd,worker 55m v1.34.7+rke2r1

root@test-0:~# kubectl -n cattle-monitoring-system get pod | grep prometheus
prometheus-rancher-monitoring-prometheus-0 3/3 Running 0 46m
...

安装 fake-gpu-operator

1
2
3
4
5
6
7
8
9
10
kubectl create namespace gpu-operator
kubectl label namespace gpu-operator pod-security.kubernetes.io/enforce=privileged
kubectl label node test-0 run.ai/simulated-gpu-node-pool=default

export FAKE_GPU_OPERATOR_VERSION=0.0.82
helm -n gpu-operator upgrade --install gpu-operator \
oci://ghcr.io/run-ai/fake-gpu-operator/fake-gpu-operator \
--set runtimeClass.enabled=false \
--create-namespace \
--version ${FAKE_GPU_OPERATOR_VERSION}

RKE2 默认会包含 Nvidia 相关的 RuntimeClass,因此这里需要通过 --set runtimeClass.enabled=false 禁用 fake-gpu-operator 创建 RuntimeClass

确认 Pod 状态:

1
2
3
4
5
6
7
8
root@test-0:~# kubectl -n gpu-operator get pod
NAME READY STATUS RESTARTS AGE
device-plugin-wn65m 1/1 Running 0 52m
kwok-gpu-device-plugin-57bc9bf948-bsk7j 1/1 Running 0 53m
nvidia-dcgm-exporter-9ct5k 1/1 Running 0 52m
nvidia-dcgm-exporter-kwok-6768c648d7-7z7hj 1/1 Running 0 53m
status-updater-675ffc785b-dr2n2 1/1 Running 0 53m
topology-server-5cf5b5c9f6-gzxml 1/1 Running 0 53m

创建 ServiceMonitor 采集 GPU 指标

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
cat <<EOF | kubectl apply -f -
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: nvidia-dcgm-exporter
namespace: gpu-operator
labels:
release: prometheus
spec:
selector:
matchLabels:
app: nvidia-dcgm-exporter
namespaceSelector:
matchNames:
- gpu-operator
endpoints:
- port: gpu-metrics
path: /metrics
interval: 15s
EOF

验证指标是否采集成功:

1
2
3
root@test-0:~# kubectl exec -n cattle-monitoring-system prometheus-rancher-monitoring-prometheus-0 -- promtool query instant http://localhost:9090 'DCGM_FI_DEV_GPU_UTIL'
DCGM_FI_DEV_GPU_UTIL{Hostname="nvidia-dcgm-exporter-b99360", UUID="GPU-2a850fce-4a64-5e0b-8f6e-d3747c3e7c62", container="nvidia-dcgm-exporter", device="nvidia1", endpoint="gpu-metrics", gpu="1", instance="10.42.144.158:9400", job="nvidia-dcgm-exporter", modelName="Tesla-K80", namespace="gpu-operator", pod="nvidia-dcgm-exporter-9ct5k", service="nvidia-dcgm-exporter"} => 0 @[1781149563.009]
DCGM_FI_DEV_GPU_UTIL{Hostname="nvidia-dcgm-exporter-b99360", UUID="GPU-45f17b1d-8533-5ec8-aa93-6138df68b8b4", container="nvidia-dcgm-exporter", device="nvidia0", endpoint="gpu-metrics", exported_container="app", exported_namespace="default", exported_pod="fake-gpu-pod", gpu="0", instance="10.42.144.158:9400", job="nvidia-dcgm-exporter", modelName="Tesla-K80", namespace="gpu-operator", pod="nvidia-dcgm-exporter-9ct5k", service="nvidia-dcgm-exporter"} => 28 @[1781149563.009]

安装 HAMi 与 HAMi WebUI

由于 Rancher Prime GC 支持的 HAMi 版本暂不支持 devicePlugin.enabled 参数,因此这里使用官方 Helm Chart 进行安装:

1
2
3
4
5
6
helm repo add hami-charts https://project-hami.github.io/HAMi/
helm repo update

helm -n hami upgrade --install hami hami-charts/hami \
--set devicePlugin.enabled=false \
--create-namespace

fake-gpu-operator 已经安装了 device-plugin,因此这里需要禁用 HAMi 的 devicePlugin

安装 HAMi WebUI:

1
2
3
4
5
6
7
8
helm repo add hami-webui https://Project-HAMi.github.io/HAMi-WebUI/
helm repo update

helm -n hami upgrade --install hami-webui hami-webui/hami-webui \
--set externalPrometheus.enabled=true \
--set externalPrometheus.address="http:////rancher-monitoring-prometheus.cattle-monitoring-system.svc.cluster.local:9090" \
--set dcgm-exporter.enabled=false \
--set service.type=NodePort

确认安装状态:

1
2
3
4
5
6
7
8
9
root@test-0:~# helm -n hami ls
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
hami hami 1 2026-06-11 10:45:04.046737106 +0800 CST deployed hami-2.9.0 2.9.0
hami-webui hami 2 2026-06-11 10:51:59.786933364 +0800 CST deployed hami-webui-1.2.0 1.2.0

root@test-0:~# kubectl -n hami get pod
NAME READY STATUS RESTARTS AGE
hami-scheduler-799c59bc8d-pmswt 2/2 Running 2 (58m ago) 58m
hami-webui-7b57fc4cbf-hdvqd 2/2 Running 0 41m

配置节点标签和 GPU 注册信息

HAMi WebUI 通过节点标签 gpu=on 发现 GPU 节点,因此需要手动为节点添加标签:

1
kubectl label node test-0 gpu=on

在真实环境中,HAMi device-plugin 组件会自动在节点上写入 hami.io/node-nvidia-register Annotation,其中包含 GPU UUID、型号、显存等信息。

由于本次测试中禁用了 HAMi device-plugin(避免与 fake-gpu-operator 冲突),因此需要手动添加该 Annotation:

1
2
3
4
5
6
7
8
9
10
# 每块 GPU 对应一个 JSON 对象,与 HAMi v2.9.0 device plugin 在真实 GPU 节点上写入的格式一致
# id 是设备 UUID
# count 是每张卡的 vGPU 切分数量(HAMi 默认 10)
# devmem 是显存(MiB)
# devcore 是算力容量(%)
# mode 为 hami-core,表示软件层切分
# 这里的 UUID 和显存值来自 fake-gpu-operator 的 dcgm-exporter 指标
kubectl annotate node test-0 \
hami.io/node-nvidia-register='[{"id":"GPU-3cef3724-8228-5a66-b391-b0901788f5d0","count":10,"devmem":11441,"devcore":100,"type":"NVIDIA-Tesla-K80","mode":"hami-core","health":true},{"id":"GPU-5127182e-f297-5a25-bb44-0444c3be540c","index":1,"count":10,"devmem":11441,"devcore":100,"type":"NVIDIA-Tesla-K80","mode":"hami-core","health":true}]' \
hami.io/node-handshake="Requesting_$(date '+%Y.%m.%d %H:%M:%S')"

运行模拟 GPU 工作负载

通过以下 Pod 验证 Kubernetes 是否可以将申请 nvidia.com/gpu 的工作负载调度到 fake GPU 节点。

fake-gpu-operator 会为 GPU Pod 注入模拟的 nvidia-smi 工具,便于观察 GPU 可见性。

由于本实验未启用 HAMi device-plugin,HAMi 不会写入真实环境中的 hami.io/node-nvidia-register 节点注册信息。因此,测试 Pod 会显式绕过 HAMi Webhook,使用 Kubernetes 默认调度器以及 fake-gpu-operator 提供的模拟 GPU 资源。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: fake-gpu-pod
labels:
hami.io/webhook: ignore
annotations:
run.ai/simulated-gpu-utilization: "10-30"
spec:
restartPolicy: Never
containers:
- name: app
image: ubuntu:22.04
command: [ "bash", "-lc", "sleep 3600" ]
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
nvidia.com/gpu: 1
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
EOF

查看模拟 GPU Pod 状态:

1
2
3
root@test-0:~# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
fake-gpu-pod 1/1 Running 0 37s 10.42.144.184 test-0 <none> <none>

执行 nvidia-smi 查看模拟 GPU 信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
root@test-0:~# kubectl exec fake-gpu-pod -- nvidia-smi
Thu Jun 11 03:49:45 2026
+------------------------------------------------------------------------------+
| NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 11.4 |
+--------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
+--------------------------------+----------------------+----------------------+
| 0 Tesla-K80 Off | 00000001:00:00.0 Off | Off |
| N/A 33C P8 11W / 70W | 11441MiB / 11441MiB | 29% Default |
| | | N/A |
+--------------------------------+----------------------+----------------------+

+------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
+------------------------------------------------------------------------------+
| 0 N/A N/A 31 G sleep3600 11441MiB |
+------------------------------------------------------------------------------+

访问 HAMi WebUI

通过 NodePort 访问 HAMi WebUI,可以查看节点 GPU 信息:


限制

以下能力需要真实 NVIDIA GPU 环境:

  • HAMi device-plugin 自动注册真实 GPU,并写入 hami.io/node-nvidia-register
  • nvidia.com/gpumem 显存切分
  • nvidia.com/gpucores 算力比例限制
  • CUDA 程序真实运行
  • 显存超配
  • 显存分析
  • 显存覆盖
  • DCGM 真实 GPU 指标
Author

Warner Chen

Posted on

2026-06-11

Updated on

2026-06-11

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.
You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.