包括 Rancher、Rancher System Agent、Cattle Cluster Agent 等组件开启 debug 级别日志方法。
Read more
包括 Rancher、Rancher System Agent、Cattle Cluster Agent 等组件开启 debug 级别日志方法。
RKE 集群 Pod 一直处于 Terminating 状态
RKE 删除 Pod 的时候,Pod 的状态一直处于 Terminating,同时 kubelet 存在如下报错:
1 | 2025-10-23T14:34:08.462684510Z E1023 14:34:08.462648 3018 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/configmap/4dc46d4c-aa98-40b2-b941-9393978e4648-aaa-bbb-ccc podName:4dc46d4c-aa98-40b2-b941-9393978e4648 nodeName:}" failed. No retries permitted until 2025-10-23 14:36:10.462606964 +0000 UTC m=+68505754.237199481 (durationBeforeRetry 2m2s). Error: "error cleaning subPath mounts for volume \"aaa-bbb-ccc\" (UniqueName: \"kubernetes.io/configmap/4dc46d4c-aa98-40b2-b941-9393978e4648-aaa-bbb-ccc\") pod \"4dc46d4c-aa98-40b2-b941-9393978e4648\" (UID: \"4dc46d4c-aa98-40b2-b941-9393978e4648\") : error processing /var/lib/kubelet/pods/4dc46d4c-aa98-40b2-b941-9393978e4648/volume-subpaths/aaa-bbb-ccc/ddd-eee: error cleaning subpath mount /var/lib/kubelet/pods/4dc46d4c-aa98-40b2-b941-9393978e4648/volume-subpaths/aaa-bbb-ccc/ddd-eee/3: remove /var/lib/kubelet/pods/4dc46d4c-aa98-40b2-b941-9393978e4648/volume-subpaths/aaa-bbb-ccc/ddd-eee/3: device or resource busy" |
该问题不影响新 Pod 的创建(如 Deployment 等资源的更新等等),但集群会残留较多 Terminating 状态的 Pod。
RKE 创建 Pod 报错 no space left on device
RKE 创建 Pod 的时候,事件显示磁盘空间不足:
1 | 2025-10-23T14:33:28.367344576Z E1023 14:33:28.367304 3018 pod_workers.go:191] Error syncing pod 39982f3f-4435-47f0-bd9a-401eac35d8e5 ("logistics-api-678f476dc5-rw89k_prod-feiyuntms(39982f3f-4435-47f0-bd9a-401eac35d8e5)"), skipping: failed to "CreatePodSandbox" for "logistics-api-678f476dc5-rw89k_prod-feiyuntms(39982f3f-4435-47f0-bd9a-401eac35d8e5)" with CreatePodSandboxError: "CreatePodSandbox for pod \"logistics-api-678f476dc5-rw89k_prod-feiyuntms(39982f3f-4435-47f0-bd9a-401eac35d8e5)\" failed: rpc error: code = Unknown desc = failed to create a sandbox for pod \"logistics-api-678f476dc5-rw89k\": Error response from daemon: error creating overlay mount to /u/var/lib/docker/overlay2/62fae66c0cd56dd2fdd458c0d454ee14f1622da5231fcf361f21fa76b167e9bb-init/merged: no space left on device" |
Docker 报错:
1 | Oct 23 14:33:28 oser504254 dockerd[2462]: time="2025-10-23T14:33:28.363765012Z" level=error msg="error unmounting /u/var/lib/docker/overlay2/62fae66c0cd56dd2fdd458c0d454ee14f1622da5231fcf361f21fa76b167e9bb-init/merged: invalid argument" storage-driver=overlay2 |
但在宿主机检查容器相关的数据目录,发现磁盘可用空间都是充足的。
在使用 Nginx Ingress 并挂载证书的情况下,当通过 curl 携带 Host 请求头访问 Ingress Controller 时,返回的证书为 Kubernetes Ingress Controller Fake Certificate,而不是实际挂载的证书:

Cilium 提供多种组网模式,默认情况下使用 VXLAN 模式,通过 Overlay 组网的方式实现 Pod 的跨节点通信。
1 | root@rke2-cilium-01:~# kubectl -n kube-system get cm cilium-config -oyaml | grep tunnel-protocol |