节点规划
| Role |
Hostname |
IP |
OS |
Kernal |
| Control Plane + ETCD |
test-0 |
172.16.16.140 |
Ubuntu 22.04.5 LTS |
5.15.0-125-generic |
| Control Plane + ETCD |
test-1 |
172.16.16.141 |
Ubuntu 22.04.5 LTS |
5.15.0-125-generic |
| Control Plane + ETCD |
test-2 |
172.16.16.142 |
Ubuntu 22.04.5 LTS |
5.15.0-125-generic |
| Worker |
test-3 |
172.16.16.143 |
Ubuntu 22.04.5 LTS |
5.15.0-125-generic |
环境规划
| Component |
Version |
| Kubernetes |
v1.32.13 |
| Containerd |
v2.1.7 |
| Runc |
v1.4.2 |
| Crictl |
v1.32.13 |
| CNI Plugins |
v1.9.1 |
| Nerdctl |
v2.2.2 |
| Calico |
v3.27.0 |
| Cluster CIDR |
Pod CIDR |
| 10.43.0.0/16 |
10.42.0.0/16 |
节点初始化配置
设置时区
1
| timedatectl set-timezone Asia/Shanghai
|
设置时钟
1 2 3 4 5 6 7 8
| apt -y install chrony
vim /etc/chrony/chrony.conf
server ntp.aliyun.com minpoll 4 maxpoll 10 iburst
systemctl enable chronyd.service --now
|
验证时钟是否同步:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| root@test-0:~# chronyc tracking Reference ID : CB6B0658 (203.107.6.88) Stratum : 3 Ref time (UTC) : Wed Apr 15 03:41:43 2026 System time : 0.000027167 seconds fast of NTP time Last offset : +0.000345476 seconds RMS offset : 0.001053186 seconds Frequency : 13.992 ppm fast Residual freq : +0.245 ppm Skew : 35.316 ppm Root delay : 0.053283855 seconds Root dispersion : 0.002438692 seconds Update interval : 16.4 seconds Leap status : Normal
root@test-0:~# chronyc sources -v
.-- Source mode '^' = server, '=' = peer, '#' = local clock. / .- Source state '*' = current best, '+' = combined, '-' = not combined, | / 'x' = may be in error, '~' = too variable, '?' = unusable. || .- xxxx [ yyyy ] +/- zzzz || Reachability register (octal) -. | xxxx = adjusted offset, || Log2(Polling interval) --. | | yyyy = measured offset, || \ | | zzzz = estimated error. || | | \ MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^* 203.107.6.88 2 4 377 1 +359us[ +412us] +/- 34ms
|
关闭 Swap
1 2
| swapoff -a sed -i '/swap/s/^/#/' /etc/fstab
|
设置内核参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| cat <<EOF | tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF
modprobe overlay modprobe br_netfilter
cat <<EOF | tee /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 net.ipv4.ip_forward=1 EOF
sysctl --system
|
容器运行时环境配置
安装 Containerd
根据 Support Matrix,安装兼容的 Containerd 版本:https://containerd.io/releases/#kubernetes-support
1 2 3 4 5 6 7 8 9 10 11 12 13
| export CONTAINERD_VERSION="2.1.7"
wget "https://github.com/containerd/containerd/releases/download/v$CONTAINERD_VERSION/containerd-$CONTAINERD_VERSION-linux-amd64.tar.gz"
tar Czvxf /usr/local containerd-$CONTAINERD_VERSION-linux-amd64.tar.gz
mkdir -pv /usr/local/lib/systemd/system
wget -P /usr/local/lib/systemd/system "https://raw.githubusercontent.com/containerd/containerd/v$CONTAINERD_VERSION/containerd.service"
systemctl daemon-reload
systemctl enable containerd --now
|
安装 Runc
1 2 3 4 5
| export RUNC_VERSION="v1.4.2"
wget -O /usr/local/bin/runc "https://github.com/opencontainers/runc/releases/download/$RUNC_VERSION/runc.amd64"
chmod 755 /usr/local/bin/runc
|
安装 CNI 插件
1 2 3 4 5 6 7
| export CNI_PLUGIN_VERSION="v1.9.1"
wget "https://github.com/containernetworking/plugins/releases/download/$CNI_PLUGIN_VERSION/cni-plugins-linux-amd64-$CNI_PLUGIN_VERSION.tgz"
mkdir -pv /opt/cni/bin
tar Czvxf /opt/cni/bin cni-plugins-linux-amd64-$CNI_PLUGIN_VERSION.tgz
|
配置 Containerd
1 2 3 4 5 6 7 8 9 10
| mkdir -pv /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i 's/^\s*SystemdCgroup\s*=\s*false/SystemdCgroup = true/' /etc/containerd/config.toml
systemctl daemon-reload
systemctl restart containerd
|
配置 Containerd 使用代理
如果节点需要通过代理才能获取镜像,则需要给 Containerd 配置代理:
1 2 3 4 5 6 7 8 9 10 11 12
| mkdir -pv /etc/systemd/system/containerd.service.d
cat <<EOF | tee /etc/systemd/system/containerd.service.d/proxy.conf [Service] Environment="HTTP_PROXY=http://172.16.16.12:10808" Environment="HTTPS_PROXY=http://172.16.16.12:10808" Environment="NO_PROXY=localhost,127.0.0.1,0.0.0.0,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,10.42.0.0/16,10.43.0.0/16,.svc,.cluster.local,.cn,warnerchen.com" EOF
systemctl daemon-reload
systemctl restart containerd
|
安装 Crictl
1 2 3 4 5 6
| export CRICTL_VERSION="v1.32.0"
wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$CRICTL_VERSION/crictl-$CRICTL_VERSION-linux-amd64.tar.gz
tar Czxvf /usr/local/bin crictl-$CRICTL_VERSION-linux-amd64.tar.gz
|
安装 Nerdctl(可选)
1 2 3 4 5
| export NERDCTL_VERSION="2.2.2"
wget "https://github.com/containerd/nerdctl/releases/download/v$NERDCTL_VERSION/nerdctl-$NERDCTL_VERSION-linux-amd64.tar.gz"
tar Czvxf /usr/local/bin nerdctl-$NERDCTL_VERSION-linux-amd64.tar.gz
|
安装 Kubernetes
节点安装 kubeadm / kubelet / kubectl(Only for Control Plane)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| apt update
apt install -y apt-transport-https ca-certificates curl
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list
apt update
apt-cache madison kubelet kubeadm kubectl
apt install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl
|
安装第一台 Control Plane 节点
1 2 3 4 5 6
| kubeadm init \ --control-plane-endpoint "172.16.16.140:6443" \ --upload-certs \ --pod-network-cidr=10.42.0.0/16 \ --service-cidr=10.43.0.0/16
|
初始化 Kubeconfig:
1 2 3 4 5 6 7 8 9
| mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
echo "source <(kubectl completion bash)" >> ~/.bashrc
source ~/.bashrc
|
安装 CNI
此处安装 Calico,根据 Support Matrix 安装兼容的版本:https://docs.tigera.io/calico/latest/getting-started/kubernetes/requirements#supported-versions
1
| kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml
|
加入第二、三台 Control Plane 节点
1 2 3
| kubeadm join 172.16.16.140:6443 --token aaa \ --discovery-token-ca-cert-hash sha256:bbb \ --control-plane --certificate-key ccc
|
加入 Worker 节点
1 2
| kubeadm join 172.16.16.140:6443 --token aaa \ --discovery-token-ca-cert-hash sha256:bbb
|
所有 Worker 节点安装完成后,为这些节点打 Worker 标签:
1 2 3 4 5 6 7 8
| kubectl label node test-3 node-role.kubernetes.io/worker=''
kubectl get nodes NAME STATUS ROLES AGE VERSION test-0 Ready control-plane 23m v1.32.13 test-1 Ready control-plane 9m6s v1.32.13 test-2 Ready control-plane 4m10s v1.32.13 test-3 Ready worker 3m8s v1.32.13
|
ETCD 快照备份与恢复
ETCD 命令初始化
ETCD 相关操作需要使用 etcdctl / etcdutl,可以在 Control Plane 节点执行如下命令获取:
1 2
| cp "$(find / -name etcdctl 2>/dev/null | grep -v snapshot | head -n1)" /usr/local/bin/ cp "$(find / -name etcdutl 2>/dev/null | grep -v snapshot | head -n1)" /usr/local/bin/
|
etcdctl 的执行需要参数指定证书路径和 Endpoints:
1 2 3 4 5
| ETCDCTL_API=3 etcdctl member list -w table \ --endpoints=xxx \ --cacert=xxx \ --cert=xxx \ --key=xxx
|
可以通过 kubectl 和 /etc/kubernetes/manifests/etcd.yaml 获取:
1 2 3 4 5 6 7 8 9 10 11 12
| grep 'file' /etc/kubernetes/manifests/etcd.yaml | grep -v 'peer' - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --key-file=/etc/kubernetes/pki/etcd/server.key - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_API=3 export ETCDCTL_ENDPOINTS=$(kubectl get nodes -o wide \ | awk '/etcd|control-plane/ {ips[$6]=1} END {for (ip in ips) printf "https://%s:2379,", ip}' \ | sed 's/,$//') export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
|
ETCD 快照备份
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| mkdir pv ~/etcd-backup
export ETCDCTL_API=3 export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
etcdctl snapshot save ~/etcd-backup/etcd-backup-0.db \ --endpoints=https://172.16.16.140:2379
ls -l ~/etcd-backup total 5968 -rw------- 1 root root 6107168 Apr 15 16:48 etcd-backup-0.db
|
检查 ETCD 快照状态
1 2 3 4 5 6 7 8 9 10 11 12
| export ETCDCTL_API=3 export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
etcdutl snapshot status /root/etcd-backup/etcd-backup-0.db -w table +----------+----------+------------+------------+ | HASH | REVISION | TOTAL KEYS | TOTAL SIZE | +----------+----------+------------+------------+ | 348ec281 | 13267 | 2285 | 6.1 MB | +----------+----------+------------+------------+
|
ETCD 快照恢复
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| kubectl -n kube-system delete pod etcd-test-0
systemctl stop kubelet
export ETCDCTL_API=3 export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
mkdir -pv /var/lib/etcd-restore
etcdutl --data-dir="/var/lib/etcd-restore" snapshot restore ~/etcd-backup/etcd-backup-0.db
vim /etc/kubernetes/manifests/etcd.yaml
systemctl start kubelet
|
升级 Kubernetes
升级至 v1.33 版本,在所有节点执行:
1 2 3 4 5 6 7 8
| curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.33/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes.gpg] https://pkgs.k8s.io/core:/stable:/v1.33/deb/ /' | tee /etc/apt/sources.list.d/kubernetes.list
apt update
apt-cache madison kubeadm kubelet kubectl
|
升级第一台 Control Plane 节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| apt-mark unhold kubeadm apt-get update && sudo apt-get install -y kubeadm=1.33.10-1.1 apt-mark hold kubeadm
kubeadm upgrade plan
kubeadm upgrade apply v1.33.10
apt-mark unhold kubelet kubectl apt-get install -y kubelet=1.33.10-1.1 kubectl=1.33.10-1.1 apt-mark hold kubelet kubectl
systemctl daemon-reload systemctl restart kubelet
|
升级 CNI
此前安装的 Calico 版本兼容 Kubernetes v1.33,所以不做升级。如不兼容,需要手动升级。
升级第二、三台 Control Plane 节点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| apt-mark unhold kubeadm apt-get update && sudo apt-get install -y kubeadm=1.33.10-1.1 apt-mark hold kubeadm
kubeadm upgrade node
apt-mark unhold kubelet kubectl apt-get install -y kubelet=1.33.10-1.1 kubectl=1.33.10-1.1 apt-mark hold kubelet kubectl
systemctl daemon-reload systemctl restart kubelet
|
升级 Worker 节点
清空节点上的 Pod:
1 2
| kubectl drain test-3 --ignore-daemonsets --delete-emptydir-data
|
升级 Worker 节点:
1 2 3 4 5 6 7 8 9 10 11 12
| apt-mark unhold kubeadm kubelet apt-get update apt-get install -y kubeadm=1.33.10-1.1 kubelet=1.33.10-1.1 apt-mark hold kubeadm kubelet
kubeadm upgrade node
systemctl daemon-reload systemctl restart kubelet
|
恢复调度:
1 2
| kubectl uncordon test-3
|