版本升级

版本升级


简述 将当前版本 1.33.3 升级到 1.35.2

环境介绍

项目 说明
当前集群版本 1.33.3
目标版本 1.35.2
master node 数量1个
work node 数量2个
cni cilium 1.17.6
etcd 外置节点 http 访问
containerd 版本 2.1.3
kube-proxy 无, 使用了 cilium 替代
kubelet 二进制部署 systemd 启动
host os ubuntu 24.04
仓库 所有镜像都需要在内网镜像仓库

升级策略

方案简述:

逐版本滚动升级策略, 每阶段版本升级过程:1.33.3 -> 1.33.9 -> 1.34.5 -> 1.35.2

官方推荐的升级计划

  1. 先升级到当前次版本的最新补丁版本
  2. 再升级到下个次版本的最新补丁版本
  3. 依次升级到下个次版本的最新补丁版本

注意:

  1. 次要版本不能跨, 必须逐个升级
  2. 补丁版本可以跨, 例如 1.33.1 -> 1.33.8

kubeadm 这个命令只支持最近1个版本的集群, 例如 1.33 的集群, 最多使用 1.34 版本的 kubeadm 工具

其它可选的升级方案

  • 可以先不升级到当前次版本的最新补丁版本,而是直接升级到下个次版本;但是联网情况的 kubeadm 工具会去检测最新版本情况,所以会通不过检查计划,而是要求当前次版本的最新补丁版本;如果要强行升可以离线进行或者参数关闭此特性;

  • k8s的版本偏差限制支持控制节点的版本可以高于工作节点3个次版本,所以工作节点可以最后一并进行升级,但是 kubeadm 不支持, 所以如果只是二进制部署的场景可以在充分测试后尝试这种方式。

升级过程注意事项

附加事项 - 内网仓库地址更换

  • 原地址 registry.services.wait/k8s
  • 新地址 harbor.services.wait/registry.k8s.io

准备工作

事项

  1. 版本兼容性检查, 特别是与其它插件的兼容性
  2. 关键信息备份
  3. 新二进制文件准备
  4. 相关镜像文件准备

下载二进制文件

需要当前版本与最新版本之间的每个次版本的最新版本

# 官方地址
https://github.com/kubernetes/kubernetes

# 查看3个版本的发布页
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.33.md
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.34.md
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.35.md

# 下载二进制包
https://dl.k8s.io/v1.33.9/kubernetes-server-linux-amd64.tar.gz
https://dl.k8s.io/v1.34.5/kubernetes-server-linux-amd64.tar.gz
https://dl.k8s.io/v1.35.2/kubernetes-server-linux-amd64.tar.gz

镜像准备

  1. 查看需要的镜像
# 预览需要的官方镜像
root@h22:~# kubeadm config images list --kubernetes-version=v1.33.9
registry.k8s.io/kube-apiserver:v1.33.9
registry.k8s.io/kube-controller-manager:v1.33.9
registry.k8s.io/kube-scheduler:v1.33.9
registry.k8s.io/kube-proxy:v1.33.9
registry.k8s.io/coredns/coredns:v1.12.0
registry.k8s.io/pause:3.10
registry.k8s.io/etcd:3.5.24-0

# 指定内网仓库后,查询需要的镜像
root@h22:~# kubeadm config images list --kubernetes-version=v1.33.9 --image-repository=harbor.services.wait/registry.k8s.io
harbor.services.wait/registry.k8s.io/kube-apiserver:v1.33.9
harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.33.9
harbor.services.wait/registry.k8s.io/kube-scheduler:v1.33.9
harbor.services.wait/registry.k8s.io/kube-proxy:v1.33.9
harbor.services.wait/registry.k8s.io/coredns:v1.12.0    # 注意这里因为指定了的仓库所以少了一级
harbor.services.wait/registry.k8s.io/pause:3.10
harbor.services.wait/registry.k8s.io/etcd:3.5.24-0
  1. 提前拉取镜像

因为镜像文件非常大, 如果没有内网仓库,最好提前拉取镜像。

# 提前各主机拉取镜像 - 官方源拉取
kubeadm config images pull --kubernetes-version=v1.33.9

# 指定仓库方式拉取
kubeadm config images pull --kubernetes-version=v1.33.9 --image-repository=harbor.services.wait/registry.k8s.io

# crictl 的拉取方式
crictl pull harbor.services.wait/registry.k8s.io/kube-apiserver:v1.33.9
crictl pull harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.33.9
crictl pull harbor.services.wait/registry.k8s.io/kube-scheduler:v1.33.9
  1. 内网仓库镜像准备

apiserver controller scheduler 的镜像在官方的二进制包文件内就有;

docker image load < 1.33.9/kubernetes/server/bin/kube-apiserver.tar
docker image load < 1.33.9/kubernetes/server/bin/kube-controller-manager.tar
docker image load < 1.33.9/kubernetes/server/bin/kube-scheduler.tar

docker image load < 1.34.5/kubernetes/server/bin/kube-apiserver.tar
docker image load < 1.34.5/kubernetes/server/bin/kube-controller-manager.tar
docker image load < 1.34.5/kubernetes/server/bin/kube-scheduler.tar

docker image load < 1.35.2/kubernetes/server/bin/kube-apiserver.tar
docker image load < 1.35.2/kubernetes/server/bin/kube-controller-manager.tar
docker image load < 1.35.2/kubernetes/server/bin/kube-scheduler.tar

# 修改标签时注意要去掉后面的 amd64 标识
docker tag registry.k8s.io/kube-apiserver-amd64:v1.33.9 harbor.services.wait/registry.k8s.io/kube-apiserver:v1.33.9
docker tag registry.k8s.io/kube-controller-manager-amd64:v1.33.9 harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.33.9
docker tag registry.k8s.io/kube-scheduler-amd64:v1.33.9 harbor.services.wait/registry.k8s.io/kube-scheduler:v1.33.9

docker tag registry.k8s.io/kube-apiserver-amd64:v1.34.5 harbor.services.wait/registry.k8s.io/kube-apiserver:v1.34.5
docker tag registry.k8s.io/kube-controller-manager-amd64:v1.34.5 harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.34.5
docker tag registry.k8s.io/kube-scheduler-amd64:v1.34.5 harbor.services.wait/registry.k8s.io/kube-scheduler:v1.34.5

docker tag registry.k8s.io/kube-apiserver-amd64:v1.35.2 harbor.services.wait/registry.k8s.io/kube-apiserver:v1.35.2
docker tag registry.k8s.io/kube-controller-manager-amd64:v1.35.2 harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.35.2
docker tag registry.k8s.io/kube-scheduler-amd64:v1.35.2 harbor.services.wait/registry.k8s.io/kube-scheduler:v1.35.2

# 推送到内部仓库
docker push harbor.services.wait/registry.k8s.io/kube-apiserver:v1.33.9
docker push harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.33.9
docker push harbor.services.wait/registry.k8s.io/kube-scheduler:v1.33.9

docker push harbor.services.wait/registry.k8s.io/kube-apiserver:v1.34.5
docker push harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.34.5
docker push harbor.services.wait/registry.k8s.io/kube-scheduler:v1.34.5

docker push harbor.services.wait/registry.k8s.io/kube-apiserver:v1.35.2
docker push harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.35.2
docker push harbor.services.wait/registry.k8s.io/kube-scheduler:v1.35.2


# 其它相关镜像准备
harbor.services.wait/registry.k8s.io/coredns:v1.12.0
harbor.services.wait/registry.k8s.io/coredns:v1.13.1
harbor.services.wait/registry.k8s.io/pause:3.10


skopeo copy --override-arch amd64 --override-os linux \
    docker://registry.k8s.io/pause:3.10 \
    docker://harbor.services.wait/registry.k8s.io/pause:3.10

# 注意路径变了
skopeo copy --override-arch amd64 --override-os linux \
    docker://registry.k8s.io/coredns/coredns:v1.12.0 \
    docker://harbor.services.wait/registry.k8s.io/coredns:v1.12.0

# 升级到1.35时需要
skopeo copy --override-arch amd64 --override-os linux \
    docker://registry.k8s.io/coredns/coredns:v1.13.1 \
    docker://harbor.services.wait/registry.k8s.io/coredns:v1.13.1

二进制文件上传

work 节点其实可以不上传 kubeadm 和 kubectl

# 单个主机节点上传
ssh root@h22 'mkdir /root/tmp'
scp 1.33.9/kubernetes/server/bin/kubeadm root@h22:/root/tmp/kubeadm-1.33.9
scp 1.33.9/kubernetes/server/bin/kubectl root@h22:/root/tmp/kubectl-1.33.9
scp 1.33.9/kubernetes/server/bin/kubelet root@h22:/root/tmp/kubelet-1.33.9

scp 1.34.5/kubernetes/server/bin/kubeadm root@h22:/root/tmp/kubeadm-1.34.5
scp 1.34.5/kubernetes/server/bin/kubectl root@h22:/root/tmp/kubectl-1.34.5
scp 1.34.5/kubernetes/server/bin/kubelet root@h22:/root/tmp/kubelet-1.34.5

scp 1.35.2/kubernetes/server/bin/kubeadm root@h22:/root/tmp/kubeadm-1.35.2
scp 1.35.2/kubernetes/server/bin/kubectl root@h22:/root/tmp/kubectl-1.35.2
scp 1.35.2/kubernetes/server/bin/kubelet root@h22:/root/tmp/kubelet-1.35.2

简化上传脚本

#!/bin/bash

NODES=("h21" "h22" "h23")
VERSIONS=("1.33.9" "1.34.5" "1.35.2")   # 定义要分发的版本
COMPONENTS=("kubeadm" "kubectl" "kubelet")
REMOTE_TMP="/root/tmp"                  # 远程存放目录

# 1. 在所有节点创建目录
echo ">>> Step 1: Creating remote directories..."
for node in "${NODES[@]}"; do
    ssh "root@$node" "mkdir -p $REMOTE_TMP"
done

# 2. 循环分发文件
echo ">>> Step 2: Distributing binaries..."
for version in "${VERSIONS[@]}"; do
    echo "Processing version: $version"
    for comp in "${COMPONENTS[@]}"; do

        SRC_PATH="$version/kubernetes/server/bin/$comp"

        # 检查本地文件是否存在
        if [ ! -f "$SRC_PATH" ]; then
            echo "Error: Local file $SRC_PATH not found, skipping."
            continue
        fi

        for node in "${NODES[@]}"; do
            DEST_FILE="$REMOTE_TMP/${comp}-${version}"
            echo "Sending $comp ($version) to $node..."
            scp "$SRC_PATH" "root@$node:$DEST_FILE"
            ssh "root@$node" "chmod +x $DEST_FILE"
        done
    done
done

echo ">>> All binaries have been distributed successfully!"

准备升级配置文件

  1. 因为集群没有部署 kube-proxy 所以需要跳过
  2. 因为是外部 etcd 环境, 所以也需要关闭

kubeadm-upgrade.yaml

apiVersion: kubeadm.k8s.io/v1beta4
kind: UpgradeConfiguration
# --- 升级策略控制 ---
apply:
  # 关键:跳过 kube-proxy 阶段,避免寻找不存在的 ConfigMap
  skipPhases:
    - addon/kube-proxy
  # 因为是外部 Etcd, 所以 kubeadm 不介入
  etcdUpgrade: false

备份集群

  1. 备份关键集群 configmap

# 集群内的 kubeadm-config 配置
kubectl -n kube-system get configmap kubeadm-config -o yaml
  1. 备份控制面节点关键目录
sudo cp -r /etc/kubernetes /etc/kubernetes.bak.$(date +%Y%m%d)

# 注意 /var/lib/kubelet/pods 下是 Pod 目录, 可以不备份
sudo cp -r /var/lib/kubelet /var/lib/kubelet.bak.$(date +%Y%m%d)
  1. 备份etcd - 外置 etcd
ETCDCTL_API=3 etcdctl --endpoints="http://192.168.5.9:2379" \
  snapshot save etcd-snapshot-$(date +%Y%m%d).db
  1. 记录当前集群状态 - 便于回滚对比
kubeadm version
kubectl version
cilium version
kubectl get nodes -o wide > pre-upgrade-nodes.txt
kubectl get pods --all-namespaces > pre-upgrade-pods.txt
kubectl get all --all-namespaces > pre-upgrade-all-resources.txt

附加任务: 更换内网仓库地址

因为 kubeadm upgrade apply 不支持 --image-repository 参数,如果使用 --config 并在文件内 添加 ClusterConfiguration 并设定新仓库的地址, 但 apply 还是会去集群内 ConfigMap 配置的仓库地址拉取镜像。

所以需要提前修改 kubeadm 配置在集群内的仓库地址

修改 ConfigMap(kubeadm-config) 内仓库的配置

kubectl edit cm kubeadm-config -n kube-system

# 修改内容为新仓库地址
imageRepository: harbor.services.wait/registry.k8s.io

第一阶段升级 v1.33.9

目标是升级到 1.33.9

简述

  1. 升级 kubeadm
  2. 预览升级计划
  3. 执行升级计划
  4. 升级master节点的 kubelet 和 kubectl
  5. 升级work节点的 kubelet 和 kubectl

控制节点升级

  1. 升级 kubeadm
cp /usr/local/bin/kubeadm /usr/local/bin/kubeadm-1.33.3
mv /root/tmp/kubeadm-1.33.9 /usr/local/bin/kubeadm
kubeadm version
  1. 预览升级计划

可以先看看检查是否通过


kubeadm upgrade apply v1.33.9 --config kubeadm-upgrade.yaml --dry-run
  1. 执行集群升级

因为 kubeadm upgrade apply 是幂等的, 所以如果执行失败, 可以修改错误后再重新执行。


# kubeadm upgrade apply v1.33.9 --config kubeadm-upgrade.yaml

root@h22:~# kubeadm upgrade apply v1.33.9 --config kubeadm-upgrade.yaml
W0312 12:26:17.570002 3308720 upgradeconfiguration.go:43] [config] WARNING: Ignored configuration document with GroupVersionKind kubeadm.k8s.io/v1beta4, Kind=ClusterConfiguration
[upgrade] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[upgrade] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0312 12:26:17.929493 3308720 configset.go:78] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" not found
[upgrade/preflight] Running preflight checks
[upgrade] Running cluster health checks
[upgrade/preflight] You have chosen to upgrade the cluster version to "v1.33.9"
[upgrade/versions] Cluster version: v1.33.3
[upgrade/versions] kubeadm version: v1.33.9
[upgrade] Are you sure you want to proceed? [y/N]: y
[upgrade/preflight] Pulling images required for setting up a Kubernetes cluster
[upgrade/preflight] This might take a minute or two, depending on the speed of your internet connection
[upgrade/preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[upgrade/control-plane] Upgrading your static Pod-hosted control plane to version "v1.33.9" (timeout: 5m0s)...
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests2037152466"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Moving new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backing up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2026-03-12-12-26-28/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This can take up to 5m0s
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moving new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backing up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2026-03-12-12-26-28/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This can take up to 5m0s
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moving new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backing up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2026-03-12-12-26-28/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This can take up to 5m0s
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upgrade/control-plane] The control plane instance for this node was successfully upgraded!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upgrade/kubeconfig] The kubeconfig files for this node were successfully upgraded!
W0312 12:27:44.867273 3308720 postupgrade.go:117] Using temporary directory /etc/kubernetes/tmp/kubeadm-kubelet-config2442112002 for kubelet config. To override it set the environment variable KUBEADM_UPGRADE_DRYRUN_DIR
[upgrade] Backing up kubelet config file to /etc/kubernetes/tmp/kubeadm-kubelet-config2442112002/config.yaml
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade/kubelet-config] The kubelet configuration for this node was successfully upgraded!
[upgrade/bootstrap-token] Configuring bootstrap token and cluster-info RBAC rules
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS

[upgrade] SUCCESS! A control plane node of your cluster was upgraded to "v1.33.9".

[upgrade] Now please proceed with upgrading the rest of the nodes by following the right order.
  1. 版本检查
root@h22:~# kubectl  version
Client Version: v1.33.3
Kustomize Version: v5.6.0
Server Version: v1.33.9
  1. 升级 kubelet

kubelet 升级流程: "驱逐 -> 替换 -> 重启 -> 恢复"

# 先腾空节点(会影响节点上的 Pod, 提前规划)
kubectl drain h22 --ignore-daemonsets --delete-emptydir-data

systemctl stop kubelet

cp /usr/local/bin/kubelet /usr/local/bin/kubelet.v1.33.3
mv /root/tmp/kubelet-1.33.9 /usr/local/bin/kubelet
chmod +x /usr/local/bin/kubelet

# 重启 kubelet
systemctl daemon-reload
systemctl restart kubelet

# 查看状态
systemctl status kubelet

# 查看日志确保没有因为参数弃用导致的报错
journalctl -u kubelet -f

# 恢复节点调度
kubectl uncordon h22

# 验证节点状态, 确认版本
kubectl get nodes -o wide
  1. 验证节点状态, 确认版本

控制面的节点已经升级成功了

root@h22:~# kubectl get nodes -o wide
NAME   STATUS   ROLES           AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
h21    Ready    <none>          207d   v1.33.3   192.168.5.21   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h22    Ready    control-plane   229d   v1.33.9   192.168.5.22   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h23    Ready    <none>          229d   v1.33.3   192.168.5.23   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
  1. 升级 kubectl

cp /usr/local/bin/kubectl /usr/local/bin/kubelet.v1.33.3
mv /root/tmp/kubectl-1.33.9 /usr/local/bin/kubectl

工作节点升级

流程

  1. kubectl drain 驱逐节点 pod
  2. kubeadm upgrade node;
  3. 升级二进制文件; apt or yum or 替换二进制
  4. 重启 kubelet 服务
  5. kubectl uncordon 恢复节点调度

参考步骤 5 升级 kubectl 的方式全部进行更换即可

kubeadm upgrade node

验证

# 检查所有节点版本是否为 v1.33.9
kubectl get nodes -o wide

# 检查核心组件运行状态
kubectl get pods -n kube-system

# 测试业务 Pod 是否正常
kubectl get pods --all-namespaces

# 验证 Cilium 状态
cilium status

所有节点都已经升级到 v1.33.9

wait@ub05:~/local_tmp$ kubectl get nodes -o wide
NAME   STATUS   ROLES           AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
h21    Ready    <none>          207d   v1.33.9   192.168.5.21   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h22    Ready    control-plane   229d   v1.33.9   192.168.5.22   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h23    Ready    <none>          229d   v1.33.9   192.168.5.23   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3

第二阶段升级 v1.34.5

升级目标: 1.34.5

准备

  1. 还是用原来的 kubeadm-upgrade.yaml 文件

使用提前准备好的镜像

root@h22:~# kubeadm config images list --kubernetes-version=v1.34.5
registry.k8s.io/kube-apiserver:v1.34.5
registry.k8s.io/kube-controller-manager:v1.34.5
registry.k8s.io/kube-scheduler:v1.34.5
registry.k8s.io/kube-proxy:v1.34.5
registry.k8s.io/coredns/coredns:v1.12.1
registry.k8s.io/pause:3.10.1
registry.k8s.io/etcd:3.6.5-0


root@h22:~# kubeadm config images list --kubernetes-version=v1.34.5 --image-repository=harbor.services.wait/registry.k8s.io
harbor.services.wait/registry.k8s.io/kube-apiserver:v1.34.5
harbor.services.wait/registry.k8s.io/kube-controller-manager:v1.34.5
harbor.services.wait/registry.k8s.io/kube-scheduler:v1.34.5
harbor.services.wait/registry.k8s.io/kube-proxy:v1.34.5
harbor.services.wait/registry.k8s.io/coredns:v1.12.1
harbor.services.wait/registry.k8s.io/pause:3.10.1
harbor.services.wait/registry.k8s.io/etcd:3.6.5-0
  1. 升级 kubeadm
cp /usr/local/bin/kubeadm /usr/local/bin/kubeadm.v1.33.9
mv /root/tmp/kubeadm-1.34.5 /usr/local/bin/kubeadm

kubeadm version

升级集群

  1. 预览升级
kubeadm upgrade apply v1.34.5 --config kubeadm-upgrade.yaml --dry-run

# 最后输出项为成功
[upgrade/successful] Finished dryrunning successfully!
  1. 执行升级
kubeadm upgrade apply v1.34.5 --config kubeadm-upgrade.yaml
  1. 查看版本

root@h22:~# kubectl version
Client Version: v1.33.9
Kustomize Version: v5.6.0
Server Version: v1.34.5

升级 kubelet

参考前面步骤的 kubelet 升级方式,

流程仍然为:

  1. 驱逐节点
  2. 停止 kubelet
  3. 更新并启动 kubelet
  4. 取消驱逐状态
  5. 升级 kubectl

确认版本状态(已经完成一个控制面的更新)

root@h22:~# kubectl get nodes -o wide
NAME   STATUS   ROLES           AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
h21    Ready    <none>          207d   v1.33.9   192.168.5.21   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h22    Ready    control-plane   230d   v1.34.5   192.168.5.22   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h23    Ready    <none>          229d   v1.33.9   192.168.5.23   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3

更新其它工作节点

最终状态

root@h22:~# kubectl get nodes -o wide
NAME   STATUS   ROLES           AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
h21    Ready    <none>          207d   v1.34.5   192.168.5.21   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h22    Ready    control-plane   230d   v1.34.5   192.168.5.22   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h23    Ready    <none>          229d   v1.34.5   192.168.5.23   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3

第三阶段升级 v1.35.2

升级目标: 1.35.2

简述

  1. 升级 kubeadm
  2. 拉取需要的镜像
  3. 升级预览
  4. 执行集群升级
  5. 升级控制面 kubelet
  6. 升级工作节点 kubelet

集群升级


root@h22:~# kubeadm upgrade apply v1.35.2 --config kubeadm-upgrade.yaml
[upgrade] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[upgrade] Use 'kubeadm init phase upload-config kubeadm --config your-config-file' to re-upload it.
W0312 17:31:25.632690 3322048 configset.go:77] Warning: No kubeproxy.config.k8s.io/v1alpha1 config is loaded. Continuing without it: configmaps "kube-proxy" not found
[upgrade/preflight] Running preflight checks
[upgrade] Running cluster health checks
[upgrade/preflight] You have chosen to upgrade the cluster version to "v1.35.2"
[upgrade/versions] Cluster version: v1.34.5
[upgrade/versions] kubeadm version: v1.35.2
[upgrade] Are you sure you want to proceed? [y/N]: y
[upgrade/preflight] Pulling images required for setting up a Kubernetes cluster
[upgrade/preflight] This might take a minute or two, depending on the speed of your internet connection
[upgrade/preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[upgrade/control-plane] Upgrading your static Pod-hosted control plane to version "v1.35.2" (timeout: 5m0s)...
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests3574325544"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Moving new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backing up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2026-03-12-17-31-34/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This can take up to 5m0s
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moving new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backing up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2026-03-12-17-31-34/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This can take up to 5m0s
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moving new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backing up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2026-03-12-17-31-34/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This can take up to 5m0s
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upgrade/control-plane] The control plane instance for this node was successfully upgraded!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upgrade/kubeconfig] The kubeconfig files for this node were successfully upgraded!
W0312 17:33:08.236323 3322048 postupgrade.go:105] Using temporary directory /etc/kubernetes/tmp/kubeadm-kubelet-config-2026-03-12-17-33-08 for kubelet config. To override it set the environment variable KUBEADM_UPGRADE_DRYRUN_DIR
[upgrade] Backing up kubelet config file to /etc/kubernetes/tmp/kubeadm-kubelet-config-2026-03-12-17-33-08/config.yaml
[patches] Applied patch of type "application/strategic-merge-patch+json" to target "kubeletconfiguration"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade/kubelet-config] The kubelet configuration for this node was successfully upgraded!
[upgrade/bootstrap-token] Configuring bootstrap token and cluster-info RBAC rules
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
W0312 17:33:11.435047 3322048 postupgrade.go:203] Using temporary directory /etc/kubernetes/tmp/kubeadm-kubelet-env2676355038 for kubelet env file. To override it set the environment variable KUBEADM_UPGRADE_DRYRUN_DIR
[upgrade] Backing up kubelet env file to /etc/kubernetes/tmp/kubeadm-kubelet-env2676355038/kubeadm-flags.env
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"

[upgrade] SUCCESS! A control plane node of your cluster was upgraded to "v1.35.2".

[upgrade] Now please proceed with upgrading the rest of the nodes by following the right order.

最终完成升级完成状态

root@h22:~# kubectl get nodes -o wide
NAME   STATUS   ROLES           AGE    VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
h21    Ready    <none>          207d   v1.35.2   192.168.5.21   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h22    Ready    control-plane   230d   v1.35.2   192.168.5.22   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3
h23    Ready    <none>          229d   v1.35.2   192.168.5.23   <none>        Ubuntu 24.04.1 LTS   6.8.0-63-generic   containerd://2.1.3

检查列表

集群 upgrade 过程中变更的事项

  1. 证书自动续期, 都延期1年
  2. 更新了 /var/lib/kubelet/config.yaml 的配置
  3. 引导令牌与 RBAC 规则
  4. Addons 插件升级, 如 coredns
  5. apiserver 等容器的版本
# 验证版本
kubeadm version
kubectl version
kubectl get nodes -o wide  # 确认所有节点版本均为 v1.35.2


# 证书有效期 - 延长了1年
kubeadm certs check-expiration

# 如果希望跳过证书续期(如需手动管理证书)
# kubeadm upgrade apply v1.34.x --certificate-renewal=false


# 验证 Cilium 升级状态 - 考虑将 cilium 一并更新
cilium status


# 组件健康
kubectl get componentstatuses
kubectl get pods -n kube-system

功能验证

# 网络功能验证
# 创建测试 Pod 验证跨节点通信
kubectl run test-nginx --image=nginx
kubectl expose pod test-nginx --port=80
kubectl run test-curl --image=curlimages/curl --rm -it --restart=Never -- curl test-nginx

# 现有业务验证

# 检查关键业务 Pod 状态
kubectl get pods --all-namespaces | grep -v Running

DNS 功能验证

wait@ub05:~/local_tmp$ kubectl run test-dns --image=harbor.services.wait/docker.io/busybox:1.37.0 --rm -it --restart=Never -- nslookup kubernetes.default.svc.cluster.local
Server:         10.11.0.10
Address:        10.11.0.10:53


Name:   kubernetes.default.svc.cluster.local
Address: 10.11.0.1

pod "test-dns" deleted from cwx namespace

cilium 状态


wait@ub05:~/local_tmp$ cilium status --wait
    /¯¯\
 /¯¯\__/¯¯\    Cilium:             OK
 \__/¯¯\__/    Operator:           OK
 /¯¯\__/¯¯\    Envoy DaemonSet:    OK
 \__/¯¯\__/    Hubble Relay:       OK
    \__/       ClusterMesh:        disabled

DaemonSet              cilium                   Desired: 3, Ready: 3/3, Available: 3/3
DaemonSet              cilium-envoy             Desired: 3, Ready: 3/3, Available: 3/3
Deployment             cilium-operator          Desired: 2, Ready: 2/2, Available: 2/2
Deployment             hubble-relay             Desired: 1, Ready: 1/1, Available: 1/1
Deployment             hubble-ui                Desired: 1, Ready: 1/1, Available: 1/1
Containers:            cilium                   Running: 3
                       cilium-envoy             Running: 3
                       cilium-operator          Running: 2
                       clustermesh-apiserver
                       hubble-relay             Running: 1
                       hubble-ui                Running: 1
Cluster Pods:          23/23 managed by Cilium
Helm chart version:    1.17.6
Image versions         cilium             registry.services.wait/cilium/cilium:v1.17.6: 3
                       cilium-envoy       registry.services.wait/cilium/cilium-envoy:v1.33.4-1752151664-7c2edb0b44cf95f326d628b837fcdd845102ba68: 3
                       cilium-operator    registry.services.wait/cilium/operator-generic:v1.17.6: 2
                       hubble-relay       registry.services.wait/cilium/hubble-relay:v1.17.6: 1
                       hubble-ui          registry.services.wait/cilium/hubble-ui-backend:v0.13.2: 1
                       hubble-ui          registry.services.wait/cilium/hubble-ui:v0.13.2: 1

回滚预案

  1. 控制面回滚

# 回滚到指定版本
kubeadm upgrade apply v1.33.3 --force

# 如果失败严重, 使用备份恢复
# 恢复静态 Pod 清单
sudo cp -r /etc/kubernetes.bak.< 日期 >/* /etc/kubernetes/

# 重启 kubelet
sudo systemctl restart kubelet
  1. etcd 数据恢复
# 外置的 etcd
etcdctl snapshot restore /backup/etcd-snapshot-< 日期 >.db \
  --data-dir /var/lib/etcd-restore \
  --name <etcd-member-name> \
  --initial-cluster <cluster-config> \
  --initial-cluster-token etcd-cluster

kubeadm 升级命令解释

kubeadm upgrade apply 集群升级

  1. 检查集群状态
  2. 执行版本偏差策略
  3. 拉取镜像
  4. 升级控制面组件或回滚
  5. 应用新的 CoreDNS 和 kube-proxy 清单,并强制创建所有必需的 RBAC 规则
  6. 如果旧文件在 180 天后过期,将创建 API 服务器的新证书和密钥文件并备份旧文件

kubeadm upgrade node 节点升级

控制平节点

  1. 从集群中获取 kubeadm ClusterConfiguration
  2. 备份 kube-apiserver 证书和更新
  3. 升级控制平面组件的静态 Pod 清单
  4. 为本节点升级 kubelet 配置

work节点

  1. 从集群取回 kubeadm ClusterConfiguration; 更新本地的 /var/lib/kubelet/config.yaml 和 /var/lib/kubelet/kubeadm-flags.env
  2. 为本节点升级 kubelet 配置
最后更新于