Upgarde Kubernetes cluster- no shortcuts please

Amit Cohen
4 min readJan 3, 2023

--

One of the developers found an issue fixed in a newer version of Kubernetes. You may need to upgrade the cluster to improve it. Your cluster downtime has to be minimal; does that sound familiar? Let’s dive in to see how to handle such a critical task.

How does cluster upgrade work?

There are two main parts to cluster upgrade: upgrading the primary node and the worker nodes. Once the master node is upgraded, there is no application downtime, but management functionality is unbailable. If one of the pods dies, it won’t be restarted because the controller manager and scheduler are down. If you want to allow management functionality during matter node upgrade, you should design your cluster to have more than one master node in your cluster and upgrade them one by one.

Control Plane

Kubernetes is not one binary but multiple components like apiserver, scheduler, controller-manager, etc.… there are different components on the control plane and worker node. So in an upgrade, it’s essential to know what ingredients are upgraded and into which version. All master node components are common to all versions; they are upgraded with every performance. In addition, we have the etcd on the master node, coreDNS, and the network plugin we are using. etcd and coreDNS have different versions, but they depend on Kubernetes, and since they are all installed by kubeadm, it will upgrade them as well. However, the network plugin must be upgraded separately, as it was established independently from all other components.

Upgradable components with each K8s version is another question, do all these components need the same version? And do we need to upgrade them all at once? What happens if we want to upgrade them separately? There are restrictions to that. kube-apiserver must always be the latest version, while controller-manager and Kube-scheduler can be with one earlier version. Kube-proxy and kubelet can even be two earlier versions. kubelet, as the client, can be one of the three versions.

However, it’s recommended to use the same version on all components.

How to upgrade?

As you probably already know, as Kubernetes admin, the cluster installation is done via kubeadm; it is the same tool for upgrades. kubeadm follows the same version as Kubernetes components, so upgrading the master, we first upgrade the kubeadm itself to the Kubernetes version to which we wish to upgrade. Once this is done, we can execute: kubeadm upgrade applies [version] that will upgrade all the control plane components and renew all cluster certificates. This does not complete the upgrade task as kubeadm does not install or manage kubelet or kubectl, so we must upgrade them separately. Since kubelet is the component responsible for running and scheduling pods or nodes, we need to clear all the nodes from pods while kubelet is being upgraded. And this is called maintenance mode. Using kubectl drain master removes all pods from the node and will be marked as an unscheduled level, meaning no new pods could be scheduled on it. Once we do that, we can upgrade kubectl and kubelet, and right after that, kubectl uncordon master to ready status.

Upgrading the worker

Once the master is upgraded, we now move to the worker onde and upgrade them one by one. Again upgrading the kubeadm tool on them first, as we don’t have any control plane component, we jump to drain the node that removes all pods. These pods will restart on one of the other worker nodes. Same as the master, once it is drained, there will be no scheduler of pods. Once a worker node is completed and upgraded to the new version, we move to the other until all are upgraded. So you will not face downtime if you have at least 2 worker nodes and at least two pod replicas.

Draining Nodes

Let us go deeper into the draining process; once a node is a marked rain, it will be un-schedulable, meaning the scheduler knows not to try and deploy pods on it. Running pods will be evicted and scheduled somewhere else. This is done using kubectl drain [worker name]. There is another command that makes pods stay on the node but prevent the scheduling of new/other pods. kubectl cordon [worker name] allows that; this is useful if the upgrade does not impact the running pods. kubectl un cordon [worker name] that undo and allows scheduling pods on the node; we also use this command to undo from maintenance mode.

When to upgrade?

The question that every admin asks himself is when to upgrade. Why? How often? Officially, Kubernetes support up to 3 older versions, so if you want to support you, consider it. A fix would also be a factor for an upgrade. However, the main reason to upgrade your cluster is that you can not upgrade by moving two versions in a single upgrade, meaning you will have to run several promotions, which is a hell of a task.

--

--

Amit Cohen
Amit Cohen

Written by Amit Cohen

A product leader with exceptional skills and strategic acumen, possessing vast expertise in cloud orchestration, cloud security, and networking.

No responses yet