Upgrade Kubernetes version on Bare Metal Bottlerocket EKS-A cluster fails: kube-vip can't bind IP #6535
Labels
area/providers/tinkerbell
Tinkerbell provider related tasks and issues
area/upgrades
external
An issue, bug or feature request filed from outside the AWS org
Milestone
What happened: Upgrade of a newly created Bare Metal EKS-A cluster to a new Kubernetes version (e.g. 1.26->1.27) for a standalone cluster with Bottlerocket fails. For other node OS, e.g. Ubuntu, it works. The
kube-vip
pod in thekube-system
namespace on the new control plane node reports "listen:listen tcp :2112: bind: address already in use\n". The cluster endpoint IP then cannot be reached any longer by any node ("connect: connection refused"), etcd cannot be reached, making the cluster inaccessible and the upgrade fail.What you expected to happen: Upgrade to succeed.
How to reproduce it (as minimally and precisely as possible): Creating a minimal cluster config with 1 control plane node and 1 worker node, kubernetesVersion 1.26, Tinkerbell provider, and Bottlerocket is sufficient. Then invoke
eksctl anywhere create cluster ...
. After this has succeeded, update the kubernetesVersion to 1.27 and update the osImageURL in the config file. Then invokeeksctl anywhere upgrade cluster ...
. This results in an inaccessible cluster and a failed upgrade.Anything else we need to know?: In my test the new Bottlerocket control plane node showed the following
kube-vip
container logs. The secondkube-vip
container did not start correctly and did not advertise the cluster endpoint IP, which was in my case10.20.73.115
:Environment:
The text was updated successfully, but these errors were encountered: