HA clusters don't reboot properly #1689

BenTheElder · 2020-06-25T08:03:19Z

first reported in #1685
tracking in an updated bug.

reproduce with:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: control-plane
- role: control-plane

+ restart docker.

BenTheElder · 2020-06-25T08:07:29Z

Still needs root causing, but multiple user reports. We should fix this.

ozbillwang · 2020-07-03T01:20:55Z

yes, I hit the same issue today.

any workaround I can manually fix it? I spent a little bit long time to set up the test KIND environment, I used it for a while. I don't want to recreate it.

Any way to restore it back?

Another thing which not sure if related to this problem.

yesterday I upgraded KIND version from 0.7 to 0.8.1. My old nodes used to be 1.17.0, but today, after I restart Docker service, it becomes to kindest/node:v1.18.2 .

BenTheElder · 2020-07-03T04:05:02Z

I haven't looked into this issue yet.

Regarding the node versions, please read the release notes about the changes, and see the usage and user guide for how to change it.

BenTheElder · 2020-07-06T14:37:46Z

This has never worked. 0.7 and down did not survive reboots for *any* configuration. 0.8+ apparently doesn't survive reboots for "HA" clusters.

…

On Thu, Jul 2, 2020, 18:21 Bill Wang ***@***.***> wrote: yes, I hit the same issue today. any workaround I can manually fix it? I spent a little bit long time to set up the test KIND environment, I don't want to recreate it. Any way to restore it back? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1689 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHADKYPW3VQJYFXRZSV4DLRZUXAHANCNFSM4OICEZFQ> .

BenTheElder · 2020-08-27T07:36:01Z

cc @aojea you were recently looking at the loadbalancer networking

aojea · 2020-08-27T07:37:02Z

/assign

aojea · 2020-08-27T08:01:42Z

it is more complicated than the load balancer, the control plane nodes has different ips and the cluster does not come up

2020-08-27 07:58:58.054356 E | etcdserver: publish error: etcdserver: request timed out
2020-08-27 07:58:58.061687 W | rafthttp: health check for peer 6dd029603bf5e797 could not connect: x509: certificate is valid for 172.18.0.7, 127.0.0.1, ::1, not 172.18.0.5
2020-08-27 07:58:58.061717 W | rafthttp: health check for peer 6dd029603bf5e797 could not connect: x509: certificate is valid for 172.18.0.7, 127.0.0.1, ::1, not 172.18.0.5
2020-08-27 07:58:58.063416 W | rafthttp: health check for peer 2b4992c658e42934 could not connect: dial tcp 172.18.0.7:2380: connect: no route to host
2020-08-27 07:58:58.063454 W | rafthttp: health check for peer 2b4992c658e42934 could not connect: dial tcp 172.18.0.7:2380: connect: no route to host

seems we should use hostnames on the certificates to avoid this

BenTheElder · 2020-08-27T16:25:37Z

We do where we can already. IIRC etcd won't use hostnames.

…

On Thu, Aug 27, 2020, 01:01 Antonio Ojea ***@***.***> wrote: it is more complicated than the load balancer, the control plane nodes has different ips and the cluster does not come up 2020-08-27 07:58:58.054356 E | etcdserver: publish error: etcdserver: request timed out 2020-08-27 07:58:58.061687 W | rafthttp: health check for peer 6dd029603bf5e797 could not connect: x509: certificate is valid for 172.18.0.7, 127.0.0.1, ::1, not 172.18.0.5 2020-08-27 07:58:58.061717 W | rafthttp: health check for peer 6dd029603bf5e797 could not connect: x509: certificate is valid for 172.18.0.7, 127.0.0.1, ::1, not 172.18.0.5 2020-08-27 07:58:58.063416 W | rafthttp: health check for peer 2b4992c658e42934 could not connect: dial tcp 172.18.0.7:2380: connect: no route to host 2020-08-27 07:58:58.063454 W | rafthttp: health check for peer 2b4992c658e42934 could not connect: dial tcp 172.18.0.7:2380: connect: no route to host seems we should use hostnames to sign certificates to avoid this — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1689 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHADKZROGMZP7NU3UZY3D3SCYHHJANCNFSM4OICEZFQ> .

shlomibendavid · 2020-08-30T18:27:06Z

Same issue here.

my machine environment:

macOS High Sierra v10.13.6
docker: 2.3.0.4
engine: 19.03.12
kubernetes: v1.16.5

kind environment:

kind-control-plane
kind-control-plane2
kind-control-plane3
kind-worker
kind-worker2
kind-external-load-balancer (haproxy)

after docker reboot:
kubectl get pods

output:
Unable to connect to the server: EOF

Is there any workaround for this issue?

BTW - when running with only one kind-control-plane the reboot passed successfully.

BenTheElder · 2020-08-30T19:54:34Z

There's no work around, rebooting HA (multiple control plane) clusters has never been supported and does not appear to be trivial to fix. #1689 (comment)

RolandMa1986 · 2020-10-30T09:49:48Z

Hi @BenTheElder, I guess this issue is caused by the Nodes' IP are changed during restarting docker. One possible solution is to assign a fixed IP to the Nodes. That requires 2 steps:

Create a network with a subnet, 'docker network create --subnet=172.18.0.0/24 kind`
Start the Node with fixed IP, docker run --network kind --ip 172.18.0.6 -d nginx

aojea · 2020-10-30T10:46:10Z

@RolandMa1986 thanks for the suggestion, but we discarded that idea before because we'll need to implement an IPAM in KIND.

Also, we'll need to keep status of all the KIND clusters to handle reboots avoid conflicts with new clusters or new containers that can be created in the bridge.

RolandMa1986 · 2020-11-02T10:36:45Z

Thanks, @aojea
I want to know more details about the IPAM approach. Will it be a CNM plugin or CNI plugin?

aojea · 2020-11-02T10:45:47Z

Thanks, @aojea
I want to know more details about the IPAM approach. Will it be a CNM plugin or CNI plugin?

it depends on the provider, currently KIND uses docker as default, that means CNM ...
but podman is in the roadman that uses CNI, #1778

as you can see this is an area that will require a lot of effort to support, honestly, I don't see that we want to invest much on this ... Ben can correct me if I'm wrong

BenTheElder · 2020-11-02T17:49:39Z

I don't think that's a good approach. If we create non-standard IPAM this will create a headache for users vs their existing ability to configure docker today.

Additionally, this approach still does not guarantee an address, and you have concurrency issues with clusters using a remote docker (where will you store and lock the IPAM data?), which otherwise works fine for users today.
EDIT: please search for past discussion, I'd rather not re-hash that entire discussion here.
EDIT2: thanks for the suggestion though 🙃

We can probably instead re-roll the etcd peer configuration and necessary certs on restart, but this is very low priority.

The main reason to support clusters through reboot is long lived development clusters for users building applications, which should not be using "HA" clusters. Otherwise for testing / disposable clusters, this is a non-issue.

BenTheElder · 2021-02-06T02:03:59Z

see more here on why the "ipam" approach is not super tenable: #2045 (comment)

velcrine · 2021-07-09T11:31:14Z

i faced a pretty different issue, replicasets were not creating pods, when pod is deleted. Deployments were not creating replicasets, after i restarted the machine

BenTheElder · 2021-07-09T18:30:31Z

@velcrine that's actually a variation on the issues in #2045

HA has a different additional problem in that the loadbalancer causes issues with the API being reachable after restart, in which case you wouldn't even be able to query for those problems.

FWIW regarding HA Nobody is working on or using this feature much and it's simplistic / not fully designed. This issue is unlikely to see work anytime soon. (priority/backlog)

The other issue (#2045) is one I'm sure someone would work on except nobody has posited a good solution we can agree on yet or root caused the issues.

seguidor777 · 2021-07-29T05:23:38Z

Hi @BenTheElder I know that using DNS names is the cleanest solution for issue #2045

However I am using this script as a workaround to use static IPs for the nodes communication

I have restarted my cluster several times and it has worked fine so far

aojea · 2021-07-29T05:39:43Z

I have restarted my cluster several times and it has worked fine so far

Users may have multiple clusters and that is hard to support, however, your script is great, I think that it also can solve the problems of snapshotting HA clusters.

BenTheElder · 2021-07-29T17:46:18Z

That's a neat script!

It's unfortunately not super workable as an approach to a built-in solution though. Users creating clusters concurrently in CI (and potentially with a "remote" daemon due to containerized CI) are very important to us and this approach is not safe there.

velcrine · 2021-07-29T17:59:24Z

can't we extend the kind config to take ip-node mapping as an optional parameter: to only those who know what they are doing.
It can then completely replace every solution to perfectly fit in all needs.

BenTheElder · 2021-07-29T18:27:59Z

can't we extend the kind config to take ip-node mapping as an optional parameter: to only those who know what they are doing.
It can then completely replace every solution to perfectly fit in all needs.

This is not without its own drawbacks.

Not necessarily portable across node backends (e.g. out of our current options podman cannot do at least ipv6 this way).
Does not solve the need to set a reserved IP range in the kind network (so you will still need to do hacks outside of the kind tool...)
Adds infrequently used and untested codepath(s). (We are not going to add yet another CI job to exercise this, we have too many as-is and we have no need for this upstream https://kind.sigs.k8s.io/docs/contributing/project-scope/).

Multi-node clusters are a necessity for testing Kubernetes itself (where we expect clusters to be disposable over the course of developing some change to Kubernetes).
For development of applications, we expect single node clusters to be most reasonable (and this is the case where it may make sense to persist them, though we'd still encourage regularly testing from a clean state).

The case of:

Requirement for multi-node
Requirement for persistence
Frequent reboots

Seems rather rare and I'm not sure it outweighs adding a broken partial solution that people will then depend on in the future even if we find some better design.

I'm not saying we definitely couldn't do this, but I wouldn't jump to doing it today.

BenTheElder · 2021-07-29T18:41:06Z

k3d seems to have done something about this here: k3d-io/k3d#550 (comment) which links back to this issue in our repo.

With --subnet auto, k3d will create a fake docker network to get a subnet auto-assigned by docker that it can use.

This looks to me like a broken approach to identifying an available subnet (there is at minimum a race between acquiring / deleting the "fake" network and creating the real one with two clusters), but I'm also unclear as of yet if the IP range is used on a per-cluster network or IPs outside of another network's range are used on that network.

It may be worth digging into the approach there more.

velcrine · 2021-07-30T03:14:04Z

this is where plugins help!! Anyways, it must be mentioned in the quick start/ other doc, that multinode will not survive restart. When I first face the issue, it was hard time debugging.

BenTheElder · 2021-07-30T03:48:40Z

We document this sort of thing at https://kind.sigs.k8s.io/docs/user/known-issues/ which the quick start links to prominently, but it seems this issue hasn't made it there yet. Earlier versions did not support host restart at all, it wasn't in scope early in the project.

MarkLFT · 2022-04-26T02:24:38Z

this is where plugins help!! Anyways, it must be mentioned in the quick start/ other doc, that multinode will not survive restart. When I first face the issue, it was hard time debugging.

Can I second this, I just spent several days building a multi-node cluster, then on the first reboot, effectively lost the lot. Not best pleased, especially when after researching my problem, finding this is a known issue. For the sake of the sanity of others, can someone please put a simple warning about this in the known issues section of the Kind documentation.

BenTheElder · 2022-05-03T19:29:30Z

FWIW:

most use cases should be using single node clusters, there is no resource isolation and unless you have very particular needs involving multi-node behavior you will be better served with single node clusters
most use cases shouldn't be using permanent long lived clusters
no use cases should only have state in the kind cluster (!), this is not a durable approach
because (1) (2) (3) originally we didn't test or support reboot at all, these are meant to be local ephemeral test clusters
the multi-node issue is multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045, this one is meant for discussion specific to problems with multiple control-plane nodes
recent discussion in Fix multi-node cluster not working after restarting docker #2671 on how we might fix it

For the sake of the sanity of others, can someone please put a simple warning about this in the known issues section of the Kind documentation.

We have a detailed contributing guide including how to contribute to the docs, the known issues page is written in markdown in this repo. No tools other than git / github / markdown text are required.

victor-sudakov · 2022-05-04T16:28:13Z

most use cases should be using single node clusters

@BenTheElder what do you imply by single vs multi-node clusters referring to Kind? You mean multiple control-plane nodes or multiple worker ones? I'd never need multiple control-plane nodes in Kind, but I sometimes do need multiple worker nodes to test tolerances, affinities etc. And I would like the clusters with multiple worker nodes to survive a reboot.

BenTheElder · 2022-05-04T16:42:59Z

Yes, affinities and tolerances are a case for using multi-node and that issue is #2405

victor-sudakov · 2022-05-04T18:37:25Z

Please for the most stupid of us, tell what you understand by multi-node. Is a cluster with multiple worker nodes and one control-plane node still a multi-node?

Yes, affinities and tolerances are a case for using multi-node and that issue is #2405

Sorry did not see anything relevant there. Maybe it's just me.

BenTheElder · 2022-05-04T19:06:07Z

Please for the most stupid of us, tell what you understand by multi-node. Is a cluster with multiple worker nodes and one control-plane node still a multi-node?

Multi-node has multiple nodes.

This issue is specifically about problems with clusters that have multiple control-plane nodes ("HA").

Solving issues related to multi-node reboots in general will likely leave multi-control-plane specific issues (specifically, around the loadbalancer).

Sorry did not see anything relevant there. Maybe it's just me.

I typoed #2045 on mobile, but it is already linked and discussed in the previous comment #1689 (comment)

BenTheElder · 2022-05-26T00:02:27Z

#2045 (multi-node restart, not multi-control-plane-node) should be fixed at HEAD thanks to the patient work of @tnqn, have not dug into multi-control-plane node yet.
Probably won't be able to immediately, perhaps next week.

BenTheElder · 2023-05-02T17:44:55Z

I think #2775 has the general right idea of "just fixup the IP(s) on startup" but would need a bit more thought for HA clusters. In general "HA" / multi-control-plane-node kind clusters could use more thought. We've had little demand for them so far, Kubernetes has surprisingly little CI for this at the moment.

ak2766 · 2024-01-18T02:39:57Z

Here's what I've been doing to get my HA kind cluster back up after reading this post.

After rebooting the host, the cluster did not come up as expected. Interestingly, external load balancer is not restarted automatically on reboot and requires a manual start - hmm! I then took note of the load balancer IP from the kube config file, and the control-plane IP's from what etcd was advertising as on each node. I then followed these steps to bring it all up again:

disconnect network on nodes
- docker network disconnect kind <node-name>
reconnected network on all nodes
- docker network connect --ip <previously noted above> kind <node-name>
- NOTE: Assign whatever IP you want for the worker nodes as long as they don't conflict.
restart all nodes
- docker restart <node-name>

NOTE: Rebooting host or restarting docker after this manual IP assignment results in the nodes now remembering their IP addresses. Interestingly though, I occasionally need to restart some nodes to resolve crash loops.

EDIT: Adding my kind config

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: control-plane
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: control-plane
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: worker
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: worker
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
- role: worker
  image: kindest/node:v1.27.3@sha256:3966ac761ae0136263ffdb6cfd4db23ef8a83cba8a463690e98317add2c9ba72
networking:
  kubeProxyMode: "ipvs"

ak2766 · 2024-01-18T03:07:22Z

Replying to myself:

I believe the first time I did this I saved the docker network state - docker network inspect kind > /tmp/b4-docker-reboot - and did not get the IP's from the etcd advertisement as stated above. I just tried to validate my statements above and noticed that etcd advertisement also changes to the new IP after docker reboot. Or maybe this has changed in recent kind versions - hmm.

BenTheElder · 2024-01-18T05:55:59Z

#2775 as mentioned in #1689 (comment) above is pretty recent and does involve patching the local component manifests' IP for the node's last => current IP.

ak2766 · 2024-01-18T08:22:28Z

Thanks @BenTheElder - As I trawled more regarding this issue, I came across this post in the other closed issue that states the same thing.

Apologies for flooding...

BenTheElder added the kind/bug Categorizes issue or PR as related to a bug. label Jun 25, 2020

BenTheElder mentioned this issue Jun 25, 2020

When after restart docker, kind cluster could't connect #1685

Closed

BenTheElder added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 25, 2020

BenTheElder added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 20, 2020

k8s-ci-robot assigned aojea Aug 27, 2020

aojea removed their assignment Sep 9, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2021

BenTheElder added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 6, 2021

kubernetes-sigs deleted a comment from fejta-bot Feb 6, 2021

iwilltry42 mentioned this issue Apr 7, 2021

[FEATURE] k3d IPAM (to prevent etcd failures) k3d-io/k3d#550

Closed

BenTheElder mentioned this issue Jun 26, 2021

unable to retrieve multi node cluster if my pc restated. #2332

Closed

This was referenced Sep 8, 2021

Containers don't restart after CAPD install vmware-tanzu/community-edition#832

Closed

Improvements for etcd liveness probes kubernetes/kubeadm#2567

Closed

BenTheElder mentioned this issue Feb 18, 2022

Cluster doesn't restart when docker restarts #148

Closed

BenTheElder mentioned this issue Apr 21, 2022

Add start and stop command to kind. #2715

Open

BenTheElder mentioned this issue May 26, 2022

multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045

Closed

LukeShortCloud mentioned this issue Jun 24, 2022

[virtualization][kubernetes_administration] kind 0.15.0 supports reboots of multi-node clusters LukeShortCloud/rootpages#777

Open

BenTheElder mentioned this issue May 2, 2023

Fix multi-node cluster not working after restarting docker #2671

Closed

stmcginnis mentioned this issue Dec 20, 2023

command add start cluster #3458

Closed

HA clusters don't reboot properly #1689

HA clusters don't reboot properly #1689

Comments

BenTheElder commented Jun 25, 2020

BenTheElder commented Jun 25, 2020

ozbillwang commented Jul 3, 2020 • edited Loading

BenTheElder commented Jul 3, 2020

BenTheElder commented Jul 6, 2020 via email

BenTheElder commented Aug 27, 2020

aojea commented Aug 27, 2020

aojea commented Aug 27, 2020 • edited Loading

BenTheElder commented Aug 27, 2020 via email

shlomibendavid commented Aug 30, 2020 • edited Loading

my machine environment:

kind environment:

BenTheElder commented Aug 30, 2020 • edited Loading

RolandMa1986 commented Oct 30, 2020

aojea commented Oct 30, 2020

RolandMa1986 commented Nov 2, 2020

aojea commented Nov 2, 2020

BenTheElder commented Nov 2, 2020 • edited Loading

BenTheElder commented Feb 6, 2021

velcrine commented Jul 9, 2021

BenTheElder commented Jul 9, 2021 • edited Loading

seguidor777 commented Jul 29, 2021

aojea commented Jul 29, 2021

BenTheElder commented Jul 29, 2021

velcrine commented Jul 29, 2021 • edited Loading

BenTheElder commented Jul 29, 2021

BenTheElder commented Jul 29, 2021

velcrine commented Jul 30, 2021 • edited Loading

BenTheElder commented Jul 30, 2021

MarkLFT commented Apr 26, 2022

BenTheElder commented May 3, 2022

victor-sudakov commented May 4, 2022

BenTheElder commented May 4, 2022

victor-sudakov commented May 4, 2022

BenTheElder commented May 4, 2022 • edited Loading

BenTheElder commented May 26, 2022

BenTheElder commented May 2, 2023

ak2766 commented Jan 18, 2024 • edited Loading

ak2766 commented Jan 18, 2024 • edited Loading

BenTheElder commented Jan 18, 2024

ak2766 commented Jan 18, 2024

ozbillwang commented Jul 3, 2020 •

edited

Loading

aojea commented Aug 27, 2020 •

edited

Loading

shlomibendavid commented Aug 30, 2020 •

edited

Loading

BenTheElder commented Aug 30, 2020 •

edited

Loading

BenTheElder commented Nov 2, 2020 •

edited

Loading

BenTheElder commented Jul 9, 2021 •

edited

Loading

velcrine commented Jul 29, 2021 •

edited

Loading

velcrine commented Jul 30, 2021 •

edited

Loading

BenTheElder commented May 4, 2022 •

edited

Loading

ak2766 commented Jan 18, 2024 •

edited

Loading

ak2766 commented Jan 18, 2024 •

edited

Loading