-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make snap service start more robust. #630
Conversation
69410a1
to
bb9c7b5
Compare
"kube-controller-manager", | ||
"kube-proxy", | ||
"kube-scheduler", | ||
"kubelet", | ||
"kube-apiserver", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is kubelet not dependent on the api-server to start first? It needs to register nodes with the api-server?
Should we rather wait to start kubelet ony after the api-server is up to avoid the race condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubelet will wait until the apiserver is up - this is not a problem.
If we would wait until the apiserver is up, we would not avoid the problem as the node-controller could still randomly kick in and kill out snap startup.
The only preferred way to handle this IMHO is to make sure the controllers do not run before all snap services are settled.
I'm open for better ideas, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, waiting for snap services introduces a whole lot of implicit dependencies which we really don't want in the code (see updated description). Let's just retry the hook snap start instead and keep it simple.
f03b706
to
136336e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Summary
This PR fixed a race-condition that could pop up when bootstrapping or joining a control-plane node.
Details
On startup, the controllers (node controller etc.) are prevented from running by the WaitGroup
a.readyWG
. This WG count is set to 1 and only decremented in markNodeReay. This function runs in a separate go routine and will wait until the apiserver is reachable and then decrement the WG.Also on startup, the boostrap or join hook is executed, which at its end starts all snap services sequentially. The services are defined here. The
kube-apiserver
is started second in the list,kubelet
is started last.So, the race-condition happens if kube-apiserver is ready,
markReadyNode
sees this (remember that this runs in a separate routine) and marks the node as ready. This will start the controllers, including thenode-configuration controller
which in turn will restart the kubelet in its reconcilation loop. It may now happen that the hook and node-controller try to start/restart the kubelet at the same time which fails.This PR adds a couple of improvements to prevent this:
inmarkNodeReady
wait until all expected snap services are upNote: Initially I wanted to monitor in the
markNodeReady
function if all snap services are up. However, this introduces a lot of implicit dependencies, e.g. you need to wait for different services on cp and worker. For that, we need to check the worker marker file which is also set on bootstrap and may cause subtle bugs if the order is wrong.