-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Add support for controller-manager webhook shutdown delay #2601
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dippynark The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @dippynark! |
Hi @dippynark. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Signed-off-by: Luke Addison <lukeaddison785@gmail.com>
Signed-off-by: Luke Addison <lukeaddison785@gmail.com>
121fbc7
to
bf963c8
Compare
pkg/manager/signals/signal.go
Outdated
// of this manager before starting shutdown of any webhook servers to avoid | ||
// receiving connection attempts after closing webhook listeners. If a second | ||
// signal is caught, the program is terminated with exit code 1. | ||
func SetupSignalHandlerWithDelay(delay time.Duration) context.Context { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this? IMHO this doesn't make sense because if this is needed or not is not a property of the binary, but of the environment. Kubernetes allows to use a preStopHook for this and recently even merged native sleep
support for this precise use-case: kubernetes/kubernetes#119026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice, I wasn't aware of this sleep lifecycle hook feature! I agree with this but until Kubernetes v1.28 (I think the first version where the sleep lifecycle hook is available) the application otherwise needs to ship with a sleep binary so I think it'd still be worth having to deal with the long tail of Kubernetes versions.
However, happy for this to be closed and wait for the sleep hook for a native solution!
@dippynark Should we not create a webhook-specific solution? I do remember reading a great article on how to properly shutdown a webserver on Kubernetes; unfortunately, I cannot find it anymore. I think in go-world this translates to:
|
@inteon Nice! When I looked at this I couldn't see how to add some time between triggering shutdown and closing listeners: https://github.com/golang/go/blob/de5b418bea70aaf27de1f47e9b5813940d1e15a4/src/net/http/server.go#L2986-L2989 But didn't know about |
@dippynark I created a POC here: #2607. Feel free to copy that code or leave feedback on that PR. |
@inteon Ah just saw your POC, have also updated this PR. I think we want the delay to be configurable though, because the amount of time we want to wait depends on the kube-proxy I have also not disabled keep-alives before waiting for the shutdown delay because if clients are still trying to open new connections and we disable keep-alives then those connections will only last for a single HTTP request (and there is nothing stopping the client opening a new connection to the same server that is about to be shut down). Instead, I think we want to only wait and keep the server running as normal until clients stop trying to open new connections and then when Also, do we definitely want to stop the readiness probe? If users are using the same endpoint for liveness then the server might be killed early. IMO we should leave the health check alone and rely on the termination of the Pod to signal Kubernetes to stop sending traffic (but not too sure what situations you were thinking of where Kubernetes doesn't know about it). I can copy any code from your POC PR depending on what you think of the above (I haven't done a controller-runtime PR before so would prefer to merge this one if it's all the same to you 😄) |
I do agree with not making the readiness/ liveness probe fail. I think most controllers will only shutdown because Kubernetes told it to shutdown. The only situation in which this might be useful is if there was some kind of critical failure that forces the pod to restart. Currently c/r does not provide a clear readiness <-> liveness split for this probe, so it will likely be added to the liveness probe too, which is undesirable. I do think however that the Please do copy. I just wrote it down as an experiment. I officially give you all the copyrights for that code 😄. |
@inteon Sounds good! I have changed to disable keep-alives so I think the PR is good, as long as you're happy with defaulting |
Another thing that I do not understand about this: The Kube-Apiserver doesn't actually use kube-proxy to reach webhooks and extended apiserves, it directly resolves the endpoints internally which means the described problem shouldn't apply...? |
@@ -101,6 +101,10 @@ type Options struct { | |||
|
|||
// WebhookMux is the multiplexer that handles different webhooks. | |||
WebhookMux *http.ServeMux | |||
|
|||
// ShutdownDelay delays server shutdown to wait for clients to stop opening | |||
// new connections before closing server listeners. Defaults to 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Open question:
Would it make sense to set a > 0 default, eg. 15s which is less than the 30 seconds default graceful shutdown delay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a tricky one for sure, but because it's so specific to the Kubernetes cluster you're running on it doesn't feel right to align to any particular cluster or cloud provider (e.g. I'm not sure what value EKS and AKS use for example).
IMO we should keep it at 0 for backwards compatibility and to ensure that connection draining 'doesn't work' consistently across clusters by default. I think this is also more likely to encourage applications that care a lot about this (e.g. Gatekeeper) to expose a flag to change this to make it easier to reconfigure for the cluster you're running on (i.e. push the decision onto users of controller-runtime since an optimal default isn't clear).
No problems with changing to 15 seconds by default though if you were leaning towards that (that would be better for us anyway because we're using GKE 😄). We could alternatively align to the kube-proxy default of 1 second? But on GKE at least this won't help much.
@alvaroaleman There seems to be a
|
Q: Did I get this correctly that kube-proxy reacts to the deletionTimestamp on the Pod and not only on readiness when dropping a Pod from a Service? |
@sbueringer Yeah, my understanding is that the following actions are triggered simultaneously when a Pod's deletion timestamp is set:
There are more details here: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination Note that there is currently no way to guarentee that all kube-proxy instances have reconciled EndpointSlice updates before shutting down (leading to issues like I described in this PR). The best mechanism I am aware of is to delay shutdown for slightly longer than the |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/remove-lifecycle rotten |
/reopen |
@inteon: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
// Disable HTTP keep-alives to close persistent connections after next | ||
// HTTP request. Clients may reconnect until routes are updated and | ||
// server listeners are closed however this should start gradually | ||
// migrating clients to server instances that are not about to shutdown | ||
srv.SetKeepAlivesEnabled(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a good addition
// Wait before shutting down webhook server | ||
time.Sleep(s.Options.ShutdownDelay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? Reading the intent of the PR, there is a context with a timeout that will be given to shutdown
which allows those connections to be closed, was that not enough?
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Currently the controller-runtime signal handler cancels the context as soon as a termination signal is received:
controller-runtime/pkg/manager/signals/signal.go
Lines 27 to 30 in 1657cf6
This is problematic for controller-managers managing a webhook server since graceful webhook server shutdown may begin before kube-proxy has observed the termination (and reconciled local iptables rules) leading to connection attempts after webhook server listeners have already been closed.
This PR adds support for a
ShutdownDelay
webhook server option allowing users to delay webhook server shutdown and give time for routes to be updated so that clients connect to other server replicas before closing server listeners. In particular, on GKE the kube-proxy--iptables-min-sync-period
flag is currently set to 10 seconds, so a delay duration of 15 seconds might make sense on such clusters.