fix(kuma-cp): graceful components #4277
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
To implement a graceful shutdown, I had to disable signal propagation from Kuma DP to Envoy. That means that we need to take care of sending the signal ourselves. This has a disadvantage that if we won't do this, Envoy (and other subprocesses) will still be running. This is not a problem on Kubernetes since the whole Pod will go down, but it is a problem on VMs.
There was a race that we were not waiting until all components are stopped. This could result in stopping Kuma DP process without closing context that is passed to Envoy command which results in not sending SIGTERM.
I implemented
GracefulComponent
that adds additional optional logic of waiting until the component is finished. The logic is specific to every component. In the case of Envoy and CoreDNS, the main process waits until the command finish.I considered adding
WaitForDone
toComponent
, but this would result in empty implementations for tens of components, which is also not that relevant if they are not manage external resources. I expect that moving this toComponent
will be done in #1001Issues resolved
Not reported, I noticed it while testing.
Documentation
No extra docs.
Testing
Backwards compatibility
- [ ] UpdateUPGRADE.md
with any steps users will need to take when upgrading.