-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 prevent leader election when shutting down a non-elected manager #2724
🐛 prevent leader election when shutting down a non-elected manager #2724
Conversation
|
Welcome @alexandremahdhaoui! |
Hi @alexandremahdhaoui. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
4ef9863
to
49681ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
@@ -518,6 +518,8 @@ func (cm *controllerManager) engageStopProcedure(stopComplete <-chan struct{}) e | |||
|
|||
// Stop all the leader election runnables, which includes reconcilers. | |||
cm.logger.Info("Stopping and waiting for leader election runnables") | |||
// Prevent leader election when shutting down a non-elected manager | |||
cm.runnables.LeaderElection.startOnce.Do(func() {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the test in 06d575d.
49681ff
to
06d575d
Compare
|
||
Expect(m1.Add(RunnableFunc(func(ctx context.Context) error { | ||
defer GinkgoRecover() | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you simulate a Runnable that blocks until the context is canceled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
& make sure it is stopped before manager.Start returns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Edit] Done
aa2bbaa
to
66e510a
Compare
When leader election is enabled, a non-leader manager would never start the LeaderElection runnable group. Thus, as the shutdown process calls the sync.Once Start func of the runnableGroup; it will start a new election. This change ensures `Start` is ineffective during shutdown. The test ensures the LeaderElection runnableGroup is not started during shutdown. Signed-off-by: Alexandre Mahdhaoui <alexandre.mahdhaoui@gmail.com>
66e510a
to
d0c20b4
Compare
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cherrypick release-0.17
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexandremahdhaoui, alvaroaleman The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherrypick release-0.17 |
@alvaroaleman: new pull request created: #2752 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Context
This PR should address a bug where the runnable group for Leader Election would start while shutting down the manager: #2719
What's happening?
controller-runtime/pkg/manager/internal.go
Line 344 in 395cfc7
controller-runtime/pkg/manager/internal.go
Line 521 in 395cfc7
controller-runtime/pkg/manager/internal.go
Line 521 in 395cfc7
controller-runtime/pkg/manager/runnable_group.go
Line 277 in 395cfc7
However when leader election is enabled, a non-leader manager will never start the LeaderElection runnable group. Thus, the sync.Once allow starting a new election during shutdown. The change in this PR ensures
Start
is ineffective during shutdown.manager.LeaderElection
'srunnableGroup.Start()
controller-runtime/pkg/manager/internal.go
Line 557 in 395cfc7
controller-runtime/pkg/manager/internal.go
Line 568 in 395cfc7
controller-runtime/pkg/manager/internal.go
Line 429 in 395cfc7
Changes
sync.Once
func associated with the LeaderElectionrunnableGroup
to prevent it from starting a new leader election.