-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't allow downscales if some shards are unassigned #3883
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In some conditions, for example when a Pod gets killed/restarted right before a downscale happens, the shards on that Pod is reported as UNASSIGNED. At this point it is dangerous to definitely remove the Pod (downscale the cluster) since we can't know for sure the Pod to remove isn't supposed to hold any of the unassigned shards. To avoid that situation, this commit disallows any downscale to happen if some of the shards don't have a node assigned to them (regardless of their status - unassigned or not). The logic to allow a node to be downscaled is rather simple: - all shards must have a node assigned to them - the pod to remove must not have a shard assigned to it This is a rather conservative/safe approach that could be optimized in the future. I tried implementing an e2e test for this, but it's a bit tricky to setup the right way so it consistently fails without this commit. I'm considering the unit test is good enough.
anyasabo
reviewed
Oct 28, 2020
anyasabo
approved these changes
Oct 28, 2020
Co-authored-by: Anya Sabo <anya@sabolicio.us>
run full pr build |
sebgl
added a commit
to sebgl/cloud-on-k8s
that referenced
this pull request
Oct 30, 2020
* Don't allow downscales if some shards are unassigned In some conditions, for example when a Pod gets killed/restarted right before a downscale happens, the shards on that Pod is reported as UNASSIGNED. At this point it is dangerous to definitely remove the Pod (downscale the cluster) since we can't know for sure the Pod to remove isn't supposed to hold any of the unassigned shards. To avoid that situation, this commit disallows any downscale to happen if some of the shards don't have a node assigned to them (regardless of their status - unassigned or not). The logic to allow a node to be downscaled is rather simple: - all shards must have a node assigned to them - the pod to remove must not have a shard assigned to it This is a rather conservative/safe approach that could be optimized in the future. I tried implementing an e2e test for this, but it's a bit tricky to setup the right way so it consistently fails without this commit. I'm considering the unit test is good enough. * Improve comments Co-authored-by: Anya Sabo <anya@sabolicio.us> Co-authored-by: Anya Sabo <anya@sabolicio.us>
sebgl
added a commit
that referenced
this pull request
Oct 30, 2020
* Don't allow downscales if some shards are unassigned In some conditions, for example when a Pod gets killed/restarted right before a downscale happens, the shards on that Pod is reported as UNASSIGNED. At this point it is dangerous to definitely remove the Pod (downscale the cluster) since we can't know for sure the Pod to remove isn't supposed to hold any of the unassigned shards. To avoid that situation, this commit disallows any downscale to happen if some of the shards don't have a node assigned to them (regardless of their status - unassigned or not). The logic to allow a node to be downscaled is rather simple: - all shards must have a node assigned to them - the pod to remove must not have a shard assigned to it This is a rather conservative/safe approach that could be optimized in the future. I tried implementing an e2e test for this, but it's a bit tricky to setup the right way so it consistently fails without this commit. I'm considering the unit test is good enough. * Improve comments Co-authored-by: Anya Sabo <anya@sabolicio.us> Co-authored-by: Anya Sabo <anya@sabolicio.us> Co-authored-by: Anya Sabo <anya@sabolicio.us>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In some conditions, for example when a Pod gets killed/restarted right
before a downscale happens, the shards on that Pod is reported as UNASSIGNED.
At this point it is dangerous to definitely remove the Pod (downscale the cluster)
since we can't know for sure the Pod to remove isn't supposed to hold any of the
unassigned shards.
To avoid that situation, this commit disallows any downscale to happen if
some of the shards don't have a node assigned to them (regardless of their status -
unassigned or not). The logic to allow a node to be downscaled is rather simple:
This is a rather conservative/safe approach that could be optimized in the future.
I tried implementing an e2e test for this, but it's a bit tricky to setup the right way
so it consistently fails without this commit. I'm considering the unit test is good enough.
Fixes #3867.