-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Controlling discovery for decommissioned nodes #4590
Merged
shwetathareja
merged 46 commits into
opensearch-project:main
from
imRishN:decommission/controlled-discovery
Oct 7, 2022
Merged
Changes from 5 commits
Commits
Show all changes
46 commits
Select commit
Hold shift + click to select a range
a015a81
Controlling discovery for decommissioned nodes
imRishN 0a3f262
Fix spotless check
imRishN 24fcaaf
Add changelog
imRishN c9e0f5b
Empty-Commit
imRishN bb0547d
Add basic UT
imRishN 2ea4dc9
Add Consumer instead of listener
imRishN 2582c4b
Remove UT
imRishN 70b7d48
Refactor
imRishN a30b83d
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN da2fa40
Fix spotless check
imRishN 61bb893
Improve logging msg
imRishN d872065
Fix spotless check
imRishN bee48b4
Add log msg in join helper
imRishN d02f9ae
Update peer finder interval to 2 min during decommission
imRishN 6c0ce6c
Move flag to coordinator
imRishN 01f2d70
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN 8f82360
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN 1c497b7
Add UT
imRishN e4d1354
Fix spotless check
imRishN 1ac690d
Prevent join at join execute task and at coordinator. Add UTs
imRishN f448d78
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN adc4a8c
Move validator appropriately
imRishN 69f2784
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN c42586b
Fix spotless check
imRishN a53e578
Make method pkg private
imRishN edff382
Ensure decommissioned node don't become leader
imRishN 46c49b0
Add static helper for decommission flow
imRishN 63f6c00
Updates in pre voting round
imRishN b8cf3fe
Move commission check
imRishN 6a4c16f
Move commission check
imRishN 50a6e67
Move helpers to Service
imRishN c9db6b1
Fix executor
imRishN bb6f573
Remove UT
imRishN 8dd4457
Fix spotless check
imRishN 19c524d
Minor
imRishN c5b1c0d
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN 41cc099
Add built in join validator
imRishN 260641a
Fix
imRishN 8e161a6
Add UT for join
imRishN 4a932b7
Merge remote-tracking branch 'upstream/main' into decommission/contro…
imRishN 05d073e
Fix spotless check
imRishN 3b6c1b7
Changes in coordinator
imRishN 8b94b8d
Add UT for coordinator
imRishN 165b961
Fix spotless check
imRishN e4e1b5c
Add test for execute method of task
imRishN 0b3f106
Empty-Commit
imRishN File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current flow is:
The above logic will not work if the current master eligible node is not active leader and is a candidate accumulating joins Code Ref
Suggestion:
Ideally the decommissioning check should be executed at
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, can you please help me understand that when current leader eligible node is not active leader and is just accumulating joins, we just skip the validator code flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out @shwetathareja. As per your suggestions, implementing the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made these changes.
STATE_NOT_RECOVERED_BLOCK
)@shwetathareja , lmk if this looks ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I explored more on join validators -
I think we can keep it as built in join validators, and Coordinator#handleJoinRequest validates all the checks here. So handle join request would be covered here. Another case I see where keeping it at BuiltinValidator would help is during handlePublishRequest. Ideally we would never hit this case as it runs only on Mode.Leader but I think for any such cases in future, it would be good to have all the validators at one place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Built validator delays the check which can be checked early on during handleJoinRequest itself. Can you help me understand the handlePublishRequest scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the snippet from handleJoinRequest of Coordinator. If you see, this method is also running the same built in validators
onJoinValidators.forEach(a -> a.accept(joinRequest.getSourceNode(), stateForJoinValidation));
. And hence the concern with early checking is actually already resolved hereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might never actually hit the
handlePublishRequest
scenario as this would be run only forLeader
mode. But what I was trying to say is plugins put their join validators inonJoinValidators
and then Coordinator adds all other validators to thisonJoinValidators
. And then in handlePublishRequest and handleJoinRequest we validate with whatever validatorsonJoinValidators
has. Hence, keeping all the validators centrally inonJoinValidators
might be goodThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handlePublishRequest is executed on the node joining the master. But with the check inside
handleJoinRequest
master will reject the joins upfront.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I meant the built in validators are also executed in handleJoinRequest