Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure consistency of system flag on IndexMetadata after diff is applied #16644

Merged
merged 7 commits into from
Dec 4, 2024

Conversation

cwperks
Copy link
Member

@cwperks cwperks commented Nov 14, 2024

Description

This PR ensures the consistency of the isSystem flag between leader and follower after the IndexMetadataDiff is applied when computing new cluster state.

This inconsistency can appear during rolling upgrades if nodes of a new version know of an index as a system index, but nodes of a previous version treat the same index as a regular index. For example, if a plugin retroactively declares an index to be a system index through SystemIndexPlugin.getSystemIndexDescriptors (Example PR) there can be a discrepancy in the IndexMetadata (for the same version) between the cluster manager node and other nodes in the cluster.

The reason this happens is because when the IndexMetadataDiff is applied, its taking the value from the previous metadata on the node instead of the new metadata that it has received from the incoming publishRequest. As a result, the diff here is being computed incorrectly. It should take the value from the diff instead of from the previous index metadata.

Related Issues

Resolves #16643

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix. Ship it!

Copy link
Contributor

✅ Gradle check result for b526a45: SUCCESS

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.07%. Comparing base (b1bf72f) to head (b04bd2b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16644      +/-   ##
============================================
+ Coverage     72.04%   72.07%   +0.02%     
- Complexity    65157    65201      +44     
============================================
  Files          5318     5318              
  Lines        303993   303993              
  Branches      43990    43990              
============================================
+ Hits         219026   219107      +81     
+ Misses        66962    66919      -43     
+ Partials      18005    17967      -38     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@owaiskazi19
Copy link
Member

@cwperks this change needs a changelog entry

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks
Copy link
Member Author

cwperks commented Nov 14, 2024

@cwperks this change needs a changelog entry

Added

@owaiskazi19 owaiskazi19 added the backport 2.x Backport to 2.x branch label Nov 14, 2024
Copy link
Contributor

✅ Gradle check result for 7ff0651: SUCCESS

@cwperks
Copy link
Member Author

cwperks commented Nov 19, 2024

Resolved conflicts in CHANGELOG

Copy link
Contributor

✅ Gradle check result for 76bde0e: SUCCESS

@cwperks
Copy link
Member Author

cwperks commented Nov 25, 2024

Resolved conflicts on CHANGELOG again. Can a maintainer help to merge this PR?

Copy link
Contributor

✅ Gradle check result for 8a8ee60: SUCCESS

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Copy link
Contributor

github-actions bot commented Dec 2, 2024

❕ Gradle check result for 31f7b12: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: Craig Perkins <cwperx@amazon.com>
@cwperks
Copy link
Member Author

cwperks commented Dec 3, 2024

Resolved conflicts in CHANGELOG

Copy link
Contributor

github-actions bot commented Dec 3, 2024

❕ Gradle check result for b04bd2b: UNSTABLE

  • TEST FAILURES:
      2 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock
      1 org.opensearch.search.SearchTimeoutIT.testSimpleTimeout {p0={"search.concurrent_segment_search.enabled":"false"}}

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@Bukhtawar Bukhtawar merged commit d199096 into opensearch-project:main Dec 4, 2024
38 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-16644-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 d1990962f37e65c4645a171f60867d0b971b83c6
# Push it to GitHub
git push --set-upstream origin backport/backport-16644-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-16644-to-2.x.

@cwperks
Copy link
Member Author

cwperks commented Dec 4, 2024

Creating a manual backport now

@cwperks
Copy link
Member Author

cwperks commented Dec 4, 2024

Created manual backport to resolve conflict in CHANGELOG: #16777

Bukhtawar pushed a commit that referenced this pull request Dec 10, 2024
…ied (#16644) (#16777)

* Ensure consistency of system flag on IndexMetadata after diff is applied

Signed-off-by: Craig Perkins <cwperx@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working Plugins
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] System Indices breaking cluster state invariance after upgrade from 2.15
5 participants