Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Update index settings on shard movement during remote store migration #13316

Merged
merged 26 commits into from
Apr 30, 2024

Conversation

shourya035
Copy link
Member

@shourya035 shourya035 commented Apr 20, 2024

Description

Reopening this since #13253 got closed during a rebase from main

The IndexMetadataUpdater logic runs on the cluster manager and applies changes in shard state triggered by AllocationService. Depending on the state of the shard after AllocationService decisions, the IndexMetadataUpdater would currently do stuff like update in-sync allocation IDs or bump primary terms in the IndexMetadata which is the cluster manager then publishes to the data nodes as a state update. The IndicesClusterStateService responds to these changes and then performs relevant actions like creating or updating the existing IndexShard instance. We are utilizing this code path and adding logic within the IndexMetadataUpdater to:

  • Append remote store path based index metadata when the first primary shard copy moves over to remote
  • Apply remote store based index settings once all shard copies of an index moves over to remote store enabled nodes.

Currently this logic would only execute when the cluster is in mixed mode and direction for migration is remote_store

Since the updated metadata would be handed over from the cluster manager to the data nodes during a state publication, this helps us in:

and

  • Remote Store features like shallow snapshots can be enabled as soon as all available shard copies of an index moves over to the remote store enabled nodes.

Apart from this:

  • Updated the existing ITs under RemoteDualReplicationIT to accommodate the changes being introduced through this PR
  • Made additional validations alongside Add validation while updating CompatibilityMode setting #13080 which would prevent switching the cluster to strict from mixed mode if all indices in the cluster does not have the remote store based settings.

Related Issues

Resolves #13252

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Failing checks are inspected and point to the corresponding known issue(s) (See: Troubleshooting Failing Builds)
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)
  • Public documentation issue/PR created

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for f14e231: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for ca4b1d9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 667cffe: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shourya035 shourya035 self-assigned this Apr 20, 2024
@shourya035 shourya035 force-pushed the index-metadata-mutate branch from cae1886 to aa83dc3 Compare April 20, 2024 16:56
Copy link
Contributor

❌ Gradle check result for cae1886: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 6f9f0c3: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

✅ Gradle check result for 5c8df2c: SUCCESS

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
Copy link
Contributor

❌ Gradle check result for 98004f8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
@shourya035 shourya035 force-pushed the index-metadata-mutate branch from bc6d493 to e4ce087 Compare April 30, 2024 05:14
Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
Copy link
Contributor

❌ Gradle check result for e4ce087: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for bc6d493: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 455bc59: UNSTABLE

  • TEST FAILURES:
      1 org.opensearch.index.IndexServiceTests.testAsyncTranslogTrimTaskOnClosedIndex
      1 org.opensearch.cluster.allocation.ClusterRerouteIT.testDelayWithALargeAmountOfShards

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

❌ Gradle check result for 1179072: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
Copy link
Contributor

❌ Gradle check result for 4d37794: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
Copy link
Contributor

❌ Gradle check result for a19c5cd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shourya035
Copy link
Member Author

shourya035 commented Apr 30, 2024

❌ Gradle check result for a19c5cd: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

❌ Gradle check result for 4d37794: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Flaky test being fixed through: #13457

Copy link
Contributor

✅ Gradle check result for a19c5cd: SUCCESS

@gbbafna gbbafna merged commit 1f406db into opensearch-project:main Apr 30, 2024
28 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 30, 2024
…store migration (#13316)

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
(cherry picked from commit 1f406db)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gbbafna pushed a commit that referenced this pull request Apr 30, 2024
…store migration (#13316) (#13466)

(cherry picked from commit 1f406db)

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
finnegancarroll pushed a commit to finnegancarroll/OpenSearch that referenced this pull request May 10, 2024
…store migration (opensearch-project#13316)

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
deshsidd pushed a commit to deshsidd/OpenSearch that referenced this pull request May 17, 2024
…store migration (opensearch-project#13316)

Signed-off-by: Shourya Dutta Biswas <114977491+shourya035@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch enhancement Enhancement or improvement to existing feature or request skip-changelog Storage:Remote v2.14.0
Projects
Status: ✅ Done
3 participants