Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Integrate remote segment store in peer recovery flow #6664

Merged

Conversation

sachinpkale
Copy link
Member

@sachinpkale sachinpkale commented Mar 14, 2023

Description

  • Currently, peer recovery flow involves copying segment files from primary to new replica.
  • With remote segment store, the segment files are already present in the remote store.
  • In this change, we download segment files from remote segment store freeing up primary from the file transfer task.

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sachinpkale
Copy link
Member Author

Build is failing due to flaky test: #6665

Sachin Kale added 2 commits March 16, 2023 14:52
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
@sachinpkale sachinpkale force-pushed the peer-recovery-integration branch from dd1f138 to 0889239 Compare March 16, 2023 09:22
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testReadRangeBlobWithRetries

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.indices.replication.SegmentReplicationRelocationIT.testFlushAfterRelocation

Comment on lines +4380 to +4389
syncSegmentsFromRemoteSegmentStore(overrideLocal, true);
}

/**
* Downloads segments from remote segment store.
* @param overrideLocal flag to override local segment files with those in remote store
* @param refreshLevelSegmentSync last refresh checkpoint is used if true, commit checkpoint otherwise
* @throws IOException if exception occurs while reading segments from remote store
*/
public void syncSegmentsFromRemoteSegmentStore(boolean overrideLocal, boolean refreshLevelSegmentSync) throws IOException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can sync segments be agnostic of the source like peer node or remote store?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite challenging to me as the code paths are completely different . Would like to know @sachinpkale's view as well .

Also the paths are different in other parts like restore as well . So there is consistency across all parts .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @gbbafna mentioned, this method is not specific to peer recovery. We are using the same method in restore as well as failover flow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this change, we make sure that segments are downloaded from the remote store instead of the source node. Remote store is not replacing the source node. I have not given much thought to this yet but given the peer recovery flow, it would be much bigger change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name itself doesn't provide an abstraction. Lets open a follow up issue to clean this up ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name itself doesn't provide an abstraction. Lets open a follow up issue to clean this up ?

Okay, will create a tracking issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created issue: #6831

Copy link
Collaborator

@gbbafna gbbafna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me . We can merge once @Bukhtawar's concerns are addressed .

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

Codecov Report

Merging #6664 (0746e3c) into main (e4d9fb5) will increase coverage by 0.16%.
The diff coverage is 30.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #6664      +/-   ##
============================================
+ Coverage     70.60%   70.76%   +0.16%     
- Complexity    59151    59254     +103     
============================================
  Files          4810     4810              
  Lines        283502   283510       +8     
  Branches      40884    40885       +1     
============================================
+ Hits         200153   200626     +473     
+ Misses        66869    66436     -433     
+ Partials      16480    16448      -32     
Impacted Files Coverage Δ
...a/org/opensearch/test/OpenSearchIntegTestCase.java 56.66% <0.00%> (+1.32%) ⬆️
...ch/indices/recovery/PeerRecoveryTargetService.java 53.28% <25.00%> (+0.79%) ⬆️
...in/java/org/opensearch/index/shard/IndexShard.java 69.57% <66.66%> (-0.49%) ⬇️

... and 512 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@gbbafna gbbafna merged commit 07565ad into opensearch-project:main Mar 25, 2023
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Mar 25, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 25, 2023
…6664)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
(cherry picked from commit 07565ad)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gbbafna pushed a commit that referenced this pull request Mar 27, 2023
…6664) (#6833)

(cherry picked from commit 07565ad)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
mitrofmep pushed a commit to mitrofmep/OpenSearch that referenced this pull request Apr 5, 2023
…pensearch-project#6664)

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Valentin Mitrofanov <mitrofmep@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants