Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Add remote store restore API implementation #3642

Merged

Conversation

sachinpkale
Copy link
Member

Signed-off-by: Sachin Kale kalsac@amazon.com

Description

  • This change adds implementation for the remote store restore API.
  • Currently, only segment restore is supported. Once remote translog support is added, we need to modify implementation of this API to restore data from remote translog as well.
  • API endpoint is added as part of this PR: [Remote Store] Add rest endpoint for remote store restore #3576
  • This will be followed by integ tests PR which will cover end-to-end flow of segment upload and restore.

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -0,0 +1,211 @@
/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is added so that corresponding code in RestoreService is compiled without any errors. This file is same as defined here: #3576

@@ -0,0 +1,103 @@
/*
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RestoreRemoteStoreRequest file is added so that corresponding code in RestoreService is compiled without any errors. This file is same as defined here: #3576

@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success f51871fa10b061f84cf18518d97ec7fb8b2eb134
Log 6189

Reports 6189

@sachinpkale sachinpkale marked this pull request as ready for review June 22, 2022 03:35
@sachinpkale sachinpkale requested review from a team and reta as code owners June 22, 2022 03:35
@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from 57b52b6 to 405689c Compare June 26, 2022 09:51
@opensearch-ci-bot
Copy link
Collaborator

✅   Gradle Check success 405689c53207cd7777a41b93d45923ba4c63df84
Log 6344

Reports 6344

Comment on lines +120 to +133
void recoverFromRemoteStore(final IndexShard indexShard, ActionListener<Boolean> listener) {
if (canRecover(indexShard)) {
RecoverySource.Type recoveryType = indexShard.recoveryState().getRecoverySource().getType();
assert recoveryType == RecoverySource.Type.REMOTE_STORE : "expected remote store recovery type but was: " + recoveryType;
ActionListener.completeWith(recoveryListener(indexShard, listener), () -> {
logger.debug("starting recovery from remote store ...");
recoverFromRemoteStore(indexShard);
return true;
});
} else {
listener.onResponse(false);
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we extend StoreRecovery to RemoteStoreRecovery and override recoverFromStore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StoreRecovery currently contains: recoverFromStore, recoverFromRepository and recoverFromLocalShards where the source of recovery is different for each of these methods. That is why I added one more method where recovery source is remote store.

* @param request restore request
* @param listener restore listener
*/
public void restoreFromRemoteStore(RestoreRemoteStoreRequest request, final ActionListener<RestoreCompletionResponse> listener) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method exists in the snapshot package, we should extend it and abstract it out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,restoreSnapshot method is in the same class. I earlier thought of extending it but RestoreSnapshotRequest is different than RestoreRemoteStoreRequest and even if we create one method, it will anyway have conditional check to redirect the flow to snapshot restore vs remote store restore.

Comment on lines +214 to +221
public ClusterState execute(ClusterState currentState) {
// Updating cluster state
ClusterState.Builder builder = ClusterState.builder(currentState);
Metadata.Builder mdBuilder = Metadata.builder(currentState.metadata());
ClusterBlocks.Builder blocks = ClusterBlocks.builder().blocks(currentState.blocks());
RoutingTable.Builder rtBuilder = RoutingTable.builder(currentState.routingTable());

List<String> indicesToBeRestored = new ArrayList<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure concurrent snapshot recovery isn't happening? We need to sync on the shard state

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed adding a check in IndexRountingTable.initializeAsRemoteStoreRestore() method. Will add.

Copy link
Member Author

@sachinpkale sachinpkale Jun 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out. Made change to handle this. #3642 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait! Forgot to commit index re-opening logic. Adding now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added code to open the closed index as part of restore flow.

@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from 405689c to a9efcc8 Compare June 30, 2022 07:02
Sachin Kale added 2 commits June 30, 2022 12:33
Signed-off-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>
@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from a9efcc8 to 374471d Compare June 30, 2022 07:04
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from 374471d to 527a653 Compare June 30, 2022 10:58
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sachinpkale
Copy link
Member Author

Test failing: DiskThresholdDeciderIT.testHighWatermarkNotExceeded. This is a flaky test as mentioned here: https://build.ci.opensearch.org/job/gradle-check/99/console

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from 527a653 to e5809d0 Compare June 30, 2022 12:44
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

logger.warn("Remote store restore is not supported for non-existent index. Skipping: {}", index);
continue;
}
if (currentIndexMetadata.getState() != IndexMetadata.State.CLOSE) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We allow restore only for closed indices (similar to what snapshot does). Once index is set for recovery, index state is changed to open. This check enables us to avoid running two restores (remote/remote or remote/snapshot) at the same time.

@sachinpkale sachinpkale requested a review from Bukhtawar June 30, 2022 13:17
@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2022

Gradle Check (Jenkins) Run Completed with:

@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from b0fb31e to e1bfcba Compare July 1, 2022 13:17
@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2022

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from e1bfcba to 5f7baf9 Compare July 1, 2022 14:02
@github-actions
Copy link
Contributor

github-actions bot commented Jul 1, 2022

Gradle Check (Jenkins) Run Completed with:

private IndicesOptions indicesOptions = IndicesOptions.strictExpandOpen();
private boolean waitForCompletion;

public RestoreRemoteStoreRequest() {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't indices be in constructor args(mandatory)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handled by overriding ActionRequet's validate() method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant just from a stand alone class perspective

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. It is a bit tricky as it depends on how RestRestoreRemoteStoreAction initializes the RestoreRemoteStoreRequest.

https://github.com/opensearch-project/OpenSearch/pull/3576/files#diff-03a57182daff0959e24e2fccf7fe28e7dedcd711ef76758e24df834dc0e884b7R41

As indices is a part of post body, fetching it from RestRequest requires parsing the body content. This code of parsing is added as part of source method of the same class.

I just followed the conventions used by other Request classes but open to suggestions.

super.writeTo(out);
out.writeStringArray(indices);
indicesOptions.writeIndicesOptions(out);
out.writeBoolean(waitForCompletion);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeOptionalBoolean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

if (name.equals("indices")) {
if (entry.getValue() instanceof String) {
indices(Strings.splitStringByCommaToArray((String) entry.getValue()));
} else if (entry.getValue() instanceof ArrayList) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be ArrayList?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As indices is part of post body, it is better to get it as map from the XContentParser. This will help in easily adding new fields to the post body.

Signed-off-by: Sachin Kale <kalsac@amazon.com>
@sachinpkale sachinpkale force-pushed the feature/restore-api-impl branch from fad667a to e0d88ee Compare July 5, 2022 04:34
@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2022

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Jul 5, 2022

Gradle Check (Jenkins) Run Completed with:

@sachinpkale sachinpkale requested a review from Bukhtawar July 5, 2022 06:32
Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sachinpkale
Is it possible to restore just one shard rather than the complete index from remote store?

@sachinpkale
Copy link
Member Author

Thanks @sachinpkale Is it possible to restore just one shard rather than the complete index from remote store?

@Bukhtawar With the current flow, it makes an entry to the IndexRoutingTable for the given index. Need to explore if we can change IndexShardRoutingTable instead and enable restore for a particular shard.

@sachinpkale
Copy link
Member Author

Thanks @sachinpkale Is it possible to restore just one shard rather than the complete index from remote store?

@Bukhtawar With the current flow, it makes an entry to the IndexRoutingTable for the given index. Need to explore if we can change IndexShardRoutingTable instead and enable restore for a particular shard.

Created a tracking issue: #3768

@Bukhtawar Bukhtawar merged commit 27b58ab into opensearch-project:main Jul 5, 2022
sachinpkale added a commit to sachinpkale/OpenSearch that referenced this pull request Sep 1, 2022
…h-project#3642)

* Add remote restore API implementation

Signed-off-by: Sachin Kale <kalsac@amazon.com>
sachinpkale added a commit to sachinpkale/OpenSearch that referenced this pull request Sep 2, 2022
…h-project#3642)

* Add remote restore API implementation

Signed-off-by: Sachin Kale <kalsac@amazon.com>
sachinpkale added a commit to sachinpkale/OpenSearch that referenced this pull request Sep 2, 2022
…h-project#3642)

* Add remote restore API implementation

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Bukhtawar pushed a commit that referenced this pull request Sep 2, 2022
…4380)

* [Remote Store] Upload segments to remote store post refresh (#3460)

* Add RemoteDirectory interface to copy segment files to/from remote store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>

* Add index level setting for remote store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>

* Add RemoteDirectoryFactory and use RemoteDirectory instance in RefreshListener

Co-authored-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Upload segment to remote store post refresh

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>
Signed-off-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Inject remote store in IndexShard instead of RemoteStoreRefreshListener (#3703)

* Inject remote store in IndexShard instead of RemoteStoreRefreshListener

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Pass supplier of RepositoriesService to RemoteDirectoryFactory

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Create isRemoteStoreEnabled function for IndexShard

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Explicitly close remoteStore on indexShard close

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Change RemoteDirectory.close to a no-op

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Co-authored-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Add remote store restore API implementation (#3642)

* Add remote restore API implementation

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Add support to add nested settings for remote store (#4060)

* Add support to add nested settings for remote store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Add rest endpoint for remote store restore (#3576)

* Add rest endpoint for remote store restore

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Add validator that forces segment replication type before enabling remote store (#4175)

* Add validator that forces segment replication type before enabling remote store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Change remote_store setting validation message to make it more clear (#4199)

* Change remote_store setting validation message to make it more clear

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* [Remote Store] Add RemoteSegmentStoreDirectory to interact with remote segment store (#4020)

* Add RemoteSegmentStoreDirectory to interact with remote segment store

Signed-off-by: Sachin Kale <kalsac@amazon.com>

* Use RemoteSegmentStoreDirectory instead of RemoteDirectory (#4240)

* Use RemoteSegmentStoreDirectory instead of RemoteDirectory

Signed-off-by: Sachin Kale <kalsac@amazon.com>

Signed-off-by: Sachin Kale <kalsac@amazon.com>
Co-authored-by: Sachin Kale <kalsac@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants