-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ILM: add searchable snapshot action #52585
ILM: add searchable snapshot action #52585
Conversation
Pinging @elastic/es-core-features (:Core/Features/ILM+SLM) |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @andreidan, I know it's still a WIP, but I left a bunch of comments!
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CleanupSnapshotStep.java
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CleanupSnapshotStep.java
Show resolved
Hide resolved
Objects.nonNull(indexPrefix); | ||
Objects.nonNull(settingsKeys); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need to be Objects.requireNonNull
, currently it just returns true or false but doesn't throw an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch (I hadn't even acknowledged the existence of an API that just does obj != null
- TIL :) )
settings.put(key, value); | ||
} | ||
|
||
UpdateSettingsRequest updateSettingsRequest = new UpdateSettingsRequest(indexName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we do this as an AsyncActionStep
, but should we rather do this as a ClusterStateActionStep
to avoid having to actually issue the request and wait for it? I believe it would be a bit better then, and it would also allow us to copy settings that may not be settable (if we need that functionality in the future)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting. I agree it'd be a better avenue
"the ilm policy but has " + lifecycleState.getSnapshotName(); | ||
LifecycleExecutionState.Builder newCustomData = LifecycleExecutionState.builder(lifecycleState); | ||
String policy = indexMetaData.getSettings().get(LifecycleSettings.LIFECYCLE_NAME); | ||
String snapshotName = generateSnapshotName(generateSnapshotName("<{now/M}-" + index.getName() + "-" + policy + ">")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we generating the snapshot name twice here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do we think {now/M}
is enough granularity? We could probably go with /d
to be a bit more granular?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this should use the same validation that SnapshotLifecyclePolicy.validate()
does for validating things (for example, no #
in the name). We should factor that validation out into a method and use it here.
getClient().admin().indices().aliases(aliasesRequest, new ActionListener<>() { | ||
|
||
@Override | ||
public void onResponse(AcknowledgedResponse response) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check that the request was actually acknowledged and throw an exception if it wasn't so that it can be retried
|
||
import static org.elasticsearch.xpack.core.ilm.LifecycleExecutionState.fromIndexMetadata; | ||
|
||
public class TakeSnapshotStep extends AsyncActionStep { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better name might be CreateSnapshotStep
// TODO should we expose this? | ||
request.includeGlobalState(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should expose this (meaning to hardcode it as false like you have), these are specific snapshots used only for searchable snapshots
public void onResponse(CreateSnapshotResponse createSnapshotResponse) { | ||
listener.onResponse(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably check the status()
of the response, it's possible it could be a failure in which case I think we could retry again immediately?
/** | ||
* Optionally derives the index name using the provided prefix (if any) and waits for the status of the index to be GREEN. | ||
*/ | ||
public class WaitForGreenIndexHealthStep extends ClusterStateWaitStep { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could refactor the users of this class to use WaitForIndexColorStep
and then we wouldn't need to add this class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally, thanks for the tip. I didn't know of its existence
Pinging @elastic/es-ui (:ES-UI) |
@andreidan Would you mind adding some details to the PR description? The ES UI team would be particularly interested in changes to the API and an explanation of the use cases for new features. |
- renamed TakeSnapshotStep to CreateSnapshotStep - made the Create and RestoreSnapshotStep retry-able using the AsyncRetryDuringSnapshotActionStep infrastructure - CreateSnapshotStep checks the response status for internal server error and fails the step in this case so we retry before we go to the next steps that’ll discover the snapshot was not created successfully - CleanupSnapshotStep reports missing repository in a more detailed way - CopySettingsStep copies the settings using the cluster state directly - GenerateSnapshotNameStep validates the snapshot name
This redesign is meant to allow the client to specify that it wants to stop waiting and move to the “condition unfulfilled” side of the branch.
@elasticmachine update branch |
This speeds up the test a bit and also avoids a race condition where the explain API will not get a response as the ILM is busy executing cluster action steps for the restored index.
@elasticmachine update branch |
@elasticmachine update branch |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay whew, sorry about the delay on this Andrei, I left a few more really minor comments, I think the only thing that's major is the null
check boolean flipping in WaitForSnapshotInProgressStep.java
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CleanupSnapshotStep.java
Show resolved
Hide resolved
if (settingsKeys == null || settingsKeys.length == 0) { | ||
return clusterState; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor, but we can probably stick this before the targetIndexMetadata == null
check to avoid an error where no settings are copied?
public DeleteAction() { | ||
this(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth having a separate discussion (I can't remember if we discussed it) with David & Co about whether we want the default for this to be true or false (can be had after this PR)
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/GenerateSnapshotNameStep.java
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/MountSnapshotStep.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/OnAsyncWaitBranchingStep.java
Outdated
Show resolved
Hide resolved
return new Result(false, new Info(String.format(Locale.ROOT, | ||
"snapshot [%s] generated by policy [%s] for index [%s] is still in progress", snapshotName, policyName, indexName))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think we need to return true
here, it's possible that the snapshot was taken and completed (prior to the step being updated), and then the cluster was restarted, so the SnapshotsInProgress
is null and won't ever exist because the snapshot was successful, best to err on the side of moving forward and doing a check than be wedged I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted the WaitForSnapshotInProgressStep altogether as we'll design a step that waits for a generation change in the repo metadata in a subsequent PR.
@elasticmachine update branch |
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>
We'll add a new way to save snapshot status api calls based on the repository data generation in a future PR.
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I left two really minor typo change comments, thanks for iterating so much on this @andreidan!
* newly created searchable snapshot backed index. | ||
*/ | ||
public class SearchableSnapshotAction implements LifecycleAction { | ||
public static final String NAME = "searchable_snapshot"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one may have been lost in some of the other reviews, but I still think it should be changed (to searchable-snapshot
)
import static org.elasticsearch.xpack.core.ilm.SearchableSnapshotAction.getCheckSnapshotStatusAsyncAction; | ||
import static org.hamcrest.Matchers.is; | ||
|
||
public class SearchableSnaposhotActionTests extends AbstractActionTestCase<SearchableSnapshotAction> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super minor, but there's a typo here (and in the name of the file):
SearchableSnaposhotActionTests
vs
SearchableSnapshotActionTests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it was hard to spot the typo, great eyes, thanks @dakrone !
Regarding the action name, I renamed it to "searchable_snapshot" (lowercase) on purpose when we renamed "snapshot-repository" to "snapshot_repository" as I thought it was confusing to have a configuration that mixes scores and underscores eg.:
"searchable-snapshot" : {
"snapshot_repository" : "snapshotRepositoryName"
}
I using underscores everywhere employs less cognitive load:
"searchable_snapshot" : {
"snapshot_repository" : "snapshotRepositoryName"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other "composed" action we have is set_priority
so it would be inconsistent to use scores. If you don't have a strong opinion on this, I think I'd prefer we use underscores everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the other one as the step name rather than the action name, I agree that the action name should use underscores (like our other ones), sorry about that!
Thanks for reviewing this rather huge chunk of functionality Lee |
Add ILM support for searchable snapshots in the cold phase. Searchable snapshots are part of the lazy snapshot restores effort that's being tracked here
The API for configuring creating a searchable snapshot is: