-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple repositories in get snapshot request #41799
Support multiple repositories in get snapshot request #41799
Conversation
Pinging @elastic/es-distributed |
@@ -80,7 +85,11 @@ public GetSnapshotsRequest(StreamInput in) throws IOException { | |||
@Override | |||
public void writeTo(StreamOutput out) throws IOException { | |||
super.writeTo(out); | |||
out.writeString(repository); | |||
if (out.getVersion().onOrAfter(Version.V_7_1_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should fail BwC tests shouldn't it? I think we gotta first set it to V8_0_0
and then back port and adjust the constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@original-brownbear thanks, I've fixed it in 5691315.
The sequence of BwC story would be the following:
- Merge this PR.
- Create separate PR to disable snapshot related BwC tests in master and change version to 7.2.0.
- Backport the change to 7.2.0.
- Enable tests.
🎉 Thank you for making this change! I assume the repository name will always be an array of strings, even when the snapshot only belongs to a single repository, correct?
From the UI standpoint, I don't see this being problematic, because AFAIK we're only requesting either all snapshots in all repositories or a single snapshot within a single repository. @jen-huang Can you think of any issues with this?
By not a problem, you mean that snapshot info is returned if the snapshots exist in any of the specified repositories, correct? |
Confirming that this is not a problem for the UI since the only time we request a particular snapshot is when we already know which repository it belongs to. |
@cjcenizal I've extended PR description with Example section, please let me know if I should clarify something else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't make that API change to finalize. Each repository must know its own name (its in the metadata
field in the blob store repo for example).
server/src/main/java/org/elasticsearch/repositories/FilterRepository.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/repositories/Repository.java
Outdated
Show resolved
Hide resolved
@andrershov seems BwC tests are still failing. |
run elasticsearch-ci/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment about the change to SnapshotInfo
, I think what we actually should adjust here is GetSnapshotsResponse
to only serialize what we return over the wire, not what is stored in the repository.
SnapshotId snapshotId = snapshotInfo.snapshotId(); | ||
allSnapshotIds.put(snapshotId.getName(), snapshotId); | ||
currentSnapshots.add(snapshotInfo); | ||
List<SnapshotInfo> snapshotInfos = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: We could use ActionListener#completeWith
here :)
@@ -54,6 +54,7 @@ | |||
private static final DateFormatter DATE_TIME_FORMATTER = DateFormatter.forPattern("strictDateOptionalTime"); | |||
private static final String SNAPSHOT = "snapshot"; | |||
private static final String UUID = "uuid"; | |||
private static final String REPOSITORY = "repository"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add this field to this class:
We're now serializing the repository name and writing it to the repository itself (SnapshotInfo
is also written to the repository).
It seems to me we're only adding it here since we're also serializing SnapshotInfo
in the new transport response => we should create some new class that has all the info SnapshotInfo
contains + the repository name and serialize that over the wire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@original-brownbear I don't see how adding repository field to SnapshotInfo
is harmful.
Yes, I see that BlobStoreRepository
writes SnapshotInfo
to disk and reads it, but it does not seem to be a problem if the repository name is immutable.
I find it natural that SnapshotInfo
, which has the following JavaDoc Information about a snapshot
contains information about the repository name for which this snapshot was taken.
At the same time, I'm not a proponent of adding unnecessary classes unless we can come up with a good argument why current implementation won't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It simply seems wrong and needless to go through all the steps of adding BwC, changing the format of what we serialize in the snapshot repository (which always introduces risk) etc. when we can resolve this trivially by simply changing the response serialization.
It would even be a smaller change.
=> I really don't see an argument for changing the serialization in the repo or over the wire when we already can get all the data we need inside the transport action. All we do with this change is store redundant data in the repo, add risk and BwC issues when we can have the same change without those downsides by simply fixing the response serialization.
Not adding a new class here to keep complexity down isn't really an argument either because the whole logic to figure out the state of things is so complex precisely because we are sharing SnapshotInfo
across functionalities here (sorta related to #40943 (comment)).
@cjcenizal I'm a bit worried about the following requirement in the original issue "In addition to providing information about which repository a snapshot belongs to, the endpoint should also return a collection of any errors that repositories respond with, associated with the name of the repository." |
@cjcenizal I think I need to extend response format like this:
And semantics would be changed in such a way, that if some exception happens when retrieving the snapshots from one repository this repository will appear in "errors" section and snapshots from well-functioning repositories will appear in snapshots section. There might be at most one error for one repository.
This format is also carefully chosen not to break BwC. If I were to design it from scratch I would probably do it differently:
But this clearly breaks BwC and I don't think we should create new endpoint. Let me know what do you think. |
@andrershov Thank you for catching that requirement! I'd forgotten about that. For context, the errors I've seen are generally those returned by misconfigured repositories when retrieving their snapshots, and so their shape and status code varies by repository type. Here's how we're currently collecting these errors. We request all snapshots of each repository, and sometimes this request results in an error. I've seen 400, 404, and 500 errors in these cases. In the UI we aren't actually surfacing these errors yet, but it's possible we will at some point. I think the response format you shared will work well for us. In the case there's a single top-level error, I assume that will return a 4xx status? @jen-huang Thoughts? |
With this newly proposed format, I see the following issues:
@original-brownbear what do you think? |
@andrershov @cjcenizal I'm wondering if we actually need:
Can't we simply throw the first error we encounter and let it bubble up to the top level like we did before? (or an aggregate error for all the repos). It seems kind of weird to me to collect the errors separately when you could argue that the request simply failed if any single repository failed since one wanted to search through all of them. Otherwise this situation that @andrershov points out gets really weird:
I'd go even further than this. If we run this request (say for a single snaptshot) for two repositories, and we don't find the snapshot in repo A and get some random error from repo B. Wouldn't we just want to return the error from repo B since it supersedes the To me it seems that since (I agree with this statement 100%):
it might not make too much sense to be able to just keep going in the UI when you should be fixing your repo settings in the first place? That said, if we think there's actually a use case for this. Why not just retain the current behavior for single snapshot + repo requests and list failures only when the request targets multiple repositories as well as multiple snapshots? That seems somewhat consistent to me. |
The way the UI will use this request is to display a list of all available snapshots, so in this case the important part of the response is the available snapshots, even if the request is partially failed.
The UI has two separate lists, one for repositories and one for snapshots. Currently, we're only surfacing errors in the list of repositories, which uses a different endpoint than the one changed in this PR. The list of snapshots will use the endpoint changed in this PR and, ideally, it will show the user all of their available snapshots so they can delete them, inspect them, restore from them, etc, even if some repositories are misconfigured.
|
@original-brownbear this a bulk request and bulk requests could partially fail and it's perfectly normal to return errors for those parts of the request that have failed and results for the successful parts. For example, we're doing the same for multi-get request.
So we separately report errors for each repository. And we don't need to be BwC here, because our previous API only supported one repo and we need to be BwC for one repo case.
And with the old response format we should return:
That's why I say that if there is a single repo in the request and it has failed, we return errors in the old format. But I'm not satisfied with it, because now new clients will contain special case handling logic for one repository case. The way people do BwC usually is to mix formats:
So that old clients can read "error" field as usual and new clients can read "failures" field. |
+1 to the first part, it seems tricky to find a nice BwC compatible solution here (though I don't know enough about our compatibility story to be able to tell how BwC we need to be here, I guess we can "team-discuss" it later :)). |
@cjcenizal we discussed it on the team meeting and we see 3 options:
We agreed that (3) is the simplest option and we will go with it. So this feature will be included in Elasticsearch release 8.0 and you should be able to use it starting with Kibana 8.0. For 7.x series, we propose you to continue sending requests to the repositories one-by-one. |
@andrershov That sounds great, thank you! The options you considered sound comprehensive, and I think the decision to make a clean break at a major release is the right one. Our current solution on the Kibana side should be fine until we hit 8.0. |
@cjcenizal @original-brownbear closing this in favour of #42090 |
This commit adds multiple repositories support to get snapshots request. If some repository throws an exception this method does not fail fast instead, it returns results for all repositories. This PR is opened in favour of elastic#41799, because we decided to change the response format in a non-BwC manner. It makes sense to read a discussion of the aforementioned PR. This is the continuation of work done here elastic#15151.
Description
This commit adds multiple repositories support to get snapshots request.
This is the continuation of work done here #15151.
_snapshot API
Now the following requests are supported:
This commit adds repository name to each
SnapshotInfo
returned.Previously, snapshots in the response were sorted by start time and then by snapshot id.
This commit sorts the snapshots first by the repository name, then start time and then by snapshot id.
When particular snapshot names are specified instead of
*
or_all
andignoreUnavailable=false
, then all these snapshots should exist in all repositories mentioned. This might be undesirable, probably reviewers can propose something better. Currently, you can specifyignoreUnvailable=true
, in this case, it's not a problem._cat/snapshots API
Cat snapshot API also supports multiple repositories now.
I'm not sure if it's needed and this change can be reverted.
is also supported and will return all snapshots for all repositories.
Interestingly enough, the previous implementation also was registering
/_cat/snapshots
filter but a request to this endpoint always resulted inImplementation
Technically,
TransportGetSnapshotsAction
callsTransportGetRepositoriesAction
which resolves repo names and wildcards in request to concrete repository names. After that, we just iterate over all returned repositories and are using the already implemented algorithm to get matching snapshots in the repositories.In terms of the transport level protocol, 2 classes are changed:
GetSnapshotsRequest
now accepts the list of repository names (including wildcards) instead of single repo name.SnapshotInfo
is extended to include repo name.In terms of REST protocol, 2 classes are changed:
SnapshotRequestsConverters.getSnapshots
now can understand the list of repository names inGetSnapshotRequest
and adds it as comma separated list to the request path.RestGetSnapshotsAction
can now parse the comma-separated list of the repo names.SnapshotInfo
XContent serialization addsrepository
field.RestSnapshotAction
which is responsible for handling cat API for the snapshots is extended with repository field as well.Testing
SharedClusterSnapshotRestoreIT.testGetSnapshotsMultipleRepos
tests that multiple repositories and wildcard work fine on transport level.SnapshotRequestConvertersTests.getSnapshots
tests that high-level REST client correctly sends the request when multiple repositories are used in the request.SnapshotIT.testGetSnapshots
tests getting 2 snapshots from 2 different repositories using high-level REST client.Documentation
Seems that it makes sense to make adjustments to the following documentation pages:
Both of them should talk about the ability to specify multiple repositories when querying the snapshots (TBD).
Examples
_snapshot API
Consider you have 2 repositories:
And you have one snapshot in each repository:
Now the following commands
will give you the same output
Note the repository field in each serialized
SnapshotInfo
object.The following request
results in
because snap1 should exist in all repositories for this request to be succesful.
However, you can specify
ignore_unavailable
flag:And get the following response:
This works, because
repo2
is still checked forsnap1
presence, but it's absense does not make the whole request failure.Of course, you still can retrieve a particular snapshot from a particular repository as usual:
And get the following response back (same as before this commit, but enriched with repository field):
_cat/snapshot API
Consider you have 2 repositories:
And you have one snapshot in each repository:
Now you can get all the snapshots using any of this API calls:
The response will contain the repository name:
Closes #41210