Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect generational files in recoveryDiff #55239

Closed

Conversation

DaveCTurner
Copy link
Contributor

Today MetadataSnapshot#recoveryDiff considers the .liv file as per-commit
rather than per-segment and often transfers them during peer recoveries and
snapshot restores. It also considers differences in .fnm, .dvd and .dvm
files as indicating a difference in the whole segment, even though these files
may be adjusted without changing the segment itself.

This commit adjusts this logic to attach these generational files to the
segments themselves, allowing Elasticsearch only to transfer them if they are
genuinely needed.

Closes #55142
Resolves an outstanding //NORELEASE action related to #50999.

Today `MetadataSnapshot#recoveryDiff` considers the `.liv` file as per-commit
rather than per-segment and often transfers them during peer recoveries and
snapshot restores. It also considers differences in `.fnm`, `.dvd` and `.dvm`
files as indicating a difference in the whole segment, even though these files
may be adjusted without changing the segment itself.

This commit adjusts this logic to attach these generational files to the
segments themselves, allowing Elasticsearch only to transfer them if they are
genuinely needed.

Closes elastic#55142
Resolves an outstanding `//NORELEASE` action related to elastic#50999.
@DaveCTurner DaveCTurner added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Distributed Indexing/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v8.0.0 v7.8.0 labels Apr 15, 2020
@DaveCTurner DaveCTurner requested review from tlrx, s1monw and ywelsch April 15, 2020 14:31
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Store)

final List<StoreFileMetadata> perCommitStoreFiles = new ArrayList<>();

for (StoreFileMetadata meta : this) {
if (IndexFileNames.OLD_SEGMENTS_GEN.equals(meta.name())) { // legacy
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is no longer relevant but looking for confirmation from someone who knows the history here.

final Field.Store textFieldStored = random().nextBoolean() ? Field.Store.YES : Field.Store.NO;
doc.add(new TextField("body", textFieldContent, textFieldStored));
final String docValueFieldContent = TestUtil.randomRealisticUnicodeString(random());
doc.add(new BinaryDocValuesField("dv", new BytesRef(docValueFieldContent)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of changes is mostly for extra logging, the one semantic change here is SortedDocValuesField becoming BinaryDocValuesField. I struggled to update the original SortedDocValuesField for some reason, I'm not too familiar with this API so maybe I missed something?

equalTo(newCommitMetadata.size() - 4)); // segments_N, cfs, cfe, si for the new segment
assertThat(newCommitDiff.toString(), newCommitDiff.different.size(), equalTo(0)); // the del file must be different
assertThat(newCommitDiff.toString(), newCommitDiff.missing.size(), equalTo(4)); // segments_N,cfs, cfe, si for the new segment
assertTrue(newCommitDiff.toString(), newCommitDiff.identical.stream().anyMatch(m -> m.name().endsWith(".liv")));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the .liv file is considered identical between the commits rather than different.

Copy link
Member

@tlrx tlrx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks David, I appreciated the doc and comments. This looks good to me but a second review from someone who better knows this part of the code is necessary.

@@ -76,8 +76,7 @@ public void testCreateAndRestoreSearchableSnapshot() throws Exception {
for (int i = between(10, 10_000); i >= 0; i--) {
indexRequestBuilders.add(client().prepareIndex(indexName).setSource("foo", randomBoolean() ? "bar" : "baz"));
}
// TODO NORELEASE no dummy docs since that includes deletes, yet we always copy the .liv file in peer recovery
indexRandom(true, false, indexRequestBuilders);
indexRandom(true, true, indexRequestBuilders);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@s1monw
Copy link
Contributor

s1monw commented Apr 18, 2020

FYI I merged a lucene change that will make this easier down the road. It will be in Lucene 8.6 apache/lucene-solr#1434

@ywelsch ywelsch removed :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs :Distributed Indexing/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. labels Apr 20, 2020
@DaveCTurner
Copy link
Contributor Author

@s1monw do you think we should wait for Lucene 8.6 here? I think that'd be ok, and it'd make this more robust.

@s1monw
Copy link
Contributor

s1monw commented Apr 21, 2020

@DaveCTurner the change in 8.6 will only work for segments that are changed by a Lucene 8.6 writer. I wonder if we need to do this for other segments too?

@DaveCTurner
Copy link
Contributor Author

Yes, we do need to handle older segments too, and I think what I've written here will be robust enough for them since we won't be writing any further segment generations without IDs.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rjernst rjernst added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 4, 2020
@fcofdez
Copy link
Contributor

fcofdez commented Sep 8, 2021

@DaveCTurner I'm wondering if we should update this PR and merge it? I can take care of it.

@DaveCTurner
Copy link
Contributor Author

Thanks @fcofdez please go ahead. Would be great to get this in.

@fcofdez
Copy link
Contributor

fcofdez commented Oct 5, 2021

Closing this via #77695

@fcofdez fcofdez closed this Oct 5, 2021
@DaveCTurner DaveCTurner deleted the 2020-04-15-dont-copy-liv-file branch July 23, 2022 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.16.0 v8.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incremental restores treat segments containing soft-deleted docs as "different"