-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Remote Store] Add header(codec) and footer(checksum) to remote store segment metadata #5917
[Remote Store] Add header(codec) and footer(checksum) to remote store segment metadata #5917
Conversation
bdc70a5
to
ccad01b
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
ccad01b
to
0053780
Compare
Gradle Check (Jenkins) Run Completed with:
|
~15 UTs related to remote are failing. i've fixed 11 and 4 are still failing. will push new commit after fixing those. That should also cover UTs for my changes specifically. |
Gradle Check (Jenkins) Run Completed with:
|
1d68264
to
6da3934
Compare
Gradle Check (Jenkins) Run Completed with:
|
6da3934
to
6bea374
Compare
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more @@ Coverage Diff @@
## main #5917 +/- ##
============================================
+ Coverage 70.67% 70.80% +0.12%
- Complexity 58958 59029 +71
============================================
Files 4799 4802 +3
Lines 282432 282468 +36
Branches 40716 40716
============================================
+ Hits 199622 200007 +385
+ Misses 66428 66012 -416
- Partials 16382 16449 +67
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Thanks for the change @linuxpi |
nit: Typo in word checksum |
CodecUtil.checkHeader( | ||
in, | ||
UploadedSegmentMetadata.METADATA_CODEC, | ||
UploadedSegmentMetadata.CURRENT_VERSION, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens once we change the version of metadata file from 1
to 2
and reading a metadata file of version 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can support a min version of metadata file for each OS version and deprecate very old versions as we go along. checkHeader
returns the version of the metadata file we read. Which we can use to decide how to read the metadata file.
But I feel it won't this simple in the long run. Currently, migrating indexes to new OS version is restricted due to how lucene restricts upgrades, adding more blockers/conditions to this might not be ideal. Ideally we should be able to automatically upgrade the metadata files to new version when read. But we need to define a limit for each OS version what min version of segment we would support.
This could be part of recovery from remote store flow? I was thinking if we could automatically upgrade the metadata files to new versions whenever possible during recovery flow and try to avoid changes to metadata file where such automatic upgrade is not possible.
I think we can spend some time to think about it and use a uniform version evolution strategy for both translog and segment metadata files. do you think we should create a separate issue for this? or include in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created a separate issue for handling backward compatibility #6123
server/src/main/java/org/opensearch/index/store/RemoteSegmentStoreDirectory.java
Outdated
Show resolved
Hide resolved
Adding |
in, | ||
UploadedSegmentMetadata.METADATA_CODEC, | ||
UploadedSegmentMetadata.CURRENT_VERSION, | ||
UploadedSegmentMetadata.CURRENT_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could try keeping a mapping of OS version to this version like this - https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/Version.java#L79-L93
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ashking94. Let me check on this more and see if we can follow a similar approach with metadata versions. We would definitely have to maintain some version mapping b/w metadata versions and OS versions. But i was thinking about how can we provide maximum compatibility across versions and better backward compatibility for data migration. I know lucene already adds some restriction to that, but for metadata files maybe we can be a little flexible.
6bea374
to
c94e8b8
Compare
Gradle Check (Jenkins) Run Completed with:
|
Flaky Test failures in gradle check:
|
97c2ff5
to
a204f98
Compare
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. Security scan needs to be addressed maybe?
…nt metadata Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
… to MetadataParser 2. Add TODOs and Deprecate non-generic methods Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
…eading contract Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
56b19aa
to
5967db4
Compare
Looks like a known issue - #5576 (comment) |
Gradle Check (Jenkins) Run Completed with:
|
We can go ahead with merge as no new dependencies were introduced in this PR but still Security Check is failing. Saw a PR merged which had similar issue - #6351 Security failure is a know issue - #5576 (comment) |
…te store segment metadata (opensearch-project#5917)" This reverts commit 31434a3.
Signed-off-by: Varun Bansal bansvaru@amazon.com
Description
This PR adds header(metadata codec and version) and footer(metadata checksum) to remote segment metadata files.
Metadata version evolution/Backward Compatibility
As we update the metadata file contents in future, we would increment the
CURRENT_VERSION
. We need to think about how we are going to handle segment metadata files for older versions to provide support for backward compatibility. Created an new issue to explore the possibilities. Similar handling is needed for translog metadata files as well.For the metadata files generated before this change, they would be incompatible and
CorruptIndexException
would be thrown when read with this change in. Since Remote store was an experimental feature we have decided to not support older metadata files.Issues Resolved
#4605
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.