Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Add header(codec) and footer(checksum) to remote store segment metadata #5917

Merged
merged 13 commits into from
Feb 21, 2023

Conversation

linuxpi
Copy link
Collaborator

@linuxpi linuxpi commented Jan 18, 2023

Signed-off-by: Varun Bansal bansvaru@amazon.com

Description

This PR adds header(metadata codec and version) and footer(metadata checksum) to remote segment metadata files.

  • Adding header helps us to evolve the metadata file with new attributes gracefully in future
  • Adding footer helps us to check against data corruption during upload/download.

Metadata version evolution/Backward Compatibility

As we update the metadata file contents in future, we would increment the CURRENT_VERSION. We need to think about how we are going to handle segment metadata files for older versions to provide support for backward compatibility. Created an new issue to explore the possibilities. Similar handling is needed for translog metadata files as well.

For the metadata files generated before this change, they would be incompatible and CorruptIndexException would be thrown when read with this change in. Since Remote store was an experimental feature we have decided to not support older metadata files.

Issues Resolved

#4605

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@linuxpi linuxpi changed the title Add header(codec) and footer(cehcksum) to remote store segment metadata [Remote Segment Store] Add header(codec) and footer(cehcksum) to remote store segment metadata Jan 18, 2023
@linuxpi linuxpi changed the title [Remote Segment Store] Add header(codec) and footer(cehcksum) to remote store segment metadata [Remote Store] Add header(codec) and footer(cehcksum) to remote store segment metadata Jan 18, 2023
@linuxpi linuxpi force-pushed the segment-md-header-footer branch from bdc70a5 to ccad01b Compare January 18, 2023 09:02
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@linuxpi linuxpi force-pushed the segment-md-header-footer branch from ccad01b to 0053780 Compare January 18, 2023 10:31
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@linuxpi
Copy link
Collaborator Author

linuxpi commented Jan 19, 2023

~15 UTs related to remote are failing. i've fixed 11 and 4 are still failing. will push new commit after fixing those. That should also cover UTs for my changes specifically.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@linuxpi linuxpi force-pushed the segment-md-header-footer branch from 1d68264 to 6da3934 Compare January 22, 2023 04:38
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@linuxpi linuxpi force-pushed the segment-md-header-footer branch from 6da3934 to 6bea374 Compare January 22, 2023 05:24
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.search.SearchWeightedRoutingIT.testSearchAggregationWithNetworkDisruption_FailOpenEnabled

@codecov-commenter
Copy link

codecov-commenter commented Jan 22, 2023

Codecov Report

Merging #5917 (a204f98) into main (5b3b557) will increase coverage by 0.12%.
The diff coverage is 100.00%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #5917      +/-   ##
============================================
+ Coverage     70.67%   70.80%   +0.12%     
- Complexity    58958    59029      +71     
============================================
  Files          4799     4802       +3     
  Lines        282432   282468      +36     
  Branches      40716    40716              
============================================
+ Hits         199622   200007     +385     
+ Misses        66428    66012     -416     
- Partials      16382    16449      +67     
Impacted Files Coverage Δ
...nsearch/common/io/VersionedCodecStreamWrapper.java 100.00% <100.00%> (ø)
...earch/index/store/RemoteSegmentStoreDirectory.java 99.30% <100.00%> (-0.01%) ⬇️
...x/store/remote/metadata/RemoteSegmentMetadata.java 100.00% <100.00%> (ø)
.../remote/metadata/RemoteSegmentMetadataHandler.java 100.00% <100.00%> (ø)
...ava/org/opensearch/action/NoSuchNodeException.java 0.00% <0.00%> (-50.00%) ⬇️
...opensearch/persistent/PersistentTasksExecutor.java 22.22% <0.00%> (-44.45%) ⬇️
...java/org/opensearch/threadpool/ThreadPoolInfo.java 56.25% <0.00%> (-37.50%) ⬇️
...pensearch/action/ingest/DeletePipelineRequest.java 31.25% <0.00%> (-37.50%) ⬇️
...main/java/org/opensearch/ingest/ProcessorInfo.java 21.05% <0.00%> (-26.32%) ⬇️
.../opensearch/client/indices/GetMappingsRequest.java 47.05% <0.00%> (-23.53%) ⬇️
... and 488 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@sachinpkale
Copy link
Member

Thanks for the change @linuxpi

@sachinpkale
Copy link
Member

nit: Typo in word checksum Add header(codec) and footer(cehcksum) to remote store segment metadata

CodecUtil.checkHeader(
in,
UploadedSegmentMetadata.METADATA_CODEC,
UploadedSegmentMetadata.CURRENT_VERSION,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens once we change the version of metadata file from 1 to 2 and reading a metadata file of version 1?

Copy link
Collaborator Author

@linuxpi linuxpi Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can support a min version of metadata file for each OS version and deprecate very old versions as we go along. checkHeader returns the version of the metadata file we read. Which we can use to decide how to read the metadata file.

But I feel it won't this simple in the long run. Currently, migrating indexes to new OS version is restricted due to how lucene restricts upgrades, adding more blockers/conditions to this might not be ideal. Ideally we should be able to automatically upgrade the metadata files to new version when read. But we need to define a limit for each OS version what min version of segment we would support.

This could be part of recovery from remote store flow? I was thinking if we could automatically upgrade the metadata files to new versions whenever possible during recovery flow and try to avoid changes to metadata file where such automatic upgrade is not possible.

I think we can spend some time to think about it and use a uniform version evolution strategy for both translog and segment metadata files. do you think we should create a separate issue for this? or include in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a separate issue for handling backward compatibility #6123

@sachinpkale
Copy link
Member

Adding skip-changelog label as this could be part of bigger checksum validation story.

in,
UploadedSegmentMetadata.METADATA_CODEC,
UploadedSegmentMetadata.CURRENT_VERSION,
UploadedSegmentMetadata.CURRENT_VERSION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ashking94. Let me check on this more and see if we can follow a similar approach with metadata versions. We would definitely have to maintain some version mapping b/w metadata versions and OS versions. But i was thinking about how can we provide maximum compatibility across versions and better backward compatibility for data migration. I know lucene already adds some restriction to that, but for metadata files maybe we can be a little flexible.

@linuxpi linuxpi changed the title [Remote Store] Add header(codec) and footer(cehcksum) to remote store segment metadata [Remote Store] Add header(codec) and footer(checksum) to remote store segment metadata Jan 27, 2023
@linuxpi linuxpi force-pushed the segment-md-header-footer branch from 6bea374 to c94e8b8 Compare January 31, 2023 19:44
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@linuxpi
Copy link
Collaborator Author

linuxpi commented Feb 1, 2023

Flaky Test failures in gradle check:

[org.opensearch.search.aggregations.metrics.MedianAbsoluteDeviationAggregatorTests.testQueryFiltering](https://build.ci.opensearch.org/job/gradle-check/10342/testReport/junit/org.opensearch.search.aggregations.metrics/MedianAbsoluteDeviationAggregatorTests/testQueryFiltering/)
[org.opensearch.search.aggregations.metrics.MedianAbsoluteDeviationAggregatorTests.testQueryFiltering](https://build.ci.opensearch.org/job/gradle-check/10342/testReport/junit/org.opensearch.search.aggregations.metrics/MedianAbsoluteDeviationAggregatorTests/testQueryFiltering_2/)
[org.opensearch.search.aggregations.metrics.MedianAbsoluteDeviationAggregatorTests.testQueryFiltering](https://build.ci.opensearch.org/job/gradle-check/10342/testReport/junit/org.opensearch.search.aggregations.metrics/MedianAbsoluteDeviationAggregatorTests/testQueryFiltering_3/)
[org.opensearch.search.aggregations.metrics.MedianAbsoluteDeviationAggregatorTests.testQueryFiltering](https://build.ci.opensearch.org/job/gradle-check/10342/testReport/junit/org.opensearch.search.aggregations.metrics/MedianAbsoluteDeviationAggregatorTests/testQueryFiltering_4/)`

@linuxpi linuxpi marked this pull request as ready for review February 1, 2023 04:43
@linuxpi linuxpi force-pushed the segment-md-header-footer branch from 97c2ff5 to a204f98 Compare February 21, 2023 05:25
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Security scan needs to be addressed maybe?

…nt metadata

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
… to MetadataParser

2. Add TODOs and Deprecate non-generic methods

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
…eading contract

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
Signed-off-by: Varun Bansal <bansvaru@amazon.com>
@linuxpi linuxpi force-pushed the segment-md-header-footer branch from 56b19aa to 5967db4 Compare February 21, 2023 09:00
@linuxpi
Copy link
Collaborator Author

linuxpi commented Feb 21, 2023

Overall LGTM. Security scan needs to be addressed maybe?

Looks like a known issue - #5576 (comment)

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@linuxpi
Copy link
Collaborator Author

linuxpi commented Feb 21, 2023

We can go ahead with merge as no new dependencies were introduced in this PR but still Security Check is failing. Saw a PR merged which had similar issue - #6351

Security failure is a know issue - #5576 (comment)

@gbbafna gbbafna merged commit 31434a3 into opensearch-project:main Feb 21, 2023
@gbbafna gbbafna added the backport 2.x Backport to 2.x branch label Feb 21, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Feb 21, 2023
… segment metadata (#5917)

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
(cherry picked from commit 31434a3)
Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gbbafna pushed a commit that referenced this pull request Feb 21, 2023
… segment metadata (#5917) (#6406)

(cherry picked from commit 31434a3)

Signed-off-by: Varun Bansal <bansvaru@amazon.com>
linuxpi added a commit to linuxpi/OpenSearch that referenced this pull request Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants