Skip to content

Commit

Permalink
Segment Replication - Fix NoSuchFileException errors caused when comp…
Browse files Browse the repository at this point in the history
…uting metadata snapshot on primary shards. (opensearch-project#4366)

* Segment Replication - Fix NoSuchFileException errors caused when computing metadata snapshot on primary shards.

This change fixes the errors that occur when computing metadata snapshots on primary shards from the latest in-memory SegmentInfos.  The error occurs when a segments_N file that is referenced by the in-memory infos is deleted as part of a concurrent commit.  The segments themselves are incref'd by IndexWriter.incRefDeleter but the commit file (Segments_N) is not.  This change resolves this by ignoring the segments_N file when computing metadata for CopyState and only sending incref'd segment files to replicas.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Update StoreTests.testCleanupAndPreserveLatestCommitPoint to assert additional segments are deleted.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Rename snapshot to metadataMap in CheckpointInfoResponse.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Refactor segmentReplicationDiff method to compute off two maps instead of MetadataSnapshots.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix spotless.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Revert catchall in SegmentReplicationSourceService.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Revert log lvl change.

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Fix SegmentReplicationTargetTests

Signed-off-by: Marc Handalian <handalm@amazon.com>

* Cleanup unused logger.

Signed-off-by: Marc Handalian <handalm@amazon.com>

Signed-off-by: Marc Handalian <handalm@amazon.com>
Co-authored-by: Suraj Singh <surajrider@gmail.com>
  • Loading branch information
mch2 and dreamer-89 committed Sep 7, 2022
1 parent ceb0e17 commit 0b85fd7
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ Inspired from [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
- [Segment Replication] Extend FileChunkWriter to allow cancel on transport client ([#4386](https://github.com/opensearch-project/OpenSearch/pull/4386))
- [Segment Replication] Fix NoSuchFileExceptions with segment replication when computing primary metadata snapshots ([#4366](https://github.com/opensearch-project/OpenSearch/pull/4366))
- [Segment Replication] Fix timeout issue by calculating time needed to process getSegmentFiles ([#4434](https://github.com/opensearch-project/OpenSearch/pull/4434))
- [Segment Replication] Update replicas to commit SegmentInfos instead of relying on segments_N from primary shards.

### Security

Expand Down

0 comments on commit 0b85fd7

Please sign in to comment.