Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add transport action for primary term validation for remote-backed indices #5616

Merged
merged 16 commits into from
Dec 26, 2022

Conversation

ashking94
Copy link
Member

@ashking94 ashking94 commented Dec 22, 2022

Signed-off-by: Ashish Singh ssashish@amazon.com

Description

Solves #5464

Issues Resolved

Solves #5464

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ashish Singh <ssashish@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

codecov-commenter commented Dec 22, 2022

Codecov Report

Merging #5616 (5d546b2) into main (306b199) will increase coverage by 0.09%.
The diff coverage is 70.43%.

@@             Coverage Diff              @@
##               main    #5616      +/-   ##
============================================
+ Coverage     70.84%   70.93%   +0.09%     
- Complexity    58296    58339      +43     
============================================
  Files          4741     4740       -1     
  Lines        278541   278635      +94     
  Branches      40268    40273       +5     
============================================
+ Hits         197322   197654     +332     
+ Misses        65068    64839     -229     
+ Partials      16151    16142       -9     
Impacted Files Coverage Δ
...eplication/checkpoint/PublishCheckpointAction.java 27.58% <ø> (+5.17%) ⬆️
...ation/OpenSearchIndexLevelReplicationTestCase.java 89.81% <ø> (ø)
...ensearch/action/bulk/TransportShardBulkAction.java 76.71% <63.01%> (-3.14%) ⬇️
...upport/replication/TransportReplicationAction.java 76.56% <66.66%> (+0.39%) ⬆️
...support/replication/ReplicationModeAwareProxy.java 81.25% <70.00%> (-8.75%) ⬇️
...on/support/replication/FanoutReplicationProxy.java 85.71% <83.33%> (-14.29%) ⬇️
...tion/support/replication/ReplicationOperation.java 73.29% <100.00%> (-0.33%) ⬇️
...h/action/support/replication/ReplicationProxy.java 100.00% <100.00%> (+20.00%) ⬆️
...n/support/replication/ReplicationProxyRequest.java 100.00% <100.00%> (ø)
...tion/support/replication/TransportWriteAction.java 81.34% <100.00%> (ø)
... and 487 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Signed-off-by: Ashish Singh <ssashish@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Ashish Singh <ssashish@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sachinpkale
Copy link
Member

Don't we need a changelog entry for this change?

No, while working for a bigger feature, we dont require a changelog. Check this - https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#which-changes-require-a-changelog-entry

But no-op replication can be considered as a big feature, right?

@ashking94
Copy link
Member Author

Don't we need a changelog entry for this change?

No, while working for a bigger feature, we dont require a changelog. Check this - https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#which-changes-require-a-changelog-entry

But no-op replication can be considered as a big feature, right?

No-op itself is not a feature, but a part of "Request level durability for remote backed indices". Refer this comment - #4954 (comment)

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we need to validate if clusterService.state().getNodes().getMinNodeVersion() of all nodes is greater than this version to ensure remote translog works. Since this can break during mixed versions rolling upgrades

@ashking94
Copy link
Member Author

Should we need to validate if clusterService.state().getNodes().getMinNodeVersion() of all nodes is greater than this version to ensure remote translog works. Since this can break during mixed versions rolling upgrades

Had discussed this offline with @gbbafna and @sachinpkale. Major point around this was - If there is a mixed cluster with let's say older version and 2.5, then a primary on 2.5 will send requests to replicas on older version which can fail to understand this. However, since this is an experimental feature, We can handle this when removing the experimental feature flag. This also seems to be the status quo as the segment replication feature which is also behind the experimental feature flag has PublishCheckpointAction which older version replicas would not understand when a index is created in a mixed cluster setup.

@Bukhtawar
Copy link
Collaborator

Should we need to validate if clusterService.state().getNodes().getMinNodeVersion() of all nodes is greater than this version to ensure remote translog works. Since this can break during mixed versions rolling upgrades

Had discussed this offline with @gbbafna and @sachinpkale. Major point around this was - If there is a mixed cluster with let's say older version and 2.5, then a primary on 2.5 will send requests to replicas on older version which can fail to understand this. However, since this is an experimental feature, We can handle this when removing the experimental feature flag. This also seems to be the status quo as the segment replication feature which is also behind the experimental feature flag has PublishCheckpointAction which older version replicas would not understand when a index is created in a mixed cluster setup.

Can you please open up a tracking issue for the same?

@ashking94
Copy link
Member Author

Can you please open up a tracking issue for the same?

Created #5635 for the same.

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add tests to ensure that primary term and no replication modes are used correctly across TransportActions

@ashking94 ashking94 changed the title Add new transport action for primary term validation Add transport action for primary term validation for remote-backed indices Dec 26, 2022
Signed-off-by: Ashish Singh <ssashish@amazon.com>
@ashking94
Copy link
Member Author

Can we add tests to ensure that primary term and no replication modes are used correctly across TransportActions

Added

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Ashish Singh <ssashish@amazon.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the UTs can we open a follow up for an Integ test for the transport calls and validating primary term failure cases during isolated writes

@Bukhtawar Bukhtawar merged commit b8c74bd into opensearch-project:main Dec 26, 2022
@ashking94
Copy link
Member Author

Tracking issue - #5636

ashking94 added a commit to ashking94/OpenSearch that referenced this pull request Jan 6, 2023
…dices (opensearch-project#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>
ashking94 added a commit to ashking94/OpenSearch that referenced this pull request Jan 6, 2023
…dices (opensearch-project#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>
ashking94 added a commit to ashking94/OpenSearch that referenced this pull request Jan 9, 2023
…dices (opensearch-project#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>
gbbafna pushed a commit that referenced this pull request Jan 9, 2023
…5731)

* Remove PRRL creation/deletion in peer recovery of remote store enabled replica (#4954)

* Add RecoverySourceHandlerFactory for extensibility

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* recoverToTarget made extensible to allow multiple implementations

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove PRRL after SendFileStep in Peer Recovery

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove CHANGELOG entry as this is incremental PR

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Enhance CheckpointState to support no-op replication (#5282)

* CheckpointState enhanced to support no-op replication

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Co-authored-by: Bukhtawar Khan<bukhtawa@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Add transport action for primary term validation for remote-backed indices (#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>
sachinpkale pushed a commit to sachinpkale/OpenSearch that referenced this pull request Jan 10, 2023
…pensearch-project#5731)

* Remove PRRL creation/deletion in peer recovery of remote store enabled replica (opensearch-project#4954)

* Add RecoverySourceHandlerFactory for extensibility

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* recoverToTarget made extensible to allow multiple implementations

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove PRRL after SendFileStep in Peer Recovery

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove CHANGELOG entry as this is incremental PR

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Enhance CheckpointState to support no-op replication (opensearch-project#5282)

* CheckpointState enhanced to support no-op replication

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Co-authored-by: Bukhtawar Khan<bukhtawa@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Add transport action for primary term validation for remote-backed indices (opensearch-project#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>
gbbafna pushed a commit that referenced this pull request Jan 10, 2023
* Remove PRRL creation/deletion in peer recovery of remote store enabled replica (#4954)

* Add RecoverySourceHandlerFactory for extensibility

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* recoverToTarget made extensible to allow multiple implementations

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove PRRL after SendFileStep in Peer Recovery

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove CHANGELOG entry as this is incremental PR

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Enhance CheckpointState to support no-op replication (#5282)

* CheckpointState enhanced to support no-op replication

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Co-authored-by: Bukhtawar Khan<bukhtawa@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Add transport action for primary term validation for remote-backed indices (#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Co-authored-by: Ashish <ssashish@amazon.com>
kotwanikunal pushed a commit that referenced this pull request Jan 25, 2023
…5731)

* Remove PRRL creation/deletion in peer recovery of remote store enabled replica (#4954)

* Add RecoverySourceHandlerFactory for extensibility

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* recoverToTarget made extensible to allow multiple implementations

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove PRRL after SendFileStep in Peer Recovery

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Empty-Commit

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Remove CHANGELOG entry as this is incremental PR

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Incorporate PR review feedback

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Enhance CheckpointState to support no-op replication (#5282)

* CheckpointState enhanced to support no-op replication

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Co-authored-by: Bukhtawar Khan<bukhtawa@amazon.com>
Signed-off-by: Ashish Singh <ssashish@amazon.com>

* Add transport action for primary term validation for remote-backed indices (#5616)

Add transport action for primary term validation for remote-backed indices

Signed-off-by: Ashish Singh <ssashish@amazon.com>

Signed-off-by: Ashish Singh <ssashish@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
distributed framework skip-changelog Storage:Durability Issues and PRs related to the durability framework v2.5.0 'Issues and PRs related to version v2.5.0'
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants