Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove PRRLs before performing file-based recovery #43928

Conversation

DaveCTurner
Copy link
Contributor

If the primary performs a file-based recovery to a node that has (or recently
had) a copy of the shard then it is possible that the persisted global
checkpoint of the new copy is behind that of the old copy since file-based
recoveries are somewhat destructive operations.

Today we leave that node's PRRL in place during the recovery with the
expectation that it can be used by the new copy. However this isn't the case if
the new copy needs more history to be retained, because retention leases may
only advance and never retreat.

This commit addresses this by removing any existing PRRL during a file-based
recovery: since we are performing a file-based recovery we have already
determined that there isn't enough history for an ops-based recovery, so there
is little point in keeping the old lease in place.

Caught by a failure of RecoveryWhileUnderLoadIT.testRecoverWhileRelocating

Relates #41536

If the primary performs a file-based recovery to a node that has (or recently
had) a copy of the shard then it is possible that the persisted global
checkpoint of the new copy is behind that of the old copy since file-based
recoveries are somewhat destructive operations.

Today we leave that node's PRRL in place during the recovery with the
expectation that it can be used by the new copy. However this isn't the case if
the new copy needs more history to be retained, because retention leases may
only advance and never retreat.

This commit addresses this by removing any existing PRRL during a file-based
recovery: since we are performing a file-based recovery we have already
determined that there isn't enough history for an ops-based recovery, so there
is little point in keeping the old lease in place.

Caught by [a failure of `RecoveryWhileUnderLoadIT.testRecoverWhileRelocating`](https://scans.gradle.com/s/wxccfrtfgjj3g/console-log?task=:server:integTest#L14)

Relates elastic#41536
@DaveCTurner DaveCTurner added >test Issues or PRs that are addressing/adding tests :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. labels Jul 3, 2019
@DaveCTurner DaveCTurner requested review from ywelsch and dnhatn July 3, 2019 15:06
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

…e way back to SequenceNumbers.UNASSIGNED_SEQ_NO
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left 1 comment, looking good o.w.

Math.max(0L, checkpointState.globalCheckpoint + 1L),
PEER_RECOVERY_RETENTION_LEASE_SOURCE);
final long newRetainedSequenceNumber = Math.max(0L, checkpointState.globalCheckpoint + 1L);
if (retentionLease.retainingSequenceNumber() <= newRetainedSequenceNumber) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we want to renew just because of timestamp (but copy is not currently tracked)? In that case, this here will be false and we won't renew.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow. If this condition is false then we can't renew because the lease can only advance. Or, put differently, why would we want to renew a lease for a copy that we're not tracking? Leases for inactive copies should be allowed to expire if the copy doesn't become active within the timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not explicitly make sure not to renew those shards then that are not tracked?

Given that newRetainedSequenceNumber is always >= 0, could there be a case where we continuously extend a lease for a shard copy that is not tracked but had an existing retainingSeqNumber of 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if there's a shard that stays in the routing table but which never becomes tracked, which I don't think can happen (assuming that recoveries eventually terminate at least, thanks to the MaxRetryAllocationDecider).

Note that we don't expire leases for assigned shards anyway, tracked or not, because we create the lease in peer recovery before initiating tracking, so if we skipped untracked shards here then the lease would continue to exist until it had expired and the shard was no longer in the routing table.

Also, if we do skip untracked shards here then you'd potentially have a shard on which tracking was just initiated with a really old lease; if that shard failed before the next retention lease sync then its lease could expire, triggering another file-based recovery when in fact an ops-based recovery would have been worth attempting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

head spin

@DaveCTurner DaveCTurner requested a review from ywelsch July 3, 2019 19:15
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit 389c625 into elastic:peer-recovery-retention-leases Jul 4, 2019
@DaveCTurner DaveCTurner deleted the 2019-07-03-delete-prrl-in-file-based-recovery branch July 4, 2019 14:32
DaveCTurner added a commit that referenced this pull request Jul 4, 2019
If the primary performs a file-based recovery to a node that has (or recently
had) a copy of the shard then it is possible that the persisted global
checkpoint of the new copy is behind that of the old copy since file-based
recoveries are somewhat destructive operations.

Today we leave that node's PRRL in place during the recovery with the
expectation that it can be used by the new copy. However this isn't the case if
the new copy needs more history to be retained, because retention leases may
only advance and never retreat.

This commit addresses this by removing any existing PRRL during a file-based
recovery: since we are performing a file-based recovery we have already
determined that there isn't enough history for an ops-based recovery, so there
is little point in keeping the old lease in place.

Caught by [a failure of `RecoveryWhileUnderLoadIT.testRecoverWhileRelocating`](https://scans.gradle.com/s/wxccfrtfgjj3g/console-log?task=:server:integTest#L14)

Relates #41536
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. >test Issues or PRs that are addressing/adding tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants