-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove PRRLs before performing file-based recovery #43928
Remove PRRLs before performing file-based recovery #43928
Conversation
If the primary performs a file-based recovery to a node that has (or recently had) a copy of the shard then it is possible that the persisted global checkpoint of the new copy is behind that of the old copy since file-based recoveries are somewhat destructive operations. Today we leave that node's PRRL in place during the recovery with the expectation that it can be used by the new copy. However this isn't the case if the new copy needs more history to be retained, because retention leases may only advance and never retreat. This commit addresses this by removing any existing PRRL during a file-based recovery: since we are performing a file-based recovery we have already determined that there isn't enough history for an ops-based recovery, so there is little point in keeping the old lease in place. Caught by [a failure of `RecoveryWhileUnderLoadIT.testRecoverWhileRelocating`](https://scans.gradle.com/s/wxccfrtfgjj3g/console-log?task=:server:integTest#L14) Relates elastic#41536
Pinging @elastic/es-distributed |
…e way back to SequenceNumbers.UNASSIGNED_SEQ_NO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left 1 comment, looking good o.w.
Math.max(0L, checkpointState.globalCheckpoint + 1L), | ||
PEER_RECOVERY_RETENTION_LEASE_SOURCE); | ||
final long newRetainedSequenceNumber = Math.max(0L, checkpointState.globalCheckpoint + 1L); | ||
if (retentionLease.retainingSequenceNumber() <= newRetainedSequenceNumber) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we want to renew just because of timestamp (but copy is not currently tracked)? In that case, this here will be false and we won't renew.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow. If this condition is false
then we can't renew because the lease can only advance. Or, put differently, why would we want to renew a lease for a copy that we're not tracking? Leases for inactive copies should be allowed to expire if the copy doesn't become active within the timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not explicitly make sure not to renew those shards then that are not tracked?
Given that newRetainedSequenceNumber is always >= 0, could there be a case where we continuously extend a lease for a shard copy that is not tracked but had an existing retainingSeqNumber of 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only if there's a shard that stays in the routing table but which never becomes tracked, which I don't think can happen (assuming that recoveries eventually terminate at least, thanks to the MaxRetryAllocationDecider
).
Note that we don't expire leases for assigned shards anyway, tracked or not, because we create the lease in peer recovery before initiating tracking, so if we skipped untracked shards here then the lease would continue to exist until it had expired and the shard was no longer in the routing table.
Also, if we do skip untracked shards here then you'd potentially have a shard on which tracking was just initiated with a really old lease; if that shard failed before the next retention lease sync then its lease could expire, triggering another file-based recovery when in fact an ops-based recovery would have been worth attempting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
head spin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If the primary performs a file-based recovery to a node that has (or recently had) a copy of the shard then it is possible that the persisted global checkpoint of the new copy is behind that of the old copy since file-based recoveries are somewhat destructive operations. Today we leave that node's PRRL in place during the recovery with the expectation that it can be used by the new copy. However this isn't the case if the new copy needs more history to be retained, because retention leases may only advance and never retreat. This commit addresses this by removing any existing PRRL during a file-based recovery: since we are performing a file-based recovery we have already determined that there isn't enough history for an ops-based recovery, so there is little point in keeping the old lease in place. Caught by [a failure of `RecoveryWhileUnderLoadIT.testRecoverWhileRelocating`](https://scans.gradle.com/s/wxccfrtfgjj3g/console-log?task=:server:integTest#L14) Relates #41536
If the primary performs a file-based recovery to a node that has (or recently
had) a copy of the shard then it is possible that the persisted global
checkpoint of the new copy is behind that of the old copy since file-based
recoveries are somewhat destructive operations.
Today we leave that node's PRRL in place during the recovery with the
expectation that it can be used by the new copy. However this isn't the case if
the new copy needs more history to be retained, because retention leases may
only advance and never retreat.
This commit addresses this by removing any existing PRRL during a file-based
recovery: since we are performing a file-based recovery we have already
determined that there isn't enough history for an ops-based recovery, so there
is little point in keeping the old lease in place.
Caught by a failure of
RecoveryWhileUnderLoadIT.testRecoverWhileRelocating
Relates #41536