-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-23.1: kv: prevent lease interval regression during expiration-to-epoch promotion #130124
Merged
nvanbenschoten
merged 6 commits into
cockroachdb:release-23.1
from
nvanbenschoten:backport23.1-123442
Sep 5, 2024
Merged
release-23.1: kv: prevent lease interval regression during expiration-to-epoch promotion #130124
nvanbenschoten
merged 6 commits into
cockroachdb:release-23.1
from
nvanbenschoten:backport23.1-123442
Sep 5, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This can be triggered rapidly because each replica might call this as it tries and fails to acquire a lease.
This commit adds a check that a replica does not perform a lease transfer if it does not own the previous lease. This allows us to make a stronger assumption a layer down. Epic: None Release note: None
…otion Fixes cockroachdb#121480. Fixes cockroachdb#122016. This commit resolves a bug in the expiration-based to epoch-based lease promotion transition, where the lease's effective expiration could be allowed to regress. To prevent this, we detect when such cases are about to occur and synchronously heartbeat the leaseholder's liveness record. This works because the liveness record interval and the expiration-based lease interval are the same, so a synchronous heartbeat ensures that the liveness record has a later expiration than the prior lease by the time the lease promotion goes into effect. The code structure here leaves a lot to be desired, but since we're going to be cleaning up and/or removing a lot of this code soon anyway, I'm prioritizing backportability. This is therefore more targeted and less general than it could be. The resolution here also leaves something to be desired. A nicer fix would be to introduce a minimum_lease_expiration field on epoch-based leases so that we can locally ensure that the expiration does not regress. This is what we plan to do for leader leases in the upcoming release. We don't make this change because it would be require a version gate to avoid replica divergence, so it would not be backportable. Release note (bug fix): Fixed a rare bug where a lease transfer could lead to a `side-transport update saw closed timestamp regression` panic. The bug could occur when a node was overloaded and failing to heartbeat its node liveness record.
This commit adds a check that `args.PrevLease` is equivalent to `cArgs.EvalCtx.GetLease()` to RequestLease. This ensures that the validation here is consistent with the validation that was performed when the lease request was constructed. Release note: None Epic: None
This commit deflakes the test by waiting for N1's view of N2's lease expiration to match N2's view. This is important in the rare case where N1 tries to increase N2's epoch, but it has a stale view of the lease expiration time. Epic: None Release note: None
A race could occur when a replica queue and post lease application both attempted to switch the lease type. This race would cause the queue to not process the replica because the lease type had already changed. As a result, lease preference violations might not have been quickly resolved by the lease queue. Read the lease under the same mutex used for requesting the lease, when possibly switching the lease type. Resolves: cockroachdb#123998 Release note: None
Thanks for opening a backport. Please check the backport criteria before merging:
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
Also, please add a brief release justification to the body of your PR to justify this |
blathers-crl
bot
added
the
backport
Label PR's that are backports to older release branches
label
Sep 4, 2024
nicktrav
approved these changes
Sep 5, 2024
arulajmani
approved these changes
Sep 5, 2024
blathers backport staging-v23.1.27 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport:
Please see individual PRs for details.
/cc @cockroachdb/release
Release note (bug fix): Fixed a rare bug where a lease transfer could lead to a
side-transport update saw closed timestamp regression
panic. The bug could occur when a node was overloaded and failing to heartbeat its node liveness record.Release justification: fixes rare, but serious panic.