Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-23.1: kv: prevent lease interval regression during expiration-to-epoch promotion #130124

Merged

Commits on Sep 4, 2024

  1. kvserver: throttle logging on failed self-heartbeat

    This can be triggered rapidly because each replica might call this as it tries
    and fails to acquire a lease.
    tbg authored and nvanbenschoten committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    f97e02c View commit details
    Browse the repository at this point in the history
  2. kv: assert against remote lease transfers

    This commit adds a check that a replica does not perform a lease transfer if it
    does not own the previous lease. This allows us to make a stronger assumption a
    layer down.
    
    Epic: None
    Release note: None
    nvanbenschoten committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    0df0361 View commit details
    Browse the repository at this point in the history
  3. kv: prevent lease interval regression during expiration-to-epoch prom…

    …otion
    
    Fixes cockroachdb#121480.
    Fixes cockroachdb#122016.
    
    This commit resolves a bug in the expiration-based to epoch-based lease
    promotion transition, where the lease's effective expiration could be
    allowed to regress. To prevent this, we detect when such cases are about
    to occur and synchronously heartbeat the leaseholder's liveness record.
    This works because the liveness record interval and the expiration-based
    lease interval are the same, so a synchronous heartbeat ensures that the
    liveness record has a later expiration than the prior lease by the time
    the lease promotion goes into effect.
    
    The code structure here leaves a lot to be desired, but since we're
    going to be cleaning up and/or removing a lot of this code soon anyway,
    I'm prioritizing backportability. This is therefore more targeted and
    less general than it could be.
    
    The resolution here also leaves something to be desired. A nicer fix
    would be to introduce a minimum_lease_expiration field on epoch-based
    leases so that we can locally ensure that the expiration does not
    regress. This is what we plan to do for leader leases in the upcoming
    release. We don't make this change because it would be require a version
    gate to avoid replica divergence, so it would not be backportable.
    
    Release note (bug fix): Fixed a rare bug where a lease transfer could
    lead to a `side-transport update saw closed timestamp regression` panic.
    The bug could occur when a node was overloaded and failing to heartbeat
    its node liveness record.
    nvanbenschoten committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    b5f6cbc View commit details
    Browse the repository at this point in the history
  4. kv: add PrevLease check in RequestLease

    This commit adds a check that `args.PrevLease` is equivalent to
    `cArgs.EvalCtx.GetLease()` to RequestLease. This ensures that the
    validation here is consistent with the validation that was performed
    when the lease request was constructed.
    
    Release note: None
    Epic: None
    nvanbenschoten committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    3167af5 View commit details
    Browse the repository at this point in the history
  5. kvserver: deflake TestRangefeedCheckpointsRecoverFromLeaseExpiration

    This commit deflakes the test by waiting for N1's view of N2's lease
    expiration to match N2's view. This is important in the rare case
    where N1 tries to increase N2's epoch, but it has a stale view of
    the lease expiration time.
    
    Epic: None
    
    Release note: None
    iskettaneh authored and nvanbenschoten committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    b198da9 View commit details
    Browse the repository at this point in the history
  6. kvserver: read lease under mutex when switching lease type

    A race could occur when a replica queue and post lease application both
    attempted to switch the lease type. This race would cause the queue to
    not process the replica because the lease type had already changed. As a
    result, lease preference violations might not have been quickly
    resolved by the lease queue.
    
    Read the lease under the same mutex used for requesting the lease, when
    possibly switching the lease type.
    
    Resolves: cockroachdb#123998
    Release note: None
    kvoli authored and nvanbenschoten committed Sep 4, 2024
    Configuration menu
    Copy the full SHA
    ea16f5a View commit details
    Browse the repository at this point in the history