-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: check PROSCRIBED lease status over UNUSABLE #102407
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
It isn't really considered valid as such, I think the crucial point is that it's treated as ours. A This probably also deserves a test case for the precedence. |
Yeah, adding test in the meantime. Wanted to put the fix for review ASAP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo my comment above about valid/owned and a test.
How does this interact with clock skew, btw? The stasis period exists because it's possible for someone else to believe their lease has already started while we also believe that ours hasn't ended because of clock skew. Can this also cause violations, such that we should exclude the |
@erikgrinaker Is your question basically: why this comment? I don't know, it would be nice to explain in this comment why it's safe to close timestamps in the stasis. I think the reason it's ok is that the closed timestamps are "logical". They fall between the lease start and expiration timestamp. They aren't concerned with clock skew? Stasis is there to make sure there aren't requests served around the lease transfer times, i.e. it makes sure the physical time ranges when two leases are used are non-intersecting too? Upd: there might be a problem though. How about the following scenario?
OTOH, I think it's always true that:
So, two closed timestamps coming from 2 leaseholders never form a CT regression? Even if one of them is in stasis. Lease timestamps, and closed timestamps are "logical". They don't have any conflicts etc. Clock skew and stasis period come into play when we need to map physical onto this logical time, i.e. when this layer interacts with the "requests" layer above. |
5a1693d
to
993e321
Compare
Yeah, I think it's the LAI that saves us, because presumably an unusable lease can't increment the LAI. |
Does it though? In the scenario we've seen, the old leaseholder must have been able to increment its LAI, despite the lease being in I think the coordination here happens either by a lease transfer revoking the current lease (which will/should immediately halt the closed ts transport from sending future updates), or by the next lease necessarily having a greater start time than our expiration time, where the LAI is communicated via Raft, and the side transport is prevented from sending further updates beyond the expiration time. So the problem is simply that we keep sending updates after we transferred the lease, which we fix here. So I don't think There's also this: cockroach/pkg/kv/kvserver/kvserverpb/lease_status.go Lines 51 to 53 in bfc5603
|
The old leaseholder was sending the old LAI. But the timestamp was high, getting into the territory of the next lease. |
Are you sure that's true? I don't think closed timestamps are constrained to the by the lease start, as the lease plays no role here: cockroach/pkg/kv/kvserver/closedts/sidetransport/sender.go Lines 331 to 338 in fad7ac8
We do make sure that the resulting timestamp isn't past the lease expiration, cockroach/pkg/kv/kvserver/replica_closedts.go Lines 68 to 71 in 793b4c8
but I don't see a corresponding check for the start time. Assuming a fixed closed timestamp lag time L, I agree that continuing to close during stasis doesn't seem to be an issue. The largest timestamp that can be closed by the old leaseholder is But also, closed timestamp lag time is a cluster setting and propagates async, so can't the old leaseholder be closing out |
I'm probably missing something in that last bit because shouldn't changing the cluster setting trigger the assertion even if no lease changes hands then? Since the leaseholder's closed timestamp will regress against itself. |
@erikgrinaker Added a unit-test, and updated the PR desc. |
The PROSCRIBED lease status, just like EXPIRED, puts a lease to a definitely invalid state. The UNUSABLE state (when request timestamp is in stasis period) is less of a clear cut: we still own the lease but callers may use or not use it depending on context. For example, the closed timestamp side-transport ignores the UNUSABLE state (because we still own the lease), and takes it as usable for its purposes. Because of the order in which the checks were made, this has lead to a bug: a PROSCRIBED lease is reported as UNUSABLE during stasis periods, the closed timestamp side-transport then considers it usable, and updates closed timestamps when it shouldn't. This commit fixes the bug by swapping the order of checks in the leaseStatus method. The order now goes from "hard" checks like EXPIRED and PROSCRIBED, to "softer" UNUSABLE, and (when the softness is put to the limit) VALID. Fixes cockroachdb#98698 Fixes cockroachdb#99931 Fixes cockroachdb#100101 Epic: none Release note (bug fix): a bug is fixed in closed timestamp updates within its side-transport. Previously, during asymmetric partitions, a node that transfers a lease away, and misses a liveness heartbeat, could then erroneously update the closed timestamp during the stasis period of its liveness. This could lead to closed timestamp invariant violation, and node crashes; in extreme cases this could lead to inconsistencies in read-only queries.
Epic: none Release note: none
574cead
to
add56b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A couple of test nits, but we can address those in a follow-up PR. Thanks for adding this test!
ts := []hlc.ClockTimestamp{ | ||
{WallTime: 500, Logical: 0}, // before lease start | ||
{WallTime: 1000, Logical: 0}, // lease start | ||
{WallTime: 2000, Logical: 0}, // within lease | ||
{WallTime: 2950, Logical: 0}, // within stasis | ||
{WallTime: 3000, Logical: 0}, // lease expiration | ||
{WallTime: 3500, Logical: 0}, // expired | ||
{WallTime: 5000, Logical: 0}, // next expiration | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would be good to name these. It's hard to read the integer indexes below.
{lease: epoLease, now: ts[2], liveness: oldLiveness, want: kvserverpb.LeaseState_ERROR, | ||
wantErr: "node liveness info for n1 is stale"}, | ||
} { | ||
t.Run("", func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Instead of comments above the test cases, consider adding a desc
field and use that as the test name too. It's annoying to debug failures when you have to count to find the correct case.
TFTRs! bors r=erikgrinaker,tbg |
Build succeeded: |
The PROSCRIBED lease status, just like EXPIRED, puts a lease to a definitely invalid state. The UNUSABLE state (when request timestamp is in stasis period) is less of a clear cut: we still own the lease but callers may use or not use it depending on context.
For example, the closed timestamp side-transport ignores the UNUSABLE state (because we still own the lease), and takes it as usable for its purposes. Because of the order in which the checks were made, this has lead to a bug: a PROSCRIBED lease is reported as UNUSABLE during stasis periods, the closed timestamp side-transport then considers it usable, and updates closed timestamps when it shouldn't.
This commit fixes the bug by swapping the order of checks in the leaseStatus method. The order now goes from "hard" checks like EXPIRED and PROSCRIBED, to "softer" UNUSABLE, and (when the softness is put to the limit) VALID.
Fixes #98698
Fixes #99931
Fixes #100101
Epic: none
Release note (bug fix): a bug is fixed in closed timestamp updates within its side-transport. Previously, during asymmetric partitions, a node that transfers a lease away, and misses a liveness heartbeat, could then erroneously update the closed timestamp during the stasis period of its liveness. This could lead to closed timestamp invariant violation, and node crashes; in extreme cases this could lead to inconsistencies in read-only queries.