-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
avoid dataIsNotReady error while retrying stale read on the leader #765
Conversation
internal/locate/region_request.go
Outdated
if retryTimes > 0 && s.replicaSelector != nil && s.replicaSelector.regionStore != nil && | ||
s.replicaSelector.targetIdx == s.replicaSelector.regionStore.workTiKVIdx { | ||
// retry on the leader should not use stale read to avoid possible DataIsNotReady error as it always can serve any read | ||
req.StaleRead = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The region state may not up-to-date in the client, may set req.ReplicaRead
to false
to avoid it served by a follower replica.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@you06 could you please educate me what is req.ReplicaRead=false
supposed to do? My understanding that even if the request lands on the replica, it will do a ReadIndex and return data w/o danger of repeated DataIsNotReady
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry req.ReplicaRead
is set to false
when stale read, the comment is not correct, my idea is to avoid follower read when retrying, and I'm inclined to read from leader after DataIsNotReady
is returned once.
The request may got multi DataIsNotReady
before it choosing the leader replica and switch off stale read in your implementation.
If the resolved ts is blocked by write locks(maybe large transactions), there will be DataIsNotReady
in another replica node with high possibiliy.
Maybe you can checkout this commit to figure out my suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to read from leader after DataIsNotReady is returned once.
The request may got multi DataIsNotReady before it choosing the leader replica
Thanks @you06 for explanation. I believe, this should choose the leader after the firstDataIsNotReady
.
your commit does it at a different level. The big difference is that in my commit this logic only for global stale read, but you commit does it for all. I'm ok do go ahead with your commit and removing, but we will need to remove this code with it as it becomes unrelevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@you06 I moved my code to the place you recommended, but keep the approach to rely on storeSelectorOp.leaderOnly
instead of resetting selector state completely as the latter might have larger consequences than expected (e.g. result in more retries)
I’ve also ran integration tests with with --with-tikv and it passed all test locally
PASS
ok integration_tests 220.231s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation is correct, but I can find some region miss with it, from the metrics, 1/3 of the DataIsNotReady suffers from region miss.
I've carefully debugged with it and find the problem may happens here, when the stale read is served by leader in the first time, it may receive a DataIsNotReady error, the leader.attempts
is increased, then the check leader.isExhausted(1)
will fail here. Response with region error will retry later, but introduce unexpected latency.
client-go/internal/locate/region_request.go
Lines 595 to 599 in 92db9f7
if leader.isEpochStale() || leader.isExhausted(1) { | |
metrics.TiKVReplicaSelectorFailureCounter.WithLabelValues("exhausted").Inc() | |
selector.invalidateRegion() | |
return nil, nil | |
} |
I think we have 2 options:
- Reset
leader.attempts
in the path withstate.option.leaderOnly
. - Disable stale read if choosing leader replica to serve the read in the first try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The leaderOnly
flag in the previous implementation of accessFollower
does not function as intended, which can be considered a bug.
To implement the "retry leader peer without stale read flag" strategy, an alternative approach is to modify accessFollower.next
to return the corresponding leader peer's RPCContext
. Then, in SendReqCtx
, we can remove the StaleRead
flag from the request based on information provided by the returned RPCContext
. This ensures that peer selection strategy remains unified and centralized within the selector
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: artem_danilov <artem_danilov@airbnb.com>
b6dc236
to
df222dc
Compare
Signed-off-by: artem_danilov <artem_danilov@airbnb.com>
If stale read encounter a |
@zhangjinpeng1987 |
Signed-off-by: artem_danilov <artem_danilov@airbnb.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement by removing backoffer!
@cfzjywxk PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Great improvement~
…ikv#765) * avoid dataIsNotReady error while retrying stale read on the leader Signed-off-by: artem_danilov <artem_danilov@airbnb.com> * move StaleRead flag reset to retry section Signed-off-by: artem_danilov <artem_danilov@airbnb.com> * move all logic to #next and allow retry on the leader Signed-off-by: artem_danilov <artem_danilov@airbnb.com> --------- Signed-off-by: artem_danilov <artem_danilov@airbnb.com> Co-authored-by: artem_danilov <artem_danilov@airbnb.com>
…ikv#765) * avoid dataIsNotReady error while retrying stale read on the leader Signed-off-by: artem_danilov <artem_danilov@airbnb.com> * move StaleRead flag reset to retry section Signed-off-by: artem_danilov <artem_danilov@airbnb.com> * move all logic to #next and allow retry on the leader Signed-off-by: artem_danilov <artem_danilov@airbnb.com> --------- Signed-off-by: artem_danilov <artem_danilov@airbnb.com> Co-authored-by: artem_danilov <artem_danilov@airbnb.com> Signed-off-by: you06 <you1474600@gmail.com>
…) (#789) * avoid dataIsNotReady error while retrying stale read on the leader (#765) * avoid dataIsNotReady error while retrying stale read on the leader Signed-off-by: artem_danilov <artem_danilov@airbnb.com> * move StaleRead flag reset to retry section Signed-off-by: artem_danilov <artem_danilov@airbnb.com> * move all logic to #next and allow retry on the leader Signed-off-by: artem_danilov <artem_danilov@airbnb.com> --------- Signed-off-by: artem_danilov <artem_danilov@airbnb.com> Co-authored-by: artem_danilov <artem_danilov@airbnb.com> Signed-off-by: you06 <you1474600@gmail.com> * add context patcher for 65 Signed-off-by: you06 <you1474600@gmail.com> * fmt Signed-off-by: you06 <you1474600@gmail.com> --------- Signed-off-by: artem_danilov <artem_danilov@airbnb.com> Signed-off-by: you06 <you1474600@gmail.com> Co-authored-by: Artem Danilov <329970+Tema@users.noreply.github.com> Co-authored-by: artem_danilov <artem_danilov@airbnb.com>
When Stale Read fails with
DataIsNotReady
it reties on the leader. as a Stale Read again and still can fail withDataIsNotReady
ifsafe_ts
is not advanced on leader either. We can easily avoid it by sending the read against leader without stale flag. This way it always succeeds if there is no active conflict. This should considerably help in cases when safe_ts is not advancing due to long commit transactions.