-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clientv3/balancer: handle network partition in health check #8669
Conversation
bd030f2
to
1286a4e
Compare
clientv3/balancer.go
Outdated
@@ -44,6 +44,8 @@ type balancer interface { | |||
endpoints() []string | |||
// pinned returns the current pinned endpoint. | |||
pinned() string | |||
// errored handles error from server-side. | |||
errored(addr string, err error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should not we call through the down func returned from up?
5e0f81d
to
f6a3809
Compare
clientv3/retry.go
Outdated
@@ -87,14 +88,10 @@ func (c *Client) newRetryWrapper(isStop retryStopErrFunc) retryRpcFunc { | |||
func (c *Client) newAuthRetryWrapper() retryRpcFunc { | |||
return func(rpcCtx context.Context, f rpcFunc) error { | |||
for { | |||
pinned := c.balancer.pinned() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess this is still useful. let us keep this for now?
clientv3/retry.go
Outdated
@@ -65,6 +65,7 @@ func (c *Client) newRetryWrapper(isStop retryStopErrFunc) retryRpcFunc { | |||
} | |||
if logger.V(4) { | |||
logger.Infof("clientv3/retry: error %v on pinned endpoint %s", err, pinned) | |||
c.balancer.down(pinned, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only error can reach here is timeout error?
|
||
func TestNetworkPartitionBalancerWithGet(t *testing.T) { | ||
testNetworkPartitionBalancer(t, func(cli *clientv3.Client, ctx context.Context) error { | ||
_, err := cli.Get(ctx, "foo") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for get, we should see no error. client should auto retry
|
||
func TestNetworkPartitionBalancerWithPut(t *testing.T) { | ||
testNetworkPartitionBalancer(t, func(cli *clientv3.Client, ctx context.Context) error { | ||
_, err := cli.Put(ctx, "foo", "bar") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for put we should expect deadline exceed error.
clus.Members[0].InjectPartition(t, clus.Members[1:]) | ||
defer clus.Members[0].RecoverPartition(t, clus.Members[1:]) | ||
|
||
for i := 0; i < 5; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get and put expect different behavior in testing. let not try to mix them together. probably we only need to reuse the code for setting up client and partitioning, not the actual test/check part.
clientv3/balancer.go
Outdated
@@ -44,6 +44,8 @@ type balancer interface { | |||
endpoints() []string | |||
// pinned returns the current pinned endpoint. | |||
pinned() string | |||
// errored handles error from server-side. | |||
down(addr string, err error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok... endpointError is more appropriate actually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the back and forth on this one.
hb.mu.RUnlock() | ||
if skip || !bad { | ||
return true | ||
} | ||
// prevent isolated member's endpoint from being infinitely retried, as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this change affect what we are fixing in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Often, timed-out endpoint gets retried and passes this check, so gets pinned again. This is problematic when the member is the isolated one in network partitions still serving the health server as SERVING
. In such case, we keep it in the unhealthy
list so that other endpoints can be prioritized within the next time-out.
Basically we need this as well to fix the network partition issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this testable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard to test since we can't control gRPC side. But with debugging enabled, I can confirm manually this happens.
integration/cluster.go
Outdated
client, err := NewClientV3(m) | ||
if err != nil { | ||
t.Fatalf("cannot create client: %v", err) | ||
if !cfg.NoClient { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me remove it in the next push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you.
08a9f4f
to
a043fbc
Compare
_, err := cli.Get(ctx, "a") | ||
cancel() | ||
if err != nil { | ||
t.Fatal(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.error. also we need a better failure description. want err = nil, got %v
func TestNetworkPartitionBalancerWithPut(t *testing.T) { | ||
testNetworkPartitionBalancer(t, func(cli *clientv3.Client) { | ||
var err error | ||
for i := 0; i < 5; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 5 times? would not a simple second retry go through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slow CIs take several seconds to switch endpoints.
Instead, increased wait time with only 2 tries.
t.Fatalf("#%d: expected %v, got %v", i, context.DeadlineExceeded, err) | ||
} | ||
if i == 0 { // give enough time for endpoint switch | ||
time.Sleep(5 * time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to get rid of this eventually. probably need a wait endpoint switch func.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can expose balancer
interface with ConnectNotify
. Should we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for this pr. simply add a todo is good enough.
clientv3/health_balancer.go
Outdated
func (hb *healthBalancer) mayPin(addr grpc.Address) bool { | ||
hb.mu.RLock() | ||
skip := len(hb.addrs) == 1 || len(hb.unhealthy) == 0 | ||
_, bad := hb.unhealthy[addr.Addr] | ||
failed, bad := hb.unhealthy[addr.Addr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
failedTime?
clientv3/health_balancer.go
Outdated
// instead, return before grpc-healthcheck if failed within healthcheck timeout | ||
if elapsed := time.Since(failed); elapsed < dur { | ||
if logger.V(4) { | ||
logger.Infof("clientv3/health-balancer: %s is up but not pinned (elapsed %v < timeout %v)", addr.Addr, elapsed, dur) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably also log explicitly that we are not going to pin this since xxx.
LGTM |
Otherwise network-partitioned member with active health-check server would not be gray-listed, making health-balancer stuck with isolated endpoint. Also clarifies some log messages. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Previous behavior is when server returns errors, retry wrapper does not do anything, while passively expecting balancer to gray-list the isolated endpoint. This is problematic when multiple endpoints are passed, and network partition happens. This patch adds 'endpointError' method to 'balancer' interface to actively(possibly even before health-check API gets called) handle RPC errors and gray-list endpoints for the time being, thus speeding up the endpoint switch. This is safe in a single-endpoint case, because balancer will retry no matter what in such case. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Only health-check when timeout elapses since last failure.
Otherwise network-partitioned member with active health-check
server would not be gray-listed, making health-balancer stuck
with isolated endpoint. Also clarifies some log messages.
Mark partitioned member as unhealthy in retry code paths.
Previous behavior is when server returns errors, retry
wrapper does not do anything, while passively expecting
balancer to gray-list the isolated endpoint. This is
problematic when multiple endpoints are passed, and
network partition happens.
This patch adds
endpointError
method tobalancer
interfaceto actively(possibly even before health-check API gets called)
handle RPC errors and gray-list endpoints for the time being,
thus speeding up the endpoint switch.
This is safe in a single-endpoint case, because balancer will
retry no matter what in such case.
Added integration tests fail without this patch.
Also manually tested to confirm this fixes #8660.
Logs