-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alts: Add retry loop when making RPC in ALTS's TestFullHandshake. #6183
Conversation
@easwars This should be ready for review when you have a chance. :) The failing test seems to be an unrelated XDS flake. |
credentials/alts/alts_test.go
Outdated
@@ -337,8 +339,15 @@ func TestFullHandshake(t *testing.T) { | |||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) | |||
defer cancel() | |||
c := testgrpc.NewTestServiceClient(conn) | |||
if _, err = c.UnaryCall(ctx, &testpb.SimpleRequest{}, grpc.WaitForReady(true)); err != nil { | |||
t.Errorf("c.UnaryCall() failed: %v", err) | |||
for ; ctx.Err() == nil; <-time.After(5 * time.Second) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be more like this:
- You have an overall deadline of 10s
- You retry every 10ms
- You make the RPCs without the
WaitForReady
call option - I'm not convinced that
DeadlineExceeded
is something where you retry. It should ideally be something else. You probably will see the correct code now if you make the RPC without theWaitForReady
call option.
const defaultTestTimeout = 10 * time.Second
const defaultTestShortTimeout = 10 * time.Millisecond
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
c := testgrpc.NewTestServiceClient(conn)
// TODO: A comment explaining why we need to retry the RPC until it succeeds.
for ; ctx.Err() == nil; <-time.After(defaultTestShortTimeout) {
_, err = c.UnaryCall(ctx, &testpb.SimpleRequest{})
if err == nil {
break
}
if code := status.Code(err); code == <expected code when handshake fails> {
continue
}
t.Fatalf("c.UnaryCall() failed with unexpected error: %v", err)
}
if ctx.Err() != nil {
t.Fatalf("Timeout when waiting for UnaryCall RPC to succeed")
}
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, the WaitForReady
option was masking an Unavailable
error code, which makes more sense. And thanks for the tips! Implemented as you suggested. :)
This fixes the flaky test introduced in #6177 and flagged in #6182. Before this PR, I could reproduce the flake ~1.7% of time on local runs. With this PR, I cannot reproduce the flake locally.
Fixes #6182
RELEASE NOTES: none