Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds/internal/server: stop using a fake xDS client in rds handler tests #6689

Merged
merged 2 commits into from
Oct 6, 2023

Conversation

easwars
Copy link
Contributor

@easwars easwars commented Oct 4, 2023

This is in prep for xDS generic API changes for the server side.

#resource-agnostic-xdsclient-api

RELEASE NOTES: none

@easwars easwars changed the title xds/internal/server: stop using a fake client in rds handler tests xds/internal/server: stop using a fake xDS client in rds handler tests Oct 4, 2023
@easwars easwars requested a review from zasweq October 4, 2023 23:03
@easwars easwars added this to the 1.59 Release milestone Oct 4, 2023
@codecov
Copy link

codecov bot commented Oct 4, 2023

Codecov Report

❗ No coverage uploaded for pull request base (master@93dbc05). Click here to learn what that means.
The diff coverage is n/a.

Additional details and impacted files

@ginayeh ginayeh modified the milestones: 1.59 Release, 1.60 Release Oct 5, 2023
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

return fmt.Errorf("unexpected resources %v of type %q requested", req.GetResourceNames(), req.GetTypeUrl())
}
select {
case <-namesCh:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We ok with losing the old update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do create the channel with a buffer of 1. So, we are not concerned about losing an update completely. And when the tests do read the update when they expect one, we are good. Most of our tests which verify request names sent to the management server do this and run without flakes.

The issue with blocking here (instead of losing the update) is that unpredictability around go-control-plane server's behavior. For example, if you have resources of two types and you update one of them, the server will send all resources back to the client because we use a snapshot cache, and once the snapshot changes, the server will send the whole thing to the client.

func verifyUpdateFromChannel(ctx context.Context, t *testing.T, updateCh chan rdsHandlerUpdate, wantUpdate rdsHandlerUpdate) {
t.Helper()

opts := []cmp.Option{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmm nice I always do this inline. This is nice though with this many options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. I use this approach only when I have a lot of options. And here it makes total sense since some of those options are taking a function.

Comment on lines 118 to 119
cmpopts.SortSlices(func(s1, s2 string) bool { return s1 < s2 }),
cmpopts.SortMaps(func(s1, s2 string) bool { return s1 < s2 }),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly comment what non determinism this takes away and perhaps why the non determinism arises? I had a lot of non determinism in my OpenCensus tests since rows were emitted non deterministically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I remember there was some non-determinism which is why I added these options, but now I don't see them anymore. I ran the tests a thousand times locally and couldn't repro any flakes. So, I'm getting rid of these now.

Comment on lines 126 to 129
t.Fatalf("Got unexpected route update, diff (-got, +want): %v", diff)
}
case <-ctx.Done():
t.Fatal("Timed out waiting for update from update channel.")
t.Fatal("Timed out waiting for a route config update")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit route update or route config update? Feels like everywhere it's referred to as route config update, I think I'd prefer that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the one above this to route config update

rh, fakeClient, ch := setupTests()
// Tests the case where the rds handler receives an update with two routes, then
// receives an update with only one route. The rds handler is expected to cancel
// the watch for the route no longer present, and push a corresponding update.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a corresponding update with only one route.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 306 to 310
// Tests the case where the rds
// handler receives an update with two routes, and then receives an update with
// two routes, one previously there and one added (i.e. 12 -> 23). This should
// cause the route that is no longer there to be deleted and cancelled, and the
// route that was added should have a watch started for it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap this comment to 80 characters please. Also optionally can you talk about the verifications after 23 is updated such as it doesn't send an update until 3 is received on management server etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +314 to +322
// Create an rds handler and give it two routes to watch.
updateCh := make(chan rdsHandlerUpdate, 1)
rh := newRDSHandler(xdsC, updateCh)
rh.updateRouteNamesToWatch(map[string]bool{route1: true, route2: true})

// Verify that the given routes are requested.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := waitForFuncWithNames(ctx, fakeClient.WaitForWatchRouteConfig, route1, route2); err != nil {
t.Fatalf("Error while waiting for names: %v", err)
}
waitForResourceNames(ctx, t, resourceNamesCh, []string{route1, route2})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: this seems pretty common to many tests, can maybe pull this out to helper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what is common that can be pulled into a helper. Not all tests start with two routes. Some start with one. And I think this is a core portion of the test: creating the handler, giving it one or more names to watch and verifying that it sends RDS requests for the same. Pulling that out into a helper isn't very helpful in my opinion since the reader of the test will not be able to glance over that detail.

}
if routeNameDeleted != route1 {
t.Fatalf("xdsClient.CancelRDS called for route %v, want %v", routeNameDeleted, route1)
// Verify that the update is pushed to the handler's update channel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rds handler's (for consistency)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as mentioned in the other comment.

Comment on lines +374 to +376
// Close the rds handler and verify that the watch is canceled.
rh.close()
waitForResourceNames(ctx, t, resourceNamesCh, []string{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done at the end of every test. I guess I'm fine with it because it closes from different state spaces. Also isn't it more than one watch that gets cancelled haha?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also isn't it more than one watch that gets cancelled haha?

In some tests, it is more than one watch and in some it it one. I think the comment conveys the message enough.

t.Fatalf("Did not receive the expected error, instead received: %v", rhu.err.Error())
case update := <-updateCh:
if !strings.Contains(update.err.Error(), wantErr) {
t.Fatalf("Update received with error %v, want error containing %q", update.err, wantErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the second formatting directive %q vs %v?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.

@zasweq zasweq assigned easwars and unassigned zasweq Oct 6, 2023
Copy link
Contributor Author

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

return fmt.Errorf("unexpected resources %v of type %q requested", req.GetResourceNames(), req.GetTypeUrl())
}
select {
case <-namesCh:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do create the channel with a buffer of 1. So, we are not concerned about losing an update completely. And when the tests do read the update when they expect one, we are good. Most of our tests which verify request names sent to the management server do this and run without flakes.

The issue with blocking here (instead of losing the update) is that unpredictability around go-control-plane server's behavior. For example, if you have resources of two types and you update one of them, the server will send all resources back to the client because we use a snapshot cache, and once the snapshot changes, the server will send the whole thing to the client.

func verifyUpdateFromChannel(ctx context.Context, t *testing.T, updateCh chan rdsHandlerUpdate, wantUpdate rdsHandlerUpdate) {
t.Helper()

opts := []cmp.Option{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. I use this approach only when I have a lot of options. And here it makes total sense since some of those options are taking a function.

Comment on lines 118 to 119
cmpopts.SortSlices(func(s1, s2 string) bool { return s1 < s2 }),
cmpopts.SortMaps(func(s1, s2 string) bool { return s1 < s2 }),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I remember there was some non-determinism which is why I added these options, but now I don't see them anymore. I ran the tests a thousand times locally and couldn't repro any flakes. So, I'm getting rid of these now.

Comment on lines 126 to 129
t.Fatalf("Got unexpected route update, diff (-got, +want): %v", diff)
}
case <-ctx.Done():
t.Fatal("Timed out waiting for update from update channel.")
t.Fatal("Timed out waiting for a route config update")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the one above this to route config update

rh, fakeClient, ch := setupTests()
// Tests the case where the rds handler receives an update with two routes, then
// receives an update with only one route. The rds handler is expected to cancel
// the watch for the route no longer present, and push a corresponding update.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 306 to 310
// Tests the case where the rds
// handler receives an update with two routes, and then receives an update with
// two routes, one previously there and one added (i.e. 12 -> 23). This should
// cause the route that is no longer there to be deleted and cancelled, and the
// route that was added should have a watch started for it.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

t.Fatalf("Did not receive the expected error, instead received: %v", rhu.err.Error())
case update := <-updateCh:
if !strings.Contains(update.err.Error(), wantErr) {
t.Fatalf("Update received with error %v, want error containing %q", update.err, wantErr)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo.

Comment on lines +374 to +376
// Close the rds handler and verify that the watch is canceled.
rh.close()
waitForResourceNames(ctx, t, resourceNamesCh, []string{})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also isn't it more than one watch that gets cancelled haha?

In some tests, it is more than one watch and in some it it one. I think the comment conveys the message enough.

}
if routeNameDeleted != route1 {
t.Fatalf("xdsClient.CancelRDS called for route %v, want %v", routeNameDeleted, route1)
// Verify that the update is pushed to the handler's update channel.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as mentioned in the other comment.

Comment on lines +314 to +322
// Create an rds handler and give it two routes to watch.
updateCh := make(chan rdsHandlerUpdate, 1)
rh := newRDSHandler(xdsC, updateCh)
rh.updateRouteNamesToWatch(map[string]bool{route1: true, route2: true})

// Verify that the given routes are requested.
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()
if err := waitForFuncWithNames(ctx, fakeClient.WaitForWatchRouteConfig, route1, route2); err != nil {
t.Fatalf("Error while waiting for names: %v", err)
}
waitForResourceNames(ctx, t, resourceNamesCh, []string{route1, route2})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see what is common that can be pulled into a helper. Not all tests start with two routes. Some start with one. And I think this is a core portion of the test: creating the handler, giving it one or more names to watch and verifying that it sends RDS requests for the same. Pulling that out into a helper isn't very helpful in my opinion since the reader of the test will not be able to glance over that detail.

@easwars easwars merged commit eb33677 into grpc:master Oct 6, 2023
13 checks passed
@easwars easwars deleted the xds_server_cleanup_1 branch October 6, 2023 18:38
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants