-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdsbalancer: test cleanup part 3/N #6564
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, particularly about lost assertions.
@@ -18,1078 +18,650 @@ package cdsbalancer | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall comment on this file: I like how you rewrote the tests to expect a child config to the cluster_resolver corresponding to the leaf clusters. However, I feel like by switching the tests over we drop a lot of the functionality wrt the cds balancer not sending an update yet due to the aggregate cluster graph not being filled yet (i.e. all the roots). You have a case for this from A->A not sending but drop a lot of the verifications that it doesn't send in many tests. Maybe add them by only adding management server resources for different clusters when needed, and have assertions it doesn't send before adding the management server resources for a specific cluster (instead of filling out the full tree at the beginning of the test).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A specific example of a dropped assertion: https://github.com/grpc/grpc-go/pull/6564/files#diff-6eccd90237cb4345aac6d80e2e740c5d686f42cab9799b3eaf261e392785af54L288. I think these checks are important, but luckily we can plumb it into the e2e style way as well :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this check to the following tests:
TestAggregateClusterSuccess_ThenUpdateChildClusters
TestAggregatedClusterSuccess_DiamondDependency
TestAggregatedClusterSuccess_IgnoreDups
The first one checks the base case for aggregate clusters, while the next two test some other interesting cases.
// This shouldn't cause an update to be written to the update buffer, | ||
// as cluster C has not received a cluster update yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another example of a scenario that is lost. This is a corner case but is important because cluster C gets a callback which is unknown before specifying the same D, and tests the flip from this aggregate cluster is ready to update since the whole tree has updates (including both B and C's children) vs. it's not yet ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, tests this complicated algorithm/data structure of clusters seen: https://github.com/grpc/grpc-go/blob/master/xds/internal/balancer/cdsbalancer/cluster_handler.go#L215.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
eea79f9
to
9b1a660
Compare
@zasweq : Had to force push since I had to rebase to master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for adding back assertions. E2E tests look solid here. I remember writing the left red side two months into my tenure :). LGTM.
mgmtServer, nodeID, _, _, _, _, _ := setupWithManagementServer(t) | ||
|
||
// Configure the management server with the aggregate cluster resource | ||
// pointing to two child clusters. But don't include resource corresponding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specify the child clusters, and mention you include EDS cluster. Applies to all aggregate cluster comments added in commit 3 explaining not send update yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DOne.
// Tests the case where the cluster resource requested by the cds LB policy is | ||
// an aggregate cluster root pointing to two child clusters, one of type EDS and | ||
// the other of type LogicalDNS. Verifies that the load balancing configuration | ||
// pushed to the cluster_resolver LB policy is as expected. The test then | ||
// updates the aggregate cluster to point to two child clusters, the same leaf | ||
// cluster of type EDS and a different leaf cluster of type LogicalDNS and | ||
// verifies that the load balancing configuration pushed to the cluster_resolver | ||
// LB policy contains the expected discovery mechanisms corresponding to the | ||
// child clusters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For cases where you added no cluster update sent verification, please add that to top level testing comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DOne.
This PR cleans up aggregate cluster tests in the cds LB policy. It makes the tests verify behavior rather than internal state. Cleaning up these tests helped identify the issue described in #6562.
#resource-agnostic-xdsclient-api
RELEASE NOTES: none