Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clusterimpl: update picker synchronously upon receipt of configuration update #7210

Open
easwars opened this issue May 7, 2024 · 3 comments
Assignees
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. P2 Type: Bug

Comments

@easwars
Copy link
Contributor

easwars commented May 7, 2024

#5469 recommends an audit of existing LB policies to ensure that they update their pickers synchronously upon receipt of a configuration update.

xds_cluster_impl_experimental does not update its picker synchronously.

This needs to be changed to return a picker synchronously upon receipt of a configuration update.

@aranjans aranjans self-assigned this May 8, 2024
@aranjans
Copy link
Contributor

aranjans commented May 9, 2024

@easwars I'd be working on this issue.

@easwars
Copy link
Contributor Author

easwars commented Aug 12, 2024

After some thought, I've come to the understanding that I don't necessarily like the approach taken in the PR linked above because it still causes multiple picker updates and testing is very tied to the implementation rather than behavior. Instead I suggest we take the following approach. We can do this step-by-step:

  • Stop using deprecated APIs in the gracefulswitch.Balancer
    • Currently clusterimpl has a single child and uses a gracefulswitch.Balancer to support the case where the child policy name changes. And this is currently handled by calling the SwitchTo method on the gracefulswitch.Balancer. Instead we should have clusterimpl build JSON config in the format expected by the gracefulswitch.Balancer and call UpdateClientConnState instead.
    • Add an e2e style test for this as part of this change
      • As part of looking into these changes, I realized that we have no test for this scenario.
        Currently clusterimpl
  • Use a grpcsync.CallbackSerializer instead of a run goroutine to synchronize updates from the channel and updates from the child policy.
    • This is greatly simplify the implementation since we will not longer require a mutex, or a closed event, or a done event etc.
  • I feel that testing coverage is quite thin for the clusterimpl LB policy as a whole.
    • We should audit C-core and/or Java and see what tests they have and add them here. Mostly e2e style tests.
  • Once we have the above changes, updating the picker synchronously would be simple:
    • As part of handling a config update from the channel, the clusterimpl would do the following:
      • inhibit picker updates from the child policy
      • process the config update from the channel
      • resume picker updates, and as part of this send a new picker to the channel
      • At the end of handing a config update, whether or not an update from a child policy was received, the clusterimpl LB policy should send a single picker update to the channel.

@arjan-bal
Copy link
Contributor

@aranjans can this issue be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: xDS Includes everything xDS related, including LB policies used with xDS. P2 Type: Bug
Projects
None yet
5 participants