-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
default client balancer only returns one address #1694
Comments
Did you look at the new balancer package? Or the v1 balancer in grpc package? Can you try to use the new balancer and resolver rr := balancer.Get("round_robin")
grpc.Dial("dns:///your.target.name", // "dns:///" specifies the resolver to use
grpc.WithCredentials(...),
grpc.WithBalancerBuilder(rr), // use round_robin balancer
) Not that |
Thanks for that hint @menghanl ! This seems to work now, but unfortunately it doesn't seem to query the DNS server that often. I saw the requery frequency in another package set to 30 minutes. Is this configurable? edit: Perhaps a WithResolveNowInterval(time.Duration) option and a independent goroutine in the ClientConn which calls ResolveNow() in a loop when this optin is set? I could write that If it would help. I think it should be a very good and small task to get started ;) Please let me know if I can help @menghanl |
The resolve interval is decided by each resolver implementation. There are resolvers that do pushing instead of polling. From your comment in #1388, you mentioned dead connections will still be retried. This can be solved by #1679. The resolver will re-resolve whenever a connection is down. If the dead server was removed in DNS, the re-resolve will notice that and will remove it from ClientConn. MAX_CONNECTION_AGE plus #1679 would also cause the resolver to re-resolve and discover new servers. Let me know what you think about this solution. |
I tried the MAX_CONNECTION_AGE plus #1679 approach but It doesn't trigger the re-resolving. I dont know if perhaps the MAX_CONNECTION_AGE parameter in the server keep alive parameters of the workers is not respected, or if the resolver is not invoked when the connection closes normally (without error). I could imagine that this happens. I do not even see SubConn state changes in the logs. When killing one of the worker pods everything works fine and the resolver returns the new address set. |
Have you turned on info logging by importing the If killing the server manually works, however, my guess is MAX_CONNECTION_AGE isn't configured correctly or is not working correctly -- it should kill the connection and appear the same as an error to the client. |
Does the max age problem still exist? Did you get more logs for this issue? |
I tried today, but can not reproduce the issue anymore! |
What version of gRPC are you using?
glide version: ^1.7.2
ref: 5a9f7b4
What version of Go are you using (
go version
)?1.9.1
What operating system (Linux, Windows, …) and version?
Linux
What did you do?
I created a service in docker swarm which serves gRPC requests with endpoint mode dnsrr (so the DNS returns multiple A records for that service).
Another service inside swarm calls this.
Dialing looks like this:
This client is then reused to serve rpc invocations.
What did you expect to see?
The calls should be dispatched round-robin to all available replicas of the target service out of the box as documented in the go-docs (round-robin must not be registered, because its the default)
What did you see instead?
Only the first replica is used to serve the requests.
Additional Notes
WithBalancer(balancer.RoundRobin(resolver.NewDNSResolver()))
it gives me an error that no addresses are availableDo I need to setup loadbalancing manually for the moment?
The text was updated successfully, but these errors were encountered: