-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide custom keepalive ping handler #2166
Comments
@gyuho, no, there are no current plans to do this. Is there a reason you want to perform this check when you get keepalive pings as opposed to alternatives (e.g. polling periodically)? It may be possible to do something similar using a custom handshaker on the server that kills the connection when certain conditions arise, or by making an RPC initiated by the client that takes the place of the keepalive pings. I'm not sure I fully understand your design, however, so this may or may not help. |
@dfawley I wasn't clear enough. Our main use case for keepalive ping is to detect unresponsive "server" from "client" side for long-running stream RPCs. Keepalive is used for detecting disconnects in transport layer, but no way to extend it to support application-layer conditions (as mentioned above, partitioned node keepalive server handler will still receive keepalive pings from server, thus marked as active and making client stream be stuck with the partitioned node).
We want to use keepalive since it's light (8-byte HTTP/2 ping) and built-in to gRPC server. We have a similar use case in gRPC server interceptor layer where every RPC goes through leader-check to see if there's an active leader or not. I would imagine we could do the same thing for keepalive handler. We haven't tried https://godoc.org/google.golang.org/grpc/credentials#TransportCredentials but seems like this is for connection handshake? Thanks! |
If you were to kill the connection, how would the client know which node to reconnect to? Could you perform this check in a server interceptor and kill the watch stream (instead of the connection itself) when the partitioning happens? Note that your streaming interceptor has full control of the stream and can terminate it early with an error at any time -- not just at the start of the RPC or when traffic happens. E.g. something like this might work: func interceptor(...) error {
errChan := make(chan error, 1)
go func() {
errChan<-handler(srv, ss)
close(errChan)
}()
select {
case <-partitioned: // written to by something monitoring for partitioning
return partitionedErr
case err := <-errChan:
return err
}
}
}
|
Once disconnected, etcd clientv3 balancer picker policy will either round robin or switch other nodes based on latest node health status to retry the requests. We just need some indication that this node is not responding.
Yes, and we support that as an option (user can specify require-leader metadata in their contexts).
I see. We will look into this. Was just curious if gRPC team has any plan to support the custom keepalive handler. If you recommend server interceptor pattern, we are also happy with it for now. Thanks for response! And please feel free to close it if there's no plan to support the custom keepalive handler in the near future. We will revisit when we find other use cases. |
Yes, I think the interceptor pattern would be recommended for this. Keepalives should only be necessary for our internal implementation, so I would like to avoid exporting a hook for them unless there is a strong enough need to justify it (i.e. there's no other/better way to accomplish the same thing). Thanks for the request. |
What version of gRPC are you using?
v1.12.2
What version of Go are you using (
go version
)?Go 1.10.3
What operating system (Linux, Windows, …) and version?
Linux
What did you do?
Use keepalive.
What did you expect to see?
Customizable keepalive ping handler.
What did you see instead?
Internal to gRPC.
Currently, server can only configure keepalive "duration":
grpc-go/keepalive/keepalive.go
Lines 41 to 45 in 2ab07fd
And its keepalive handler is internal to gRPC:
grpc-go/transport/http2_server.go
Lines 854 to 857 in b28608a
etcd uses this keepalive ping to detect unavailable server or client for stream RPCs (e.g. Watch API).
etcd is typically clustered of 3 to 5 nodes. Client watch stream may failover to other nodes, when a node become unavailable. However, once a stream is made to a node, which then becomes network-partitioned, the watch client would get stuck with the partitioned node.
Related issues are:
etcd's current workaround is to specify
WithRequireLeader
in client context to pass metadata to the server interceptor; but, this is optional in etcd. Another workaround in etcd is to add add stream progress request etcd-io/etcd#9869 to check if the node is up-to-date.It would be very useful if etcd can configure server-side keepalive ping handler to do the following:
Do we have any plan to expose keepalive ping handler in gRPC?
/cc @jpbetz @xiang90
The text was updated successfully, but these errors were encountered: