-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transport: Pass Header metadata to tap handle. #6652
Conversation
May I ask your use case for the The other grpc languages don't have this construct (and seem to be doing just fine without it), and we recently noticed that it's excluded from metrics & tracing, which is not very desirable from an observability perspective, and we may not want to introduce metrics callouts specifically for it, either, given its sensitive location. So I was actually considering just calling it deprecated and not recommending anyone to use it. |
Hi. Our use case is to drop requests early, before spending too much resources on it. We use this when server is already unable to accept more messages (eg. too many inflight requests). We have tried to use interceptors before, but that did not provide adequate protection. From what I understand, gRPC-go is processing the request in following steps (discussing unary case):
As can be seen in this description, when interceptors run, quite a lot of resources have already been used: possibly new goroutine spawned, request is in memory twice (once as bytes buffer, once as unmarshaled message). We want to avoid all that when we know that server can't possibly handle any more requests at the moment. When we tried to use interceptors, we couldn't implement the protection effectively, and we could still crash the server under the load, for example by sending enough big messages -- then the server will just OOM.
We've also noticed exclusion from metrics, because our metrics get updated in stats handler and interceptors. However that's easy to fix: we can increment relevant metric when rejecting request in tap handle. As for tracing, in our setup client would log returned grpc error to its trace span. (But we use grpc only in internal communication) I think that even with these limitations, tap handle is useful feature and gives us much-needed protection against overloading server with too many messages. |
Thanks for all the detailed info - it was really helpful! Do you happen to work with gRPC in other languages besides Go (on the server side)? I'm not sure, but I believe the recommended way of dealing with these kinds of issues would be to use a LimitingListener and MaxConcurrentStreams instead. Have you tried these, by any chance? It's also possible the other languages tend to provide interceptors at a lower level. Especially if you're looking at the message payload size in unary RPCs, I believe the other languages provide interceptors that are more like our streaming interceptors, but for all RPCs, that do not require the payload to be received and decoded before calling them, as our unary interceptors do. Unary interceptors can be nice for usability, but otherwise have some undesirable properties (e.g. they require implementing things twice if you have both unary and streaming RPCs that want the same interceptor behavior). To that end, we do have #1805 open to provide a streaming-style interceptor even for unary RPCs. It hasn't been prioritized, but might allow for sufficiently early termination of RPCs - it would cut out steps 4 and 5, saving 2 copies of the message payload. Could something like that potentially work for you in place of the tap handle feature? |
Thank you for reply!
Not at the moment.
We have experimented with One problem with this approach is how to find the correct values. For example let's say we know that we can handle 1000 concurrent requests before OOMing. Our number of clients sending requests is dynamic, and we autoscale them with the load. We would need to allow Furthermore changing Another problem is that this approach doesn't allows us to distinguish between various methods. Eg. we never want to limit health checks. This is also the reason for this PR: we want more details to distinguish requests 😄
I think it would be a step in right direction, although I'm worried that step 3 -- creation of Goroutine -- may still be a problem. (If many clients try to send request all at once, and they create so many goroutines that server crashes with OOM) To give you definitive "Yes", we would need to load-test it. Goroutines problem could perhaps be solved by implementing hard-limit for number of workers, although once again it would probably be global to gRPC server, and not per-method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can take this PR, but be aware that use of the tap handler feature is not recommended.
Also, can you please add a test for this. There is an existing TestTap
in test/end2end_test.go
that you can probably pretty easily extend:
Line 2111 in fd9ef72
func testTap(t *testing.T, e env) { |
👍 Understood. The best I can hope for is that it won't be removed anytime soon (before usable alternative). Thank you!
Thank you for the pointer. I was wondering how to add test for this change and missed the end2end test! |
35fa329
to
9f727c4
Compare
I've extended existing test to cover this change. Please take a look again. |
Remove unwanted changes.
Remove unwanted changes.
+zasweq for a second look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thank you! |
|
Apologies, this was a fantasy that entered my mind during intensive debugging. |
This PR extends
tap.Info
struct with metadata, and modifieshttp2_server.go
to pass available metadata toServerInHandle
function.This gives tap handle more details that it can use to reject request early.
RELEASE NOTES: