-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Envoy to pass hits_addend to RateLimitService #12969
Comments
Hey @mattklein123, I've created this issue to follow up on envoyproxy/ratelimit#167. Please let me know if you think any more details would be helpful. Thanks! |
Yeah this makes sense to me. Marking help wanted. |
any news regarding this? |
Pinterest is interested in this also. cc @fishcakez @JuniorHsu |
Hi, any news regarding this? We would like to use it to limit the token/minutes for the LLM use case, as they are usually limited by |
Related work. |
@lizzzcai In case you are using the OpenAI API, I think they limit on request token + response token. So further work would be required either in the ratelimit filter or another new filter so the response token can be sent to the ratelimit sidecar on the response flow. |
Hi @PeterL328 , thanks for your update, I will follow your other PR for the progress.
For our case, we are using Azure OpenAI. However, I think the limit is not on the Reference: Azure OpenAI
|
Hi @lizzzcai, Yea you can use the max token on the response but it will not be accurate if that is what you need if you plan to track it. |
I have opened #34184 as a potential solution to setting |
after #34184 merged, able to close this? |
Hi @EItanya, I'm trying to use the hits addend with istio. Can you please provide me an example of how to configure this as an EnvoyFilter? I was trying to use the set filter state filter to set the I get the following error. Error adding/updating listener(s) virtualInbound: 'envoy.ratelimit.hits_addend' does not have an object factory. |
please use master branch |
Hi @zirain , I managed to build and use the piot and proxyv2 images of istio from master branch. I am tryting to create the EnvoyFilter objects. Is this the correct way to set the envoy.ratelimit.hits_addend filter state from a request header called hits, before the rate limit filter? |
be careful of inserting a filter based on something that is created by another envoyfilter. |
Seeing the same problem than the one described by @OS-ramamurtisubramanian on the latest
Error log is:
Is there another configuration to add ? |
I cannot recall, but can you give a try with main branch? |
@zirain I just tried using a freshly built envoy binary from the
Configuration tested:
same error:
|
I'm not sure how you build it, I cannot reproduce it on my machine.
|
@zirain , sorry, I must have missed something in my first build (I was using the docker script provided). Trying with your command indeed works. Thank you ! |
Description
The RLS v3 api describes the RateLimitService as able to injest a hits_addend field to determine number of tokens to use for the rate limiting request.
Envoy should provide a method for extracting a value from a request header (or some other method) to populate this method on a per request basis. If hits_addend is only static, then it is effectively the same as modifying the ratelimit.
Use case
In the HTTP Rate Limit Filter allow for a configuration of a request header containing an integer hits_addend value to send with the rate limit request, allowing for greater configurability of rate limiting capabilities.
The text was updated successfully, but these errors were encountered: