[Fix] Enforce rate limits on 429s #2642
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Prior to this change, webhook requests were not being properly rate limited. In
RequestQueueBucket.cs
, when a 429 occurred, theUpdateRateLimit
function was only setting a delay timer for the rate limit, but was otherwise not blocking requests until that timer expired.See #2592, which may be resolved by this PR.
Problem 1: 429s not being enforced
In that class, the integer
_semaphore
contains the number of requests remaining before the rate limit will be activated. If that value is non-positive, then according to the API we are being rate limited.The problem was that
_semaphore
was not being set to zero when a 429 occurred, so requests weren't actually being limited.Problem 2: Rate limits being updated twice, with different values
It was also theoretically possible for the
Remaining
value coming back from the response to be zero, indicating that we are out of remaining requests, and in that case_semaphore
was likewise not being set to zero. To avoid any problems that could have stemmed from that I added code to update_semaphore
to theRemaining
value on every request (prior to checking for a 429).That change uncovered another lurking problem: on 429s the
UpdateRateLimit
function was being called twice -- first for the 429 and second with the value made to look like a 429 had not occured, which undid my changes to_sempahore
.Problem 3: Using WindowCount == 0 to mean "unlimited"
I didn't dig too much into the cause of this problem but somewhere in the midst of fixing the other two, I began encountering situations where the API was returning a
Limit
value of zero. I think this happened because the incorrect rate limiting resulted in the API banning requests for a short time.The problem there was that in
RequestQueueBucket.cs
, a limit of zero meant "no limit", and requests stopped being throttled. See this line.I fixed this by changing the "no limit" value to
-1
, and therefore treating zero to mean what the API intends, namely that there is a limit of zero requests.Test code
Any code that sends a bunch of requests on a webhook will do fine. Note that webhooks have an undocumented limit of 30/min/channel, which kicks in after 40 or so requests. This bug only gets triggered after you hit that limit and you start to receive 429s.
Bad example
Prior to the fix, the output on a failed request looks like this:
Notice that the request immediately retries, eating up all of the remaining requests until the hard stop is hit (
_semaphore == 0
).Good example
With the fix, the request is only tried once, then sleeps the required amount of time before trying again (and succeeding).