Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive API usage for Mozilla Firefox Add-ons (and deprecated API usage) #7584

Closed
diox opened this issue Feb 9, 2022 · 7 comments · Fixed by #7586
Closed

Excessive API usage for Mozilla Firefox Add-ons (and deprecated API usage) #7584

diox opened this issue Feb 9, 2022 · 7 comments · Fixed by #7586
Labels
service-badge Accepted and actionable changes, features, and bugs

Comments

@diox
Copy link

diox commented Feb 9, 2022

Are you experiencing an issue with...

shields.io

🐞 Description

Hi,

The addons.mozilla.org team is looking at removing its v3 API as planned per deprecation announcement (We're in fact a bit late, we had announced we would shut it down on December 31st 2021).

When looking at this, we noticed a few problems with Shields.io:

  • It is using the deprecated v3 version instead of the stable v4.
  • It's making requests without the expected trailing slash, causing each request to follow a redirect to the URL with the trailing slash (doubling requests)
  • The usage seems excessive, totaling over 2 million requests per month.

Since we are going to shut down v3, you should start using v4 instead, perhaps adjusting your code depending on what this would break. Furthermore, we'd strongly urge you to find ways to avoid calling our API so much. Fixing that trailing slash issue will halve the number of requests, but please consider caching more aggressively. We can and will tweak rate limiting on our end, but not having to deal with the requests in the first place would be preferable.

🔗 Link to the badge

No response

💡 Possible Solution

No response

@diox diox added the question Support questions, usage questions, unconfirmed bugs, discussions, ideas label Feb 9, 2022
@chris48s
Copy link
Member

chris48s commented Feb 9, 2022

In terms of the move from v3 to v4, thanks for flagging this to us.


When it comes to the API usage:

You've mentioned the trailing slash to cut down the number of requests that are just following a redirect.

We haven't applied any custom cache durations for Mozilla Addons so we'll be using our standard cache durations, which are:

  • Downloads, Users: 900 seconds
  • Rating, Stars: 120 seconds
  • Version: 300 seconds

These are tuned to try and strike a reasonable balance between performance and up-to-date-ness. We aim to cache things that update frequently/unpredictably or show an exact value for shorter times and things that we expect to change less frequently or show a rounded/indicative value for longer. This helps our performance as well as reducing traffic on upstream APIs.

Using your knowledge of the Mozilla Addons platform/API, is there anything there that strikes you as a mismatch. For example if you say "we only update our download counts once per day" or something, that would be a good reason for us to set a much higher duration on the downloads/user badges for Mozilla Addons specifically.

@chris48s chris48s added service-badge Accepted and actionable changes, features, and bugs and removed question Support questions, usage questions, unconfirmed bugs, discussions, ideas labels Feb 9, 2022
@calebcartwright
Copy link
Member

Thanks for reaching out! We definitely understand the load Shields can swing to upstream services due to the popularity/utilization of our own platform, and strive to collaborate with those upstream providers whenever possible.

It sounds like there's a couple easy things we can do on our end, but a few things worth sharing/discussing in more detail too.

It's making requests without the expected trailing slash, causing each request to follow a redirect to the URL with the trailing slash (doubling requests)

Can't say I've ever heard of such a case before, but also an easy change on our end that we're happy to make in the short term as part of trying to tactically reduce the load

For anyone interested in working on this:

url: `https://addons.mozilla.org/api/v3/addons/addon/${addonId}`,

It is using the deprecated v3 version instead of the stable v4.

Thanks for letting us know. We're a very small, volunteer driven project that integrates with hundreds of upstream APIs/services, most of which us maintainers don't actively utilize ourselves. As such we're not really tied into the latest happenings with all those services and generally rely on the community and or those providers to let us know when something has/will change.

The usage seems excessive, totaling over 2 million requests per month.
Furthermore, we'd strongly urge you to find ways to avoid calling our API so much. Fixing that trailing slash issue will halve the number of requests, but please consider caching more aggressively. We can and will tweak rate limiting on our end, but not having to deal with the requests in the first place would be preferable.

Lots to drill into here. Few questions for you:

  • What time period was this 2 million/month number derived from, just the most recent month or is that what you're seeing over the last several months? I ask because we recently (past week) ran some experiments in our hosting environment and one unanticipated side effect was a mass invalidation of our caching which in turn triggered an unusual spike in the traffic we were throwing at upstream providers (@chris48s can correct me if I'm wrong)
  • Does your documentation include any info on throttling/rate limits? I tried to search but was unable to find anything
  • What would you consider to be a reasonable/nonexcessive load? Even assuming the high end of 1 million (half of reported 2 million with the / change) that translates to well below 0.5 requests/second by my quick math. That's definitely on the lower end compared to what we do for other services, but some more info would be helpful here so that we have a more quantified picture of your perspective (as Chris as spelled out above since he managed to respond more quickly than I 😄)

@chris48s
Copy link
Member

chris48s commented Feb 9, 2022

I ask because we recently (past week) ran some experiments in our hosting environment and one unanticipated side effect was a mass invalidation of our caching which in turn triggered an unusual spike in the traffic we were throwing at upstream providers (@chris48s can correct me if I'm wrong)

Yes there will have been a couple of spikes over the last week where every request was a cache miss for a bit but given the longest we cache any Mozilla Addons badge for is 15 mins that probably didn't make a huge difference. If they're logging 2 requests on their end for each API call from us (1 for the redirect, 1 for the 200) that would mean ~33,000 API calls from us per day. Just having a look at https://metrics.shields.io/ we've served ~22,000 Mozilla Addons badges in the last 24 hours excluding those served from cache. Not sure how representative that is (I guess a bit low), but it seems in the right sort of order of magnitude.

@diox
Copy link
Author

diox commented Feb 9, 2022

What time period was this 2 million/month number derived from, just the most recent month or is that what you're seeing over the last several months? I ask because we recently (past week) ran some experiments in our hosting environment and one unanticipated side effect was a mass invalidation of our caching which in turn triggered an unusual spike in the traffic we were throwing at upstream providers

That was for the most recent month. We haven't drilled down into the specifics at this point.

Does your documentation include any info on throttling/rate limits? I tried to search but was unable to find anything

We currently don't. We used to have an issue forcing us to be extremely permissive with rate limits of GET requests and as a result, clients are unlikely to be rate limited right now unless they're actively spamming. However that is going to have to change, and we'll add documentation around it.

What would you consider to be a reasonable/nonexcessive load?

We are able to handle the load, but from what we can see Shields.io is by far the biggest single consumer of our API outside of Mozilla, refreshing data that changes very little over and over quite often.

Rather than giving you a target number of requests, I can answer the caching duration question from the other reply (assuming you fix the trailing slash issue):

Downloads, Users: 900 seconds

This only changes one per day on our end, so I recommend following that and increasing to 86400 seconds.

Rating, Stars: 120 seconds

This is updated asynchronously after each rating, but it's unlikely the averages would move that much over such a short period. I recommend increasing to 7200 seconds at least - possibly even 86400 to match the number of downloads/users.

Version: 300 seconds

This seems ok for now.

FWIW, I've look briefly at ublock origin's badge on their GitHub page and the HTTP request for the image of its ratings has a caching duration of only 120 seconds - but I don't know if you're also caching things on the backend.

@calebcartwright
Copy link
Member

Yes there will have been a couple of spikes over the last week where every request was a cache miss for a bit but given the longest we cache any Mozilla Addons badge for is 15 mins that probably didn't make a huge difference

Gotcha @chris48s. I thought it was happening intermittently throughout but looking back at the update you shared see I either remembered incorrectly or misunderstood in the first place

@calebcartwright
Copy link
Member

That's excellent info @diox, thank you.

We still have to consider the balance Chris mentioned between staleness/freshness and load, so I'm not sure if we'll necessarily max out all of these durations, but absolutely a lot of room to increase these and reduce the amount of traffic we send your way.

FWIW, I've look briefly at ublock origin's badge on their GitHub page and the HTTP request for the image of its ratings has a caching duration of only 120 seconds

Correct, ratings/stars are a 120 seconds so I'd fully expect a ratings badge instance to have 120 seconds.

but from what we can see Shields.io is by far the biggest single consumer of our API outside of Mozilla, refreshing data that changes very little over and over quite often.

One thing that's not clear to me so I'll ask explicitly: are you suggesting Shields being your second biggest consumer is a problem in-and-of-itself, or is it simply a matter of volume of requests given the opportunities to reduce that volume?

@diox
Copy link
Author

diox commented Feb 10, 2022

One thing that's not clear to me so I'll ask explicitly: are you suggesting Shields being your second biggest consumer is a problem in-and-of-itself, or is it simply a matter of volume of requests given the opportunities to reduce that volume?

I'm hoping that reducing the volume of requests enough will make Shields less of an outlier so that we don't have to find out how much of a problem it actually is.

Edit:

Correct, ratings/stars are a 120 seconds so I'd fully expect a ratings badge instance to have 120 seconds.

Given the popularity of ublock origin, I think bumping that much higher as I suggested above would greatly help with volume of requests. Like I said, it's unlikely for a rating to change dramatically in less than a few hours or even a day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service-badge Accepted and actionable changes, features, and bugs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants