-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mozdownload sometimes fails on Taskcluster and Travis #13274
Comments
The reason I noticed is that in https://wpt.fyi/test-runs?label=taskcluster, the row for 2df7f9f has only stable results, but I expected stable + experimental. @Hexcles FYI. |
@Hexcles could you please add a priority label to this? I think it's your call because you are managing priority/urgency of shipping TaskCluster runs on wpt.fyi. |
300f8e2 failed with this on https://tools.taskcluster.net/groups/TKPIsqzARXyy88H0_n8bLw |
@jgraham, can you take a look? Needs more retry? |
I fairly strongly suspect that this is happening when a new nightly is being released (maybe some platforms are available and some are not?). But we are already handling that badly; it's possible to end up with some tests run in the previous nightly and some in the new one. Really we need a single decsion task that picks a binary URL and makes it available to the subsequent tasks to ensure that they all run against the exact same version. Note that Chrome could have the same issue, but it's less likely since the releases are less often. But it's harder to solve in that case; we probably actually need to download the .deb and make it available as an artifact since there isn't a longlived URL AFAIK. |
Enter, stage left, @jugglinmike to say something about how this is done in https://github.com/web-platform-tests/results-collection |
I happened to look at recent commits, and https://github.com/web-platform-tests/wpt/commits/75b92bf3d1791dc0e47cd8a716a135e98d2d2937 has a similar failure (https://tools.taskcluster.net/groups/ZklAzb_fTueVrBAWR0kGBA): "requests.exceptions.ConnectionError: HTTPSConnectionPool(host='hg.mozilla.org', port=443): Max retries exceeded with url: /mozilla-central/archive/tip.zip/testing/profiles/ (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc6f85775d0>: Failed to establish a new connection: [Errno 110] Connection timed out',))" @jgraham is there anything we can do to mitigate this? |
#14450 (comment) shows this happening when running unit tests in Travis too. @jgraham, is there nothing we can do to make this more reliable? |
This happened in #15012 now: The failure was in test_install_firefox because of a network error in mozdownload: @jgraham should we mark the test as xfail? |
@foolip that test shouldn't be any more flaky than any of the test run jobs where we start off by downloading the nightly browser |
Perhaps it is one of few pytest tests that do this, because I've seen it fail a few times and I don't think I've seen it for other tests. But it happens for @jgraham is there an infra bug filed in some Mozilla repo for improving the reliability of this setup? Or could it be a mozdownload bug? |
It's a mozdownload bug. The way it works is that it looks for a directory containing builds, stores that, and then later uses the stored directory to look for the actual build. But the operation of creating a directory full of builds isn't (even nearly) atomic; the directory is created when the first artifacts are available not when the last build is complete. So the "solution" here is either a) stop using mozdownload and roll our own thing, b) rearchitect mozdownload to do the build and directory lookup in a single operation or c) catch the failure and try again with the previous build. c) is probably the most practical option but the tool really doesn't seem to be designed in a way that makes a fix here easy. |
Is there a stable URL that can be used to download the latest build for a given platform, or does mozdownload exist precisely because downloading Firefox isn't that easy? If it were easy then just skipping mozdownload would be a decent option. @jgraham do you know if there's a bug filed for mozdownload about this? |
mozilla/mozdownload#524 is the mozdownload bug, as appears above. |
There is a stable url for "latest build" but that's what we're using and what's causing the problem. There isn't a stable url per build type/platform. |
This happened on #15280. @jgraham have you seen many Gecko exports blocked because of this? It seems from the stack that there's already retry involved, so I guess a fix for mozilla/mozdownload#524 is the only hope? |
The latest stable nightly build could be found like this for linux64: Which other kind of builds are required? |
Those are the ones that have been failing, but we also download stable and beta for other runs. Not sure if those use mozdownload, @jgraham would know though. |
We can definitely experiment with not using mozdownload. |
Fixed via #15329 |
Thanks James! |
At least twice some of the Firefox tasks for pushes to master have failed in mozdownload:
https://tools.taskcluster.net/groups/AGupAeh6TrSdyNlDxcqZXw (for commit 2df7f9f, now only discoverable via API)
https://tools.taskcluster.net/groups/eRtqwYxjTfeob0g4hHcOlw (for commit 91491de)
The most recent failure was:
The previous was very similar except for the date: "mozdownload.errors.NotFoundError: Folder for builds on 2018-09-24-10-03-54 has not been found: https://archive.mozilla.org/pub/firefox/nightly/2018/09/"
That it's happened 4 days apart suggests that it wasn't just a transient problem with archive.mozilla.org.
Here's where the error is thrown:
https://github.com/mozilla/mozdownload/blob/866cfebe9b8137bfe7ba8411efbe9d0e9d24093a/mozdownload/scraper.py
@jgraham, can you take a look?
The text was updated successfully, but these errors were encountered: