-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commits touching many files fail to run in Taskcluster #18608
Comments
@Hexcles since there are statuses here, and in the future checks I assume, would these failed runs show up in the life cycle UI? |
The same failure affected https://tools.taskcluster.net/groups/Pwp5UGFuSWG_qJD4q1iR0A now, which was the build for 0ccc44b, which Edge and Safari ran. That failure means no aligned run, delaying results. The payload seems unusually big for this commit, as many files were changed and the Taskcluster payload includes all added/modified files. If correct, this means that large changes won't get tested properly. That's a real shame, because those will keep happening, and those changes are disproportionately important to test. @jgraham this seems like a Taskcluster issue, and I think aws/aws-sam-cli#188 might be the underlying issue. There are some suggested workarounds there, could you file a Taskcluster issue about this? |
git occasionally prints warning messages to stderr (e.g. when too many files are modified and rename detection is disabled), which would mess with the output parsing if they are redirected to stdout. Fixes #18608.
git occasionally prints warning messages to stderr (e.g. when too many files are modified and rename detection is disabled), which would mess with the output parsing if they are redirected to stdout. Fixes #18608.
git occasionally prints warning messages to stderr (e.g. when too many files are modified and rename detection is disabled), which would mess with the output parsing if they are redirected to stdout. Fixes #18608.
git occasionally prints warning messages to stderr (e.g. when too many files are modified and rename detection is disabled), which would mess with the output parsing if they are redirected to stdout. Fixes #18608.
Reopening because I expect the original problem remains. |
I don't think there's an AWS problem here as such. I'm pretty sure the issue is that the |
Alright, but I guess this is still something we can't fix in this repo but would need to fix in Taskcluster? |
master run did break after I merged #18856 : 196a44e#commitcomment-34982419
|
That's yet another failure mode, haven't seen that before. Is it persistent, is a revert needed? |
It's not persistent. The next commit is running just fine at the moment. |
So, having one bug cover three seperate issues is confusing :) Can we use this one to track the case where we get an execption in In that case the failing code is in TaskCluster is calling docker via the API so if we pass in environment variables they always go via a HTTP message; there's no way to do anything like pass a file in (and, given the above, I think that'd still fail). So the fundamental problem here is at the linux/libc layer. There are a couple of options we could try to solve this:
[1] https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/ci/run_tc.py#173 |
FWIW I lean toward option 2. |
@jgraham I think it's a safe bet that the rest of us don't have a strong opinion, so if option 2 is something you can do, that'd be great! |
I went to check if this week's beta runs were also affected. Indeed they're missing in https://wpt.fyi/runs but they also don't show up as triggered at all in https://api.github.com/repos/web-platform-tests/wpt/statuses/820f0f86047e6e26401e028fb6d0da43c84e6aab. @jgraham any idea what might cause Taskcluster to not even start on a push to epochs/weekly? To get the results and see if they'd work if triggered, I'll push to the triggers/* branches now. |
Nope, still not triggering... There aren't any webhooks registered for Taskcluster and I presume that's because it uses the GitHub Apps integration. But that also means I don't know what logs to check to see if GitHub even notified Taskcluster. @jgraham can you help take a look? |
https://tools.taskcluster.net/groups/ckWIiah0SiqFdoiqisNIWg looks like a taskgroup that ran against the head of |
It was the same commit, but it looks like the triggering branch was epochs/daily, as I'd expect for stable runs. But those comments at 820f0f8#commitcomment-35000972 definitely look like they'd be triggered by my attempt with the triggers/* branches. Can you report those issues to the Taskcluster team? |
This isn't urgent, decreasing prio to make that clear, but waiting for @Hexcles to confirm that this is totally fixed. I suspect it is not because I didn't see beta runs last week. |
@foolip Sounds like this may be fixed - can you confirm? |
I don't know if it's been fixed, but I haven't seen it in a while. Closing, if it happens again I suspect it would be a new problem next time, with the infra constantly evolving. |
https://wpt.fyi/runs?max-count=100&label=beta shows these runs so far this year:
Beta runs are triggered weekly, so many dates are missing here, like all of July, Aug 5 and Aug 19.
To figure out what went wrong, one has to know what the weekly SHA was, and https://wpt.fyi/api/revisions/list?epochs=weekly&num_revisions=100 lists them going back in time.
Then a URL like https://api.github.com/repos/web-platform-tests/wpt/statuses/8561d630fb3c4ede85b33df61f91847a21c1989e will lead to the task group:
https://tools.taskcluster.net/groups/WfP5NQ-ISzahOnNbiJugEQ
This last time, it looks like all tasks failed like this:
For Aug 5 it was the same.
@jgraham any idea why this happens, and if it's disproportionately affecting Beta runs?
Related: #14210
The text was updated successfully, but these errors were encountered: