-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Taskcluster master runs fail very often #10653
Comments
@jgraham, in addition to fixing this, what monitoring do you think we should put in place? If this is on track to become critical infrastructure, then I'd like something in https://foolip.github.io/ecosystem-infra-rotation/ to go red when Taskcluster is consistently failing. What API should I look at? |
Actually, failure rate is not 100%, yay! 5 have succeeded since runs began: |
@Hexcles @lukebjerring , even for the failing ones there will be some results. I guess we will have to decide whether we require all tasks to have succeeded, or if we collect and submit partial results to wpt.fyi and treat it as a processing or frontend problem what runs to show. WDYT? |
Re: the API you want the GitHub (combined) status API. |
@jgraham you mean just to know that Taskcluster has finished? |
Yes. If you want to know specifics about the task statuses, you use e.g. https://queue.taskcluster.net/v1/task-group/Jvlwi0jnR-68F5eUlfcfgg/list where the random string is the taskgroup id that you can get from the URL in the status messge. |
I've intermittently come across this issue when messing around with |
I think the biggest issues here are now fixed. I see occasonal timeouts still, which warrant investigation because I'm not sure it should take so long to run tests, and there seems to be a race condition when merging PRs that we sometimes get "Reference is not a tree". |
@jgraham it looks like all recent runs are still failing? |
After #10762, based on visual inspection of https://github.com/w3c/web-platform-tests/commits/master Taskcluster has succeeded more often, maybe 80% of the time, but it's not rock solid. @jgraham, do you think this tracking issue is still useful, or does #10842 account for all of the remaining issues? |
#10842 accounts for everything that I have specifically noticed. I'll file more issues as I figure out more things. |
#9226 landed yesterday so we have some finished runs now. The most recent from https://github.com/w3c/web-platform-tests/commits/master:
In the last, https://tools.taskcluster.net/groups/E6sRtJcWRvSeXqYxh3vnAA/tasks/Dbj4toXRQyq2Yfr2TRk1lA/runs/0/logs/public%2Flogs%2Flive.log has this log:
Looks like a job for @jgraham :)
The text was updated successfully, but these errors were encountered: