-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub Actions - sometimes... our 'host is unreachable' #2890
Comments
Hi @trel, |
I have reproduced the failure with fewer moving parts - just a curl call. https://github.com/trel/irods/pull/9/checks?check_run_id=2079836230 I'll look into trying a similar curl in vanilla Azure |
In some further testing, Could this be an "apt and https" issue? As in, the proxy is not configured for apt via https correctly within some of the Azure containers? |
#2919 possibly related? |
Hello @trel,
And all information that I was able to find - it is some temporary error with external apt-mirror. So, it is some error with https://unstable.irods.org/apt/ repository, and I believe it makes sense to check this problem from their side as well. |
That side is my side :) We've checked and have no connectivity issues from anywhere else in the world that we've checked. And, it's also happening on https://packages.irods.org, a different VM with similar setup. We're still investigating, but at this time it does feel like a firewall/proxy issue in the container itself (as most of the time it works cleanly, and has worked for some time via travis with the same commands before moving to github actions). |
Hello @trel, |
Just saw the same issue via
In addition... I am noticing more failures in the middle of my day (UTC-0500). |
Hello @trel,
There are no changes from my side, I just follow instruction for Ubuntu from the official site: https://unstable.irods.org . Could you please check it? I need to reproduce the issue one more time, to check possible network problems from our side. |
Hi, We don't have an Ubuntu20 release out yet - please try a bionic (ubuntu:18.04) VM/container. |
@trel Thank you, I assumed this, but Ubuntu20 was filled in the initial request, that's why I'm asking this. |
I will uncheck that box - I had added that check mark when it was 'just a curl call'. And then forgot to uncheck when I learned it was the the apt/yum calls instead. Thanks. |
@trel I've reproduced the issue several times, but could not find any relations with the region or specific machine configuration. We've created an internal issue to the Azure network engineering team for further investigation. |
Excellent - thank you. |
I've seen something similar, i.e. connection problems when connecting outbound to public internet from within GitHub Actions. The problem seems to have escalated over the past month(s). Like @trel my suspicion is the same. There seems to be a "first touch penalty" for creating outbound connections (perhaps the penalty is paid on a per-destination basis, dunno). Therefore, one advice is to check your connect timeout. Our case were simple http downloads. We were using 5 seconds connect timeout. After increasing to 30 seconds the problem went away .. or at least could no longer be reproduced. However, wrt yum, the default is already 30 seconds as far as I can tell. But thought I would share findings anyways namely that it feels as if the runner needs some kind of warmup, network wise, before outbound connections are stable. |
Not sure if this is related but just today we started getting "This check failed" errors on our two linux checks without any guidance as to what the issue might be. No logs or anything is running. |
Same here, this started around 12 hours ago. I'm now randomly getting the following and our environment tag is gone in the repo settings.
|
@lukepighetti @OmgImAlexis , could you please log the separate issue for this problem since it is not related to the initial issue. For investigation, we need links to the pipelines (links will be useful even if repo is private) |
Ours was a billing issue, but the big red X didn't inform us of this. There are logs available if you click on the Actions tab which are not available if you view the action status from the PR. I'm considering my particular issue resolved, but I do think my feedback should be considered. Apologies for the noise in this PR. |
We've seen increased success lately. Not sure that's actionable here, but seeing fewer timeout failures. Not yet zero, though. |
Hello @trel, |
Okay. Thanks for the update - we still see this timeout more than once per week. |
Follow up: It's been more than a month since we upgraded the host itself that was sometimes unreachable. It had been an Ubuntu14 VM and is now CentOS7. We have seen no errors since this upgrade. Current speculation is that the aging/EOL SSL libraries on Ubuntu14 could have been related to the intermittent errors. |
Description
We are using GitHub Actions to install packages from our own apt/yum repository, housed on a public VM, raw directories under Apache. Not seeing any problems from anywhere else... however...
Sometimes... 25%?... the GitHub Action cannot get to https://unstable.irods.org
It resolves correctly, the IP address is right.
Area for Triage:
Containers
Question, Bug, or Feature?:
Bug
Virtual environments affected
Image version
Version: 20210302.0
Expected behavior
Expected to be able to see/use our server.
Actual behavior
From:
https://github.com/irods/irods/blob/master/.github/workflows/build-irods.yml#L31
We see this in the Action logs:
Repro steps
The commits to https://github.com/irods/irods trigger builds - sometimes they fail. Manual retries eventually can connect, and complete their work.
Has the feel of a firewall somewhere between GitHub and our VM that is throttling connections, perhaps IP-based. Is there any way to detect / determine this?
The text was updated successfully, but these errors were encountered: