Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action timing out when new node versions are released #132

Closed
ebwinters opened this issue Apr 6, 2020 · 41 comments
Closed

Action timing out when new node versions are released #132

ebwinters opened this issue Apr 6, 2020 · 41 comments
Labels
enhancement New feature or request

Comments

@ebwinters
Copy link

I am using this in a job:

name: Use Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v1
  with:
    node-version: ${{ matrix.node-version }}

and node-version 13.x, but my workflow takes 10+minutes just to complete use node. Any idea why or anyone else having this problem?

@Xetera
Copy link

Xetera commented Apr 6, 2020

Same here, It looks like I'm sometimes timing out and getting

Premature close
Waiting 16 seconds before trying again
Premature close
Waiting 18 seconds before trying again
##[error]Premature close

and sometimes it's working fine but still taking 7-11 minutes per install for some reason (12.x)

@ebwinters
Copy link
Author

good to know other people are having issues - seriously stalling my pipeline 😞

@camelmasa
Copy link

Same here, We are using node 12.x.

@fishcharlie
Copy link
Contributor

Same. You can see the workflow on an open source project experiencing this here. From my experience this just started happening recently.

As you can see the setup node step took 7 minutes and 34 seconds 😱

image

@fishcharlie
Copy link
Contributor

One of the major issues of this action is the fact that there is next to no logs that get printed. I brought this up in #90 for the version it's using. But this is another situation where we need logs!!

@jiyuhan
Copy link

jiyuhan commented Apr 6, 2020

This is a serious issue if one's pipeline doesn't have default timeout setup. This will cause a lot of people's CI to hang for up to 6 hours. This is not wasting MS' resource, but is wasting resource for other paying customers. Please fix this!


no code has been changed to match the timeline. it seems like some critical dependency has problem serving resources properly? but the issue can be fixed by re-running the build meaning potentially some bad hosts in the fleet or throttling?

@boredland
Copy link

this did cost us a lot of money already. please have a look!

@matiaschileno
Copy link

We have the same problem. Today every pipeline is failing because of this. We are using Node.js 12.x.

@ssannidhi-palo
Copy link

Experiencing similar issues with Node 12.x version. The step is either timing out or takes up too much time. (~25 mins)

image

@robbruce
Copy link

robbruce commented Apr 6, 2020

Cloned the repo and ran the unit tests, it would appear that the issue is with nodejs.org

::debug::Failed to download from "https://nodejs.org/dist/v10.16.0/node-v10.16.0-darwin-x64.tar.gz". Code(500) Message(Internal Server Error)

It's not consistent though, a re-run of tests can get past this issue.

@emalihin
Copy link

emalihin commented Apr 6, 2020

Tried this action for the first time last night and thought 5-10+ minutes setup is natural until i saw this issue.
As @robbruce points out it seem like an intermittent issue with nodejs.org

curl -O ./dmp https://nodejs.org/dist/v10.16.0/node-v10.16.0-darwin-x64.tar.gz

<html>
<head><title>500 Internal Server Error</title></head>
<body bgcolor="white">
<center><h1>500 Internal Server Error</h1></center>
<hr><center>nginx</center>
</body>
</html>```

@WesleyKlop
Copy link

Just browsing random pages on https://nodejs.org sometimes gives a 500 error as well, so it looks like this issue is indeed on their end

@juliedavila
Copy link

It's being tracked in the node upstream repo nodejs/node#32683

@iliyanyotov
Copy link

@defionscode Thanks for linking to the node repo issue.

image 😢

@soanaphikeste
Copy link

Depending on what you need, you might skip this action completely:

All runners have node 12.16.1 preinstalled (e.g. https://github.com/actions/virtual-environments/blob/master/images/linux/Ubuntu1804-README.md, search for "Node.js")

@Raul6469
Copy link

Raul6469 commented Apr 6, 2020

Another alternative is to run your job in a container :

your_job:
  runs-on: ubuntu-latest
  container: node:10.9

zkanda added a commit to zkanda/zkanda.github.io that referenced this issue Apr 6, 2020
and setup/actions are having issues because we can't reach nodejs
servers:
actions/setup-node#132

And it looks like the default ubuntu version already have nodejs:
https://github.com/actions/virtual-environments/blob/efe1d6eb1a6e5820fbd3d1036f9b0a56beb53a09/images/linux/Ubuntu1804-README.md
@Benwick91
Copy link

I have to set the registry-url and the scope in my workflow. And I have the same issues since today. Last week it runs in 1-2 min, but today the action fails or it durates up to 20 min. :(

@19h
Copy link

19h commented Apr 6, 2020

@Raul6469 won't work if you depend on the default dependencies in the Github containers like the AWS CLI. Although I think that's something one should plan half a man-day for..

@culshaw
Copy link

culshaw commented Apr 6, 2020

This definitely warrants a feature request to support mirrors

BradenM added a commit to CrisisCleanup/crisiscleanup-3-web that referenced this issue Apr 6, 2020
@konradpabjan
Copy link
Contributor

cc @damccorm @ericsciple @bryanmacfarlane

I'm seeing this issue as well (on Mac and Linux it's taking me 5+ minutes for this action to run, Windows for some reason seems fine)

@damccorm
Copy link
Contributor

damccorm commented Apr 6, 2020

@konradpabjan Node itself is having an outage right now that is causing this - nodejs/node#32683

@swanson
Copy link

swanson commented Apr 6, 2020

If you're finding this thread and blocked by timeout issues, two things you may want to try until this is resolved:

ThibaultVlacich added a commit to ThibaultVlacich/csv-to-strings that referenced this issue Apr 6, 2020
This avoid it hanging indefinitely when nodejs.org is unresponsive
Cf actions/setup-node#132
@mieszko4
Copy link

mieszko4 commented Apr 7, 2020

For reference software installed on default on runs-on: https://help.github.com/en/actions/reference/software-installed-on-github-hosted-runners

@bryanmacfarlane
Copy link
Member

bryanmacfarlane commented Apr 7, 2020

@mieszko4 - yes, if you don't need or want to validate on a specific version or version spec (range) of node, then you can rely on the one that's globally installed on the machine image.

Note that the global version on the image will slide (13, then 14, etc.).

The core value of setup-node is using a specific version I want to test my code against lts 8, 10, 12, and latest in a matrix in parallel.

That said, we are fixing this issue now so that you can have the best of both worlds. You can lock to an LTS (12.x) and do matrix testing and never do a download and even cuts out the typical 10s of seconds.

If it's not matching a major LTS, then it will download.

So part 2 of the work (also starting now) is to build a cache (backed by CDN) of versions so when we do read through, it hits our cache.

I might add a flag in the action to force reading through and getting absolute latest if you need that since there will be some latency on both the images and the cache.

@swanson Thanks for the note about setting timeout. 1.5x your typical time is a good suggestion. Very early we had a much lower default timeout but had the opposite problem.

@dentuzhik
Copy link

Note that the global version on the image will slide (13, then 14, etc.).

A bit off-topic, but wondering what's the motivation of sliding the node in non-LTS increments in the global images. Since node 8, I have had very few cases in production, when latest-greatest node was desirable (especially without significant amount of testing first).

The core value of setup-node is using a specific version I want to test my code against lts 8, 10, 12, and latest in a matrix in parallel.

Very nice clarification, thank you. I believe it could have some better emphasis in readme (there's a good usage example, but no why motivation).

Thanks for the note about setting timeout. 1.5x your typical time is a good suggestion.

We also have been hit quite hard by this issue, and rigorous timeouts on every job indeed saved our a$$es in this case, otherwise we would certainly hit the limit of minutes quite fast.

\cc @Kerizer

@bryanmacfarlane
Copy link
Member

bryanmacfarlane commented Apr 7, 2020

@dentuzhik - I should have clarified that hosted images should be installing LTS globally so the real scenario would be sliding from 10 to 12 and then 12 to 14.

Also one more note - this setup task sets up problem matchers and auth. However that's easy to use if you used setup-node without a node-version property. That combo will setup problem matches, proxies (if that's relevant for toolset) but won't download a node and use the one in the PATH globally.

I'll update the readme with all of this but also note that one more variable is self-hosted.

@dentuzhik
Copy link

If there will be a PR, I would happily have a look at it 🙇

@amio
Copy link

amio commented Apr 8, 2020

Seems we already using @actions/tool-cache in master branch 🤔

downloadPath = await tc.downloadTool(downloadUrl);

@bryanmacfarlane
Copy link
Member

@amio - yes, this action downloads from the node distribution to the tool-cache folder. It does if it can't match the semver version spec to what's already in the tool-cache folder. In self-hosted runners, that cache fills up on use and gets faster over time. In hosted runners, it depends on whether that cache is prepped with likely hits (e.g. latest of every LTS major version line). Remember that hosted VMs are a fresh machine every time. Thus the mention of this issue above to prep that cache as part of our image gen process. Doing that will mean virtual 0 second resolution of 8.x, 10.x, 12.x specs and no reliability issues.

Part 2 of the work refers to a different cache. An off VM (CDN) where we cache all the node distributions so everyones workflows aren't hitting the node distribution directly (which is not a CDN). We can also get better locality and more reliability from a CDN. This is essentially what was suggested in the node issue. This is not a trivial task but we're working on it now. You can see the toolkit start here. The vm gen to populate that cache isn't OSS yet. It's early and we're sketching it out.

So essentially, it will be (1) check local VM cache (great for self hosted, prepped on hosted) then (2) if no match locally, download from CDN then (3) if still no match, fall back to node dist.

Hope that helps clarify.

@19h
Copy link

19h commented Apr 11, 2020

I think it would be beneficial if Github was opting to mirror and effectively host the Node binaries if it wants Github Actions to be taken even remotely serious for production use.

@mataide
Copy link

mataide commented Apr 24, 2020

its possible to set registry-url: https://registry.npmjs.org/ without uses: actions/setup-node@v1 ?

@bryanmacfarlane
Copy link
Member

Update: v2-beta is available which downloads from a releases cache for additional reliability. It falls back to node dist on miss or failure. We also added latest of each LTS directly on the image so if you reference a major version binding ('10', '12', etc.) it should be virtually instant with no reliability issues from downloading.

More here: https://github.com/actions/setup-node#v2-beta

Note that this is the planned solution to this problem.

@bryanmacfarlane bryanmacfarlane added the enhancement New feature or request label Jun 26, 2020
@bryanmacfarlane bryanmacfarlane changed the title Action timing out Action timing out when new node versions are released Jun 26, 2020
@fishcharlie
Copy link
Contributor

@bryanmacfarlane It looks like v2 has been released (not in beta). What is the status on this issue? Are there still more planned changes to improve the reliability here or does v2 resolve everything you have planned?

@patrickk
Copy link

patrickk commented Mar 1, 2022

Also ran into this issue yesterday. Does v2 support fallback out of box? Or does it need to be configured?

@dmitry-shibanov
Copy link
Contributor

Hello @patrickk. Thank you for your report. Firstly the action will try use nodejs (LTS) from toolcache preinstalled on the hosted images. If the action do not match version it will try to download nodejs (LTS) from node-versions. If the action does not find a version in the node-versions, it'll try to download it from https://nodejs.org/dist

Could you please create an issue with public repository or repro steps ?

@patrickk
Copy link

patrickk commented Mar 1, 2022

Hello @patrickk. Thank you for your report. Firstly the action will try use nodejs (LTS) from toolcache preinstalled on the hosted images. If the action do not match version it will try to download nodejs (LTS) from node-versions. If the action does not find a version in the node-versions, it'll try to download it from https://nodejs.org/dist

Could you please create an issue with public repository or repro steps ?

sure @dmitry-shibanov! Thanks for your reply.

Unfortunately I can't create a public repo since a lot of it is private but I can detail out the steps:

  1. Workflow checks out code
  2. setup-node@v1 tries downloading 16.11.11
  3. 500s

Screen Shot 2022-03-01 at 2 49 42 PM

Screen Shot 2022-02-28 at 5 36 14 PM

(there is a version discrepancy but I'd intermittently get 500s on nodejs.org's dist for various versions).

Read somewhere that this usually happens with new releases for node, but saw that the latest release in dist url was for Feb 22, while the error above occurred Feb 28.

We were able to get past this by letting the action fail a bunch/letting it run for 5+ minutes, and one of them finally succeeded.

Haven't been able to reproduce (also I updated our workflow files to use v2 as of this morning).

@dmitry-shibanov
Copy link
Contributor

Hello @patrickk. Thank you for your response. Could you please switch to the v2 tag ? For the v1 tag the action tries to download nodejs from dist.

@dmitry-shibanov
Copy link
Contributor

Hello everyone. For now I'm going to close the issue because the support for node-versions repository was added. Besides, lts node versions are pre-cached on the images.

deining pushed a commit to deining/setup-node that referenced this issue Nov 9, 2023
Bumps [@vercel/ncc](https://github.com/vercel/ncc) from 0.26.1 to 0.26.2.
- [Release notes](https://github.com/vercel/ncc/releases)
- [Commits](vercel/ncc@0.26.1...0.26.2)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests