Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move release docker host #3839

Open
ryanaslett opened this issue Jul 17, 2024 · 13 comments
Open

Move release docker host #3839

ryanaslett opened this issue Jul 17, 2024 · 13 comments

Comments

@ryanaslett
Copy link
Contributor

Sub issue of #3597

I've rebuilt the ubuntu1804_docker-x64-1 host with ubuntu2404, and its docker containers are running and connected to ci-release.nodejs.org.

It appears as though there are two containers, but only one is being used right now (the iojs+release job shows the cross-compiler-ubuntu1804-armv7-gcc-[6,8] jobs as disabled

I have marked both https://ci-release.nodejs.org/computer/release%2Dmnx%2Dubuntu1804%5Farm%5Fcross%5Fcontainer%2Dx64%2D2/
and
https://ci-release.nodejs.org/computer/release%2Dmnx%2Dubuntu1804%5Farm%5Fcross%5Fcontainer%2Dx64%2D2/

As offline until we're ready to flip them on and test.

Im unsure the standard procedure here (enable the new ones, disable the old ones and wait for the daily build to run? or can we rebuild a previous build to test the new containers' validity?)

@richardlau
Copy link
Member

@richardlau
Copy link
Member

It appears as though there are two containers, but only one is being used right now (the iojs+release job shows the cross-compiler-ubuntu1804-armv7-gcc-[6,8] jobs as disabled

The ubuntu1804 container is redundant now as we're building Node.js 18, 20, 22 and later with the rhel8 container.

@richardlau
Copy link
Member

Running a test build (re-run of today's V8 canary): https://ci-release.nodejs.org/job/iojs+release/10355/nodes=cross-compiler-rhel8-armv7-gcc-10-glibc-2.28/

This ran out of space. Jenkins has automatically taken the agent offline with

Caution

Disk space is below threshold of 1.00 GiB. Only 361.32 MiB out of 8.73 GiB left on /home/iojs/build.

@ryanaslett
Copy link
Contributor Author

Attempted to re-run a daily job and it failed because the /home dir on the new mnx machines didnt have any space (all the disk was mounted on /data). Thats been moved to /home.

Next issue is that the docker data-root on the old machines was manually moved to /home/docker-lib

added an /etc/docker/daemon.json file with:

{
  "data-root": "/home/docker-lib"
}

https://ci-release.nodejs.org/job/iojs+release/10358/nodes=cross-compiler-rhel8-armv7-gcc-10-glibc-2.28/console is still running, which I think is due to not having anything in the ccache on the first run, but it has space now.

@ryanaslett
Copy link
Contributor Author

I really hope this is just a ccache issue but, its up to 4.5 hours and counting. Both CPU's are pegged at 100%. Im wondering if we need a bigger box.

@richardlau
Copy link
Member

It took over 8 hours but it did complete. I've started a rebuild which should utilize ccache so we can compare build times. https://ci-release.nodejs.org/job/iojs+release/10359/nodes=cross-compiler-rhel8-armv7-gcc-10-glibc-2.28/

@richardlau
Copy link
Member

It took over 8 hours but it did complete. I've started a rebuild which should utilize ccache so we can compare build times. https://ci-release.nodejs.org/job/iojs+release/10359/nodes=cross-compiler-rhel8-armv7-gcc-10-glibc-2.28/

This built in less than ten minutes (🎉) but failed to upload to node-www because we'll need to add its IP address to the ufw2 firewall there.

@ryanaslett
Copy link
Contributor Author

Ive added it's ip to the firewall there, so we should be good to test again.

I'll be on PTO next week, so up to you whether to wait till I get back or if you want to switch over to using this before then.

@richardlau
Copy link
Member

@ryanaslett Would it be possible to open PRs for the new machine (inventory in this repo and secrets) before you go?

@ryanaslett
Copy link
Contributor Author

@ryanaslett Would it be possible to open PRs for the new machine (inventory in this repo and secrets) before you go?

I realize I mistakenly pushed a few commits directly that were meant to be a PR.
ed9abaf
and
f563e77

So that ip address and the supporting changes to the ansible are in the repo.

I created a PR for the docker host secrets: https://github.com/nodejs-private/secrets/pull/339

@targos
Copy link
Member

targos commented Jul 20, 2024

No worries. I added a branch ruleset to avoid future direct pushes to main.

@richardlau
Copy link
Member

Ive added it's ip to the firewall there, so we should be good to test again.

I've taken the joyent container offline in ci-release and put the mnx one back online and will see how tomorrow's nightly/v8-canary build(s) go.

@richardlau
Copy link
Member

richardlau commented Aug 5, 2024

FWIW builds have been successful on the new container since the switch. Build times without a ccache (or where much of V8 needs to be recompiled (e.g. v8 canary)) are now ~9-10 hours (!). With a populated ccache, we're at a reasonable ~9mins.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants