-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node 20.3 Crashes all the time when executed inside docker #48444
Comments
Another possible ref electron/rebuild#1085 |
@nodejs/libuv |
"Text file busy" means trying to write a shared object or binary that's already in use. My hunch is that node-gyp has some race condition in reading/writing files that wasn't manifesting (much) when everything still went through the much slower thread pool, whereas io_uring is fast enough to make it much more visible. |
Is there a way to disable ioring using a env variable as a temporary workaround when running node-gyp? |
Yes, set |
|
This is actually worse than I thought. Node doesn't run at all with 20.3 |
|
I can reproduce this, and this is quite critical. |
cc @nodejs/tsc for visibility |
FWIW on two systems I have access to (a Red Hat owned RHEL 8 machine and test-digitalocean-ubuntu1804-docker-x64-1 from the Build infra) root@test-digitalocean-ubuntu1804-docker-x64-1:~# docker run -it node:20.3.0 node
Unable to find image 'node:20.3.0' locally
20.3.0: Pulling from library/node
bba7bb10d5ba: Pull complete
ec2b820b8e87: Pull complete
284f2345db05: Pull complete
fea23129f080: Pull complete
9063cd8e3106: Pull complete
4b4424ee38d8: Pull complete
0b4eb4cbb822: Pull complete
43443b026dcf: Pull complete
Digest: sha256:fc738db1cbb81214be1719436605e9d7d84746e5eaf0629762aeba114aa0c28d
Status: Downloaded newer image for node:20.3.0
Welcome to Node.js v20.3.0.
Type ".help" for more information.
> I can reproduce the assertion failure on an Ubuntu 16.04 host with root@infra-digitalocean-ubuntu1604-x64-1:~# docker run -it node:20.3.0 node
Unable to find image 'node:20.3.0' locally
20.3.0: Pulling from library/node
bba7bb10d5ba: Pull complete
ec2b820b8e87: Pull complete
284f2345db05: Pull complete
fea23129f080: Pull complete
9063cd8e3106: Pull complete
4b4424ee38d8: Pull complete
0b4eb4cbb822: Pull complete
43443b026dcf: Pull complete
Digest: sha256:fc738db1cbb81214be1719436605e9d7d84746e5eaf0629762aeba114aa0c28d
Status: Downloaded newer image for node:20.3.0
node[1]: ../src/node_platform.cc:68:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed.
1: 0xc8e4a0 node::Abort() [node]
2: 0xc8e51e [node]
3: 0xd0a059 node::WorkerThreadsTaskRunner::WorkerThreadsTaskRunner(int) [node]
4: 0xd0a17c node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [node]
5: 0xc4bbc4 node::V8Platform::Initialize(int) [node]
6: 0xc49408 [node]
7: 0xc497db node::Start(int, char**) [node]
8: 0x7f6e8486218a [/lib/x86_64-linux-gnu/libc.so.6]
9: 0x7f6e84862245 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
10: 0xba9ade _start [node]
root@infra-digitalocean-ubuntu1604-x64-1:~# docker run -it node:20.3.0-bullseye node
Unable to find image 'node:20.3.0-bullseye' locally
20.3.0-bullseye: Pulling from library/node
93c2d578e421: Already exists
c87e6f3487e1: Already exists
65b4d59f9aba: Already exists
d7edca23d42b: Already exists
25c206b29ffe: Already exists
599134452287: Pull complete
bd8a83c4c2aa: Pull complete
d11f4613ae42: Pull complete
Digest: sha256:ceb28814a32b676bf4f6607e036944adbdb6ba7005214134deb657500b26f0d0
Status: Downloaded newer image for node:20.3.0-bullseye
Welcome to Node.js v20.3.0.
Type ".help" for more information.
> Our website build is actually broken running |
FWIW I opened an issue about this in the docker-node repo: nodejs/docker-node#1918 TLDR: this is not a problem with Node.js itself, but with the default base OS used by the Docker image, which was upgraded for v20.3.0. |
bullseye works for me as well |
Now I also get the file busy error:
EDIT: Works with |
So to summarize:
|
Should I split the uring problem into a separate issue? |
Can someone post the result of |
On the Ubuntu 16.04 infra machine I cannot run apt in the bookworm based Another datapoint, adding root@infra-digitalocean-ubuntu1604-x64-1:~# docker run --security-opt=seccomp:unconfined -it node:20.3.0 node
Welcome to Node.js v20.3.0.
Type ".help" for more information.
> |
Right, then I can predict with near 100% certainty what the problem is: docker doesn't know about the newish clone3 system call. Its seccomp filter rejects it with some bogus error and node consequently fails when it tries to start a new thread. This docker seccomp thing is like clockwork, it always pops up when new system calls are starting to see broader use. It's quite possibly fixed in newer versions. |
Updating docker to the latest version fixed it (v24.0.2) for me. A few notes:
Here is what I think we should do:
This seems a future-proof solution while keeping the current functionality available. |
UV_USE_IO_URING is (intentionally) undocumented and going away again so don't do that. |
@bnoordhuis Would you just document this as "if you are hit by this bug, update docker"? |
I think there are two different things here. I'm not sure updating docker will help with the uring problem. Or does it? Please confirm. |
If I'm reading this correctly there are 2 separate issues here.
I think we should try to understand better the 2nd issue before disabling it. |
i have the same, what is the correct solution ? |
Try upgrade to latest container runtime (docker, containerd, etc.) to latest. If nothing newer is picked up from your package manager, consider upgrading manually. |
Experiencing this with |
archlinux 6.9.1, nodejs 22.0.0-1, same error |
node-gyp or node have a bug that prevents building with "text file busy" if the kernel is too fast, so we have to disable IO_URING support. This is cleary a hack and needs to be removed as soon as possible nodejs/node#48444 is the necro bumped thread originally from docker
Same problem
|
Btw we still use bullseye, works pretty good |
not to bring up an old issue again but this appears to be a reoccurring bug. reproducing it consistently with kernel it is fixed with the given that that does fix it, it may also podentially be an issue with |
cc @santigimeno can you take another look? |
This is supposed to be the default for Node.js (since the February security releases). https://nodejs.org/docs/latest-v22.x/api/cli.html#uv_use_io_uringvalue
|
A patch has just been sent to the kernel fixing this: It should land in stable shortly: |
I just tested with 22.2.0 installed from nvm and as documented, io_uring is disabled there. Maybe is there a problem in the arch linux package? |
If you run into this on ubuntu 16 or 18, my fix is use ubuntu >= 20.04. The issue actually comes from docker for me. |
How is arch supposed to be disabling io_uring? It configures nodejs to use the system libuv, and builds its libuv with the default options. |
That's likely the problem. Due to the security reasons mentioned above node.js patched libuv to disable io_uring in the following commits: 42e659c and 6d14352. Maybe the arch packaging hasn't taken that into account? |
Thanks @santigimeno for the help debugging this. |
I am the Arch packager and indeed I have missed this change. Opened libuv/libuv#4416 to see if there is a better way forward. |
https://aur.archlinux.org/cgit/aur.git/commit/?h=thelounge&id=fd50c63 node-gyp or node have a bug that prevents building with "text file busy" if the kernel is too fast, so we have to disable IO_URING support. This is cleary a hack and needs to be removed as soon as possible nodejs/node#48444 is the necro bumped thread originally from docker
https://aur.archlinux.org/cgit/aur.git/commit/?h=thelounge&id=fd50c63 node-gyp or node have a bug that prevents building with "text file busy" if the kernel is too fast, so we have to disable IO_URING support. This is cleary a hack and needs to be removed as soon as possible nodejs/node#48444 is the necro bumped thread originally from docker
For more info see nodejs/node#48444 (comment).
For more info see nodejs/node#48444 (comment).
yarn add bufferutil
) fails withText file busy
The text was updated successfully, but these errors were encountered: