Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inside a docker container, tsx is no longer working after the container restarted #457

Closed
4 of 6 tasks
louislam opened this issue Jan 17, 2024 · 22 comments · Fixed by #459
Closed
4 of 6 tasks

Inside a docker container, tsx is no longer working after the container restarted #457

louislam opened this issue Jan 17, 2024 · 22 comments · Fixed by #459

Comments

@louislam
Copy link
Contributor

louislam commented Jan 17, 2024

Acknowledgements

  • I searched existing issues before opening this one to avoid duplicates
  • I understand this is not a place for seek help, but to report a bug
  • I understand that the bug must be proven first with a minimal reproduction
  • I will be polite, respectful, and considerate of people's time and effort

Minimal reproduction URL

https://github.com/louislam/tsx-issue-reproduce

Version

Tested 4.7.0 and 4.6.2

Node.js version

Tested 20.11.0 and 18.19.0

Package manager

npm

Operating system

Linux

Problem & Expected behavior

Original issue from my project: louislam/dockge#353. My application Dockge is unable to start after the server was restarted for some users.

Reproduce steps

  1. The issue can only be reproduced inside a docker container, I tried on a Ubuntu bare machine, it is working fine. So you need Docker and docker-compose plugin to test it.
  2. Checkout: https://github.com/louislam/tsx-issue-reproduce
  3. Run docker compose build
  4. Run docker compose up
  5. You should see it keeps printing Forever Loop
  6. Ctrl + C to stop the stack
  7. Run docker compose up again
  8. An error like Error: listen EADDRINUSE: address already in use /tmp/tsx-0/26.pipe will be shown and exited with code 1
  9. If you don't see it, try again docker compose up several time, it should be very high chance to hit the error.

My observation

  • /tmp/ is persistent inside a docker container, all files here will remain after the container restarted
  • PID is usually the same (In my case, it is 26) after restart
  • /tmp/tsx-0/26.pipe is not removed after the container is stopped

Caused by

server.listen(pipePath, resolve);

Which I believe it should belong to #413

Error Log

test-tsx-test-tsx-1  | > ls -l /tmp/tsx-0 ; tsx index.ts
test-tsx-test-tsx-1  |
test-tsx-test-tsx-1  | total 4
test-tsx-test-tsx-1  | -rw-r--r-- 1 root root 1172 Jan 17 21:07 17055-91de2c17650b57ed05bae0f65c605f825076ee72
test-tsx-test-tsx-1  | srwxr-xr-x 1 root root    0 Jan 17 21:07 26.pipe
test-tsx-test-tsx-1  | node:internal/errors:563
test-tsx-test-tsx-1  |     ErrorCaptureStackTrace(err);
test-tsx-test-tsx-1  |     ^
test-tsx-test-tsx-1  |
test-tsx-test-tsx-1  | Error: listen EADDRINUSE: address already in use /tmp/tsx-0/26.pipe
test-tsx-test-tsx-1  |     at Server.setupListenHandle [as _listen2] (node:net:1855:21)
test-tsx-test-tsx-1  |     at listenInCluster (node:net:1920:12)
test-tsx-test-tsx-1  |     at Server.listen (node:net:2025:5)
test-tsx-test-tsx-1  |     at file:///test-tsx/node_modules/tsx/dist/cli.mjs:53:31317
test-tsx-test-tsx-1  |     at new Promise (<anonymous>)
test-tsx-test-tsx-1  |     at yn (file:///test-tsx/node_modules/tsx/dist/cli.mjs:53:31295)
test-tsx-test-tsx-1  |     at async file:///test-tsx/node_modules/tsx/dist/cli.mjs:55:459 {
test-tsx-test-tsx-1  |   code: 'EADDRINUSE',
test-tsx-test-tsx-1  |   errno: -98,
test-tsx-test-tsx-1  |   syscall: 'listen',
test-tsx-test-tsx-1  |   address: '/tmp/tsx-0/26.pipe',
test-tsx-test-tsx-1  |   port: -1
test-tsx-test-tsx-1  | }
test-tsx-test-tsx-1  |
test-tsx-test-tsx-1  | Node.js v20.11.0

Contributions

  • I plan to open a pull request for this issue
  • I plan to make a financial contribution to this project
@louislam louislam added bug Something isn't working pending triage labels Jan 17, 2024
@louislam
Copy link
Contributor Author

I think it can be resolved by removing the .pipe file before listening, but I am not fully understand what this ipc server does.

@privatenumber
Copy link
Owner

IPC server is for the tsx Parent (e.g. watcher) and Child (actual Node.js process) to communicate things like exit signals or watch dependencies.

Is the process exiting abruptly? Sounds like it's not getting cleaned up:
e.g.

fs.rmSync(pipePath);

I think it should be fine for the Node process to re-use the existing pipe file though because there can only be one process using the same ID at a time.

Do you mind adding Docker via GitHub actions to your repo: https://github.com/louislam/tsx-issue-reproduce ? I don't have a Docker env to test it in.

@louislam
Copy link
Contributor Author

louislam commented Jan 18, 2024

Do you mind adding Docker via GitHub actions to your repo: https://github.com/louislam/tsx-issue-reproduce ? I don't have a Docker env to test it in.

Just added the github actions. You can check the log here:
https://github.com/louislam/tsx-issue-reproduce/actions/runs/7570154825/job/20614947580

@louislam
Copy link
Contributor Author

Is the process exiting abruptly?

According to my users, they just normally reboot/shutdown their machine, I assume that Docker will shutdown the container gracefully.

According to my test, docker compose stop or Ctrl + C also won't trigger the cleanup event.

Upon current info, maybe the real issue is, if process is killed by signals, the process exit event is not triggered and it doesn't run the cleanup code.

I just checked it on my bare machine without Docker, those files were actually not cleaned up.
image

On a bare machine, every time the pid is quite random, it has lower chance to collide.

Inside a container, the pid is always the same (I would say 90%).

@horvbalint
Copy link

horvbalint commented Jan 19, 2024

We are also experiencing this issue of the pipe file not being cleaned up by tsx when the container shuts down.
As a workaround one can use the node --import tsx index.ts command and only use tsx as a loader. This works, but you lose the very useful watch part of tsx, so a proper solution would be great :)

@horvbalint
Copy link

I think it should be fine for the Node process to re-use the existing pipe file though because there can only be one process using the same ID at a time.

@privatenumber This is not always true. In our case we have multiple projects running with tsx inside docker containers and we have to mount the hosts /tmp folder as a volume to the containers /tmp for other reasons. The pid of the tsx process is the same in both docker container which means that only one of the containers can start, because the other one will already find a pipe file using the same pid. That file is for the other tsx process though, so it is not fine to resuse it.

Maybe a random name (uuid?) could be used as the pipe name and in the unlikely scenario of it aready existing a new one could be generated?

@privatenumber
Copy link
Owner

Gotcha. I appreciate all the context and use-cases where the socket path could collide.

We could randomize the socket path but the tricky part is communicating what the random pipe path is from the parent to the child.

Will think about this some more, but I welcome any ideas or PRs!

@privatenumber
Copy link
Owner

@horvbalint

Curious if you can find a way to uniquely identify the Containers from within Node.js?

Wondering if values in os or process.env (e.g. os.hostname() or username) could help namespace the pipe file PID.

Would you mind sharing the output of npx systeminformation from the two containers?

@horvbalint
Copy link

@privatenumber This seems promising, os.hostname() gives back the docker container-id! I also saw that this hostname can be overwritten with a launch arg when starting the container, so theoretically someone could give the same hostname for different containers, but that seems like a stretch :D

@horvbalint
Copy link

npx systeminforamations (truncated) output:

{
  "system": {
    "manufacturer": "System manufacturer",
    "model": "Docker Container",
    "version": "System Version",
    "serial": "System Serial Number",
    "uuid": "a0ae0013-a611-97c8-c2d3-04d9f5cb8d88",
    "sku": "-",
    "virtual": false
  },
  "os": {
      "platform": "linux",
      "distro": "Debian GNU/Linux",
      "release": "12",
      "codename": "bookworm",
      "kernel": "6.6.11-200.fc39.x86_64",
      "arch": "x64",
      "hostname": "552f4e808dba",
      "fqdn": "552f4e808dba",
      "codepage": "UTF-8",
      "logofile": "debian",
      "serial": "",
      "build": "",
      "servicepack": "",
      "uefi": false
    },
}

@privatenumber
Copy link
Owner

privatenumber commented Jan 21, 2024

Yeah I agree anything that can be user configured will not be a UUID.

The system.uuid looks straightforward and promising. Are they different across the two Containers? This might be good but it might be expensive to retrieve: https://github.com/sebhildebrandt/systeminformation/blob/9ffc74a0b2c1f31d1ab1ee28fa77e9602f163732/lib/osinfo.js#L1100-L1169

@horvbalint
Copy link

system.uuid is sadly the same in both my docker containers :/

@privatenumber
Copy link
Owner

That's too bad :(

Anyway, I don't have a reproduction of this problem to debug, and I'm not sure if this back-and-forth of asking for more info is productive. I'm going to leave it to you to open a PR with a suggested solution.

@louislam
Copy link
Contributor Author

Oh, I missed the point where multiple containers could use the same volume.

In my opinion, other applications did not handle it either, I saw a lot of applications put their pipe file at /var/run.

If you point /var/run to multiple containers, I believe it will not be working.

Also for reference, pm2 is a global npm package that is used to manage nodejs processes. They put pipe files at /home/<username>/.pm2/pids. Maybe this is a better location than /tmp?

@horvbalint
Copy link

Its also possible that sharing /tmp with multiple containers is not a best practice. I could share a subdir of /tmp only and so a solution for the original problem would be enough.

@physihan
Copy link

I think just solving this issue would be sufficient. Typically, people don't mount this file directory outside using Docker's mounting mechanism, and this directory shouldn't be mounted externally.

@marcoreni
Copy link

marcoreni commented Feb 1, 2024

Docker environments have the "HOSTNAME" env which is unique per container.
I see two ways of resolving this:

  • Making the pipePath a concatenation of hostname and pid (hostname may not exists in "non docker scenarios" but that's not a problem, since pid is OK in that case).
  • Accepting an input "pipe filename" parameter, so that advanced scenarios like this one can be handled more easily? (ie.: --pipe-filename=$HOSTNAME-$PID.pipe or --pipe-filename=$UUID.pipe)

@shennan
Copy link

shennan commented Feb 5, 2024

Getting the same problem here running multiple docker containers in an IOT environment. Is there a workaround while we search for a more permanent fix?

@louislam
Copy link
Contributor Author

louislam commented Feb 5, 2024

@shennan Delete /tmp/tsx-0 before start

Reference:
https://github.com/louislam/dockge/pull/380/files

@AmIBeingObtuse
Copy link

Getting the same problem here running multiple docker containers in an IOT environment. Is there a workaround while we search for a more permanent fix?

Delete the container and rerun the run command but with :nightly not :1.

@shennan
Copy link

shennan commented Feb 5, 2024

Thanks; deleting everything (rm -rf /tmp/tsx-0) when the container is started up and before calling tsx seems to be fixing it.

I'll keep an eye on the progress here and update the package when something has been merged in.

@privatenumber
Copy link
Owner

🎉 This issue has been resolved in v4.7.1

If you appreciate this project, please consider supporting this project by sponsoring ❤️ 🙏

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants