Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout abort can leave process(es) running in the background #3077

Closed
6 tasks done
NuroDev opened this issue Mar 24, 2023 · 16 comments · Fixed by #5047
Closed
6 tasks done

Timeout abort can leave process(es) running in the background #3077

NuroDev opened this issue Mar 24, 2023 · 16 comments · Fixed by #5047
Labels
p2-edge-case Bug, but has workaround or limited in scope (priority) upstream

Comments

@NuroDev
Copy link

NuroDev commented Mar 24, 2023

Describe the bug

Originally I was running 0.29.2 & ran into this frequently but since upgrading to 0.29.7 I have only ran into it a few times but in summary I have a monorepo with lots of packages & lots of tests. On a rare occasion when I run it it would get hung / stuck & then prompt a close timed out after 10000ms, followed by a Failed to terminate worker while running [PATH_TO_TEST].

When this happens I, like many, would try to abort the test using ^C. However, when doing so Vitest seems to leave the hung Node.js process running with the process peaked at 100% CPU usage. I mainly noticed this issue after my Macbook battery life dropped from 100% to 9% in 2 hours & realised a handful of Node.js processes were left running from the night before.

I assume this is because either Vitest is not correctly cleaning up processes when the CLI gets hung OR because the process is hung Vitest is not able to kill the process.

Reproduction

Reproduction has been relatively inconsistant so far but I was able to record the issue in a few steps:

  1. Run my tests
  2. Usually a single test gets hung & blocks finishing running the tests
  3. Logging gets staggered
  4. close timed out after 10000ms gets printed to the console
  5. Abort the test using ^C
  6. CLI will exit, Node.js process will remain running at 100% CPU usage until the system is restarted

If you want to clone the source to run this exact how I am, I just open sourced the project: nurodev/untypeable

Kapture.2023-03-24.at.21.41.43.mp4

System Info

System:
    OS: macOS 13.0.1
    CPU: (10) arm64 Apple M1 Pro
    Memory: 1.41 GB / 16.00 GB
    Shell: 5.8.1 - /bin/zsh
  Binaries:
    Node: 18.12.1 - ~/Library/Caches/fnm_multishells/4286_1679694469648/bin/node
    Yarn: 1.22.19 - ~/Library/Caches/fnm_multishells/4286_1679694469648/bin/yarn
    npm: 9.6.0 - ~/Library/Caches/fnm_multishells/4286_1679694469648/bin/npm
  Browsers:
    Safari: 16.1

Used Package Manager

yarn

Validations

@sheremet-va sheremet-va added bug help wanted Extra attention is needed labels Mar 25, 2023
@sheremet-va
Copy link
Member

I also noticed something similar. I think there was a comment in another issue that 100 CPU leads to process.exit override by execa 🤔

I am currently on my phone, so I cannot provide more information right now, but we should definitely look into it.

@AriPerkkio
Copy link
Member

AriPerkkio commented Mar 25, 2023

@NuroDev as work-around for now you can use poolMatchGlobs here. I quickly tested it in your reproduction case and it seems to work.

export default defineConfig({
  test: {
    poolMatchGlobs: [
      // This test prevents worker_thread from terminating
      ["**/spacex/tests/starlink.test.ts", "child_process"],
    ],
  },
});

This issue is kind of duplicate of #2008, but the reproduction case here is best one so far. I can reproduce the process hang here on every run and there are not that many test cases in total. Earlier I used trpc's repository which run into this issue on ~70% of the runs.

I can look into this more later. But likely something in the starlink.test.ts is preventing the worker_thread from terminating when worker.terminate() is called. First thing to check is whether there are any native node addons in dependencies using NAPI directly.

@AriPerkkio
Copy link
Member

I was able to reproduce this issue using just Node API's and undici: nodejs/undici#2026

I didn't look into their codebase that much but there seems to be some native code: https://github.com/nodejs/undici/tree/main/deps/llhttp/src

@sannajammeh
Copy link

Encountered the same issue on Node 20. Seems undici is the cause. By stubbing the global fetch with node-fetch the issue was resolved.

@akshaybabloo
Copy link

I seem to have the same issue on GitHub Actions. So far no problem on Windows.

@sethreidnz
Copy link

sethreidnz commented Jul 18, 2023

Has there been any further investigation to the root cause of this? I am running into it running in Azure Devops as well but do not use the above unidici library mentioned anywhere in the project so I think it's more generic. Trying the above workaround to see if it helps but curious if there has been any further insight into what might cause this?

@AriPerkkio
Copy link
Member

Has there been any further investigation to the root cause of this? I am running into it running in Azure Devops as well but do not use the above unidici library mentioned anywhere in the project so I think it's more generic.

I've only reported the issue to NodeJS issue board but that's it. Note that this also happens with native fetch in NodeJS - you don't need to install undici explicitly.

I've only seen reproduction setups where fetch caused the worker_threads to get stuck. Other cases have not been reported as far as I know.

@lannuttia
Copy link

I think that I am running into this issue as well. Unfortunately for my project, I am stuck between a rock and a hard place. I will either have to live with the hanging issue until it gets fixed upstream or downgrade to MSW 1.x. It appears that MSW 2.0 doesn't support polyfilling fetch and requires the native node fetch API so at least in my scenario, fixing it isn't as easy as replacing node's native fetch API with node-fetch.

Where MSW says that they will not support polyfilling fetch.

@AriPerkkio
Copy link
Member

Node's fetch doesn't get stuck when run inside node:child_processs. It's only the node:worker_threads which has these issues.

On Vitest you can use --pool=forks to run all tests inside node:child_process. Or if you are using vitest@0.3x.0 releases, use --no-threads.

@raegen
Copy link

raegen commented Dec 2, 2023

In my case, node-fetch is already stubbed - as is for every project using happy-dom with vitest, since happy-dom uses node-fetch under the hood. The issue however is present, so node-fetch is not the solution.

@AriPerkkio
Copy link
Member

// @vitest-environment happy-dom
import { test } from "vitest";
import nodeFetch from "node-fetch";

test("fetch", () => {
  console.log(fetch === nodeFetch);
  // stdout | tests/happy-dom.test.ts > fetch
  // false
});

@raegen I'm happy to look into reproduction cases where using node-fetch with --pool=threads doesn't work.

@AriPerkkio AriPerkkio added has workaround and removed bug help wanted Extra attention is needed labels Dec 21, 2023
@InfiniteXyy
Copy link
Contributor

I also created a simple repo to try to reproduce this issue.
https://github.com/InfiniteXyy/vitest-hang-issue

In my experiment, I used the library dompurify and encountered the hanging issue.

What I'm sure that without dompurify, tests will never hang in my case
There are also some other scenarios that I have tested, but I'm not sure if they have a direct connection to the issue:

  1. It is more likely to hang when test cases failed.
  2. Node 18/19 will hang, while it seems that there are no issues with version 20+.
  3. Setting globals: true and using the global vitest function increases the possibility of hanging.
  4. The higher count of tests, the higher the chance of hanging.

@AriPerkkio
Copy link
Member

@InfiniteXyy I don't think that is related to this issue as it's not using fetch. You should file separate bug with that minimal reproduction case instead.

@lovetingyuan
Copy link

lovetingyuan commented Jan 10, 2024

+1 , I am using nodejs fetch api. Most cases are working well, but one of them faced the issue problem. Not 100% happens.

@ModPhoenix
Copy link

ModPhoenix commented Jan 22, 2024

Node's fetch doesn't get stuck when run inside node:child_processs. It's only the node:worker_threads which has these issues.

On Vitest you can use --pool=forks to run all tests inside node:child_process. Or if you are using vitest@0.3x.0 releases, use --no-threads.

I ran into this issue today, and using --pool=forks helped me resolve it. I set it in the vite.config.ts file; this might be helpful for someone.

"vitest": "1.2.1"
vite.config.ts:

    test: {
      globals: true,
      environment: 'jsdom',
      setupFiles: './src/test/setup.tsx',
      // Setting pool='forks' is preventing this issue https://github.com/vitest-dev/vitest/issues/3077
      pool: 'forks',
    },

@sheremet-va sheremet-va moved this to P2 - 4 in Team Board Feb 12, 2024
@sheremet-va sheremet-va added p2-edge-case Bug, but has workaround or limited in scope (priority) and removed has workaround labels Feb 16, 2024
sgotti added a commit to sgotti/agola-web that referenced this issue Feb 27, 2024
* update vitest and vue test-utils
* fix vue-tsc errors.
* try to prevent issue vitest-dev/vitest#3077
by using vitest config "pool" set to "forks" instead of default
"threads"
@silverwind
Copy link
Contributor

Some seemingly good news from upstream: nodejs/undici#2026 (comment)

@sheremet-va sheremet-va moved this from P2 - 4 to Has plan in Team Board Mar 28, 2024
thisconnect added a commit to thisconnect/bitbox-wallet-app that referenced this issue Apr 24, 2024
CI sometimes still hangs, trying to run without threads

- vitest-dev/vitest#2008
- vitest-dev/vitest#3077

In earlier vitest versions the optinon was --no-threads, but this
changed to --pool=forks in:
- https://github.com/vitest-dev/vitest/releases/tag/v1.0.0-beta.0
- https://github.com/vitest-dev/vitest/blob/main/docs/guide/migration.md#pools-are-standardized-4172

i# with '#' will be ignored, and an empty message aborts the commit.
thisconnect added a commit to thisconnect/bitbox-wallet-app that referenced this issue Apr 29, 2024
CI sometimes still hangs, trying to run without threads

- vitest-dev/vitest#2008
- vitest-dev/vitest#3077

In earlier vitest versions the optinon was --no-threads, but this
changed to --pool=forks in:
- https://github.com/vitest-dev/vitest/releases/tag/v1.0.0-beta.0
- https://github.com/vitest-dev/vitest/blob/main/docs/guide/migration.md#pools-are-standardized-4172

Changed vitest config, so that CI and make webtest are both using
a single thread.
@github-actions github-actions bot locked and limited conversation to collaborators May 24, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
p2-edge-case Bug, but has workaround or limited in scope (priority) upstream
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.