Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new thread is created for each test even if --test-threads=1 #104053

Open
Taaitaaiger opened this issue Nov 6, 2022 · 17 comments
Open

A new thread is created for each test even if --test-threads=1 #104053

Taaitaaiger opened this issue Nov 6, 2022 · 17 comments
Assignees
Labels
A-libtest Area: #[test] related C-bug Category: This is a bug. needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. T-testing-devex Relevant to the testing devex team (testing DX), which will review and decide on the PR/issue.

Comments

@Taaitaaiger
Copy link

Taaitaaiger commented Nov 6, 2022

With nightly Rust, tests are run on separate threads even if --test-threads=1. It seems that a new thread is created for each test now, while they were previously run on the main thread if --test-threads=1. This breaks testing for jlrs, which depends on all tests running on the same thread. The underlying reason for this is that Julia can only be initialized once during a process's lifetime, and access is limited to the thread that initialized it.

My expectation is that --test-threads runs all tests on the same thread, or that I set use some other flag to achieve this.

Rust version: 1286ee2 2022-11-05

@Taaitaaiger Taaitaaiger added the C-bug Category: This is a bug. label Nov 6, 2022
@Taaitaaiger
Copy link
Author

This is probably due to #103681

@thomcc
Copy link
Member

thomcc commented Nov 7, 2022

We don't really guarantee this -- we certainly used to behave the way further in the past (that PR was a revert).

That said, it wasn't really a change intended to break anyone, and it makes sense that it might (in the case of single-threaded C libraries and the like). A PR restoring the previous behavior of --test-threads=1 would be fine, although it's a bummer that will probably bring back a lot of that cruftiness that was removed.

@thomcc
Copy link
Member

thomcc commented Nov 7, 2022

It might be cleaner to keep a thread around to run them, rather than coordinating use of the main thread for this case. I'd prefer that since previously it had to be duplicated, and it'd be a bummer to keep that forever.

IIUC from your description, it's most important that the tests all happen on some thread, and it's not important that it always be the main thread? (Hmmm... I suppose that's not enough for some cases though 😿)

@Taaitaaiger
Copy link
Author

Yeah, it seems like one of those things that's only going to affect a small subset of users. And you understood correctly, rewriting the tests to only use one test per file is a valid approach for me.

The PR does seem like a good change to me, would it be impossible to optionally allow tests to run on the main thread without reintroducing the cruftiness?

@ianks
Copy link

ianks commented Jan 27, 2023

This PR can have negative side effects for libs that rely on thread-local state being set before a test run (i.e. using ctor to setup something before the suite). I think it makes sense to keep the old behavior when --test-threads=1. I will take a look when I get some time!

@ianks
Copy link

ianks commented Jan 28, 2023

@rustbot claim

@uazu
Copy link

uazu commented Mar 27, 2023

I have the same issue. (The one that ehuss closed above.) The change to test on multiple threads one by one instead of just one thread has broken my tests in https://crates.io/crates/stakker.

Stakker can run multi-threaded, but as a single-threaded optimisation I allow a single thread to effectively become "special". This is useful when the app is structured as a main thread plus auxiliary threads. On that "special" main thread, Stakker can safely use global variables internally as an optimisation. However if Stakker is started on any other thread, it panics, because that would make the global variable use unsound, and the crate user needs to select the multi-threaded feature. However now that tests are spread across multiple threads, it immediately panics, and I cannot test the single-threaded functionality using cargo test.

For now I've hacked up a script to use the output of --list to run the tests one by one, but this is quite possibly flakey in case of changes in the output format of --list, so is not a good solution.

@ChrisDenton ChrisDenton added the needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. label Jul 16, 2023
@RalfJung
Copy link
Member

Funny, when it became a single thread that caused problems (#70492) and now the revert is also causing problems...

FWIW I would say each test getting fresh thread-local state is a feature. Having tests depend on each other is an anti-pattern. The test harness should clean up between tests as much as possible.

@thomcc
Copy link
Member

thomcc commented Jul 27, 2023

I think it's reasonable to want things to run on the main thread, as there are platforms which do care about this for substantial portions of their APIs. This requires reusing thread local state.

There might be other ways to support that though, it seems like a question that T-testing could eventually tackle.

@knopp
Copy link

knopp commented Jul 27, 2023

Rust default test harness doesn't really work for anything on macOS that requires execution on main thread. Unfortunately main thread is special (so that on some places there are actually checks for pthread_main_np, i.e. in libdispatch. It would be immensely helpful if there was way to run tests with the default harness directly on main thread.

@kpreid
Copy link
Contributor

kpreid commented Jul 27, 2023

In a threaded application, it might be the case that the special single thread (whether that is for platform APIs, a non-thread-safe foreign library, or anything else) is running a message receiving loop (actor) that accepts messages from all the other threads and performs the requested activities. (GUI frameworks that have a main-thread requirement, whether from the platform or on their own initiative, have just such a "run this function on the UI/event loop thread" facility.) But in the current test harness environment, there is no such access to the main thread.

If there were to be such a feature in the test harness, then for usability it should also include panic-catching and output-capturing just like the thread running the normal test. (I'm not necessarily suggesting that such a feature be added; just commenting on how applications in fact solve this problem without sticking to no threads at all.)

@thomcc
Copy link
Member

thomcc commented Jul 27, 2023

have just such a "run this function on the UI/event loop thread" facility

While this is true, we never start the main loop in the case of macOS, so if you call this function your thunk will just never get run.

@knopp
Copy link

knopp commented Jul 27, 2023

have just such a "run this function on the UI/event loop thread" facility

While this is true, we never start the main loop in the case of macOS, so if you call this function your thunk will just never get run.

Just running the test harness code on second thread as it is while running CFRunLoopRun() on main thread would probably pretty much solve all of our darwin issues. You can then easily jump to main thread simply by dispatch_async_f or by registering custom source on the run loop.

@uazu
Copy link

uazu commented Jul 28, 2023

FWIW I would say each test getting fresh thread-local state is a feature. Having tests depend on each other is an anti-pattern. The test harness should clean up between tests as much as possible.

In my case the tests don't depend on each other. In the tested configuration, the runtime is using global variables as an optimisation, and for that to be sound and also efficient, only one thread gets blessed as the thread which can access those global variables. For efficiency that blessing is permanent to the death of the process. (I stop and start a fresh runtime each test, but it inherits the blessing from the previous invocation, because it is permanent.)

Anyway I have this code, and I need to test it. So if cargo test isn't going to officially support proper single-threaded, perhaps I should put a bit more effort into a workaround and add a worker thread and some means to send work to it from the individual tests. I will need some way to catch and pass panics back, so they're reported from the test itself. Maybe I can make it work.

@the8472
Copy link
Member

the8472 commented Aug 1, 2023

perhaps I should put a bit more effort into a workaround and add a worker thread and some means to send work to it from the individual tests.

Some crates may have a need to run specific tests on specific threads or sequentially but other tests could run in parallel. Imo supporting arbitrarily complex execution constraints shouldn't be the job of the testing framework.

Many UI frameworks do have a runtime.execute_on_ui_thread(|| { ... }) kind of construct for this purpose.

@uazu
Copy link

uazu commented Aug 3, 2023

Imo supporting arbitrarily complex execution constraints shouldn't be the job of the testing framework.

The thing is that RUST_TEST_THREADS=1 suggests that Rust will test using just one test thread, and previously that's what it was doing, i.e. this constraint appeared to be supported. But never mind. Perhaps the option should be renamed to RUST_MAX_CONCURRENT_TESTS

@uazu
Copy link

uazu commented Aug 6, 2023

Okay, if anyone wants to copy the code, I have a worker thread implementation working nicely for plain cargo test (to get all my tests running on a single thread, where that is needed). However that broke my MIRI tests because MIRI wants me to wait for my worker thread to finish, but there is no way to do that because if I use a test to wait for the worker thread to finish, MIRI seems to block on that test and the other tests don't run. I've tried all kinds of different ways to get this to work without success. So for now I'm using a script to run MIRI tests one by one, using --list and --exact. This is slow but MIRI testing is already slow so it doesn't matter.

General things that might be helpful one day:

  • A cargo test option to run tests, but each one in a separate process, in the main thread. This might solve various people's problems all at once.
  • A #[test_cleanup] test annotation to let me run some code after all tests have finished to clean up the worker thread for MIRI.

@RalfJung RalfJung added the T-testing-devex Relevant to the testing devex team (testing DX), which will review and decide on the PR/issue. label Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-libtest Area: #[test] related C-bug Category: This is a bug. needs-triage-legacy Old issue that were never triaged. Remove this label once the issue has been sufficiently triaged. T-testing-devex Relevant to the testing devex team (testing DX), which will review and decide on the PR/issue.
Projects
Status: No status
Development

No branches or pull requests

9 participants