Make `cargo test` usable for nearcore developers #4490

matklad · 2021-07-09T17:33:27Z

I've noticed that I never run cargo test --workspace locally. I do want to have an ability to quickly check if I broke something somewhere, but the usability of our current cargo test is abysmal.

Specifically:

Tests output a lot to the terminal, making terminal scrollback buffer useless.
Test suite is prohibitively slow. It takes 15 wall-clock minutes / 12 cpu minutes on a 24-core machine.

I suggest immediate AIs:

fixing the tests such that the output of cargo test is never polluted unless the test actually fails. I think using TestWriter is one possible way to do this
moving all obviously slow tests to expensive_tests
revisiting this issue after this easy wins are achieved.

Longer term, I think it might sense to do the following:

replace expensive_tests with run-time check (see the example at the end of "Make Tests Fast" section of this post).
add two tiers of expensive_tests -- nightly_only_tests (> 5 min) and ci_only_tests (>10s, <5min). Local cargo t skips ci_only_tests by default, but they are run by CI for every commit that gets into master
document the expectations about tests run-time and actively fix regressions.
aim for really short test turn-around time. To give an aspiration goal, in rust-analyzer cargo t takes 5s wall-clock/35s cpu by default and 30s/100s when including ci only tests.

The text was updated successfully, but these errors were encountered:

stale · 2021-11-09T16:03:42Z

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months.
It will be closed in 7 days if no further activity occurs.
Thank you for your contributions.

bowenwang1996 · 2021-11-09T18:40:59Z

@matklad this is fixed right?

matklad · 2021-11-10T11:30:04Z

Not at all:

running cargo test still outputs megabytes of DEBUG-level logging to the terminal,
it still takes 15 minutes to complete on a 12-core machine

pmnoxx · 2022-01-03T18:15:23Z

@matklad The solution for having many prints is to run the code with
RUST_LOG=warn cargo test --workspace.

By setting proper environment flags you eliminate the issue.

matklad · 2022-01-05T16:53:34Z

Seems reasonable to flip the default? Run with RUST_LOG=INFO if you need the logs, but get no unnecessary verbose output by default. Generally you only want to see logs for failing tests.

pmnoxx · 2022-01-05T21:23:46Z

Seems reasonable to flip the default? Run with RUST_LOG=INFO if you need the logs, but get no unnecessary verbose output by default. Generally you only want to see logs for failing tests.

I agree, it's a good idea to switch the default for tests. However, then we should update the CI to run with RUST_LOG=INFO. Otherwise, you won't see errors on test failure.

Also changing the default, would change default settings for neard binary.

mina86 · 2022-01-05T22:25:20Z

Generally you only want to see logs for failing tests.

Yes, this is the issue. When I run tests all I want to see is test foo::bar::baz ... ok a bunch of times. Logs should only be printed on failure. And to be honest I don’t understand why this isn’t happening; isn’t test runner supposed to capture output of each test? Why are logs still visible?

mina86 · 2022-01-07T16:08:04Z

Turns out libtest is broken: rust-lang/rust#90785

matklad · 2022-01-07T16:39:27Z

diff --git a/test-utils/logger/src/lib.rs b/test-utils/logger/src/lib.rs
index 7372b5308..6be107180 100644
--- a/test-utils/logger/src/lib.rs
+++ b/test-utils/logger/src/lib.rs
@@ -20,7 +20,7 @@ fn setup_subscriber_from_filter(mut env_filter: EnvFilter) {
     let _ = tracing_subscriber::fmt::Subscriber::builder()
         .with_span_events(tracing_subscriber::fmt::format::FmtSpan::CLOSE)
         .with_env_filter(env_filter)
-        .with_writer(std::io::stderr)
+        .with_writer(tracing_subscriber::fmt::TestWriter::new())
         .try_init();
 }

fixes the issue (a bit swamped rn to submit an actual PR)

pmnoxx · 2022-01-07T17:19:11Z

diff --git a/test-utils/logger/src/lib.rs b/test-utils/logger/src/lib.rs
index 7372b5308..6be107180 100644
--- a/test-utils/logger/src/lib.rs
+++ b/test-utils/logger/src/lib.rs
@@ -20,7 +20,7 @@ fn setup_subscriber_from_filter(mut env_filter: EnvFilter) {
     let _ = tracing_subscriber::fmt::Subscriber::builder()
         .with_span_events(tracing_subscriber::fmt::format::FmtSpan::CLOSE)
         .with_env_filter(env_filter)
-        .with_writer(std::io::stderr)
+        .with_writer(tracing_subscriber::fmt::TestWriter::new())
         .try_init();
 }

fixes the issue (a bit swamped rn to submit an actual PR)

Thanks, I tested it. This works fine: #6023

Rust’s `libtest` has a… limitation where it doesn’t capture data written directly to `std::io::stdout` or `std::io::stderr` (see rust-lang/rust#90785). Since we are test logger to use the latter, none of the log messages are captured by the test harness. The solution is to use `TestWriter` as the writer instead which cooperates `libtest` better. Issue: near#4490

mina86 · 2022-01-07T17:33:52Z

fixes the issue (a bit swamped rn to submit an actual PR)

Indeed, #6025

replace expensive_tests with run-time check (see the example at the end of "Make Tests Fast" section of this post).

The issue with that approach is that if you forget to set the environment variable you end up with a passing test that doesn’t test anything. This is another limitation of libtest which does not allow returning an ignore state from a test, see rust-lang/rust#68007. The issue mentions test-with crate [update: test-with is not fit for purpose] and an RFC so I’m looking at those now.

Change logger to print output of `cargo test` correctly. #4490 (comment)

Firstly, the proper way to denote a test as expensive is to use `#[cfg]` directive before its definition rather than adding a return inside of the function. The latter will cause the test to show up in list of tests and always pass giving people false sense that running the test without `expensive_tests` feature actually does something. After fixing `sync_state_nodes_multishard` also add it to nightly run. Secondly, the `check_nightly.py` script is rather crude and the order in which `#[cfg]` and `#[test]` attributes are applied do matter. With `#[test]` being first, the script failed to recognise `test_catchup` as an expensive test. Thirdly, the `check_nightly.py` script also doesn’t understand `#[cfg]` matching multiple features. Because of that it does not detect `test_highload` as an expensive test either. Since we’re not using `regression_tests` feature for anything simply remove it. Issue: #4490

Introduce a `#[nightly_test]` attribute to be used to mark tests as expensive (i.e. to be run nightly on NayDuck but omitted from casual `cargo test` invocations or CI runs). It replaces the use of `#[cfg(feature = "expensive_tests")]` directive and under the hood marks tests as ignored rather than skipping their compilation. This means that while previously nightly tests would be built only if `expensive_tests` feature is set now they are always compiled. This further means that changes that break such tests will be caught by CI (previously they would break only once NayDuck decides to run them) and there is no longer need to conditionally compile helper functions which are used by such tests only. Issue: #4490

Firstly, the proper way to denote a test as expensive is to use `#[cfg]` directive before its definition rather than adding a return inside of the function. The latter will cause the test to show up in list of tests and always pass giving people false sense that running the test without `expensive_tests` feature actually does something. After fixing `sync_state_nodes_multishard` also add it to nightly run. Secondly, the `check_nightly.py` script is rather crude and the order in which `#[cfg]` and `#[test]` attributes are applied do matter. With `#[test]` being first, the script failed to recognise `test_catchup` as an expensive test. Thirdly, the `check_nightly.py` script also doesn’t understand `#[cfg]` matching multiple features. Because of that it does not detect `test_highload` as an expensive test either. Since we’re not using `regression_tests` feature for anything simply remove it. Issue: #4490

Rather than using `cfg` to not compile expensive tests, use `cfg_attr` to conditionally mark such tests as ignored. In other words, instead of writing: #[cfg(feature = "expensive_tests")] #[test] fn test_clear_old_data_too_many_heights() { // ... } write: #[test] #[cfg_attr(not(feature = "expensive_tests"), ignore)] fn test_clear_old_data_too_many_heights() { // ... } With this change, expensive tests will always be built which means that i) any code changes breaking them will be caught by CI (rather than having to wait for a nightly run) and ii) code used by expensive tests only is no longer unused when `expensive_tests` feature is not enabled (which means fewer `#[cfg(feature = "expensive_tests")]` directives sprinkled throughout code). Since we no longer mark whole modules as compiled only if the feature is enabled, this change also means that each individual test needs to be marked individually (rather than being able to mark whole module). This makes it more obvious which tests are expensive and which aren’t (since the marking is right at the test definition site) and simplifies `check_nightly.py` script. Issue: #4490

matklad · 2022-01-13T14:09:09Z

#5817 is related -- tests not being flaky is a part of cargo test being usable out of the box.

Rather than using `cfg` to not compile expensive tests, use `cfg_attr` to conditionally mark such tests as ignored. In other words, instead of writing: #[cfg(feature = "expensive_tests")] #[test] fn test_clear_old_data_too_many_heights() { // ... } write: #[test] #[cfg_attr(not(feature = "expensive_tests"), ignore)] fn test_clear_old_data_too_many_heights() { // ... } With this change, expensive tests will always be built which means that i) any code changes breaking them will be caught by CI (rather than having to wait for a nightly run) and ii) code used by expensive tests only is no longer unused when `expensive_tests` feature is not enabled (which means fewer `#[cfg(feature = "expensive_tests")]` directives sprinkled throughout code). Since we no longer mark whole modules as compiled only if the feature is enabled, this change also means that each individual test needs to be marked individually (rather than being able to mark whole module). This makes it more obvious which tests are expensive and which aren’t (since the marking is right at the test definition site) and simplifies `check_nightly.py` script. Issue: #4490

Issue: #4490

matklad · 2022-01-25T16:18:37Z

FWIW, the situation is much improved, cargo test now takes only 4 minutes locally! Thanks @mina86 and @pmnoxx !

matklad · 2022-01-25T16:46:14Z

For the future, by running cargo test -- -Z unstable-options --report-time we can find further slowest tests.

pmnoxx · 2022-01-25T22:58:37Z

@matklad Thanks, that's really useful. I'll add it to my scripts.

matklad added C-housekeeping Category: Refactoring, cleanups, code quality A-testing Area: Unit testing / integration testing T-node Team: issues relevant to the node experience team labels Jul 9, 2021

janewang added the P-low Priority: low label Aug 11, 2021

janewang assigned nikurt Aug 11, 2021

stale bot added the S-stale label Nov 9, 2021

stale bot removed the S-stale label Nov 9, 2021

matklad mentioned this issue Nov 18, 2021

utils: add a few more examples and tests to MaybeValidated #5346

Merged

matklad mentioned this issue Dec 2, 2021

Revisit Clippy whitelisting #442

Closed

nikurt removed their assignment Dec 8, 2021

bowenwang1996 assigned mina86 Dec 8, 2021

matklad mentioned this issue Dec 9, 2021

feat: workspace crate compliance #5724

Merged

bowenwang1996 mentioned this issue Dec 9, 2021

Cargo tests takes a long time to run #5726

Closed

3 tasks

pmnoxx mentioned this issue Jan 7, 2022

feat: Print cargo test output only on error. #6023

Merged

mina86 mentioned this issue Jan 7, 2022

test-uitls: fix test output not being captured by cargo test #6025

Closed

near-bulldozer bot pushed a commit that referenced this issue Jan 7, 2022

feat: Print cargo test output only on error. (#6023)

6c82076

Change logger to print output of `cargo test` correctly. #4490 (comment)

mina86 mentioned this issue Jan 9, 2022

integration-tests: fix some expensive tests marking #6031

Merged

mina86 mentioned this issue Jan 9, 2022

Use cfg_attr to ignore expensive tests #6032

Merged

mina86 mentioned this issue Jan 13, 2022

Move tests taking over 60 s to nightly #6056

Merged

near-bulldozer bot pushed a commit that referenced this issue Jan 14, 2022

Move tests taking over 60 s to nightly (#6056)

518685a

Issue: #4490

matklad closed this as completed Jan 25, 2022

gmilescu added the Node Node team label Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `cargo test` usable for nearcore developers #4490

Make `cargo test` usable for nearcore developers #4490

matklad commented Jul 9, 2021

stale bot commented Nov 9, 2021

bowenwang1996 commented Nov 9, 2021

matklad commented Nov 10, 2021

pmnoxx commented Jan 3, 2022

matklad commented Jan 5, 2022

pmnoxx commented Jan 5, 2022

mina86 commented Jan 5, 2022

mina86 commented Jan 7, 2022

matklad commented Jan 7, 2022 •

edited

Loading

pmnoxx commented Jan 7, 2022

mina86 commented Jan 7, 2022 •

edited

Loading

matklad commented Jan 13, 2022

matklad commented Jan 25, 2022

matklad commented Jan 25, 2022

pmnoxx commented Jan 25, 2022

Make cargo test usable for nearcore developers #4490

Make cargo test usable for nearcore developers #4490

Comments

matklad commented Jul 9, 2021

stale bot commented Nov 9, 2021

bowenwang1996 commented Nov 9, 2021

matklad commented Nov 10, 2021

pmnoxx commented Jan 3, 2022

matklad commented Jan 5, 2022

pmnoxx commented Jan 5, 2022

mina86 commented Jan 5, 2022

mina86 commented Jan 7, 2022

matklad commented Jan 7, 2022 • edited Loading

pmnoxx commented Jan 7, 2022

mina86 commented Jan 7, 2022 • edited Loading

matklad commented Jan 13, 2022

matklad commented Jan 25, 2022

matklad commented Jan 25, 2022

pmnoxx commented Jan 25, 2022

Make `cargo test` usable for nearcore developers #4490

Make `cargo test` usable for nearcore developers #4490

matklad commented Jan 7, 2022 •

edited

Loading

mina86 commented Jan 7, 2022 •

edited

Loading