Rustdoc should run all doctests in one binary #75341

jyn514 · 2020-08-09T17:29:18Z

Raised by @Lokathor on discord:

Doctests are slower to run than integration tests. Each individual doctest block runs as a unique binary/process when testing, but each entire integration test file runs as a single process with integration tests. So the test themselves aren't different speeds, but because you're building less bins and spawning less child processes it's faster to go through the integration tests.

I don't think there's any intrinsic reason the tests have to be their own process? The only issue is that many of them have fn main, which would have to be rewritten somehow. This could speed up CI times across the Rust ecosystem, including for the rust compiler itself.

Somewhat related to #51228.

The text was updated successfully, but these errors were encountered:

Nemo157 · 2020-08-09T20:40:15Z

fn main() doesn't seem to hard to rewrite, but a lot of nightly-only crates have #![feature(...)] and other crate-level attributes which will be more difficult.

cuviper · 2020-08-12T00:17:55Z

There are a lot of things a test may do with the assumption that it's an isolated process, like configuring global resources.

jyn514 · 2020-08-12T00:25:54Z

Hmm so this has backwards-compatibility issues, I see. Maybe we could do this in a new edition?

cuviper · 2020-08-12T00:34:32Z

I could see this in an edition, but I think you'd need an RFC. I think there should be an opt-out too -- maybe tests that contain fn main would remain in their own process, side-stepping the rewrite questions.

Mark-Simulacrum · 2020-08-12T00:36:35Z

FWIW I am somewhat doubtful that this is a good idea. We likely would have a hard time managing the distinct complexity merging these into a single compilation unit without causing a long stream of issues, not only at run-time but also at compile time (e.g., remapping warning spans and such).

I also don't know that this is truly that high overhead, at least on linux, spawning processes is not too fast but is also not all that slow. Before we invest too much into figuring out how to do this I would like to see a hacked together prototype and results on alloc, core, and std's tests as to how much of a performance win this is. If it's a few percent it seems not worth it.

Lokathor · 2020-08-12T01:04:02Z

Just anecdotally using two crates I had at hand:

809 doctests take 30.3 seconds to run (26.6 / second)
67 integration tests take 0.6 seconds to run (111.66 / second)

The timings were just taken by hand with a stopwatch, so they're only approximate, but both test suites were doing similar work per test, just math functions that setup some inputs, run the op, and check the outputs.

It probably varies by test suite, but I expect the difference to come out to a lot more than "a few percent".

Mark-Simulacrum · 2020-08-12T01:57:58Z

That seems like not quite an apples-to-apples comparison; for one thing the doctests today are all compiled out of process. Plus, is that including the rustc time for compiling those integration tests?

Lokathor · 2020-08-12T02:03:51Z

that's a cargo test with the tests having already been built. Building the integration tests adds about 2s to that case, putting them about even in terms of tests per second.

I don't recall the extra time taken to build the tests in the doctest case.

Are you saying that the doctests get built, run, and then immediately discarded as part of the cargo test command?

Mark-Simulacrum · 2020-08-12T02:06:44Z

That's the case today -- currently doctesting involves:

rustdoc collection of doctests (amortized across all of them), generating the altered source code (e.g., adding fn main)
spawn a rustc process to compile the doctest
execute the doctest itself

I have a long-term outstanding TODO for myself to verify that the rustc spawned there receives appropriate access to the jobserver, so I am uncertain whether we're managing parallelism correctly today, which might be another source of wins.

Nemo157 · 2020-08-12T09:19:55Z

On a very small crate (bs58) I copied all my doc-tests into a docs integration test file then tested how long they take to run including build time each way:

> touch src/lib.rs && time cargo test --all-features --test docs
test result: ok. 29 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
cargo test --all-features --test docs  0.46s user 0.19s system 103% cpu 0.635 total

> touch src/lib.rs && time cargo test --all-features --doc
test result: ok. 29 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
cargo test --all-features --doc  13.20s user 2.85s system 277% cpu 5.777 total

so writing doctests seems to give an ~9x walltime or ~28x cputime slowdown compared to identical integration tests.

(EDIT: splitting into one test per integration test file gives ~the same time as --doc, so it seems like there isn't much to be gained from optimizing how rustdoc runs the tests if they're still individual binaries).

Lokathor · 2020-08-14T07:13:27Z

If doctests were to be merged, i'd expect doctests without a main to just be merged into a single file and run as a batch.

ssokolow · 2021-11-01T07:28:11Z

Regardless of the solution, this performance issue is the number-one reason that I opt my examples out of being doctests and leave a note asking people to report a bug if they drift out of sync with the integration test they mirror.

It's simply too much of a drag on my code-compile-test iteration cycle.

On my CPU, I'm taking 3 seconds to rebuild a set of 13 tests as some mix of unit and integration tests and 37 seconds to rebuild those same tests as doctests.

camelid · 2022-04-23T20:39:46Z

Perhaps an easier solution (which would only solve part of the problem) would be to invoke rustc's API rather than invoking it as the rustc binary. This would also probably help with #81070 (see #81070 (comment)).

ppershing · 2024-02-18T10:06:26Z

In my experience doctests are performance drag especially in huge workspace which require nontrivial linking time. This lead us to abandon doctests in many places in our project

As for running concurrently - I am not sure how many problems would be caused by running tests concurrently/sequentially in a single process but if we think the main problem is linking performance, a workaround would be to just compile a single binary which would be invoked multiple times with different commandline argument indicating which test to run.

Also, given the previous comment on line mapping and multiple main collisions - a solution might be to create one sub-module per doctest and then collect this to a single fn main() runner? This way, people could keep using fn main() in their doctests, the top-level main would just delegate to these invididual modules. A downside would be that people could start exploiting e.g. super:: or crate:: inside doctests in accordance with Hyrum's law.

…2, r=<try> Greatly speed up doctests by compiling compatible doctests in one file Fixes rust-lang#75341. Take 2 at rust-lang#123974. It should now be much simpler to review since it's based on rust-lang#125798. I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...). The following tests are not included into the combined doctests: * `compile_fail` * If there are crate attributes (`deny`/`allow`/`warn` are ok) * have invalid AST * `test_harness` * no capture * `--show-output` (because the output is nicer without the extra code surrounding it) Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately. Because of the `edition` codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic. In case the users want a specific doctest to be opted-out from this doctest merge, I added the `standalone` codeblock attribute: ```rust /// ```rust,standalone /// // This doctest will be run in its own process! /// ``` ``` Now the interesting part, I ran it on a few crates and here are the results (with `cargo test --doc` to only include doctests): | crate | nb doctests | before this PR | with this PR | | - | - | - | - | | sysinfo | 227 | 4.6s | 1.11s | | geos | 157 | 3.95s | 0.45s | | core | 4604 | 54.08s | 13.5s (merged: 0.9s, standalone: 12.6s) | | std | 1147 | 12s | 3.56s (merged: 2.1s, standalone: 1.46s) | r? `@camelid` try-job: x86_64-msvc try-job: aarch64-apple

…-v2, r=t-rustdoc Greatly speed up doctests by compiling compatible doctests in one file Fixes rust-lang#75341. Take 2 at rust-lang#123974. It should now be much simpler to review since it's based on rust-lang#125798. I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...). The following tests are not included into the combined doctests: * `compile_fail` * If there are crate attributes (`deny`/`allow`/`warn` are ok) * have invalid AST * `test_harness` * no capture * `--show-output` (because the output is nicer without the extra code surrounding it) Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately. Because of the `edition` codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic. In case the users want a specific doctest to be opted-out from this doctest merge, I added the `standalone` codeblock attribute: ```rust /// ```rust,standalone /// // This doctest will be run in its own process! /// ``` ``` Now the interesting part, I ran it on a few crates and here are the results (with `cargo test --doc` to only include doctests): | crate | nb doctests | before this PR | with this PR | | - | - | - | - | | sysinfo | 227 | 4.6s | 1.11s | | geos | 157 | 3.95s | 0.45s | | core | 4604 | 54.08s | 13.5s (merged: 0.9s, standalone: 12.6s) | | std | 1147 | 12s | 3.56s (merged: 2.1s, standalone: 1.46s) | r? `@camelid` try-job: x86_64-msvc try-job: aarch64-apple

…2, r=t-rustdoc Greatly speed up doctests by compiling compatible doctests in one file Fixes rust-lang#75341. Take 2 at rust-lang#123974. It should now be much simpler to review since it's based on rust-lang#125798. I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...). The following tests are not included into the combined doctests: * `compile_fail` * If there are crate attributes (`deny`/`allow`/`warn` are ok) * have invalid AST * `test_harness` * no capture * `--show-output` (because the output is nicer without the extra code surrounding it) Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately. Because of the `edition` codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic. In case the users want a specific doctest to be opted-out from this doctest merge, I added the `standalone` codeblock attribute: ```rust /// ```rust,standalone /// // This doctest will be run in its own process! /// ``` ``` Now the interesting part, I ran it on a few crates and here are the results (with `cargo test --doc` to only include doctests): | crate | nb doctests | before this PR | with this PR | | - | - | - | - | | sysinfo | 227 | 4.6s | 1.11s | | geos | 157 | 3.95s | 0.45s | | core | 4604 | 54.08s | 13.5s (merged: 0.9s, standalone: 12.6s) | | std | 1147 | 12s | 3.56s (merged: 2.1s, standalone: 1.46s) | r? `@camelid` try-job: x86_64-msvc try-job: aarch64-apple

jyn514 added T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. A-doctests Area: Documentation tests, run by rustdoc labels Aug 9, 2020

jyn514 added the I-compiletime Issue: Problems and improvements with respect to compile times. label Aug 12, 2020

Manishearth added A-rustdoc and removed A-rustdoc labels Apr 23, 2021

ehuss mentioned this issue Jul 12, 2021

Cargo check does not check doctests rust-lang/cargo#6424

Open

ehuss mentioned this issue Aug 13, 2021

Doctests don't work in bin targets, non-public items #50784

Open

matklad mentioned this issue Nov 18, 2021

utils: add a few more examples and tests to MaybeValidated near/nearcore#5346

Merged

roypat mentioned this issue Mar 1, 2023

Run rustdoc tests firecracker-microvm/firecracker#3206

Closed

Jefffrey mentioned this issue Mar 4, 2023

Dataframe doctests in the main branch are taking very long to run (over 60 seconds) apache/datafusion#5347

Closed

epage added this to testing-devex backlog Oct 11, 2023

epage mentioned this issue Feb 8, 2024

doctests rust-lang/testing-devex-team#5

Open

GuillaumeGomez mentioned this issue Apr 15, 2024

Greatly speed up doctests by compiling compatible doctests in one file #123974

Closed

Shoeboxam mentioned this issue May 3, 2024

speed up smoke-test opendp/opendp#1579

Merged

traviscross mentioned this issue May 7, 2024

Tracking Issue for Rust 2024: Rustdoc tests in one binary #124853

Closed

3 tasks

alice-i-cecile mentioned this issue May 28, 2024

rustdoc ignores cargo and linker settings when compiling doc tests #125657

Open

GuillaumeGomez mentioned this issue Jun 10, 2024

Greatly speed up doctests by compiling compatible doctests in one file #126245

Merged

kylebarron mentioned this issue Aug 13, 2024

Refactor Rust CI a bit geoarrow/geoarrow-rs#697

Closed

bors closed this as completed in e5b3e68 Aug 14, 2024

github-project-automation bot moved this to Done in testing-devex backlog Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rustdoc should run all doctests in one binary #75341

Rustdoc should run all doctests in one binary #75341

jyn514 commented Aug 9, 2020

Nemo157 commented Aug 9, 2020

cuviper commented Aug 12, 2020

jyn514 commented Aug 12, 2020

cuviper commented Aug 12, 2020

Mark-Simulacrum commented Aug 12, 2020

Lokathor commented Aug 12, 2020 •

edited

Loading

Mark-Simulacrum commented Aug 12, 2020

Lokathor commented Aug 12, 2020

Mark-Simulacrum commented Aug 12, 2020

Nemo157 commented Aug 12, 2020 •

edited

Loading

Lokathor commented Aug 14, 2020

ssokolow commented Nov 1, 2021 •

edited

Loading

camelid commented Apr 23, 2022

ppershing commented Feb 18, 2024

Rustdoc should run all doctests in one binary #75341

Rustdoc should run all doctests in one binary #75341

Comments

jyn514 commented Aug 9, 2020

Nemo157 commented Aug 9, 2020

cuviper commented Aug 12, 2020

jyn514 commented Aug 12, 2020

cuviper commented Aug 12, 2020

Mark-Simulacrum commented Aug 12, 2020

Lokathor commented Aug 12, 2020 • edited Loading

Mark-Simulacrum commented Aug 12, 2020

Lokathor commented Aug 12, 2020

Mark-Simulacrum commented Aug 12, 2020

Nemo157 commented Aug 12, 2020 • edited Loading

Lokathor commented Aug 14, 2020

ssokolow commented Nov 1, 2021 • edited Loading

camelid commented Apr 23, 2022

ppershing commented Feb 18, 2024

Lokathor commented Aug 12, 2020 •

edited

Loading

Nemo157 commented Aug 12, 2020 •

edited

Loading

ssokolow commented Nov 1, 2021 •

edited

Loading