Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fallible units for scrape-examples feature #10596

Closed
wants to merge 1 commit into from

Conversation

willcrichton
Copy link
Contributor

@willcrichton willcrichton commented Apr 25, 2022

What does this PR try to resolve?

As established in rust-lang/rust#73566 and rust-lang/rust#43348, the minimum viable input to Rustdoc only needs to have valid function headers, and not valid function bodies. This is not super common, but must be a valid fallback for some crates.

The scrape examples feature requires checking the bodies of functions in order to find locations of examples. This includes the main library's functions, and functions in any other desired target. Therefore some libraries that are otherwise documentable will fail while scraping.

How should we test and review this PR?

To address this issue, this PR implements a new concept: fallible units. Units can be marked as can_fail = true, and then their failure does not terminate the Cargo session. A new data structure Context::units_completed tracks which units have completed and whether they succeeded or failed. If a unit fails, then all of its transitive reverse dependencies that also have can_fail = true will immediately fail as well.

This PR uses this feature to implement the desired behavior for scraping examples. When the user passes -Z ignore-scrape-failures, then all CompileMode::Docscrape units will have can_fail = true, and the cargo doc implementation will dynamically check which scraped examples have succeeded to determine which *.examples files should be passed to Rustdoc. The idea of the added flag is that it will be used on docs.rs where limiting failure is desirable, while users individually calling cargo doc will be presented terminal compilation failures by default since it's more likely a bug than a feature if a crate's examples don't compile.

The added tests scrape_examples_no_fail_bad_example and scrape_examples_no_fail_bad_dependency show how this behavior works in practice.

Additional information

Some questions / todos:

  • Right now the user will still see the compiler error when they arise. But they don't see any information about skipped fallible units. How should that be communicated through the CLI?
  • When fallible units are skipped, I ensure that queue.finish is called to register that the unit is done. Are there other pieces of state that need to get updated in this special case?

@rust-highfive
Copy link

r? @ehuss

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 25, 2022
@bors
Copy link
Contributor

bors commented May 20, 2022

☔ The latest upstream changes (presumably #10539) made this pull request unmergeable. Please resolve the merge conflicts.

@willcrichton willcrichton force-pushed the fallible-units branch 2 times, most recently from 20551a8 to ae509e8 Compare July 17, 2022 17:42
specific units fail. All Docscrape units now have can_fail = true to
avoid stopping Rustdoc if an example fails to scrape.
@ehuss
Copy link
Contributor

ehuss commented Jul 18, 2022

I've been thinking about this quite a bit and I'm just not convinced that this is something that can unconditionally be enabled by default in the short term. As you identified in #10343 (comment), there are several reasons that a package will fail to build examples, and I don't think this will be sufficient to address those.

I'm also concerned about this approach of having fallible units. In my experience, some people get confused when there are error messages displayed which are then later ignored. I'm concerned that this could be a difficult user experience. This also doesn't address the concern that dev-dependencies don't build, which I think is one of the greater issues.

I'm also concerned with the depth of changes being made to Cargo to accommodate this.

I'd like to consider a different approach, perhaps something along the lines of:

  • Add the doc-scrape-examples target flag. It is an Option<bool> with three states (true, false, unset). The default is "unset".
  • Docscraping is only enabled if either:
    • Examples are included with --example or --examples flag, and there is at least one example with doc-scrape-examples being true or unset.
      • The user is explicitly asking for an example, so it should be safe to scrape them.
    • There is at least one [[example]] with doc-scrape-examples=true
      • The user explicitly opted-in to doc-scraping.
    • There are no dev-dependencies, and there is at least one example with doc-scrape-examples being true or unset.
      • This should be safe to implicitly try (see below).
    • There are other workspace members being documented at the same time (with --workspace for example). In this scenario, it should be safe to scrape those libraries.
  • If doc-scrape-examples is unset for a Unit, and the unit mode is CompileMode::Docscrape, then errors for that unit will be ignored.
    • The user did not explicitly opt-in for doc-scraping. The example is only being opportunistically scanned.

    • I'd like to avoid adding something like can_fail to the Unit right now. I think this check can be done from the existing information in Unit. We can add something more complex like fallible Units in the future if it looks necessary and we have more experience with doc scraping.

    • The output in this scenario is also not displayed unless --verbose is used. If the unit fails and --verbose is not passed, cargo should print something like:

      warning: failed to scan example "some_example" in package `foo` for example code usage
           Try running with `--verbose` to see the error message.
           If this example should not be scanned, consider adding `doc-scrape-examples = false` to the `[[example]]` definition in `relative/path/to/Cargo.toml`.
      

I realize that not enabling by default makes it very unlikely that users will turn it on, making this feature less useful. However, I think this approach gives a short-term stepping stone. For projects without dev-dependencies, this should enable it by default, which should cover about 40% of crates with examples (only about 12.8% of crates on crates.io have any examples). It also enables it when documenting a workspace for other libraries.

After that has been used on docs.rs for a while (or on stable), and we have more experience with it, we can revisit what more we can do to make it more accessible. Some rough thoughts on what that could look like:

  • Perhaps there are other conditions where it could be enabled?
  • Perhaps add some more complex "fallible dev-dependencies"?
  • Swap the default to "on" in the next edition (perhaps if we can auto-detect if it is safe?).
  • Perhaps docs.rs could enable it, and if it fails, run again with it off?
  • Are a large number of packages excluding examples when published? Can we steer users to include them more often?
  • Perhaps target-specific dependencies would allow users to reduce the scope of which dependencies are needed?
  • Perhaps shared target settings would make it easier to opt-in? For example, package.examples.doc-scrape-examples = true could be added in one-line to affect all examples.

How do you feel about this different approach?

@willcrichton
Copy link
Contributor Author

Broadly this policy seems good as a first pass that avoids unexpected breakage, thanks for the careful thought you put into it.

My biggest concern is just how complex it is, with the accompanying UX issues in predictability and cliffs. For instance, it feels very weird as a user to change something seemingly unrelated to docs, like adding a dev-dependency, and all of a sudden your scraped examples vanish from documentation. But I suppose this is an unavoidable consequence of the choice to make Rustdoc not rely on dev-dependencies.

I think it's reasonable if Cargo has a conservative policy, so all cargo doc commands that worked yesterday will work today. Then docs.rs can (eventually) adopt the progressive policy of e.g. passing --examples by default, which users can opt-out via configuration if that breaks their niche build issue.

@willcrichton
Copy link
Contributor Author

Also: maybe it makes sense then to merge these changes back into #10343? Since "fallibility" is not being introduced as a first-class concept, I don't think it needs its own PR. If you agree, let's close this PR for now and move back to #10343.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants