Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustdoc's resugaring of projection predicates to type bindings is flawed when supertraits are involved #113015

Open
1 of 3 tasks
fmease opened this issue Jun 25, 2023 · 7 comments
Assignees
Labels
A-cross-crate-reexports Area: Documentation that has been re-exported from a different crate C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@fmease
Copy link
Member

fmease commented Jun 25, 2023

First discovered thanks to bevyengine/bevy#8898: There, the Output type of the cross-crate opaque impl FnMut(…) -> bool gets dropped by rustdoc resulting in impl FnMut(…). This specific bug is caused by rustdoc's opaque bound cleaner not realizing that it can & should utilize the Output assoc ty defined on FnMut's supertrait FnOnce. During further investigations I've discovered more bugs in this area.

In rustdoc there are two three major and distinct places where we resugar bounds. Both All three are flawed but each one exhibits different bugs.

  • (1) clean_middle_opaque_bounds for bounds of opaques (existential impl-Trait)
    • we only check for TraitRef equality while we should also check the super predicates (incl. their substs tho!)
      • (once we do, ensure that we don't intro name clashes into the list of type bindings (this can be triggered by relatively simple user code involving supertraits))
  • (2) clean_ty_generics for bounds in where-clauses and APIT (universal impl-Trait)
    • we do check super predicates but we only ever check DefIds (of traits) and never the relevant substs which is super incorrect leading rustdoc to move some projection predicates to unrelated trait bounds
    • the process mutates state by yanking those predicates rendering it order dependent (consider permutations of super and subtraits each one with own type bindings)
  • (3) param_env_to_generics in auto_trait.rs for where-clauses of synthetic impls
    • update: replaced by simplify::{where_clauses,sized_bounds} in #123340
    • doesn't use simplify::merge_bounds at all unlike the other two places
    • doesn't merge two partially resugared bounds like Tr<A = …> and Tr<B = …>
    • doesn't render bound vars / higher-ranked parameters at all
    • (I haven't thoroughly checked yet how supertraits are handled but it seems like Fn-family traits are special-cased)

Probably there are some more but less important bugs to be found (e.g. related to bound vars I can imagine).

I'm already deep in the weeds working on a correct & most general fix for this.
Opening this issue to keep the description of the future PR more slender.
@rustbot claim

@rustbot label C-bug T-rustdoc A-cross-crate-reexports

@rustbot rustbot added A-cross-crate-reexports Area: Documentation that has been re-exported from a different crate C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Jun 25, 2023
@fmease fmease changed the title rustdoc's resugaring of projections predicates to type bindings is flawed when supertraits are involved rustdoc's resugaring of projection predicates to type bindings is flawed when supertraits are involved Jun 25, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Oct 4, 2023
…ate-higher-ranked-params, r=notriddle

rustdoc: fix & clean up handling of cross-crate higher-ranked parameters

Preparatory work for the refactoring planned in rust-lang#113015 (for correctness & maintainability).

---

1. Render the higher-ranked parameters of cross-crate function pointer types **(*)**.
2. Replace occurrences of `collect_referenced_late_bound_regions()` (CRLBR) with `bound_vars()`.
  The former is quite problematic and the use of the latter allows us to yank a lot of hacky code **(†)**
  as you can tell from the diff! :)
3. Add support for cross-crate higher-ranked types (`#![feature(non_lifetime_binders)]`).
  We were previously ICE'ing on them (see `inline_cross/non_lifetime_binders.rs`).

---

**(*)**: Extracted from test `inline_cross/fn-type.rs`:

```diff
- fn(_: &'z fn(_: &'b str), _: &'a ()) -> &'a ()
+ for<'z, 'a, '_unused> fn(_: &'z for<'b> fn(_: &'b str), _: &'a ()) -> &'a ()
```

**(†)**: It returns an `FxHashSet` which isn't *predictable* or *stable* wrt. source code (`.rmeta`) changes. To elaborate, the ordering of late-bound regions doesn't necessarily reflect the ordering found in the source code. It does seem to be stable across compilations but modifying the source code of the to-be-documented crates (like adding or renaming items) may result in a different order:

<details><summary>Example</summary>

Let's assume that we're documenting the cross-crate re-export of `produce` from the code below. On `master`, rustdoc would render the list of binders as `for<'x, 'y, 'z>`. However, once you add back the functions `a`–`l`, it would be rendered as `for<'z, 'y, 'x>` (reverse order)! Results may vary. `bound_vars()` fixes this as it returns them in source order.

```rs
// pub fn a() {}
// pub fn b() {}
// pub fn c() {}
// pub fn d() {}
// pub fn e() {}
// pub fn f() {}
// pub fn g() {}
// pub fn h() {}
// pub fn i() {}
// pub fn j() {}
// pub fn k() {}
// pub fn l() {}

pub fn produce() -> impl for<'x, 'y, 'z> Trait<'z, 'y, 'x> {}

pub trait Trait<'a, 'b, 'c> {}

impl Trait<'_, '_, '_> for () {}
```

</details>

Further, as the name suggests, CRLBR only collects *referenced* regions and thus we drop unused binders. `bound_vars()` contains unused binders on the other hand. Let's stay closer to the source where possible and keep unused binders.

Lastly, using `bound_vars()` allows us to get rid of

* the deduplication and alphabetical sorting hack in `simplify.rs`
* the weird field `bound_params` on `EqPredicate`

both of which were introduced by me in rust-lang#102707 back when I didn't know better.

To illustrate, let's look at the cross-crate bound `T: for<'a, 'b> Trait<A<'a> = (), B<'b> = ()>`.

* With CRLBR + `EqPredicate.bound_params`, *before* bounds simplification we would have the bounds `T: Trait`, `for<'a> <T as Trait>::A<'a> == ()` and `for<'b> <T as Trait>::B<'b> == ()` which required us to merge `for<>`, `for<'a>` and `for<'b>` into `for<'a, 'b>` in a deterministic manner and without introducing duplicate binders.
* With `bound_vars()`, we now have the bounds `for<'a, b> T: Trait`, `<T as Trait>::A<'a> == ()` and `<T as Trait>::B<'b> == ()` before bound simplification similar to rustc itself. This obviously no longer requires any funny merging of `for<>`s. On top of that `for<'a, 'b>` is guaranteed to be in source order.
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Oct 4, 2023
Rollup merge of rust-lang#116388 - fmease:rustdoc-fix-n-clean-up-x-crate-higher-ranked-params, r=notriddle

rustdoc: fix & clean up handling of cross-crate higher-ranked parameters

Preparatory work for the refactoring planned in rust-lang#113015 (for correctness & maintainability).

---

1. Render the higher-ranked parameters of cross-crate function pointer types **(*)**.
2. Replace occurrences of `collect_referenced_late_bound_regions()` (CRLBR) with `bound_vars()`.
  The former is quite problematic and the use of the latter allows us to yank a lot of hacky code **(†)**
  as you can tell from the diff! :)
3. Add support for cross-crate higher-ranked types (`#![feature(non_lifetime_binders)]`).
  We were previously ICE'ing on them (see `inline_cross/non_lifetime_binders.rs`).

---

**(*)**: Extracted from test `inline_cross/fn-type.rs`:

```diff
- fn(_: &'z fn(_: &'b str), _: &'a ()) -> &'a ()
+ for<'z, 'a, '_unused> fn(_: &'z for<'b> fn(_: &'b str), _: &'a ()) -> &'a ()
```

**(†)**: It returns an `FxHashSet` which isn't *predictable* or *stable* wrt. source code (`.rmeta`) changes. To elaborate, the ordering of late-bound regions doesn't necessarily reflect the ordering found in the source code. It does seem to be stable across compilations but modifying the source code of the to-be-documented crates (like adding or renaming items) may result in a different order:

<details><summary>Example</summary>

Let's assume that we're documenting the cross-crate re-export of `produce` from the code below. On `master`, rustdoc would render the list of binders as `for<'x, 'y, 'z>`. However, once you add back the functions `a`–`l`, it would be rendered as `for<'z, 'y, 'x>` (reverse order)! Results may vary. `bound_vars()` fixes this as it returns them in source order.

```rs
// pub fn a() {}
// pub fn b() {}
// pub fn c() {}
// pub fn d() {}
// pub fn e() {}
// pub fn f() {}
// pub fn g() {}
// pub fn h() {}
// pub fn i() {}
// pub fn j() {}
// pub fn k() {}
// pub fn l() {}

pub fn produce() -> impl for<'x, 'y, 'z> Trait<'z, 'y, 'x> {}

pub trait Trait<'a, 'b, 'c> {}

impl Trait<'_, '_, '_> for () {}
```

</details>

Further, as the name suggests, CRLBR only collects *referenced* regions and thus we drop unused binders. `bound_vars()` contains unused binders on the other hand. Let's stay closer to the source where possible and keep unused binders.

Lastly, using `bound_vars()` allows us to get rid of

* the deduplication and alphabetical sorting hack in `simplify.rs`
* the weird field `bound_params` on `EqPredicate`

both of which were introduced by me in rust-lang#102707 back when I didn't know better.

To illustrate, let's look at the cross-crate bound `T: for<'a, 'b> Trait<A<'a> = (), B<'b> = ()>`.

* With CRLBR + `EqPredicate.bound_params`, *before* bounds simplification we would have the bounds `T: Trait`, `for<'a> <T as Trait>::A<'a> == ()` and `for<'b> <T as Trait>::B<'b> == ()` which required us to merge `for<>`, `for<'a>` and `for<'b>` into `for<'a, 'b>` in a deterministic manner and without introducing duplicate binders.
* With `bound_vars()`, we now have the bounds `for<'a, b> T: Trait`, `<T as Trait>::A<'a> == ()` and `<T as Trait>::B<'b> == ()` before bound simplification similar to rustc itself. This obviously no longer requires any funny merging of `for<>`s. On top of that `for<'a, 'b>` is guaranteed to be in source order.
@RalfJung
Copy link
Member

The duplicate Target = [u8] here is another instance of this bug.

bors added a commit to rust-lang-ci/rust that referenced this issue Mar 24, 2024
…s, r=<try>

[perf-only] rustdoc: synthetic impls: auto traits: Fx{Hash↦Index}{Map,Set}

Part of rust-lang#119597 and rust-lang#113015.
r? ghost
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 1, 2024
…mpl-synth, r=<try>

rustdoc: heavily simplify the synthesis of auto trait impls

`gd --numstat HEAD~2 HEAD src/librustdoc/clean/auto_trait.rs`
**+315 -705** 🟩🟥🟥🟥⬛

---

As outlined in issue rust-lang#113015, there are currently 3[^1] large separate routines that “clean” `rustc_middle::ty` data types related to generics & predicates to rustdoc data types. Every single one has their own kinds of bugs. While I've patched a lot of bugs in each of the routines in the past, it's about time to unify them. This PR is only the first in a series. It completely *yanks* the custom “bounds cleaning” of mod `auto_trait` and reuses the routines found in mod `simplify`. As mentioned, `simplify` is also flawed but still it's more complete than `auto_trait`'s routines. See also my review comment over at `tests/rustdoc/synthetic_auto/bounds.rs`.

This is preparatory work for rewriting “bounds cleaning” in follow-up PRs in order to finally [fix] rust-lang#113015.

Apart from that, I've eliminated all potential sources of *unstable rendering output*.
See also rust-lang#119597. I'm pretty sure this fixes rust-lang#119597.

This PR does not attempt to fix [any other issues related to synthetic auto trait impls](https://github.com/rust-lang/rust/issues?q=is%3Aissue+is%3Aopen+label%3AA-synthetic-impls%20label%3AA-auto-traits).
However, it's definitely meant to be a *stepping stone* by making `auto_trait` more contributor friendly.

---

* Replace `FxHash{Map,Set}` with `FxIndex{Map,Set}` to guarantee a stable / more predictable iteration order
  * Or as a perf opt, `UnordSet` (a thin wrapper around `FxHashSet`) in cases where we never iterate over the set.
* Remove the types `RegionTarget` and `RegionDeps` from `librustdoc`. They were duplicates of the identical types found in `rustc`. Just import them from `rustc`. For some reason, they were duplicated when splitting `auto_trait` in two in rust-lang#49711.
* Get rid of the useless “type namespace” `AutoTraitFinder` in `librustdoc`
  * The struct only held a `DocContext`, it was over-engineered
  * Turn the associated functions into free ones
    * Eliminates rightward drift; increases legibility
  * `rustc` also contains a useless `AutoTraitFinder` struct but I plan on removing that in a follow-up PR
* Rename a bunch of methods to be way more descriptive
* Eliminate `use super::*;`
  * Lead to `clean/mod.rs` accumulating a lot of unnecessary imports and therefore
  * made `auto_traits` less modular

I plan on adding extensive documentation to `librustdoc`'s `auto_trait` in follow-up PRs.
I don't want to do that in this PR because further refactoring & bug fix PRs may alter the overall structure of `librustdoc`'s & `rustc`'s `auto_trait` modules to a great degree. I'm slowly digging into the dark details of `rustc`'s `auto_trait` module again and once I have the full picture I will be able to provide proper docs.

---

While this PR does indeed touch `rustc`'s `auto_trait` — mostly tiny refactorings — I argue this PR doesn't need any compiler reviewers next to rustdoc ones since that module falls under the purview of rustdoc — it used to be part of `librustdoc` after all (rust-lang#49711).

Sorry for not having split this into more commits. If you'd like me to I can try to retroactively separate it into more atomic commits. However, I don't know if that would actually make reviewing this PR easier.

r? `@GuillaumeGomez`

[^1]: Or even 4 depending on the way you're counting.
bors added a commit to rust-lang-ci/rust that referenced this issue Apr 2, 2024
…mpl-synth, r=GuillaumeGomez

rustdoc: heavily simplify the synthesis of auto trait impls

`gd --numstat HEAD~2 HEAD src/librustdoc/clean/auto_trait.rs`
**+315 -705** 🟩🟥🟥🟥⬛

---

As outlined in issue rust-lang#113015, there are currently 3[^1] large separate routines that “clean” `rustc_middle::ty` data types related to generics & predicates to rustdoc data types. Every single one has their own kinds of bugs. While I've patched a lot of bugs in each of the routines in the past, it's about time to unify them. This PR is only the first in a series. It completely **yanks** the custom “bounds cleaning” of mod `auto_trait` and reuses the routines found in mod `simplify`. As alluded to, `simplify` is also flawed but it's still more complete than `auto_trait`'s routines. [See also my review comment over at `tests/rustdoc/synthetic_auto/bounds.rs`](rust-lang#123340 (comment)).

This is preparatory work for rewriting “bounds cleaning” from scratch in follow-up PRs in order to finally [fix] rust-lang#113015.

Apart from that, I've eliminated all potential sources of *instability* in the rendered output.
See also rust-lang#119597. I'm pretty sure this fixes rust-lang#119597.

This PR does not attempt to fix [any other issues related to synthetic auto trait impls](https://github.com/rust-lang/rust/issues?q=is%3Aissue+is%3Aopen+label%3AA-synthetic-impls%20label%3AA-auto-traits).
However, it's definitely meant to be a *stepping stone* by making `auto_trait` more contributor-friendly.

---

* Replace `FxHash{Map,Set}` with `FxIndex{Map,Set}` to guarantee a stable iteration order
  * Or as a perf opt, `UnordSet` (a thin wrapper around `FxHashSet`) in cases where we never iterate over the set.
  * Yes, we do make use of `swap_remove` but that shouldn't matter since all the callers are deterministic. It does make the output less “predictable” but it's still better than before. Ofc, I rely on `rustc_infer` being deterministic. I hope that holds.
* Utilizing `clean::simplify` over the custom “bounds cleaning” routines wipes out the last reference to `collect_referenced_late_bound_regions` in rustdoc (`simplify` uses `bound_vars`) which was a source of instability / unpredictability (cc rust-lang#116388)
* Remove the types `RegionTarget` and `RegionDeps` from `librustdoc`. They were duplicates of the identical types found in `rustc`. Just import them from `rustc`. For some reason, they were duplicated when splitting `auto_trait` in two in rust-lang#49711.
* Get rid of the useless “type namespace” `AutoTraitFinder` in `librustdoc`
  * The struct only held a `DocContext`, it was over-engineered
  * Turn the associated functions into free ones
    * Eliminates rightward drift; increases legibility
  * `rustc` also contains a useless `AutoTraitFinder` struct but I plan on removing that in a follow-up PR
* Rename a bunch of methods to be way more descriptive
* Eliminate `use super::*;`
  * Lead to `clean/mod.rs` accumulating a lot of unnecessary imports
  * Made `auto_traits` less modular
* Eliminate a custom `TypeFolder`: We can just use the rustc helper `fold_regions` which does that for us

I plan on adding extensive documentation to `librustdoc`'s `auto_trait` in follow-up PRs.
I don't want to do that in this PR because further refactoring & bug fix PRs may alter the overall structure of `librustdoc`'s & `rustc`'s `auto_trait` modules to a great degree. I'm slowly digging into the dark details of `rustc`'s `auto_trait` module again and once I have the full picture I will be able to provide proper docs.

---

While this PR does indeed touch `rustc`'s `auto_trait` — mostly tiny refactorings — I argue this PR doesn't need any compiler reviewers next to rustdoc ones since that module falls under the purview of rustdoc — it used to be part of `librustdoc` after all (rust-lang#49711).

Sorry for not having split this into more commits. If you'd like me to I can try to split it into more atomic commits retroactively. However, I don't know if that would actually make reviewing easier. I think the best way to review this might just be to place the master version of `auto_trait` on the left of your screen and the patched one on the right, not joking.

r? `@GuillaumeGomez`

[^1]: Or even 4 depending on the way you're counting.
@fmease
Copy link
Member Author

fmease commented Apr 16, 2024

Major: OPAQ: We don't take supertrait bounds into account when reconstructing assoc item constraints (“type bindings”) from projections (#116735 would've fixed that but I've closed it in favor of a more general PR).

Source

pub fn rpit_fn() -> impl Fn() -> bool {
    || false
}

Rendered inlined cross-crate re-export

pub fn rpit_fn() -> impl Fn()

Source

pub fn f() -> impl TrA<X = ()> {}

pub trait TrA: TrB {}
pub trait TrB { type X; }

impl TrA for () {}
impl TrB for () { type X = (); }

Rendered inlined cross-crate re-export

pub fn f() -> impl TrA

@fmease
Copy link
Member Author

fmease commented Apr 16, 2024

Major: WHERE & APIT: We don't take generic arguments of super trait bounds into account when reconstructing assoc item constraints (“type bindings”) from projections, like at all!

Source

pub struct Ty<T>(T)
where
    T: TrA + TrB<i32, X = ()>;

pub trait TrA: TrB<()> {}
pub trait TrB<D> { type X; }

Rendered inlined cross-crate re-export

pub struct Ty<T>(/* private fields */)
where
    T: TrA<X = ()> + TrB<i32>;

Source

pub struct Ty<T>(T)
where
    T: TrA<X = bool> + TrB<i32, X = ()>;

pub trait TrA: TrB<()> {}
pub trait TrB<D> { type X; }

Rendered inline cross-crate re-export

pub struct Ty<T>(/* private fields */)
where
    T: TrA<X = bool, X = ()> + TrB<i32>;

@fmease
Copy link
Member Author

fmease commented Apr 16, 2024

Minor: WHERE: We don't try to reconstruct overarching / outer binders of clauses.

Source

pub struct Ty<T>(T)
where
    for<'o> T: TrA<'o> + TrB<'o>;

pub trait TrA<'a> {}
pub trait TrB<'a> {}

Rendered inlined cross-crate re-export

pub struct Ty<T>(/* private fields */)
where
    T: for<'o> TrA<'o> + for<'o> TrB<'o>;

Edit: Medium: WHERE: We drop the "outer" binder of type-outlives predicates:

Source

pub fn demo<T>() where for<'a> T: 'a {}

Rendered inlined cross-crate re-export

pub fn demo<T>()
where
    T: 'a,

@fmease
Copy link
Member Author

fmease commented Apr 16, 2024

Minor: WHERE: We don't necessarily preserve the order bounds that differ in LHS (bounded ty, lt). However, the HIR cleaner doesn't seem to do that either (requires a different reproducer though).

Source

pub fn f<'a, T>() where T: 'static, 'a: 'static {}

Rendered inlined cross-crate re-export

pub fn f<'a, T>()
where
    'a: 'static,
    T: 'static,

Order is lifetime bounds, then type bounds(, then equality bounds). This happens in simplify::where_clauses.

@fmease
Copy link
Member Author

fmease commented Apr 16, 2024

Minor: OPAQ: We don't necessarily preserve the order of bounds, we move traits in front of lifetime bounds.

@fmease
Copy link
Member Author

fmease commented Apr 16, 2024

Medium: WHERE & APIT: The cleaning process is order-dependent.

pub trait TrA: TrB {}
pub trait TrB { type X; }

pub fn f<T: TrA + TrB<X = ()>>() {} // rendered IXCRX of the bound: `T: TrA<X = ()> + TrB`
pub fn g<T: TrB<X = ()> + TrA>() {} // rendered IXCRX of the bound: `T: TrB<X = ()> + TrA`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cross-crate-reexports Area: Documentation that has been re-exported from a different crate C-bug Category: This is a bug. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants