diff --git a/text/0000-associated-const-underscore.md b/text/0000-associated-const-underscore.md new file mode 100644 index 00000000000..14b41214bab --- /dev/null +++ b/text/0000-associated-const-underscore.md @@ -0,0 +1,649 @@ +- Feature Name: `associated_const_underscore` +- Start Date: 2023-11-12 +- RFC PR: [rust-lang/rfcs#3527](https://github.com/rust-lang/rfcs/pull/3527) +- Rust Issue: + +# Summary +[summary]: #summary + +Allow `_` for the name of associated constants. This RFC builds on [RFC 2526] +which added support for free `const` items with the name `_`, but not associated +consts. + +```rust +// RFC 2526 (stable in Rust 1.37) +const _: () = { /* ... */ }; + +impl Thing { + // this RFC + const _: () = { /* ... */ }; +} +``` + +Constants named `_` are not nameable by other code and do not appear in +documentation, but are useful when macro-generated code must typecheck some +expression in the context of a specific choice of `Self`. + +[RFC 2526]: https://github.com/rust-lang/rfcs/pull/2526 + +# Motivation +[motivation]: #motivation + +The motivation is long, because understanding why this feature is worth having +requires understanding a fair bit of context about procedural macro techniques +and limitations. I have opted to provide this context in substantial depth. + +Consider the standard library's `derive(Eq)` macro. The `core::cmp::Eq` trait +notionally contains no functions, but the following simple expansion would be +_wrong_ for its derive macro: + +```rust +// input: +#[derive(Eq)] +pub struct Thing { + field: Field, +} + +// an incorrect expansion: +impl ::core::cmp::Eq for Thing {} +``` + +This expansion is incorrect because we want `derive(Eq)` to be responsible for +enforcing that all fields of the type have an `Eq` impl. If the type `Field` +above happens to be `f32` (which implements `PartialEq` but not `Eq`), spitting +out a compilable `Eq` impl for `Thing` would be incorrect. + +Here is what `derive(Eq)` expands to today, as of Rust 1.74: + +```rust +impl ::core::cmp::Eq for Thing { + #[inline] + #[doc(hidden)] + #[coverage(off)] + fn assert_receiver_is_total_eq(&self) -> () { + let _: ::core::cmp::AssertParamIsEq; // AssertParamIsEq + } +} +``` + +The `Eq` trait has secretly come with a `doc(hidden)` associated function for +the sole purpose that `derive(Eq)` can stick code in there to typecheck it. + +This RFC proposes that `derive(Eq)` should generate its output as follows +instead, and the nonpublic `assert_receiver_is_total_eq` can be removed from the +trait. + +```rust +impl ::core::cmp::Eq for Thing {} + +impl Thing { + const _: () = { + let _: ::core::cmp::AssertParamIsEq; + }; +} +``` + +A number of alternative expansions come to mind using only existing syntax, none +of which are adequate to this use case. + +1. **Just keeping the hidden function doesn't seem so bad.** + + From the perspective of the standard library's own `derive(Eq)`, sure. The + trait and the derive macro are both defined by the same library. It's fair + for the macro to be written against nonpublic internals of the trait. This + is standard practice. + + But in a situation where the trait and macro are defined in independent + crates, a nonpublic function for dumping typechecking code into is not a + workable solution. This even affects `Eq`, because crates other than the + standard library want to be able to provide custom derive macros for it. + + Consider what the [derive\_more] crate would need to do to support its own + `derive(derive_more::Eq)`. + + [derive\_more]: https://github.com/JelteF/derive_more/issues/311 + + ```rust + #[derive(derive_more::Eq)] + struct Thing { + foo: Foo, + #[derive_more(skip)] + bar: Bar, + } + ``` + + Code needs to go somewhere to check the `Foo: Eq` requirement. Reaching into + private standard library internals is definitely not an intended way to + accomplish this. + +2. **So just make the dummy function public and stable?** + + My personal guess is that doing this to work around a language limitation + would not be appealing to the standard library API team. + + Beyond aesthetic sensibility, here are some downsides to the dummy function + approach. + + While `Eq` is not an auto-trait, the function approach is impossible to + apply to auto-traits. Auto-traits (formerly known as opt-in builtin traits) + are not allowed to contain trait functions. If we want derive macros such as + in derive\_more to be able to produce implementations of `Unpin` or + `UnwindSafe`, a different approach is required. + + Trait functions also have implications on dyn-safety. `Eq` is not dyn-safe + already, but other marker traits are. In order to keep dummy functions from + adding bloat to vtables, we'd want them bounded with `where Self: Sized`. + This poses a footgun for the macro implementation which would need to know + to _omit_ `where Self: Sized` on dummy functions within generated trait + impls ([overconstraining]/[refining]) or risk getting false negatives. + + [overconstraining]: https://rust-lang.github.io/rfcs/2316-safe-unsafe-trait-methods.html + [refining]: https://rust-lang.github.io/rfcs/3245-refined-impls.html + + ```rust + trait DynSafeTrait { + fn dummy_function_for_typechecking() where Self: Sized {} + } + + // macro-generated impl + impl DynSafeTrait for Thing { + fn dummy_function_for_typechecking() { + // We want to check this in a context where Thing is not + // necessarily Sized. + let _: WhateverCheck>; + } + } + ``` + + Finally, while the dummy function workaround has been discussed as applying + to the case of marker traits like `Eq` which otherwise contain no functions + that a macro could stick typechecking code into, consider that this RFC can + be valuable more generally than that. In traits that contain a large, + consistent set of signatures that a macro might want to implement all using + the same codepath (think of [syn::visit::Visit] with a macro that forwards + every visit function to a nested visitor), singling out a single one of + those for the macro to stick its extra typechecking code into can be + awkward. Would such traits also be expected to supply a `fn + dummy_function_for_typechecking`? + + [syn::visit::Visit]: https://docs.rs/syn/latest/syn/visit/trait.Visit.html + +3. **Just do everything through where-clauses.** + + This is a surprisingly feasible outside-the-box alternative. + + A suggestion frequently made is that macros like `derive(Eq)` on a struct + like the following: + + ```rust + pub struct Thing { + field: Field, + } + ``` + + should not expand to this kind of thing: + + ```rust + impl ::core::cmp::Eq for Thing { + #[doc(hidden)] + #[coverage(off)] + fn assert_receiver_is_total_eq(&self) { + let _: ::core::cmp::AssertParamIsEq; // AssertParamIsEq + } + } + ``` + + but rather to this: + + ```rust + impl ::core::cmp::Eq for Thing + where + Field: Eq, + {} + ``` + + In both cases, those generated trait impls compile successfully if `Field` + implements `Eq`, and fail to compile if `Field` does not implement `Eq`. + + In the past this has been more problematic than today. Namely, until Rust + 1.59, this was liable to fail with _"private type in public interface"_ + errors. + + Remaining reasons this approach is not generally applicable are: _"overflow + evaluating the requirement"_ errors in the case of co-recursive data + structures, and _"type annotation needed"_ errors in certain cases involving + lifetimes due to a longstanding compiler bug. See [dtolnay/syn#370]. + + [dtolnay/syn#370]: https://github.com/dtolnay/syn/issues/370 + +4. **Is free const underscore not sufficient?** + + Let's go through a series of decreasingly naïve ways that one might try to + implement a correct `derive(Eq)` using free const underscore, without + associated const underscore. If "implied bounds" are already on your mind at + this point, you have predicted where this is heading. + + With this as the macro input: + + ```rust + pub struct Thing { + field: Field, + } + ``` + + One might expect that we can emit: + + ```rust + impl ::core::cmp::Eq for Thing {} + + const _: () = { + let _: ::core::cmp::AssertParamIsEq; + }; + ``` + + and indeed this works. But only because generic parameters are not involved. + Let's try it with generics: + + ```rust + pub struct Thing { + field: Field, + } + ``` + + Today in stable Rust, `const` cannot be generic (there is an experimental + implementation in the compiler, but no RFC yet; see [rust#113521]). Instead + we'll use a function to introduce appropriately bounded generic parameters. + But we also keep a surrounding underscore constant to avoid needing to pick + a unique function name that won't conflict with other uses of `derive(Eq)` + in the same scope. + + [rust#113521]: https://github.com/rust-lang/rust/issues/113521 + + ```rust + const _: () = { + fn assert_fields_are_total_eq() { + let _: ::core::cmp::AssertParamIsEq>; + } + }; + ``` + + So far so good, but let's try the same thing with lifetimes in the picture. + + ```rust + type Field<'a, T> = &'a mut T; + + // #[derive(Eq)] + pub struct Thing<'a, T> { + field: Field<'a, T>, + } + + const _: () = { + fn assert_fields_are_total_eq<'a, T: ::core::cmp::Eq>() { + let _: ::core::cmp::AssertParamIsEq>; + } + }; + ``` + + This fails to compile because of a missing `T: 'a` implied bound. The + implied bound originates from code that is not visible to the macro + implementation, so it is hopeless for the macro to produce a correct + explicit bound in this situation. + + ```console + error[E0309]: the parameter type `T` may not live long enough + --> src/lib.rs:9:16 + | + 8 | fn assert_fields_are_total_eq<'a, T: ::core::cmp::Eq>() { + | -- the parameter type `T` must be valid for the lifetime `'a` as defined here... + 9 | let _: ::core::cmp::AssertParamIsEq>; + | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ...so that the type `T` will meet its required lifetime bounds + | + help: consider adding an explicit lifetime bound + | + 8 | fn assert_fields_are_total_eq<'a, T: ::core::cmp::Eq + 'a>() { + | ++++ + ``` + + Instead of an explicit bound, we can try to arrange for a suitable implied + bound to get put in, by making an unused argument of type `Self` appear in + scope. + + ```rust + const _: () = { + fn assert_fields_are_total_eq<'a, T: ::core::cmp::Eq>(_: &Thing<'a, T>) { + let _: ::core::cmp::AssertParamIsEq>; + } + }; + ``` + + This works. Though notice we can't exactly use `Self`; the type needs to be + spelled out. Also if `Self` appears in the type of one of the fields, that + would also need to be substituted with the right spelled-out type name. + + ```rust + pub struct Thing { + buf: >::Buf, + } + + const _: () = { + fn assert_fields_are_total_eq(_: &Thing) { + let _: ::core::cmp::AssertParamIsEq< as Buffered>::Buf>; + } ^^^^^^^^ + }; + ``` + + "Replacing `Self`" like this looks simple but is fiendish to handle + correctly. It cannot be done correctly on the token level because different + appearances of `Self` in a type can refer to different types. In the + following example, `Self` is used twice within the definition of `Struct` + and substituting both with `Struct` would break the meaning of the program. + + ```rust + pub struct Struct { + pub header: [u8; { + struct Nested(Option>); + Self::K + mem::size_of::() + }], + pub rest: [u8], + } + + impl Struct { + const K: usize = 1; + } + + fn main() { + let _: fn(&Struct) -> &[u8; 9] = |s| &s.header; + } + ``` + + The `async-trait` crate has [172 lines of logic][async-trait] dedicated to + "replacing Self". The `serde_derive` crate has [292 lines][serde_derive]. + Async-trait has had at least 13 bugs involving the replacement of `Self`, + affecting real-world non-contrived code. This is not a thing that typical + procedural macros should be expected to implement. + + + + [async-trait]: https://github.com/dtolnay/async-trait/blob/0.1.74/src/receiver.rs + [serde_derive]: https://github.com/serde-rs/serde/blob/v1.0.192/serde_derive/src/internals/receiver.rs + + Let's try avoiding needing to handle `Self` replacement by moving the + typechecking code into an `impl` block. + + ```rust + // #[derive(Eq)] + pub struct Thing { + field: Field, + } + + impl Thing { + #[doc(hidden)] + #[allow(dead_code)] + #[coverage(off)] + fn __assert_fields_are_total_eq() { + let _: ::core::cmp::AssertParamIsEq; + } + } + ``` + + For the library ecosystem, this isn't terrible, though needing to pick a + name for the hidden function that won't conflict with other macro-generated + code is annoying. Consider the case where a macro might be applied multiple + times to the same data structure, such as to generate `AsRef` and + `AsRef`. + + For the standard library's derive macros I think this expansion is not + viable. The reason is we'd have no way to mark that generated associated + function as being a standard library implementation detail (`#[unstable]`) + as we would ordinarily want to do. + + Here is a way to work around both issues: eliminating conflicts between + different expansions, and avoiding inserting junk APIs into the caller's + code. + + ```rust + impl ::core::cmp::Eq for Thing {} + + const _: () = { + trait __AssertFieldsAreTotalEq { + fn assert_fields_are_total_eq(); + } + impl __AssertFieldsAreTotalEq for Thing { + fn assert_fields_are_total_eq() { + let _: ::core::cmp::AssertParamIsEq; + } + } + }; + ``` + + As far as I know, this expansion is able to accomplish all technical + objectives. I considered making a PR to make `derive(Eq)` take this + approach, but if possible, going straight to the associated const underscore + proposed by this RFC would be preferable. + + ```rust + impl ::core::cmp::Eq for Thing {} + + impl Thing { + const _: () = { + let _: ::core::cmp::AssertParamIsEq; + }; + } + ``` + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +https://doc.rust-lang.org/1.73.0/reference/items/constant-items.html#unnamed-constant + +```diff +- Unlike an associated constant, a free constant may be unnamed by using an ++ A free constant or associated constant may be unnamed by using an + underscore instead of the name. For example: +``` + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +The implementation pretty much follows the implementation of free const +underscore, which has been working well. + +The following details are called out as being worth testing: + +1. Unlike ordinary associated constants, multiple associated const underscore + are permitted to co-exist on the same Self type. + + ```rust + struct Struct(T); + + impl Struct { + const _: () = (); + + const _: i16 = 0; // not a conflict + } + + impl Struct { + const _: () = (); // not a conflict + } + ``` + +2. Although associated const underscore does not add any externally accessible + API to a type, a visibility specification is still allowed on it. As with + any other associated constant, of the 3 visibilities {receiver's visibility, + constant's visibility, constant's type's visibility}, you get a warning if + the constant's type's visibility is the strictly lowest one. + + ```rust + pub struct Public; + + struct Private; + + impl Public { + pub const _: Private = Private; // warn(private_interfaces) + } + + impl Public { + const _: Private = Private; // no warning + } + + impl Private { + pub const _: Private = Private; // no warning + } + ``` + +3. The `Self` type of the impl must be local to the crate containing the impl. + + ```rust + impl std::thread::Thread { + const _: () = {}; // not allowed + } + + struct Local; + impl &Local { + const _: () = {}; // although &T is #[fundamental], this is not allowed + } + ``` + +4. This RFC does not propose const underscore for inclusion as a trait item. + + ```rust + trait Trait { + const _: (); // not allowed + } + ``` + +5. This RFC does not propose const underscore inside trait impls. + + ```rust + trait Trait {} + + impl Trait for Type { + const _: () = {}; // not allowed + } + ``` + +6. The underscore const's value is evaluated in exactly the situations that an + ordinary named associated constant would be evaluated. Named associated + constants are evaluated when accessed. Underscore associated constants + cannot be accessed, so are never evaluated — only typechecked. + + ```rust + pub struct Unit; + + impl Unit { + const K: () = assert!(false); // no error + const _: () = assert!(false); // no error + } + + pub struct Generic(T); + + impl Generic { + const K: () = assert!(mem::size_of::() % 2 == 0); // no error + const _: () = assert!(mem::size_of::() % 2 == 0); // no error + } + + fn main() { + let _ = Unit; // no error + let _ = Generic([0u8; 3]); // no error + + let _ = Unit::K; // error + let _ = Generic::<[u8; 3]>::K; // error + } + ``` + +# Drawbacks +[drawbacks]: #drawbacks + +None identified. This is a logical combination of 2 language features that the +Rust Reference needs to go out of its way to identify as being disallowed. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +The do-nothing alternative is worth examining for the following reason: **unlike +RFC 2526 (free const underscore), this RFC does not add expressiveness.** + +That previous RFC was exceedingly well motivated by use cases that were +genuinely impossible to solve prior to the language change. Some examples +include [inventory#8] and [static\_assertions]. + +[inventory#8]: https://github.com/dtolnay/inventory/issues/8 +[static\_assertions]: https://github.com/rust-lang/rust/issues/54912#issuecomment-480594120 + +Meanwhile this RFC only makes a use case easier to express than it was before, +by removing a spurious limitation of 2 language features not working together +(associated constants and const underscore). As demonstrated near the bottom of +the Motivation, the following proposed use of associated const underscore: + +```rust +impl SelfType { + const _: Something = {/* ... */}; +} +``` + +is substantially equivalent to the following already legal syntax: + +```rust +const _: () = { + trait __SomeUniqueEnoughName { + const K: Something; + } + impl __SomeUniqueEnoughName for SelfType { + const K: Something = {/* ... */}; + } +}; +``` + +The former is something that I think would be great to convert the standard +library's `derive(Eq)` to as soon as available. The latter is something that +would be a hard sell despite advantages over the current less-verbose expansion +of `derive(Eq)`. + +# Prior art +[prior-art]: #prior-art + +None identified. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- [ ] When does associated const code get run? Eagerly at type definition? When + substituting concrete types into generic arguments? Never? + +# Future possibilities +[future-possibilities]: #future-possibilities + +- Consider lifting the restriction that the `Self` type of the impl must be + local. + + Associated const underscore does not add any externally accessible API to a + type, so I wonder whether there is a strong rationale for limiting it to + local types. I believe I have had cases that would have benefited from + having associated const underscore on an arbitrary type, but I have not + aggregated the justification for supporting this. I will consider RFC-ing + this separately with a strong justification. + +Separately, refer to the "Possible future work" section of the stabilization +proposal for the original const underscore, of which this RFC is one part. +https://github.com/rust-lang/rust/pull/61347#issuecomment-497533585