Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโ€™ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Supertrait item shadowing #2845

Merged
merged 1 commit into from
Sep 21, 2021
Merged

RFC: Supertrait item shadowing #2845

merged 1 commit into from
Sep 21, 2021

Conversation

lcdr
Copy link
Contributor

@lcdr lcdr commented Jan 6, 2020

๐Ÿ–ผ๏ธ Rendered

Pre-RFC thread

๐Ÿ“ Summary

Change item resolution for generics and trait objects so that a trait bound does not bring its supertraits' items into scope if the subtrait defines an item with this name itself.

This makes it possible to write the following, which is currently rejected by the compiler due to ambiguity:

trait Super {
    fn foo(&self);
}

trait Sub: Super {
    fn foo(&self);
}

fn generic_fn<S: Sub>(x: S) {
    x.foo();
}

@mcarton
Copy link
Member

mcarton commented Jan 6, 2020

I might be surprising that the following functions

fn generic_fn<S: Super>(x: S) {
    x.foo();
}

fn use_trait_obj(x: Box<dyn Super>) {
    x.foo();
}

would call Super::foo rather than Sub::foo, especially for people coming from OO languages expecting some kind of overriding to happen here.

I also always have hated method hiding in C#, which I've only ever seen to cause problems.

@lcdr
Copy link
Contributor Author

lcdr commented Jan 6, 2020

The code you posted is actually not affected by this RFC.

The generic function already compiles today, and resolves foo to Super::foo even when something implementing Sub is passed. As an example:

impl Super for i32 {
    fn foo(&self) { println!("super"); }
}
impl Sub for i32 {
    fn foo(&self) { println!("sub"); }
}

fn generic_fn<S: Super>(x: S) {
    x.foo();
}
fn sub_generic_fn<S: Sub>(x: S) {
    generic_fn(x);
}

fn main() {
    let x: i32 = 42;
    sub_generic_fn(x);   // prints "super"
}

The trait object case is a bit different. It fails to compile, but not because of ambiguity, but because Rust doesn't have trait object upcasting.

fn use_trait_obj(x: Box<dyn Super>) {
    x.foo();
}
fn sub_use_trait_obj(x: Box<dyn Sub>) {
    use_trait_obj(x); // error: expected trait `Super`, found trait `Sub`
}

The behavior in these cases is not changed by this RFC.

I'm afraid that in this sense, the decision of shadowing vs overriding has already been made by current Rust. Changing it would be backwards-incompatible, as it would change the existing behavior of generic functions.

I've noted the potential for confusion for OOP users in the drawbacks section of the RFC. Unfortunately, this potential for confusion already exists in Rust today.

@burdges
Copy link

burdges commented Jan 6, 2020

In the next edition, we should make a subtrait shadowing any name from a super trait become a hard error, so this code should not compile:

trait Super {
    fn foo(&self);
}

trait Sub: Super {
    fn foo(&self);
}

@lcdr
Copy link
Contributor Author

lcdr commented Jan 7, 2020

Note that a name conflict situation doesn't always arise from a subtrait adding a method with the same name as in the supertrait, but can arise when a supertrait adds a method that already happens to be in a subtrait. This fragile base class problem is the main motivation for this RFC, which aims to avoid breakage in this case.

Making name conflicts a hard error would instead make the breakage worse than it is currently. Today, a supertrait adding a conflicting method is a breaking but minor change, since users can disambiguate using UFCS. With a hard error, this would not be possible, and the subtrait would be forced to change its method's name, which is a major breaking change.

@burdges
Copy link

burdges commented Jan 8, 2020

I see, fair enough. It should then warn whenever you call the method without using UFCS with the subtrait in scope.

@lcdr
Copy link
Contributor Author

lcdr commented Jan 8, 2020

Right now not using UFCS will raise a hard error. From context, I understand you're not proposing to replace the error with a warning, but are more interested in avoiding code that is not explicitly disambiguated. I'd be interested to hear your rationale for this.

Consider the following situations, as laid out in the RFC:

fn generic_fn<S: Sub>(x: S) {
    x.foo();
}

In this case it seems quite intuitive that x.foo() would be resolved to Sub::foo rather than Super::foo, as Sub is the trait mentioned in the bound. After all, this is the way it already works for other items of Sub that are not in conflict with Super.

Perhaps you're more concerned about a more ambiguous situation, such as both Sub and Super explicitly being mentioned in the bound:

fn generic_fn<S: Sub+Super>(x: S) {
    x.foo();
}

Under this RFC, shadowing would not apply in this case, as Super is explicitly brought into scope. x.foo() does not resolve and results in a compile error asking you to use UFCS to disambiguate.

This is just a brief summary, the RFC text discusses a few more cases, if I didn't list the one you're concerned about.

@burdges
Copy link

burdges commented Jan 8, 2020

If I understand the current situation, if you bring the subtrait into scope then calling foo without UFCS is a hard error, but this gets relaxed when using generic functions, which act like only the named trait is in scope.

At first blush, I'd suggest making the generic function situation stricter so that it matches the non-generic function situation. It's tricky however what if I write:

fn generic_fn<S: ::bar::Super>(x: S) {
    x.foo();
}
fn sub_generic_fn<S: ::baz::Sub>(x: S) {
    generic_fn(x);
}

I'd say this should hard error without UFCS, even without the call to generic_fn from sub_generic_fn. Yet, returning this error requires considering all traits brought into scope anywhere within the module.

As an aside, we could error if sub_generic_fn called x.foo() if Super adding foo transforms

trait Sub: Super { fn foo(&self); }

into

trait Sub: Super { fn foo(this: &Self); }

but that's insufficient for your example. I'm unsure how aggressive rustc should fight this, but maybe clippy could go further?

I'd be interested to hear your rationale for this.

Ambiguity makes reading code almost impossible, which makes correct code almost impossible.

I'd maybe phrase the compromise goal like: If a human reads the code without compiler assistance, and finds one valid resolution, then either (a) they found the only valid resolution, or (b) the resolution they found explicitly indicates where one searches for other valid resolutions.

We push (b) fairly heavily already:

  • Add<LHS>-like traits get confusing, but the LHS acts like a warning satisfying (b).
  • Inherent methods overriding trait methods gets confusing, but falls under (b) if we accept that readers should check the inherent blocks first.
  • default methods in traits fall under (b) in that you must check the actual impl
  • specialization expands (b) but only with the default keyword.

All these (b) cases become quite painful when identifying the applicable impl requires careful type checking.

I think you're proposal fails (b), and so does the current behavior maybe, because the incoherence lacks any warnings. I do think your note about favoring the supertrait improves this considerably, except many traits come with numerous supertraits, and devs from OO languages expect the dangerous opposite, like @mcarton notes.

@lcdr
Copy link
Contributor Author

lcdr commented Jan 9, 2020

If I understand the current situation, if you bring the subtrait into scope then calling foo without UFCS is a hard error, but this gets relaxed when using generic functions, which act like only the named trait is in scope.

My reply to mcarton may have confused you here. mcarton's example is not actually affected by this RFC. To summarize current behavior:

  • For both generic functions and trait objects, calling x.foo() with a bound on Sub is an error and requires UFCS.
  • For both generic functions and trait objects, calling x.foo() with a bound on Super will resolve to Super::foo.
  • The only difference between them is that trait objects don't support upcasting, which is not the subject of this RFC.

I'd say this should hard error without UFCS, even without the call to generic_fn from sub_generic_fn. Yet, returning this error requires considering all traits brought into scope anywhere within the module.

Again, this is unaffected by and unrelated to the RFC. This RFC only discusses what should happen for item resolution when the subtrait is in scope.

I'd maybe phrase the compromise goal like: If a human reads the code without compiler assistance, and finds one valid resolution, then either (a) they found the only valid resolution, or (b) the resolution they found explicitly indicates where one searches for other valid resolutions.

I agree with you, a human should be able to find the resolution that will be chosen by the compiler without compiler assistance. To do this, a human will need to look at a list of places that might contain an item with the name they are looking for. The important part is that this RFC should not make this lookup any more complex than it already is.

Let's consider the lookup order for code that compiles and resolves to a clear implementation:

To find the resolution for an item, a user needs to:

  • Go through each trait mentioned in the trait bound. For each trait:
    • If the trait contains a matching item, this will be the one chosen by the compiler. Done.
    • If not: recursively consider each of the trait's supertraits.

That's all the user needs to do today. Note that inherent impls aren't even necessary to consider, because generics/trait objects already abstract over the specific type.

This RFC actually doesn't change this 'algorithm' at all, it just makes more situations compile than before. Specifically, it will allow situations where sub- and supertraits have conflicting item names. Since subtraits shadow supertraits, you still only need to go up the hierarchy until you find a matching name. This name will always be the one that will be chosen.

Therefore the user doesn't need to change their way of reasoning, and the newly allowed situations will intuitively make sense to a user who is used to this lookup order.
Note that this only works because subtraits shadow supertraits, if it were the other way around, it would add complexity.

@burdges
Copy link

burdges commented Jan 9, 2020

I understand now. I like the prejudice towards the explicit bounds, meaning T: Sub+Super hard errors. I also like that T: Sub always resolves to a method in Sub unless no such method exists.

In this vein, I'm now curious about this case where trait Sub : Super1 + Super2 { } but both Super2 and Super3 introduce a name collision, but not with Sub. We cannot choose one as both are named, not even if Super1 : Super2, so we should make that name require UFCS, right?

After this change, Sub could fix that name collision by adding the name itself, but doing so makes future name collisions more likely if anyone does T: Sub + Super1, but maybe not such a concern.

@lcdr
Copy link
Contributor Author

lcdr commented Jan 9, 2020

Glad I was able to clear up the confusion.

You're completely right, if multiple supertraits are in conflict with each other, it will still be ambiguous. In this case this RFC will require UFCS. See here for more.

You're also right that this RFC makes it possible for the subtrait to manually fix the collision, as detailed here.

It's true that this would in turn make collisions more likely for Sub + Super1, but this isn't much of an issue. Since Sub implies Super1 anyways, there isn't much reason to include both in the bound, and if both happen to be included, Super1's mention can be removed without problems.

@Nemo157
Copy link
Member

Nemo157 commented Jan 9, 2020

One situation I havenโ€™t seen mentioned is when a method already exists on the super trait, then a method shadowing it is added to the sub trait.

Given code exists like this (probably with each item spread across a set of three crates):

trait Super {
    fn foo(&self);
}

trait Sub: Super {
}

fn generic_fn<S: Sub>(x: S) {
    x.foo();
}

It would currently compile. Currently if Sub were to add a foo method this would cause an ambiguity error in generic_fn, causing the maintainer to fix that via UFCS. With the change proposed in this RFC it would instead change to calling the shadowed method, which may result in a confusing compile error if the signatures donโ€™t match, or a silent change in behaviour if they are close enough.

@Pauan
Copy link

Pauan commented Jan 10, 2020

@Nemo157 That is true, but adding a new method onto a trait is already a semver breaking change, so it would require a major version bump to the crate which defines Sub. So there's no problem.

But the situation is more complicated once you add in default methods and sealed traits. So if this RFC is accepted then the semver requirements would probably have to be changed: adding a method which overrides a super method is now always a major breaking change.

@lcdr
Copy link
Contributor Author

lcdr commented Jan 10, 2020

Thanks for pointing out this situation, I hadn't considered the implications for semver in this case.

I agree with Pauan, this should be fine for non-defaulted items due to semver, and would require an amendment to the semver rules for defaulted items.

If I understand the semver rules correctly, it would currently be classified as minor, since you could have disambiguated using UFCS, but this doesn't seem like a good practice when it can result in silent changes in behavior. So this should probably be amended to be a major change.

However, the amendment should likely be more specific than "added shadowing of existing methods is a major change": As mentioned above, there's a situation in which a subtrait can fix an introduced collision between supertraits by shadowing and manually redirecting to the desired supertrait. This kind of fixing should be allowed as a minor change, since it is intended to keep existing code compatible.

@Centril Centril added A-trait-object Proposals relating to trait objects. T-lang Relevant to the language team, which will review and decide on the RFC. A-traits Trait system related proposals & ideas labels Mar 1, 2020
@Centril
Copy link
Contributor

Centril commented Mar 2, 2020

Some notes:

  • I'm skeptical that the problem outlined in this RFC is substantial enough that it needs to be solved in practice. I think some evidence is required to substantiate that it is in practice.

    • In particular, a convincing case is not made that this is a problem that should be prioritized.
  • The RFC assumes that the notion of "super trait" is commonly known, and does not define it. This leaves me wondering how e.g., trait Sub where Self: Super { ... } would be affected.

  • Trait aliases are not discussed at all. It would be good to elaborate on how e.g. fn foo<T: Alias>(...) { ... } would work.

@lcdr
Copy link
Contributor Author

lcdr commented Mar 2, 2020

I'm skeptical that the problem outlined in this RFC is substantial enough that it needs to be solved in practice. I think some evidence is required to substantiate that it is in practice.
In particular, a convincing case is not made that this is a problem that should be prioritized.

I agree that some evidence would make the case more compelling. However, due to the nature of the situation, it does not appear like there is a simple way of assessing the impact of the problem.

Quoting my post on the Pre-RFC thread, in response to a possible crater run:

Any crate that has actually run into the outlined situation either doesn't compile and wouldn't be uploaded to crates.io, or has worked around the situation by renaming a method or removed a supertrait (accepting a breaking change), which can't be detected because this also could have happened for other reasons. It also doesn't consider the possibility that some crates could have intentionally avoided introducing a supertrait to avoid running into the situation.

It's also not an option to scan for crates that are "at risk" of being affected by the outlined situation, because it could happen anywhere a public trait uses a supertrait from another crate (even including the standard library). Using a supertrait from std or a different crate is probably quite common (and would be completely fine unless/until the supertrait adds a conflicting item), so a scan wouldn't really help much.

Therefore, I'm not aware of a way of gathering statistical evidence for determining this problem's impact. Without this option, the only evidence I can think of would be authors of high-profile crates finding this RFC and mentioning that this problem has affected them before. This is however quite chance-based and I'm not sure whether this would be counted as evidence ("anecdotes are not data").
If someone knows of a good way to gather evidence in this case, I'd be happy to hear it.


The RFC assumes that the notion of "super trait" is commonly known, and does not define it.

The RFC is using the normal notion of "supertrait" as specified in the reference and the book. I thought those sources could be assumed as commonly known, but if you think referencing them explicitly would improve clarity, I'd be happy to add links.

This leaves me wondering how e.g., trait Sub where Self: Super { ... } would be affected.

The reference specifies this form to be the same as the supertrait shorthand:

Supertraits are declared by trait bounds on the Self type of a trait and transitively the supertraits of the traits declared in those trait bounds.

and then demonstrates the equivalence using examples.

This RFC does not change these semantics, a supertrait is still declared as a trait bound on the Self type of a trait, and the behavior would be the same for both forms. Intuitively speaking, since the supertrait is explicitly named in the bound, manual disambiguation between subtrait and supertrait is necessary (this reasoning is used throughout the RFC).


Trait aliases are not discussed at all. It would be good to elaborate on how e.g. fn foo<T: Alias>(...) { ... } would work.

I see, I hadn't considered unstable features such as trait aliases.

However, I do not believe trait aliases would pose a problem, since they don't define new items themselves. Therefore trait aliases would be substituted before this RFC's semantics would apply. E.g:

trait Super      { foo(&self) {} }
trait Sub: Super { foo(&self) {} }

trait SubAlias = Sub;
trait SubSuperAlias = Sub + Super;

fn foo1<T: SubAlias>     (t: T) { t.foo() } // ok, resolves to Sub::foo
fn foo2<T: SubSuperAlias>(t: T) { t.foo() } // error, disambiguation needed

(Note: I'm not sure if you wanted this elaboration as a comment or added to the RFC, I can add it as well.)

@Centril
Copy link
Contributor

Centril commented Mar 6, 2020

I agree that some evidence would make the case more compelling. However, due to the nature of the situation, it does not appear like there is a simple way of assessing the impact of the problem.

Yeah I agree it would be difficult, but I think, and I believe you agree, that the onus is on the RFC to demonstrate that the status quo is sufficiently problematic to warrant language changes and the additional complexity that follows.

To elaborate on my skepticism around "sufficiently problematic", I believe that:

  • Name collisions across sub/super traits are improbable to start with. The cases where I've seen them is e.g. Iterator vs. Itertools, and in those cases each party is in touch with the other side and can coordinate.
  • Collisions can be avoided by the supertrait's owner by skimming reverse dependencies on crates.io, and tooling could be implemented to automate this check (e.g. in the style of https://github.com/rust-dev-tools/rust-semverver).
  • Worst case if there is a collision and the ecosystem breakage is notable then the change to the supertrait can be yanked.

The RFC is using the normal notion of "supertrait" as specified in the reference and the book. I thought those sources could be assumed as commonly known, but if you think referencing them explicitly would improve clarity, I'd be happy to add links.

It's better to disambiguate and specify the vocabulary used. A reasonable person could infer that where Self: Super is not a supertrait constraint. To be more explicit: I believe this RFC proposes that the constraints of a trait should have lower priority wrt. resolving associated items than the trait's own associated items.

(Note: I'm not sure if you wanted this elaboration as a comment or added to the RFC, I can add it as well.)

I agree with your elaboration that there shouldn't be a problem. In general, prefer adding things to the text of an RFC as well.

@lcdr
Copy link
Contributor Author

lcdr commented Mar 8, 2020

I think, and I believe you agree, that the onus is on the RFC to demonstrate that the status quo is sufficiently problematic to warrant language changes and the additional complexity that follows.

I agree. I'll try to make my case for this change in the following.

Name collisions across sub/super traits are improbable to start with.

This may not necessarily be the case. While name collisions across two unrelated traits are very rare, sub/supertraits are typically related in functionality. This makes a collision much more likely.

The cases where I've seen them is e.g. Iterator vs. Itertools, and in those cases each party is in touch with the other side and can coordinate.

Actually, Itertools is a perfect example for this, thanks for bringing it up. I'll write the following in RFC style so that it's already close to any future RFC amendment (I don't want to lecture you about something you were part of yourself, this is for the general reader who doesn't have any prior knowledge):

Subtraits often narrow the scope of the supertrait and offer additional functionality specific to this narrower scope. In some cases, this additional functionality may later be found to be more general and uplifted to the supertrait. Ideally the method could be copied as is, but the current name collision behavior prevents that.

A similar situation may also arise when the supertrait is in a different crate that can't be modified directly (e.g. the bureaucracy of an upstream change would be too much, the maintainer is against the change, the stabilization process would take a long time but the additional functionality is needed sooner, etc). In this case it would be possible to work around the situation by defining an extension trait in the local crate which is a subtrait of the supertrait, and blanket-implementing it for all supertrait implementors. This way, the functionality can be used by adjusting signatures to use the extension trait.

Later on, the functionality may be found to be useful in general (or the upstream crate's stabilization process may have completed) and the additional functionality is added to the upstream crate. Similarly to the above case, this breaks the downstream crate because of the name collision.

As an example:

Let's assume you have a vector of vectors containing items:

let data = vec![vec![1, 2, 3, 4], vec![5, 6]];

You'd like to iterate over the elements directly, without having to consider the inner vectors. That is, you'd like the iteration sequence to be [1, 2, 3, 4, 5, 6].

Now, some smart Rust developer notices that this functionality would be possible to add to Iterator via a flatten method. Since they can't modify Iterator directly, they create their own trait:

trait IteratorExt : Iterator {
	fn flatten(self) -> Flatten<Self>
	where Self: Sized, Self::Item: IntoIterator {
		Flatten::new(self)
	}
}

// definition of Flatten elided here for brevity

impl<T: ?Sized> IteratorExt for T where T: Iterator { }

and then blanket-implement IteratorExt for all iterators. The developer publishes it as a library on crates.io, so that others can use the added functionality as well.

The crate becomes quite popular and a lot of people come to use it. At some point, the Rust maintainers notice the usefulness of a flatten method, and that it can be added to the standard library backwards-compatibly. The method gets added to std's Iterator.

However, IteratorExt uses Iterator as a supertrait, and as a result, IteratorExt breaks on the new Rust versions with native support for flatten. Since a lot of crates have the IteratorExt crate as a dependency, the result is widespread ecosystem breakage. The breakage is especially unfortunate in this case since the upstream is Rust itself, and therefore users of IteratorExt will be affected if they simply update their Rust toolchain. This also includes cases where the IteratorExt crate is only a transitive dependency.

The above example is not hypothetical, this seems to be what happened with the itertools crate, with other collisions only avoided by careful name choices. Luckily, in this case the breakage was minor, but the general scenario remains a problem.

Note that while a collision can be avoided through collaboration, this is a burden on the maintainers of both upstream and downstream. In some cases a good alternative name might not exist, which means that breakage would be inevitable. Collaboration may also not always be possible, e.g. if the downstream crate is unmaintained.

While it's possible in theory to scan crates.io for reverse dependencies with potential conflicts, this is not a practical solution. Scanning would require downloading each crate, finding subtraits, and checking their items for collisions. This is cumbersome enough that most maintainers would not find it worth the effort to do this check. If the upstream is Rust itself, the "reverse dependencies" would in effect be every crate on crates.io, making such a manual check virtually impossible.

While a tool could be created to automate this check, it would still require that maintainers actively use it to check for breakage before publishing their changes. This cannot be relied upon.

If a breaking change has already been published, it would also not help to yank the crate from crates.io. Since yanking does not prohibit downloading crates that have already been added to a lockfile, this would not fix the issue for people who have already experienced breakage. It would be necessary to publish a new crate version with the change reverted, or the item renamed to avoid the collision. However, according to semver policies, this would be a major breaking change (renaming or removing an existing item). In contrast, the change that added the item is classified as minor (adding a defaulted item). Crate maintainers may not see this breakage to be reason enough to warrant a major change. They may choose to accept the breakage instead, which is not ideal. And even if they choose to rename the item, the new name could itself collide with other crates again.

It's also important to keep in mind that crates.io is not everything. The issues described above may just as well occur with crates that haven't been published to crates.io. It is not possible to scan for these cases, since they aren't public. If automated scanning were to be implemented, I feel like it would easily be taken as an absolute verdict to go ahead with a breaking change. Care should be taken not to delegate private crates to a "second-class citizen" status simply because they haven't been uploaded to crates.io.

Lastly, it also appears to me like the current behavior in Rust was not explicitly intended, but rather was an unfortunate byproduct of the way supertraits are implemented. The current bound implementation flattens the nested hierarchical structure of traits, and as a result loses information that could be useful. It's quite surprising that referring to an item on a trait would fail, simply because a trait higher up somewhere in the hierarchy happens to have an item with the same name.

This RFC would allow fixing this behavior, and eliminate the potential for name collisions across crates as well.

cc @bluss @jswrenn - As maintainers of itertools, it would be interesting to hear your thoughts.

@carbotaniuman
Copy link

I experienced this myself on an application I was developing. A library I used added a defaulted method in a minor update, which completely broke all of my trait calls due to ambiguity.

In my case, the application was large enough that bulk renaming all uses was nontrivial, but small enough that I don't think they would've catered to me. (The update was also already pushed, so ...).

Regarding name collisions, there are many names that just seem like obvious choices, such as invert() or decompose(), and it highly likely that someone else would also come up with your obvious name.

@burdges
Copy link

burdges commented Jun 7, 2020

We cannot enforce semver, except occasionally through build breakage, so semver cannot address the reverse issue raised by @Nemo157 especially if you're avoiding build breakage. Inherent methods also shadow trait methods though, which creates the same issue.

We could address all ergonomics issue with use statements that "edit" or "rename" their trait imports like

pub mod prelude {
    pub use some_crate::{
        TraitX::{*, x as _},  // Import TraitX except hiding TraitX::x
        TraitY::{y},  // Import only TraitY::y not TraitY's other items
    }
}

We might rename traits during subtrait declarations too, aa pub trait TraitZ : TraitX { pub use TraitZ::{z}; .. }, but not sure this helps anything.

We'd be safer with all shadowing forbidden outright, but adding inherent trait implementations and maybe this trait editing.

I'll caution that reexports plus trait editing enables yet another issue: If crate Bar reexports a trait from crate Foo, but then later edits the trait, and provides an inherent method by the same name, then behavior changes. I'm less worried about this because (a) you need not trust reexports if you configure cargo carefully, and (b) you've implicitly expressed above average trust when using reexports. We should also migrate from rustc reexports to safer cargo level reexports, meaning crate foo copies its crate caz dependency from its crate bar dependency, perhaps while adapting namespaces, but that's way off topics.

From context, I understand you're .. interested in avoiding code that is not explicitly disambiguated. I'd be interested to hear your rationale for this.

We've should expect dependencies to increasingly be malware vectors, although so far I only know about NPM cases. We've ample avenues for bugdoors of course, but shadowing provides among the most attacker friendly options.

There is a push to forbid #[no_mangle] under #![forbid(unsafe_code)] and some push to forbid unsafe in dependencies. It'd similarly be natural to forbid shadowing throughout your dependencies.

@lcdr
Copy link
Contributor Author

lcdr commented Jun 8, 2020

If I understand your comment correctly, your primary concern is shadowing introducing a potential vector for malicious code, by allowing dependencies to alter downstream behavior when a method is called. While this is technically possible, I don't see how shadowing substantially changes this problem compared to today: dependencies being able to alter behavior is already possible with every dependency method call, since you are, well, calling dependency code. If we were to consider that a serious malware issue, then every crate using dependencies would be affected today. If you're using dependencies, you generally trust the dependency not to act maliciously.

@burdges
Copy link

burdges commented Jun 8, 2020

Yes, any dependency can alter behavior anywhere in the program, especially using unsafe code. It's also true subtle bugdoor opportunities abound regardless, but an innocent looking bugdoor that passes code review become considerably easier with shadowing though. I now think trait import editing cannot address this problem either though.

I'd kinda suggest that traits "more" in scope should shadow traits "less" in scope, with errors whenever collisions occur, but also permit reimporting to force shadowing.

pub trait A { 
    fn foo() {} 
    fn bar() {
        Self::foo() // Invokes A::foo() since only A in scope
    }
}
pub trait B : A {
    fn foo() {
        use A;
        Self::foo();  // Invokes A::foo() since shadowing permitted by reimportting A over B
    }
    fn baz() {
        Self::foo() // Error, both A and B are fully in scope
    }
}

pub struct SomeB;
impl A for Some B { }
impl B for Some B { }

fn bar() {
    SomeB::foo();  // Error, both A and B are fully in scope, but bar and baz allowed here
}

fn bar() {
    use A;
    SomeB::foo();  // Invokes A::foo() since shadowing permitted by reimportting A over B, bar and baz allowed here
}

We'd handle inherent item shadowing by warning about inherent items calls that shadow traits in scope and adding inherent trait implementations and/or negative import use !A;.

It's possible that'd prove too confusing however, not sure. In any case, we should've lints that forbid shadowing not just within the current crate, but anywhere within its dependency graph. As crates mature they add more "quality" lints, including this one.

@Ixrec
Copy link
Contributor

Ixrec commented Jun 8, 2020

If we want to consider proposals where additional kinds of shadowing are "allowed" but only work when call sites explicitly specify which impl, I don't think giving use these semantics makes any sense. The language already has <SomeB as A>::foo(); syntax for exactly this sort of disambiguation, and that's far clearer.

Regardless of whether we accept this particular RFC, I do think it'd be a good idea to have a lint (presumably starting out as a "pedantic" clippy lint) that detects when there are multiple methods in scope for a method call and suggests using the <Type as Trait> syntax, even if today's rules would've unambiguously picked one of those methods and allowed it to compile. (I did check the clippy lint list and didn't see anything like this already there)

@lcdr
Copy link
Contributor Author

lcdr commented Jun 8, 2020

I'd kinda suggest that traits "more" in scope should shadow traits "less" in scope, with errors whenever collisions occur, but also permit reimporting to force shadowing.

This is actually exactly what the RFC proposes. As lxrec mentions, disambiguation is done using <SomeB as A>::foo() instead of use notation (this is already how it works today, and this RFC doesn't change that). Also, this RFC is solely about generic bounds, and use scoping is completely unchanged by this RFC.
Other than that, your examples basically all work under this RFC:

// Sub is "more" in scope than Super, since Super is not mentioned in the trait bound:
fn generic_fn<S: Sub>(x: S) {
    // This:
    x.foo();
    // is the same as:
    Sub::foo(&x);
    // also still possible:
    Super::foo(&x);
}
// Both Sub and Super are in scope, disambiguation needed
fn generic_fn<S: Sub+Super>(x: S) {
    // Error: both Sub::foo and Super::foo are in scope
    x.foo();
}

Regarding your example with scoping inside the trait definition, the RFC is actually more strict here and always requires explicit disambiguation with no shadowing possible.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Nov 2, 2020
@joshtriplett
Copy link
Member

Overall we felt that the motivation of avoiding breaking changes in super-traits was a strong one

Complete agreement here; this was very much the point that convinced me as well.

@rfcbot reviewed

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Nov 9, 2020

I've read the RFC and I'm in favor of 99% of it.

I was somewhat surprised about the behavior within a trait definition:

trait Super { fn foo(&self) { } }

trait Sub: Super {
    fn foo(&self) { }
    fn bar(&self) { self.foo(); } // ambiguous
}

I think I would've expected that to invoke Sub::foo. I'm curious if others had a similar reaction. Making this ambiguous is backwards compatible though.

It will be a bit difficult to implement -- the compiler thinks of a trait definition like trait Trait { } as having one where clause in scope (where Self: Sub, essentially), from which the superclasses etc flow naturally. But we can no doubt adjust this if this is indeed the behavior that folks would expect.

@jeekobu
Copy link

jeekobu commented Mar 3, 2021

I can imagine accidentally recursing by writing

trait Super { fn foo(&self) { } }

trait Sub: Super {
    fn foo(&self) {
        do_extra_stuff(self);
        self.foo();
    }
}

@lcdr
Copy link
Contributor Author

lcdr commented Mar 6, 2021

When writing the RFC, I specified the behavior inside trait definitions to be the same as in other contexts for consistency, and to avoid implicitly resolving ambiguities when more than one trait is directly mentioned. I can however see how one could expect a call to self.foo() to be resolved to the trait definition's method. Personally, I'm more in favor of keeping the consistency, and if it is found that resolving it is the better solution, it can still be changed later, as the current behavior is forwards-compatible.


The RFC has been in three-reviews-outstanding-limbo since November, if there's anything I can do to help move it along, please let me know. I'm happy to see that the lang team seems to be generally in favor of my suggestion, and if anyone has any concerns left I'd love to discuss them further. I see there's an RFC backlog recap meeting scheduled soon, maybe this could be brought up there.

@joshtriplett
Copy link
Member

Yeah, I think treating such method calls as ambiguous seems like the forward-compatible solution, and we can always reconsider that later. It doesn't seem like that needs to be a blocker here.

@nikomatsakis Do you consider this a blocking issue?

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 6, 2021

@rfcbot concern supertrait

I'm registering a concern until I have time to re-read my comment and remember what I thought.

@lcdr
Copy link
Contributor Author

lcdr commented May 17, 2021

Has there been any progress on this?
If any input is needed from me as the RFC's author, I'd be happy to assist.

@nikomatsakis
Copy link
Contributor

@lcdr I'm supposed to review and I've been constantly busy -- I'm sorry about that.

@nikomatsakis
Copy link
Contributor

@rfcbot resolve supertrait

Based on our @rust-lang/lang meeting today, I am feeling good about this.

@rfcbot rfcbot added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Jun 1, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Jun 1, 2021

๐Ÿ”” This is now entering its final comment period, as per the review above. ๐Ÿ””

@rfcbot rfcbot removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label Jun 1, 2021
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Jun 11, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Jun 11, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@lcdr
Copy link
Contributor Author

lcdr commented Jul 15, 2021

Now that this RFC is ready to be merged, what would be necessary to move ahead with the next steps?

@danielhenrymantilla
Copy link

danielhenrymantilla commented Aug 24, 2021

I believe the case of breakage because of name collision is good enough of a motivation to avoid the hard error on that very case ๐Ÿ‘; but there are a bunch of legitimate drawbacks with the shadowing as well:

  • T : Sub and T : Sub + Super behaving different does not feel right, at all. In that regard, saying that "the user of Sub may not be aware of Super" is actually factually wrong, or else we could have things like pub trait Sub : PrivateSuper. That is, a super trait is very much part of the API of the sub-trait. T : Sub thus means exactly the same as T : Sub + Super. So, if anything, the Sub + Super behavior ought not to fail either.
  • @Nemo157's concern still stands, as well as any other comments here where it wasn't clear which method was gonna be used.
    This makes me think that whilst removing the hard error is necessary to avoid "silly breakage", I believe that instances of such method shadowing ought to trigger a warn-by-default lint about it. This would solve the silly breakage, while still disincentivizing any form of shadowing, as per the many pitfalls and surprising behaviors it entails.

On the other hand, there is a super silly case of actually-no-shadowing which I would love to see resolved with this attempt:

trait Trait {
    fn method(&self) where Self : Sized {}
}

impl dyn Trait + '_ {
    fn method(&self) {}
}

fn f(obj: &(dyn Trait + '_)) {
    obj.method(); // multiple applicable items in scope???
}

@lcdr
Copy link
Contributor Author

lcdr commented Sep 1, 2021

I believe the case of breakage because of name collision is good enough of a motivation to avoid the hard error on that very case ๐Ÿ‘; but there are a bunch of legitimate drawbacks with the shadowing as well:

  • T : Sub and T : Sub + Super behaving different does not feel right, at all. In that regard, saying that "the user of Sub may not be aware of Super" is actually factually wrong, or else we could have things like pub trait Sub : PrivateSuper. That is, a super trait is very much part of the API of the sub-trait. T : Sub thus means exactly the same as T : Sub + Super. So, if anything, the Sub + Super behavior ought not to fail either.

There are unfortunately drawbacks with any decision on this. Making the Sub + Super case use shadowing as well may make things more consistent in the sense that Sub: Super today means the same thing as Sub + Super, but it makes things more inconsistent for the case of unrelated traits with conflicting items. A trait bound of Foo + Bar , where Foo and Bar are completely unrelated, but happen to have methods with the same name, would (correctly) raise an error, despite having the same syntax as the Sub + Super case. Moreover, it would require the user to check the trait hierarchy to reason about this behavior, which is a significant additional mental burden, especially since this behavior is transitive and may only happen because of a difference in a far-removed supertrait. The RFC is designed to give preference to the syntax at the user site, so the user can reason based on the trait bound they're writing themselves, instead of a distant trait hierarchy leading to confusing behavior.

In addition, I'd like to note that this point is entirely about behavior that exists in Rust today, and which is not affected by this RFC. The RFC is thus backwards-compatible in this aspect, and any suggested changes to the Sub + Super case would be a better fit for a new RFC.

  • @Nemo157's concern still stands, as well as any other comments here where it wasn't clear which method was gonna be used.

@Nemo157's scenario is already classified as a semver-breaking change, which makes it highly disincentivized today. There is one exception to this for the case of defaulted methods, where the current semver rules may currently permit a change that, while non-breaking in compilation terms, could silently change method behavior. However, the semver RFC (#1105) does not seem to consider behavior changes to be major enough to warrant addressing them:

For the case of inherent implementations, which have very similar shadowing concerns:

Inherent implementations
[...]
It's worth noting, however, that if the signatures did happen to match then the change would no longer cause a compilation error, but might silently change runtime behavior. The case where the same method for the same type has meaningfully different behavior is considered unlikely enough that the RFC is willing to permit it to be labeled as a minor change -- and otherwise, inherent methods could never be added after the fact.

And generally:

Behavioral changes

This RFC is largely focused on API changes which may, in particular, cause downstream code to stop compiling. But in some sense it is even more pernicious to make a change that allows downstream code to continue compiling, but causes its runtime behavior to break.

This RFC does not attempt to provide a comprehensive policy on behavioral changes, which would be extremely difficult. In general, APIs are expected to provide explicit contracts for their behavior via documentation, and behavior that is not part of this contract is permitted to change in minor revisions. (Remember: this RFC is about setting a minimum bar for when major version bumps are required.)

Thus I believe this RFC's interaction with semantic versioning is minor enough that the current semver rules cover all cases.

Going back to your comment:

This makes me think that whilst removing the hard error is necessary to avoid "silly breakage", I believe that instances of such method shadowing ought to trigger a warn-by-default lint about it. This would solve the silly breakage, while still disincentivizing any form of shadowing, as per the many pitfalls and surprising behaviors it entails.

Unfortunately, such a lint would not be able to solve the main scenario this RFC addresses: that of a supertrait adding items after the fact, with subtraits already depending on it. In the crate of the supertrait, the lint would not be able to detect any shadowing, since it is not aware which crates reference the trait as a supertrait. In the crate of the subtrait, it would be able to detect the shadowing, but the subtrait crate owner would not be able to do anything about it, since they can't change the supertrait crate, and can't change the subtrait's items without introducing a breaking change, which this RFC is designed to avoid in the first place. Thus, such a lint would be unable to disincentivize the introduction of shadowing in this case, at least.

However, there is a reasonable argument to be made for a lint detecting shadowing at the user site, where a UFCS error would previously be raised. Shadowing may indeed introduce behavior that may be surprising to the user. However, Rust already features shadowing for local variables as well as the very similar case of inherent implementations shadowing trait implementations. Neither case currently has a lint in the Rust compiler, neither warn-by-default or at all. Even Clippy only seems to only have a couple lints for variable shadowing, and none for inherent impl shadowing. Therefore I would argue that such a lint would be a better fit for Clippy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-trait-object Proposals relating to trait objects. A-traits Trait system related proposals & ideas disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.