-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Supertrait item shadowing #2845
Conversation
I might be surprising that the following functions fn generic_fn<S: Super>(x: S) {
x.foo();
}
fn use_trait_obj(x: Box<dyn Super>) {
x.foo();
} would call I also always have hated method hiding in C#, which I've only ever seen to cause problems. |
The code you posted is actually not affected by this RFC. The generic function already compiles today, and resolves impl Super for i32 {
fn foo(&self) { println!("super"); }
}
impl Sub for i32 {
fn foo(&self) { println!("sub"); }
}
fn generic_fn<S: Super>(x: S) {
x.foo();
}
fn sub_generic_fn<S: Sub>(x: S) {
generic_fn(x);
}
fn main() {
let x: i32 = 42;
sub_generic_fn(x); // prints "super"
} The trait object case is a bit different. It fails to compile, but not because of ambiguity, but because Rust doesn't have trait object upcasting. fn use_trait_obj(x: Box<dyn Super>) {
x.foo();
}
fn sub_use_trait_obj(x: Box<dyn Sub>) {
use_trait_obj(x); // error: expected trait `Super`, found trait `Sub`
} The behavior in these cases is not changed by this RFC. I'm afraid that in this sense, the decision of shadowing vs overriding has already been made by current Rust. Changing it would be backwards-incompatible, as it would change the existing behavior of generic functions. I've noted the potential for confusion for OOP users in the drawbacks section of the RFC. Unfortunately, this potential for confusion already exists in Rust today. |
In the next edition, we should make a subtrait shadowing any name from a super trait become a hard error, so this code should not compile:
|
Note that a name conflict situation doesn't always arise from a subtrait adding a method with the same name as in the supertrait, but can arise when a supertrait adds a method that already happens to be in a subtrait. This fragile base class problem is the main motivation for this RFC, which aims to avoid breakage in this case. Making name conflicts a hard error would instead make the breakage worse than it is currently. Today, a supertrait adding a conflicting method is a breaking but minor change, since users can disambiguate using UFCS. With a hard error, this would not be possible, and the subtrait would be forced to change its method's name, which is a major breaking change. |
I see, fair enough. It should then warn whenever you call the method without using UFCS with the subtrait in scope. |
Right now not using UFCS will raise a hard error. From context, I understand you're not proposing to replace the error with a warning, but are more interested in avoiding code that is not explicitly disambiguated. I'd be interested to hear your rationale for this. Consider the following situations, as laid out in the RFC: fn generic_fn<S: Sub>(x: S) {
x.foo();
} In this case it seems quite intuitive that Perhaps you're more concerned about a more ambiguous situation, such as both fn generic_fn<S: Sub+Super>(x: S) {
x.foo();
} Under this RFC, shadowing would not apply in this case, as This is just a brief summary, the RFC text discusses a few more cases, if I didn't list the one you're concerned about. |
If I understand the current situation, if you bring the subtrait into scope then calling At first blush, I'd suggest making the generic function situation stricter so that it matches the non-generic function situation. It's tricky however what if I write:
I'd say this should hard error without UFCS, even without the call to As an aside, we could error if
into
but that's insufficient for your example. I'm unsure how aggressive rustc should fight this, but maybe clippy could go further?
Ambiguity makes reading code almost impossible, which makes correct code almost impossible. I'd maybe phrase the compromise goal like: If a human reads the code without compiler assistance, and finds one valid resolution, then either (a) they found the only valid resolution, or (b) the resolution they found explicitly indicates where one searches for other valid resolutions. We push (b) fairly heavily already:
All these (b) cases become quite painful when identifying the applicable impl requires careful type checking. I think you're proposal fails (b), and so does the current behavior maybe, because the incoherence lacks any warnings. I do think your note about favoring the supertrait improves this considerably, except many traits come with numerous supertraits, and devs from OO languages expect the dangerous opposite, like @mcarton notes. |
My reply to mcarton may have confused you here. mcarton's example is not actually affected by this RFC. To summarize current behavior:
Again, this is unaffected by and unrelated to the RFC. This RFC only discusses what should happen for item resolution when the subtrait is in scope.
I agree with you, a human should be able to find the resolution that will be chosen by the compiler without compiler assistance. To do this, a human will need to look at a list of places that might contain an item with the name they are looking for. The important part is that this RFC should not make this lookup any more complex than it already is. Let's consider the lookup order for code that compiles and resolves to a clear implementation: To find the resolution for an item, a user needs to:
That's all the user needs to do today. Note that inherent impls aren't even necessary to consider, because generics/trait objects already abstract over the specific type. This RFC actually doesn't change this 'algorithm' at all, it just makes more situations compile than before. Specifically, it will allow situations where sub- and supertraits have conflicting item names. Since subtraits shadow supertraits, you still only need to go up the hierarchy until you find a matching name. This name will always be the one that will be chosen. Therefore the user doesn't need to change their way of reasoning, and the newly allowed situations will intuitively make sense to a user who is used to this lookup order. |
I understand now. I like the prejudice towards the explicit bounds, meaning In this vein, I'm now curious about this case where After this change, |
Glad I was able to clear up the confusion. You're completely right, if multiple supertraits are in conflict with each other, it will still be ambiguous. In this case this RFC will require UFCS. See here for more. You're also right that this RFC makes it possible for the subtrait to manually fix the collision, as detailed here. It's true that this would in turn make collisions more likely for |
One situation I havenโt seen mentioned is when a method already exists on the super trait, then a method shadowing it is added to the sub trait. Given code exists like this (probably with each item spread across a set of three crates): trait Super {
fn foo(&self);
}
trait Sub: Super {
}
fn generic_fn<S: Sub>(x: S) {
x.foo();
} It would currently compile. Currently if |
@Nemo157 That is true, but adding a new method onto a trait is already a semver breaking change, so it would require a major version bump to the crate which defines But the situation is more complicated once you add in default methods and sealed traits. So if this RFC is accepted then the semver requirements would probably have to be changed: adding a method which overrides a super method is now always a major breaking change. |
Thanks for pointing out this situation, I hadn't considered the implications for semver in this case. I agree with Pauan, this should be fine for non-defaulted items due to semver, and would require an amendment to the semver rules for defaulted items. If I understand the semver rules correctly, it would currently be classified as minor, since you could have disambiguated using UFCS, but this doesn't seem like a good practice when it can result in silent changes in behavior. So this should probably be amended to be a major change. However, the amendment should likely be more specific than "added shadowing of existing methods is a major change": As mentioned above, there's a situation in which a subtrait can fix an introduced collision between supertraits by shadowing and manually redirecting to the desired supertrait. This kind of fixing should be allowed as a minor change, since it is intended to keep existing code compatible. |
Some notes:
|
I agree that some evidence would make the case more compelling. However, due to the nature of the situation, it does not appear like there is a simple way of assessing the impact of the problem. Quoting my post on the Pre-RFC thread, in response to a possible crater run:
Therefore, I'm not aware of a way of gathering statistical evidence for determining this problem's impact. Without this option, the only evidence I can think of would be authors of high-profile crates finding this RFC and mentioning that this problem has affected them before. This is however quite chance-based and I'm not sure whether this would be counted as evidence ("anecdotes are not data").
The RFC is using the normal notion of "supertrait" as specified in the reference and the book. I thought those sources could be assumed as commonly known, but if you think referencing them explicitly would improve clarity, I'd be happy to add links.
The reference specifies this form to be the same as the supertrait shorthand:
and then demonstrates the equivalence using examples. This RFC does not change these semantics, a supertrait is still declared as a trait bound on the
I see, I hadn't considered unstable features such as trait aliases. However, I do not believe trait aliases would pose a problem, since they don't define new items themselves. Therefore trait aliases would be substituted before this RFC's semantics would apply. E.g: trait Super { foo(&self) {} }
trait Sub: Super { foo(&self) {} }
trait SubAlias = Sub;
trait SubSuperAlias = Sub + Super;
fn foo1<T: SubAlias> (t: T) { t.foo() } // ok, resolves to Sub::foo
fn foo2<T: SubSuperAlias>(t: T) { t.foo() } // error, disambiguation needed (Note: I'm not sure if you wanted this elaboration as a comment or added to the RFC, I can add it as well.) |
Yeah I agree it would be difficult, but I think, and I believe you agree, that the onus is on the RFC to demonstrate that the status quo is sufficiently problematic to warrant language changes and the additional complexity that follows. To elaborate on my skepticism around "sufficiently problematic", I believe that:
It's better to disambiguate and specify the vocabulary used. A reasonable person could infer that
I agree with your elaboration that there shouldn't be a problem. In general, prefer adding things to the text of an RFC as well. |
I agree. I'll try to make my case for this change in the following.
This may not necessarily be the case. While name collisions across two unrelated traits are very rare, sub/supertraits are typically related in functionality. This makes a collision much more likely.
Actually, Itertools is a perfect example for this, thanks for bringing it up. I'll write the following in RFC style so that it's already close to any future RFC amendment (I don't want to lecture you about something you were part of yourself, this is for the general reader who doesn't have any prior knowledge): Subtraits often narrow the scope of the supertrait and offer additional functionality specific to this narrower scope. In some cases, this additional functionality may later be found to be more general and uplifted to the supertrait. Ideally the method could be copied as is, but the current name collision behavior prevents that. A similar situation may also arise when the supertrait is in a different crate that can't be modified directly (e.g. the bureaucracy of an upstream change would be too much, the maintainer is against the change, the stabilization process would take a long time but the additional functionality is needed sooner, etc). In this case it would be possible to work around the situation by defining an extension trait in the local crate which is a subtrait of the supertrait, and blanket-implementing it for all supertrait implementors. This way, the functionality can be used by adjusting signatures to use the extension trait. Later on, the functionality may be found to be useful in general (or the upstream crate's stabilization process may have completed) and the additional functionality is added to the upstream crate. Similarly to the above case, this breaks the downstream crate because of the name collision. As an example: Let's assume you have a vector of vectors containing items: let data = vec![vec![1, 2, 3, 4], vec![5, 6]]; You'd like to iterate over the elements directly, without having to consider the inner vectors. That is, you'd like the iteration sequence to be Now, some smart Rust developer notices that this functionality would be possible to add to trait IteratorExt : Iterator {
fn flatten(self) -> Flatten<Self>
where Self: Sized, Self::Item: IntoIterator {
Flatten::new(self)
}
}
// definition of Flatten elided here for brevity
impl<T: ?Sized> IteratorExt for T where T: Iterator { } and then blanket-implement The crate becomes quite popular and a lot of people come to use it. At some point, the Rust maintainers notice the usefulness of a However, The above example is not hypothetical, this seems to be what happened with the Note that while a collision can be avoided through collaboration, this is a burden on the maintainers of both upstream and downstream. In some cases a good alternative name might not exist, which means that breakage would be inevitable. Collaboration may also not always be possible, e.g. if the downstream crate is unmaintained. While it's possible in theory to scan crates.io for reverse dependencies with potential conflicts, this is not a practical solution. Scanning would require downloading each crate, finding subtraits, and checking their items for collisions. This is cumbersome enough that most maintainers would not find it worth the effort to do this check. If the upstream is Rust itself, the "reverse dependencies" would in effect be every crate on crates.io, making such a manual check virtually impossible. While a tool could be created to automate this check, it would still require that maintainers actively use it to check for breakage before publishing their changes. This cannot be relied upon. If a breaking change has already been published, it would also not help to yank the crate from crates.io. Since yanking does not prohibit downloading crates that have already been added to a lockfile, this would not fix the issue for people who have already experienced breakage. It would be necessary to publish a new crate version with the change reverted, or the item renamed to avoid the collision. However, according to semver policies, this would be a major breaking change (renaming or removing an existing item). In contrast, the change that added the item is classified as minor (adding a defaulted item). Crate maintainers may not see this breakage to be reason enough to warrant a major change. They may choose to accept the breakage instead, which is not ideal. And even if they choose to rename the item, the new name could itself collide with other crates again. It's also important to keep in mind that crates.io is not everything. The issues described above may just as well occur with crates that haven't been published to crates.io. It is not possible to scan for these cases, since they aren't public. If automated scanning were to be implemented, I feel like it would easily be taken as an absolute verdict to go ahead with a breaking change. Care should be taken not to delegate private crates to a "second-class citizen" status simply because they haven't been uploaded to crates.io. Lastly, it also appears to me like the current behavior in Rust was not explicitly intended, but rather was an unfortunate byproduct of the way supertraits are implemented. The current bound implementation flattens the nested hierarchical structure of traits, and as a result loses information that could be useful. It's quite surprising that referring to an item on a trait would fail, simply because a trait higher up somewhere in the hierarchy happens to have an item with the same name. This RFC would allow fixing this behavior, and eliminate the potential for name collisions across crates as well. cc @bluss @jswrenn - As maintainers of itertools, it would be interesting to hear your thoughts. |
I experienced this myself on an application I was developing. A library I used added a defaulted method in a minor update, which completely broke all of my trait calls due to ambiguity. In my case, the application was large enough that bulk renaming all uses was nontrivial, but small enough that I don't think they would've catered to me. (The update was also already pushed, so ...). Regarding name collisions, there are many names that just seem like obvious choices, such as |
We cannot enforce semver, except occasionally through build breakage, so semver cannot address the reverse issue raised by @Nemo157 especially if you're avoiding build breakage. Inherent methods also shadow trait methods though, which creates the same issue. We could address all ergonomics issue with
We might rename traits during subtrait declarations too, aa We'd be safer with all shadowing forbidden outright, but adding inherent trait implementations and maybe this trait editing. I'll caution that reexports plus trait editing enables yet another issue: If crate Bar reexports a trait from crate Foo, but then later edits the trait, and provides an inherent method by the same name, then behavior changes. I'm less worried about this because (a) you need not trust reexports if you configure cargo carefully, and (b) you've implicitly expressed above average trust when using reexports. We should also migrate from rustc reexports to safer cargo level reexports, meaning crate foo copies its crate caz dependency from its crate bar dependency, perhaps while adapting namespaces, but that's way off topics.
We've should expect dependencies to increasingly be malware vectors, although so far I only know about NPM cases. We've ample avenues for bugdoors of course, but shadowing provides among the most attacker friendly options. There is a push to forbid |
If I understand your comment correctly, your primary concern is shadowing introducing a potential vector for malicious code, by allowing dependencies to alter downstream behavior when a method is called. While this is technically possible, I don't see how shadowing substantially changes this problem compared to today: dependencies being able to alter behavior is already possible with every dependency method call, since you are, well, calling dependency code. If we were to consider that a serious malware issue, then every crate using dependencies would be affected today. If you're using dependencies, you generally trust the dependency not to act maliciously. |
Yes, any dependency can alter behavior anywhere in the program, especially using unsafe code. It's also true subtle bugdoor opportunities abound regardless, but an innocent looking bugdoor that passes code review become considerably easier with shadowing though. I now think trait import editing cannot address this problem either though. I'd kinda suggest that traits "more" in scope should shadow traits "less" in scope, with errors whenever collisions occur, but also permit reimporting to force shadowing.
We'd handle inherent item shadowing by warning about inherent items calls that shadow traits in scope and adding inherent trait implementations and/or negative import It's possible that'd prove too confusing however, not sure. In any case, we should've lints that forbid shadowing not just within the current crate, but anywhere within its dependency graph. As crates mature they add more "quality" lints, including this one. |
If we want to consider proposals where additional kinds of shadowing are "allowed" but only work when call sites explicitly specify which impl, I don't think giving Regardless of whether we accept this particular RFC, I do think it'd be a good idea to have a lint (presumably starting out as a "pedantic" clippy lint) that detects when there are multiple methods in scope for a method call and suggests using the |
This is actually exactly what the RFC proposes. As lxrec mentions, disambiguation is done using // Sub is "more" in scope than Super, since Super is not mentioned in the trait bound:
fn generic_fn<S: Sub>(x: S) {
// This:
x.foo();
// is the same as:
Sub::foo(&x);
// also still possible:
Super::foo(&x);
} // Both Sub and Super are in scope, disambiguation needed
fn generic_fn<S: Sub+Super>(x: S) {
// Error: both Sub::foo and Super::foo are in scope
x.foo();
} Regarding your example with scoping inside the trait definition, the RFC is actually more strict here and always requires explicit disambiguation with no shadowing possible. |
Complete agreement here; this was very much the point that convinced me as well. @rfcbot reviewed |
I've read the RFC and I'm in favor of 99% of it. I was somewhat surprised about the behavior within a trait definition: trait Super { fn foo(&self) { } }
trait Sub: Super {
fn foo(&self) { }
fn bar(&self) { self.foo(); } // ambiguous
} I think I would've expected that to invoke It will be a bit difficult to implement -- the compiler thinks of a trait definition like |
I can imagine accidentally recursing by writing trait Super { fn foo(&self) { } }
trait Sub: Super {
fn foo(&self) {
do_extra_stuff(self);
self.foo();
}
} |
When writing the RFC, I specified the behavior inside trait definitions to be the same as in other contexts for consistency, and to avoid implicitly resolving ambiguities when more than one trait is directly mentioned. I can however see how one could expect a call to The RFC has been in three-reviews-outstanding-limbo since November, if there's anything I can do to help move it along, please let me know. I'm happy to see that the lang team seems to be generally in favor of my suggestion, and if anyone has any concerns left I'd love to discuss them further. I see there's an RFC backlog recap meeting scheduled soon, maybe this could be brought up there. |
Yeah, I think treating such method calls as ambiguous seems like the forward-compatible solution, and we can always reconsider that later. It doesn't seem like that needs to be a blocker here. @nikomatsakis Do you consider this a blocking issue? |
Has there been any progress on this? |
@lcdr I'm supposed to review and I've been constantly busy -- I'm sorry about that. |
@rfcbot resolve supertrait Based on our @rust-lang/lang meeting today, I am feeling good about this. |
๐ This is now entering its final comment period, as per the review above. ๐ |
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. The RFC will be merged soon. |
Now that this RFC is ready to be merged, what would be necessary to move ahead with the next steps? |
I believe the case of breakage because of name collision is good enough of a motivation to avoid the hard error on that very case ๐; but there are a bunch of legitimate drawbacks with the shadowing as well:
On the other hand, there is a super silly case of actually-no-shadowing which I would love to see resolved with this attempt: trait Trait {
fn method(&self) where Self : Sized {}
}
impl dyn Trait + '_ {
fn method(&self) {}
}
fn f(obj: &(dyn Trait + '_)) {
obj.method(); // multiple applicable items in scope???
} |
There are unfortunately drawbacks with any decision on this. Making the In addition, I'd like to note that this point is entirely about behavior that exists in Rust today, and which is not affected by this RFC. The RFC is thus backwards-compatible in this aspect, and any suggested changes to the @Nemo157's scenario is already classified as a semver-breaking change, which makes it highly disincentivized today. There is one exception to this for the case of defaulted methods, where the current semver rules may currently permit a change that, while non-breaking in compilation terms, could silently change method behavior. However, the semver RFC (#1105) does not seem to consider behavior changes to be major enough to warrant addressing them: For the case of inherent implementations, which have very similar shadowing concerns:
And generally:
Thus I believe this RFC's interaction with semantic versioning is minor enough that the current semver rules cover all cases. Going back to your comment:
Unfortunately, such a lint would not be able to solve the main scenario this RFC addresses: that of a supertrait adding items after the fact, with subtraits already depending on it. In the crate of the supertrait, the lint would not be able to detect any shadowing, since it is not aware which crates reference the trait as a supertrait. In the crate of the subtrait, it would be able to detect the shadowing, but the subtrait crate owner would not be able to do anything about it, since they can't change the supertrait crate, and can't change the subtrait's items without introducing a breaking change, which this RFC is designed to avoid in the first place. Thus, such a lint would be unable to disincentivize the introduction of shadowing in this case, at least. However, there is a reasonable argument to be made for a lint detecting shadowing at the user site, where a UFCS error would previously be raised. Shadowing may indeed introduce behavior that may be surprising to the user. However, Rust already features shadowing for local variables as well as the very similar case of inherent implementations shadowing trait implementations. Neither case currently has a lint in the Rust compiler, neither warn-by-default or at all. Even Clippy only seems to only have a couple lints for variable shadowing, and none for inherent impl shadowing. Therefore I would argue that such a lint would be a better fit for Clippy. |
๐ผ๏ธ Rendered
Pre-RFC thread
๐ Summary
Change item resolution for generics and trait objects so that a trait bound does not bring its supertraits' items into scope if the subtrait defines an item with this name itself.
This makes it possible to write the following, which is currently rejected by the compiler due to ambiguity: