-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better strict version hash (SVH) computation #14145
Conversation
if i.vis == ast::Public { | ||
SawItem.hash(self.st); walk_item(self, i, e); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use the results of the reachability pass instead of ast::Public
due to things like reexports and generic functions and such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i might just do the hash unconditionally in visit_item
; e.g. a non-pub function may still get inlined in a downstream crate, and thus we should include it in the SVH, right? (I suppose the answer to that question depends on what I can learn via the reachability pass; I still need to look at that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reachability pass will return all possibly user-facing functions, which should be what you need here. For example, we use the results of reachability to determine what symbols should be visible in the object file (others can link to), which is, in essence, the public ABI.
I'm a little worried about this approach because it seems like it's very easy to sweep ABI-breaking changes under the rug by accident. For example, if a new method is added to the I do think that it's the best approach, however, as hashing the AST just doesn't cut it. I think we'll just have time tell how this approach works. Could you add some tests along the same lines of |
} | ||
|
||
fn visit_ty(&mut self, t: &Ty, e: E) { | ||
SawTy.hash(self.st); walk_ty(self, t, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that this doesn't actually hash the type itself, what if the type changes from u16
to u8
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, true. Let me experiment and see if I can fix that.
I've pointed out a few places where I think the hashing isn't necessarily sufficient. I'm worried about missing cases (I'm certainly no exhaustive check). Maybe we want to be conservative on this visitor approach... |
I just realized, some of your comments seem to indicate that you are worried about certain substructure being skipped (e.g. local definition, the patterns on match arms, etc). But I think all of those cases are covered by the recursive call to In any case, I will make sure to add tests for those examples. |
(the previous version was pretty buggy in various nasty ways and probably collapsed too many distinct texts to the same hash, largely due to the mistaken way I incorporated visibility that alex pointed out. I will have a new PR up tomorrow that does a better job.) |
(added plus: writing tests forced me to discover that the PR as written here was still hashing on spans in the |
I will reopen this when it is more fully baked. |
more fully baked now. :) |
r? @alexcrichton or @huonw (do note if you grab it so that we don't have needless duplicate review effort) |
I'll glance over it quickly now, but I've only got 15 minutes or so. |
fn got_path(&mut self, path: &Path, id: ast::NodeId, e: &E) -> After; | ||
} | ||
|
||
pub struct ForcedVisitor<HookState /*:ForcedVisitorHooks*/>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have a fixme? or is it just for documentation purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commented out bound? Its for documentation for now.
I guess with planned future changes to the language then we could uncomment it. I guess that's your point, that I should add a FIXME comment with a pointer to the RFC and/or tracking issue for the change to add trait-bounds to struct's type params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it just seems a little weird to have a commented-out piece of code with no explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see a tracking issue in the Rust repo for RFC 0011. I would consider adding a FIXME pointing to such a tracking issue, but without the tracking issue I'm not sure if I'd bother.
From my quick glance it seems fine to me, but I won't be able to read over it properly for 18+ hours (or more). |
@@ -61,7 +61,37 @@ pub fn generics_of_fn(fk: &FnKind) -> Generics { | |||
} | |||
} | |||
|
|||
/// Each method of the Visitor trait is a hook to be potentially | |||
/// overriden. Each method's default implementation recursvely visits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recursively
I'm curious what you think about the introduction of a whole new visitor, but other than that this looks fantastic to me! |
@alexcrichton Yeah I have gone back and forth about whether the new visitor is worth it. (The third option, assuming the first is Visitor and the second is ForcedVisitor, would be to not use a visitor at all, and instead write the code as a set of recursive functions that match the various enums directly. I briefly started on a rewrite to that pattern this morning, and quickly abandoned it: avoiding the boilerplate for the recursive traversals really is worth it.) Anyway, as for this PR: I am pretty happy with how the SVH client code looks in the end with the new I guess the right thing for me to do now would be to survey the existing |
@nikomatsakis You may have an opinion on the discussion regarding the Visitor API; we discussed it briefly during our meeting yesterday. |
That sounds good to me. |
r=me when you've completed your survey |
The summary is: I don't think I can easily apply According to my survey:
So, I will revise the PR to take out |
Closes rust-lang#14210 (Make Vec.truncate() resilient against failure in Drop) Closes rust-lang#14206 (Register new snapshots) Closes rust-lang#14205 (use sched_yield on linux and freebsd) Closes rust-lang#14204 (Add a crate for missing stubs from libcore) Closes rust-lang#14201 (Render not_found with an absolute path to the rust stylesheet) Closes rust-lang#14198 (update valgrind headers) Closes rust-lang#14174 (Optimize common path of Once::doit) Closes rust-lang#14162 (Print 'rustc' and 'rustdoc' as the command name for --version) Closes rust-lang#14145 (Better strict version hash (SVH) computation)
(rebasing) |
…lts do. drive-by: added some doc.
In particular, this version of strict version hash (SVH) works much like the deriving(Hash)-based implementation did, except that uses a content-based hash that filters rustc implementation artifacts and surface syntax artifacts. Fix rust-lang#14132.
Namely: non-pub `use` declarations *are* significant to the SVH computation, since they can change which traits are part of the method resolution step, and thus affect which methods get called from the (potentially inlined) code.
(Only after adding the tests did I realize that this is not really a special case at the AST level; as far as the visitor is concerned, `int` and `i32` and `i64` are just idents.)
…excrichton Teach SVH computation to ignore more implementation artifacts. In particular, this version of strict version hash (SVH) works much like the deriving(Hash)-based implementation did, except that it deliberately: 1. skips over content known not affect the generated crates, and, 2. uses a content-based hash for names instead of using the value of the `Name` index itself, which can differ depending on the order in which strings are interned (which in turn is affected by e.g. the presence of `--cfg` options on the command line). Fix #14132.
@pnkfelix @alexcrichton In particular, non-crate attributes seem to be intentionally not hashed at all, but they can easily affect ABI ( |
It's probably "good enough" today to get the job done, but I also wouldn't be confident in saying that the SVH reflects the "true ABI" of a crate either, so in that sense I'm sure there's a few bugs here and there :) |
Teach SVH computation to ignore more implementation artifacts.
In particular, this version of strict version hash (SVH) works much
like the deriving(Hash)-based implementation did, except that it
deliberately:
the
Name
index itself, which can differ depending on the orderin which strings are interned (which in turn is affected by
e.g. the presence of
--cfg
options on the command line).Fix #14132.