Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incr.comp.: Create Test Case for Incr. Comp. Hash for function interfaces #36812

Closed
michaelwoerister opened this issue Sep 28, 2016 · 15 comments
Closed
Labels
A-incr-comp Area: Incremental compilation E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.

Comments

@michaelwoerister
Copy link
Member

michaelwoerister commented Sep 28, 2016

This issue is part of a broader effort to implement comprehensive regression testing for incremental compilation. For the tracking issue go to #36350.

Background

For incremental compilation we need to determine if a given HIR node has changed in between two versions of a program. This is implemented in the calculate_svh module. We compute a hash value for each HIR node that corresponds to a node in the dependency graph and then compare those hash values. We call this hash value the Incremental Compilation Hash (ICH) of the HIR node. It is supposed to be sensitive to any change that might make a semantic difference to the thing being hashed.

Testing Methodology

The auto-tests in the src/test/incremental directory all follow the same pattern:

  • Each source file defines one test case
  • The source file is compiled multiple times with incremental compilation turned on, each time with a different --cfg flag, allowing each version to differ via conditional compilation. Each of these versions we call a revision.
  • During each revision, the test runner will make sure that some assertions about the dependency graph of the program hold.
  • These assertions are specified in the source code of the test file as attributes attached to the items being tested (e.g. #[rustc_clean]/#[rustc_dirty]).

The ICH-specific tests use this framework by adhering to the following pattern:

  • There are two versions of the definition under test, one marked with #[cfg(cfail1)] and the second marked with #[cfg(not(cfail1))]. As a consequence, the first revision will be compiled with the first version of the definition, and all other revisions will be compiled with the second version. The two versions are supposed to differ in exactly one aspect (e.g. the visibility of a field is different, or the return of a function has changed).
  • The second definition has a #[rustc_dirty(label="Hir", cfg="cfail2")] attribute attached. This attribute makes the test runner check that a change of the Hir dependency node of the definition has been detected between revisions cfail1 and cfail2. This will effectively test that a different ICH value has been computed for the two versions of the definition.
  • The second definition also has a #[rustc_clean(label="Hir", cfg="cfail3")] attribute. This makes sure that the Hir dependency node (and thus the ICH) of the definition has not changed between revisions cfail2 and cfail3. That's what we expect, because both revisions use the same version of the definition.
  • For definitions that are exported from the crate, we also want to check the ICH of the corresponding metadata. This is tested using the #[rustc_metadata_clean]/#[rustc_metadata_dirty] attributes and works analogous to the Hir case: We add #[rustc_metadata_dirty(cfg="cfail2")] to the second definition to make sure that the ICH of the exported metadata is not the same for the different versions of the definition, and we add #[rustc_metadata_dirty(cfg="cfail3")] to make sure that the ICH is the same for the two identical versions of the definition.

Why are the revisions called "cfail"? That's because of reasons internal to how
the test-runner works. Prefixing a revision with "cfail" tells the test runner to treat the test like a "compile-file" test, that is: compile the test case but don't actually run it (which would be the case for an "rpass" revision). For the ICH tests we need to compile "rlibs", so that we can test metadata ICHs. As a consequence we cannot declare them "rpass". In an additional directive (// must-compile-successfully), we tell the test runner that we actually expect the file to not produce any errors.

Each test case must contain the following test-runner directives and crate attributes at the top:

// must-compile-successfully
// revisions: cfail1 cfail2 cfail3
// compile-flags: -Z query-dep-graph

#![feature(rustc_attrs)]  // <-- let's us use #[rustc_dirty], etc.
#![crate_type="rlib"]     // <-- makes sure that we export definitions

See src/test/incremental/hashes/struct_defs.rs for a full example of such a ICH regression test.

Function Interface Specific ICH Testing

Each of the following things should be tested with its own definition pair:

  • Add a parameter to the function
  • Add a return type to the function
  • Change type of function parameter (i32 => i64)
  • Change type of function parameter (&i32 => &mut i32)
  • Change order of function parameters
  • Make the function unsafe
  • Add extern modifier to function
  • Change extern "C" to extern "rust-intrinsic"
  • Add type parameter to function
  • Add lifetime parameter to function
  • Add trait bound to function type parameter
  • Add builtin bound to function type parameter
  • Add lifetime bound to function type parameter
  • Add second trait bound to function type parameter
  • Add second builtin bound to function type parameter
  • Add second lifetime bound to function type parameter
  • Add an #[inline] attribute to the function
  • Change #[inline] to #[inline(never)]
  • Add a #[no_mangle] attribute to the function
  • Add a #[linkage] attribute to the function
  • Change from regular return type to -> impl Trait (if that works already)

EDIT: Some more cases

  • Change return type of function indirectly by modifying a use statement
  • Change type of function parameter indirectly by modifying a use statement
  • Change trait bound of type parameter indirectly by modifying a use statement
  • Change trait bound of type parameter in where clause indirectly by modifying a use statement
@michaelwoerister michaelwoerister added E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. A-incr-comp Area: Incremental compilation labels Sep 28, 2016
@MathieuBordere
Copy link

@michaelwoerister I would like to take this one up, looks like a good way to start contributing to Rust.

@michaelwoerister
Copy link
Member Author

@MathieuBordere Excellent! I'd start by copying src/test/incremental/hashes/struct_defs.rs and adapting it. If you have any questions, just post them here.

To execute the incremental compilation test suite, run make check-stage1-incremental -j4 in your Rust directory.

@MathieuBordere
Copy link

@michaelwoerister I've got a couple of questions

  1. What to do with failing tests? e.g. Marking the function as unsafe doesn't include the function in the dirty set
  2. A couple of features are experimental and fail to compile -> e.g. the #[linkage]attribute and returning -> impl Trait Do I leave these out?

@jonas-schievink
Copy link
Contributor

Marking the function as unsafe doesn't include the function in the dirty set

Congratulations, your tests have discovered a hole in the incr. comp. logic! You can try to fix it (closing #36914 in the process) or comment the particular part of the test out and put a // FIXME(#36914) there.

A couple of features are experimental and fail to compile

You can add #[feature] directives for them. Tests are allowed to use unstable features, they're supposed to test them.

@michaelwoerister
Copy link
Member Author

What to do with failing tests? e.g. Marking the function as unsafe doesn't include the function in the dirty set

That means that you've discovered a bug there. I've opened #36914 with regard to this. I'd suggest that you comment out that particular case and caption it with something like: // FIXME(#36914): This should work but doesn't yet. I'll fix it soon then. If you are interested, you can also try to fix it yourself (in which case I'd provide you with more information on the actual implementation of the ICH).

A couple of features are experimental and fail to compile -> e.g. the #[linkage]attribute and returning -> impl Trait Do I leave these out?

You can activate those features with a #![feature(...)] crate attribute and should do so. The error messages should tell you exactly which features you need to activate.

@michaelwoerister
Copy link
Member Author

@jonas-schievink :)

@MathieuBordere
Copy link

@michaelwoerister @jonas-schievink

There are actually 4 failing cases:

  • Marking the function unsafe #36914
  • Add extern modifier to function
  • Change extern "C" to extern "rust-intrinsic"
  • Change type of function parameter (&i32 => &mut i32)

@michaelwoerister I would be interested in fixing the bugs myself

@michaelwoerister
Copy link
Member Author

OK, so the reason these tests fail is that the StrictVersionHashVisitor does not incorporate these things into the hash.

The StrictVersionHashVisitor (which is called that for historical reasons) is a HIR visitor, which means that it will walk the compiler's internal representation of the AST. It will always start at an hir::Item (that is, a function, a struct, an impl, etc) and walk all the components of that item, but without recursing into sub-items. So it will walk the expressions and signature of a function, but if there is a nested module or function within that function, that will be ignored. So StrictVersionHashVisitor::visit_item() is our entry point for the algorithm.

The implementation of the hashing visitor follows a specific pattern the workings of which might not be immediately obvious: Instead of hashing things directly, it copies the data that should be hashed into an aggregate value (the SawXXX types) and then hashes that. For example, it is:

fn visit_name(&mut self, span: Span, name: Name) {
        debug!("visit_name: st={:?}", self.st);
        SawIdent(name.as_str()).hash(self.st);
        hash_span!(self, span);
}

instead of something like:

fn visit_name(&mut self, span: Span, name: Name) {
        debug!("visit_name: st={:?}", self.st);
        "ident".hash(self.st)
        name.as_str().hash(self.st);
        hash_span!(self, span);
}

Whether using this pattern is strictly better than not using it, I'm not sure, but it's been there for some time and I'd say we just keep following it.

In general HIR visitors work like this:

  • There is the Visitor trait which provides callbacks for each component of the HIR it visits.
  • The default implementation of these callbacks does nothing except for recursing further down the tree. If you want to have something special happen in one of the visit_xxx methods, you override that method in the impl of your visitor.
  • If you want your visitor to keep recursing after your special code, you have to call the corresponding walk_xxx method to make the visitor do so.

Here's an example:

// This gets called when the visitor encounters a struct field
fn visit_struct_field(&mut self, s: &'tcx StructField) {
    // We record that we saw a struct field
    SawStructField.hash(self.st);
    // We hash the span of the field
    hash_span!(self, s.span);
    // We hash the attributes of the field
    hash_attrs!(self, &s.attrs);
    // Now we recurse further into the StructField structure
    // where the visitor will encounter the name of the field
    // (and hash it via visit_name()) and its type (and hash it
    // via visit_ty())
    visit::walk_struct_field(self, s)
}

Now for the bugs we are trying to fix: The problem is that hashing simply skips those things. The information about extern is in the ABI field of the function or method and it's never hashed anywhere. unsafe can be found in the same places.

I would suggest creating a new SawXXX type for items to solve this issue:

#[derive(Hash)]
enum SawItemComponent {
    SawItemExternCrate,
    SawItemUse,
    SawItemStatic(Mutability),
    SawItemConst,
    SawItemFn(Unsafety, Constness, Abi),
    ...
}

// See saw_expr() for reference
fn saw_item(node: &Item_) -> SawItemComponent {
    match *node {
        ItemExternCrate(..) => SawItemExternCrate,
        ItemUse(..) => SawItemUse,
        ItemStatic(_, mutability, _) => SawItemStatic,
        ItemConst(..) =>SawItemConst,
        ItemFn(_, unsafety, constness, abi, _, _) => SawItemFn(unsafety, constness, abi),
        ...
    }
}

fn visit_item(&mut self, i: &'tcx Item) {
    debug!("visit_item: {:?} st={:?}", i, self.st);
    SawItem(saw_item(&i.node)).hash(self.st);
    hash_span!(self, i.span);
    hash_attrs!(self, &i.attrs);
    visit::walk_item(self, i)
}

That would take care of the extern and unsafe cases for regular functions (which are hir::Items. For methods within traits and impls, you'd need an analogous SawTraitOrImplItemComponent. The same goes for the & versus &mut case. You'd need a SawPatComponent there. Again, the SawPatComponent enum variants only need to contain things that are not visited by the visitor anyway.

@MathieuBordere
Copy link

Thanks for the lengthy write-up, I'm looking into it, but will probably need some time to digest it :-)

@michaelwoerister
Copy link
Member Author

michaelwoerister commented Oct 3, 2016

No problem, it is a lot to digest and read up on in the code.

@MathieuBordere
Copy link

Thanks, I'm almost there, tests are passing now. Last little question, I'm a bit confused by the difference between Builtin and Trait. At the moment I use Send and Sized as Builtin and Eq and Clone as Trait . But I'm not entirely sure what I'm doing ...

@michaelwoerister
Copy link
Member Author

@MathieuBordere The traits/builtins you have chosen are OK. It's actually not that important since the compiler doesn't distinguish the two at the HIR level yet, but better to be on the safe side.

@MathieuBordere
Copy link

MathieuBordere commented Oct 4, 2016

make check is failing with
test [debuginfo-lldb] debuginfo-lldb/union-smoke.rs ... FAILED
will try to figure it out before I create a PR.

@michaelwoerister
Copy link
Member Author

It's unlikely that your changes have something to do with that. I wouldn't put too much time into debugging it.

sophiajt pushed a commit to sophiajt/rust that referenced this issue Oct 6, 2016
…Interfaces, r=michaelwoerister

Mb/36812 ich function interfaces

r? @michaelwoerister

This PR contains fixes for rust-lang#36812 and rust-lang#36914
@michaelwoerister
Copy link
Member Author

Fixed by #36974

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-incr-comp Area: Incremental compilation E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion.
Projects
None yet
Development

No branches or pull requests

3 participants