-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile-Time Performance Regression #37864
Comments
Hmm, yeah, not really expected. |
This is the diff that affected https://gist.github.com/nikomatsakis/ca47ebbcd264452539074899b6d09355 Not seeing yet what might have caused such a perturbation. |
@eddyb: You also had a look at that PR, any ideas? |
Is it possible the PR increased the amount of code in librustc and hit a pathological case in some way? This shows that most crates had little-to-no difference except for librustc itself, which jumped up by ~30 seconds. |
The syntex-syntax test case also shows the regression, so there's definitely something to it. |
It looks like I was wrong with my initial assessment about rustc being the only one to show the increase, but this graph shows that it has the largest increase by far out of most rustc crates in that pass. |
I'd suggest comparing with callgrind: if the number of calls related to inference, for example, change, well... My suspicion is basically "impl children get checked twice" but then tests couldn't pass because errors would also be doubled? I'm not sure. |
The 142M instructions executed is up from 124M. I've not seen If the problem can't be found soon I suggest reverting #37660. |
So I did some experimentation with a small test file: it's certainly not as simple as something in typeck happening twice, or at least if it is I didn't figure out what yet. @nnethercote's samples suggest something about region inference, but we are not running regionck more often, as far as I can tell (nor typeck). Or at least we don't do so on a very simple test case. I will try experimenting with some bigger ones.
I'm not ready to revert yet. Please consult with me before considering such a thing. |
OK, I may have found the culprit. |
Fix in #37920 |
I wouldn't presume to revert, but my statement was intended to mark the beginning of such a consultation :) |
winapi was totally hit by this. |
And I confirmed that #37920 fixes winapi build times back to ~35s, the same that it takes with stable. |
The `visit_fn` code mutates its surrounding context. Between *items*, this was saved/restored, but between impl items it was not. This meant that we wound up with `CallSiteScope` entries with two parents (or more!). As far as I can tell, this is harmless in actual type-checking, since the regions you interact with are always from at most one of those branches. But it can slow things down. Before, the effect was limited, since it only applied to impl items within an impl. After rust-lang#37660, impl items are visisted all together at the end, and hence this could create a very messed up hierarchy. Isolating impl item properly solves both issues. I cannot come up with a way to unit-test this; for posterity, however, you can observe the messed up hierarchies with a test as simple as the following, which would create a callsite scope with two parents both before and after ``` struct Foo { } impl Foo { fn bar(&self) -> usize { 22 } fn baz(&self) -> usize { 22 } } fn main() { } ``` Fixes rust-lang#37864.
in region, treat current (and future) item-likes alike The `visit_fn` code mutates its surrounding context. Between *items*, this was saved/restored, but between impl items it was not. This meant that we wound up with `CallSiteScope` entries with two parents (or more!). As far as I can tell, this is harmless in actual type-checking, since the regions you interact with are always from at most one of those branches. But it can slow things down. Before, the effect was limited, since it only applied to impl items within an impl. After #37660, impl items are visisted all together at the end, and hence this could create a very messed up hierarchy. Isolating impl item properly solves both issues. I cannot come up with a way to unit-test this; for posterity, however, you can observe the messed up hierarchies with a test as simple as the following, which would create a callsite scope with two parents both before and after ``` struct Foo { } impl Foo { fn bar(&self) -> usize { 22 } fn baz(&self) -> usize { 22 } } fn main() { } ``` Fixes #37864. r? @michaelwoerister cc @pnkfelix -- can you think of a way to make a regr test?
in region, treat current (and future) item-likes alike The `visit_fn` code mutates its surrounding context. Between *items*, this was saved/restored, but between impl items it was not. This meant that we wound up with `CallSiteScope` entries with two parents (or more!). As far as I can tell, this is harmless in actual type-checking, since the regions you interact with are always from at most one of those branches. But it can slow things down. Before, the effect was limited, since it only applied to impl items within an impl. After #37660, impl items are visisted all together at the end, and hence this could create a very messed up hierarchy. Isolating impl item properly solves both issues. I cannot come up with a way to unit-test this; for posterity, however, you can observe the messed up hierarchies with a test as simple as the following, which would create a callsite scope with two parents both before and after ``` struct Foo { } impl Foo { fn bar(&self) -> usize { 22 } fn baz(&self) -> usize { 22 } } fn main() { } ``` Fixes #37864. r? @michaelwoerister cc @pnkfelix -- can you think of a way to make a regr test?
in region, treat current (and future) item-likes alike The `visit_fn` code mutates its surrounding context. Between *items*, this was saved/restored, but between impl items it was not. This meant that we wound up with `CallSiteScope` entries with two parents (or more!). As far as I can tell, this is harmless in actual type-checking, since the regions you interact with are always from at most one of those branches. But it can slow things down. Before, the effect was limited, since it only applied to impl items within an impl. After #37660, impl items are visisted all together at the end, and hence this could create a very messed up hierarchy. Isolating impl item properly solves both issues. I cannot come up with a way to unit-test this; for posterity, however, you can observe the messed up hierarchies with a test as simple as the following, which would create a callsite scope with two parents both before and after ``` struct Foo { } impl Foo { fn bar(&self) -> usize { 22 } fn baz(&self) -> usize { 22 } } fn main() { } ``` Fixes #37864. r? @michaelwoerister cc @pnkfelix -- can you think of a way to make a regr test?
I remeasured and I can confirm the regression is fixed. Performance on rustc-benchmarks is basically equivalent to what it was on Nov 14, my last measurement prior to the regression. |
🎉 |
#37660 appears to have regressed performance by ~6% on bootstrap, due to a near tripling in time for item-bodies checking (23s to 62s). I'm not sure if that was expected or not, but someone should probably investigate. Let me know if I should open a new issue about that.
See here for a comparison across all crates.
cc @nikomatsakis
The text was updated successfully, but these errors were encountered: