Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Avoid re-interning types in outlives checking #67899

Closed
wants to merge 8 commits into from

Conversation

Marwes
Copy link
Contributor

@Marwes Marwes commented Jan 5, 2020

In profiling intern_ty is a very hot function (9% in the test I used). While there does not seem to be a way to reduce the cost of calling we can avoid the call in some cases.

In outlives checking ParamTy and ProjectionTy are extracted from the Ty value that contains them only to later be passed as an argument to intern_ty again later. This seems to be happening a lot in my test with intern_ty called from outlives is at ~6%.

Since all ParamTy and ProjectionTy are already stored in a Ty I had an idea to pass around a View type which provides direct access to the specific, inner type without losing the original Ty pointer. While the current implementation does so with some unsafe to let the branch be elided on Deref, it could be done entirely in safe code as well, either by accepting the (predictable) branch in Deref or by storing the inner type in View as well as the Ty. But considering that the unsafe is trivial to prove and the call sites seem quite hot I opted to show the unsafe approach first.

Based on #67840 (since it touches the same file/lines)

Commits without #67840 https://github.com/rust-lang/rust/pull/67899/files/77ddc3540e52be4b5bd75cf082c621392acaf81b..b55bab206096c27533120921f6b0c273f115e34a

@rust-highfive
Copy link
Collaborator

r? @varkor

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 5, 2020
@jonas-schievink
Copy link
Contributor

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Jan 5, 2020

⌛ Trying commit b55bab2 with merge 4218803...

bors added a commit that referenced this pull request Jan 5, 2020
perf: Avoid re-interning types in outlives checking

In profiling `intern_ty` is a very hot function (9% in the test I used). While there does not seem to be a way to reduce the cost of calling we can avoid the call in some cases.

In outlives checking `ParamTy` and `ProjectionTy` are extracted from the `Ty` value that contains them only to later be passed as an argument to `intern_ty` again later. This seems to be happening a lot in my test with `intern_ty` called from outlives is at ~6%.

Since all `ParamTy` and `ProjectionTy` are already stored in a `Ty` I had an idea to pass around a `View` type which provides direct access to the specific, inner type without losing the original `Ty` pointer. While the current implementation does so with some unsafe to let the branch be elided on `Deref`, it could be done entirely in safe code as well, either by accepting the (predictable) branch in `Deref` or by storing the inner type in `View` as well as the `Ty`. But considering that the unsafe is trivial to prove and the call sites seem quite hot I opted to show the unsafe approach first.

Based on #67840 (since it touches the same file/lines)

Commits without #67840 https://github.com/rust-lang/rust/pull/67899/files/77ddc3540e52be4b5bd75cf082c621392acaf81b..b55bab206096c27533120921f6b0c273f115e34a
@Zoxc
Copy link
Contributor

Zoxc commented Jan 5, 2020

Did you try just passing along the Ty<'tcx>?

@Marwes
Copy link
Contributor Author

Marwes commented Jan 5, 2020

Did you try just passing along the Ty<'tcx>?

No, my roundtrip for profiling are >1.5 h with the computers I have access to atm so trying variations isn't very feasible. I would be surprised if it gave a measurable effect, as I don't believe the values are passed around enough to cause the extra memcpy/register copies to matter. It is definitely slower though not matter how minuscule (hence why I submitted it as is for feedback, also I only thought of that variant while writing the PR message).

@bors
Copy link
Contributor

bors commented Jan 5, 2020

☀️ Try build successful - checks-azure
Build commit: 4218803 (4218803e644995081642c893d4efd9b8dad60ee5)

@rust-timer
Copy link
Collaborator

Queued 4218803 with parent 7785834, future comparison URL.

@varkor
Copy link
Member

varkor commented Jan 6, 2020

Perf shows some regressions and no improvements. If you've made some changes, I can trigger another perf run.

@bors
Copy link
Contributor

bors commented Jan 6, 2020

☔ The latest upstream changes (presumably #67886) made this pull request unmergeable. Please resolve the merge conflicts.

@Zoxc
Copy link
Contributor

Zoxc commented Jan 6, 2020

The perf run should be done without #67840. It looks like an improvement diffing the diffs =P

@Marwes
Copy link
Contributor Author

Marwes commented Jan 7, 2020

Separated this out from #67840 since that one looks problematic.

@varkor
Copy link
Member

varkor commented Jan 7, 2020

@bors try @rust-timer queue

@bors
Copy link
Contributor

bors commented Jan 7, 2020

⌛ Trying commit 4e77efb41bec21ddef3223e484eaa21f531e6942 with merge bd311dcdd7b31daa87c22040bf0895a276ceffda...

@bors
Copy link
Contributor

bors commented Jan 8, 2020

☔ The latest upstream changes (presumably #67970) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Jan 8, 2020

☀️ Try build successful - checks-azure
Build commit: bd311dcdd7b31daa87c22040bf0895a276ceffda (bd311dcdd7b31daa87c22040bf0895a276ceffda)

@lqd
Copy link
Member

lqd commented Jan 8, 2020

@rust-timer build bd311dcdd7b31daa87c22040bf0895a276ceffda

@rust-timer
Copy link
Collaborator

Queued bd311dcdd7b31daa87c22040bf0895a276ceffda with parent 8597644, future comparison URL.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 8, 2020

Regression still persists, so I guess one of the refactorings actually caused it instead of the main change from #67840 . Will try to isolate it, it might just be that regions get added in a slightly different way which happens to further regress the slow loop in lexical_region_resole. #68001 should fix that to some extent at least.

@bors
Copy link
Contributor

bors commented Jan 10, 2020

☔ The latest upstream changes (presumably #68078) made this pull request unmergeable. Please resolve the merge conflicts.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 10, 2020

Rebased and removed the commits which causes the regression

@rust-highfive
Copy link
Collaborator

The job mingw-check of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2020-01-10T10:53:34.9640843Z ##[command]git remote add origin https://github.com/rust-lang/rust
2020-01-10T10:53:34.9721076Z ##[command]git config gc.auto 0
2020-01-10T10:53:34.9796477Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2020-01-10T10:53:34.9867175Z ##[command]git config --get-all http.proxy
2020-01-10T10:53:35.0044695Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/67899/merge:refs/remotes/pull/67899/merge
---
2020-01-10T11:00:11.7114932Z     Checking rustc_builtin_macros v0.0.0 (/checkout/src/librustc_builtin_macros)
2020-01-10T11:00:15.5858140Z error: unused import: `Hasher`
2020-01-10T11:00:15.5859684Z  --> src/librustc/ty/view.rs:3:18
2020-01-10T11:00:15.5861016Z   |
2020-01-10T11:00:15.5862311Z 3 |     hash::{Hash, Hasher},
2020-01-10T11:00:15.5864743Z   |
2020-01-10T11:00:15.5865480Z   = note: `-D unused-imports` implied by `-D warnings`
2020-01-10T11:00:15.5865810Z 
2020-01-10T11:00:34.9664102Z error: aborting due to previous error
2020-01-10T11:00:34.9664102Z error: aborting due to previous error
2020-01-10T11:00:34.9664762Z 
2020-01-10T11:00:34.9992402Z error: could not compile `rustc`.
2020-01-10T11:00:34.9992967Z 
2020-01-10T11:00:34.9993494Z To learn more, run the command again with --verbose.
2020-01-10T11:00:35.0018997Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "check" "-Zconfig-profile" "--target" "x86_64-unknown-linux-gnu" "-Zbinary-dep-depinfo" "-j" "2" "--release" "--color" "always" "--features" " llvm" "--manifest-path" "/checkout/src/rustc/Cargo.toml" "--message-format" "json-render-diagnostics"
2020-01-10T11:00:35.0025319Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap check
2020-01-10T11:00:35.0025608Z Build completed unsuccessfully in 0:04:39
2020-01-10T11:00:35.0075393Z == clock drift check ==
2020-01-10T11:00:35.0083107Z   local time: Fri Jan 10 11:00:35 UTC 2020
2020-01-10T11:00:35.0083107Z   local time: Fri Jan 10 11:00:35 UTC 2020
2020-01-10T11:00:35.2845038Z   network time: Fri, 10 Jan 2020 11:00:35 GMT
2020-01-10T11:00:35.7372568Z == end clock drift check ==
2020-01-10T11:00:35.7372797Z 
2020-01-10T11:00:35.7434141Z ##[error]Bash exited with code '1'.
2020-01-10T11:00:35.7465263Z ##[section]Starting: Checkout
2020-01-10T11:00:35.7467690Z ==============================================================================
2020-01-10T11:00:35.7467771Z Task         : Get sources
2020-01-10T11:00:35.7467824Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors
Copy link
Contributor

bors commented Jan 10, 2020

⌛ Trying commit c243723 with merge 51adcde...

bors added a commit that referenced this pull request Jan 10, 2020
perf: Avoid re-interning types in outlives checking

In profiling `intern_ty` is a very hot function (9% in the test I used). While there does not seem to be a way to reduce the cost of calling we can avoid the call in some cases.

In outlives checking `ParamTy` and `ProjectionTy` are extracted from the `Ty` value that contains them only to later be passed as an argument to `intern_ty` again later. This seems to be happening a lot in my test with `intern_ty` called from outlives is at ~6%.

Since all `ParamTy` and `ProjectionTy` are already stored in a `Ty` I had an idea to pass around a `View` type which provides direct access to the specific, inner type without losing the original `Ty` pointer. While the current implementation does so with some unsafe to let the branch be elided on `Deref`, it could be done entirely in safe code as well, either by accepting the (predictable) branch in `Deref` or by storing the inner type in `View` as well as the `Ty`. But considering that the unsafe is trivial to prove and the call sites seem quite hot I opted to show the unsafe approach first.

Based on #67840 (since it touches the same file/lines)

Commits without #67840 https://github.com/rust-lang/rust/pull/67899/files/77ddc3540e52be4b5bd75cf082c621392acaf81b..b55bab206096c27533120921f6b0c273f115e34a
@bors
Copy link
Contributor

bors commented Jan 10, 2020

☀️ Try build successful - checks-azure
Build commit: 51adcde (51adcdefa464cabb373a958cc3289a624f1cc53f)

@rust-timer
Copy link
Collaborator

Queued 51adcde with parent 2d8d559, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 51adcde, comparison URL.

Marwes and others added 8 commits January 17, 2020 23:22
`ty::View<'tcx, T>` acts like a `T` but internally stores a `Ty<'tcx>`
pointer. Thanks to this, it is possible to retrieve the original `Ty`
value without needing to ask the interner for it.

Looking up the already created type (`T`) takes a good chunk of the
time in `infer/outlives/verify.rs` so this should be a good speedup.
It may be applicable in other places as well, but those are far lower
when profiling.
But it instead provies "view" types when possible
@Marwes
Copy link
Contributor Author

Marwes commented Jan 18, 2020

Added some inline annotations and specialized TypeFoldable and Lift. Can I get a perf run to see if it fixes the regression?

@Zoxc
Copy link
Contributor

Zoxc commented Jan 18, 2020

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Jan 18, 2020

⌛ Trying commit 1214625 with merge 170c7fc350184210f80c8bba1590e650f02fd5c5...

@bors
Copy link
Contributor

bors commented Jan 18, 2020

☀️ Try build successful - checks-azure
Build commit: 170c7fc350184210f80c8bba1590e650f02fd5c5 (170c7fc350184210f80c8bba1590e650f02fd5c5)

@rust-timer
Copy link
Collaborator

Queued 170c7fc350184210f80c8bba1590e650f02fd5c5 with parent 1b117d7, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit 170c7fc350184210f80c8bba1590e650f02fd5c5, comparison URL.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 20, 2020

Regression is gone, but performance is the same or slightly improved on trait heavy crates (https://github.com/Marwes/combine is the crate mentioned in the PR, which is basically entirely trait implementations). I can get some numbers for that perhaps.

I have seen some other spots which could potentially benefit from this optimization (both the part that avoids interning and the space reduction part) but I wanted to see that this works out first.

@varkor
Copy link
Member

varkor commented Jan 20, 2020

The numbers from the perf run look like noise. If there are some crates that do are optimised with this change, we should get numbers on them and also add them to perf so we don't overlook these sorts of changes.

@varkor varkor added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 22, 2020
@joelpalmer
Copy link

Ping from Triage: any updates @Marwes?

@JohnCSimon JohnCSimon added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Feb 10, 2020
@JohnCSimon
Copy link
Member

Ping from triage: @Marwes
Thanks for your contribution but it seems this PR has sat idle for the last three weeks, so I'm going to close this as inactive. Please feel free to open this up again if you're willing to continue work on this.
Note: don't push to branch with a closed PR as it prevents it from being opened again.

@JohnCSimon JohnCSimon closed this Feb 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants