Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proc_macro/bridge: stop using a remote object handle for proc_macro Ident and Literal #98189

Merged
merged 5 commits into from
Jul 19, 2022

Conversation

mystor
Copy link
Contributor

@mystor mystor commented Jun 17, 2022

This is the fourth part of #86822, split off as requested in #86822 (review). This patch transforms the Ident and Group types into structs serialized over IPC rather than handles.

Symbol values are interned on both the client and server when deserializing, to avoid unnecessary string copies and keep the size of TokenTree down. To do the interning efficiently on the client, the proc-macro crate is given a vendored version of the fxhash hasher, as SipHash appeared to cause performance issues. This was done rather than depending on rustc_hash as it is unfortunately difficult to depend on crates from within proc_macro due to it being built at the same time as std.

In addition, a custom arena allocator and symbol store was also added, inspired by those in rustc_arena and rustc_span. To prevent symbol re-use across multiple invocations of a macro on the same thread, a new range of Symbol names are used for each invocation of the macro, and symbols from previous invocations are cleaned-up.

In order to keep Ident creation efficient, a special ASCII-only case was added to perform ident validation without using RPC for simple identifiers. Full identifier validation couldn't be easily added, as it would require depending on the rustc_lexer and unicode-normalization crates from within proc_macro. Unicode identifiers are validated and normalized using RPC.

See the individual commit messages for more details on trade-offs and design decisions behind these patches.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 17, 2022
@rust-highfive

This comment was marked as off-topic.

@rust-highfive
Copy link
Collaborator

r? @petrochenkov

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 17, 2022
@mystor
Copy link
Contributor Author

mystor commented Jun 17, 2022

r? @eddyb

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Jun 18, 2022

☔ The latest upstream changes (presumably #98186) made this pull request unmergeable. Please resolve the merge conflicts.

@mystor mystor changed the title Stop using a remote object handle for proc_macro Ident and Literal proc_macro/bridge: stop using a remote object handle for proc_macro Ident and Literal Jun 18, 2022
@mystor mystor force-pushed the fast_ident_literal branch 2 times, most recently from a4d0244 to 266d51b Compare June 26, 2022 17:20
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 26, 2022
proc_macro/bridge: cache static spans in proc_macro's client thread-local state

This is the second part of rust-lang#86822, split off as requested in rust-lang#86822 (review). This patch removes the RPC calls required for the very common operations of `Span::call_site()`, `Span::def_site()` and `Span::mixed_site()`.

Some notes:

This part is one of the ones I don't love as a final solution from a design standpoint, because I don't like how the spans are serialized immediately at macro invocation. I think a more elegant solution might've been to reserve special IDs for `call_site`, `def_site`, and `mixed_site` at compile time (either starting at 1 or from `u32::MAX`) and making reading a Span handle automatically map these IDs to the relevant values, rather than doing extra serialization.

This would also have an advantage for potential future work to allow `proc_macro` to operate more independently from the compiler (e.g. to reduce the necessity of `proc-macro2`), as methods like `Span::call_site()` could be made to function without access to the compiler backend.

That was unfortunately tricky to do at the time, as this was the first part I wrote of the patches. After the later part (rust-lang#98188, rust-lang#98189), the other uses of `InternedStore` are removed meaning that a custom serialization strategy for `Span` is easier to implement.

If we want to go that path, we'll still need the majority of the work to split the bridge object and introduce the `Context` trait for free methods, and it will be easier to do after `Span` is the only user of `InternedStore` (after rust-lang#98189).
@bors
Copy link
Contributor

bors commented Jun 28, 2022

☔ The latest upstream changes (presumably #98188) made this pull request unmergeable. Please resolve the merge conflicts.

@rustbot

This comment was marked as off-topic.

@mystor mystor marked this pull request as ready for review June 29, 2022 17:49
@eddyb
Copy link
Member

eddyb commented Jun 29, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 29, 2022
@bors
Copy link
Contributor

bors commented Jun 29, 2022

⌛ Trying commit eed310647a8cb28a4ddabf5d43033cb1618f61a0 with merge 63a4e3d54a9ba6a95c75507b75a2b28fbfdbfa5c...

@bors
Copy link
Contributor

bors commented Jun 29, 2022

☀️ Try build successful - checks-actions
Build commit: 63a4e3d54a9ba6a95c75507b75a2b28fbfdbfa5c (63a4e3d54a9ba6a95c75507b75a2b28fbfdbfa5c)

Copy link
Member

@eddyb eddyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple nits (sorry for the delay!)

library/proc_macro/src/lib.rs Outdated Show resolved Hide resolved
library/proc_macro/src/lib.rs Outdated Show resolved Hide resolved
Unfortunately, as it is difficult to depend on crates from within proc_macro,
this is done by vendoring a copy of the hasher as a module rather than
depending on the rustc_hash crate.

This probably doesn't have a substantial impact up-front, however will be more
relevant once symbols are interned within the proc_macro client.
This was removed in a previous part, however it should be specialized for
to_string performance and consistency.
Doing this for all unicode identifiers would require a dependency on
`unicode-normalization` and `rustc_lexer`, which is currently not
possible for `proc_macro` due to it being built concurrently with `std`
and `core`. Instead, ASCII identifiers are validated locally, and an RPC
message is used to validate unicode identifiers when needed.

String values are interned on the both the server and client when
deserializing, to avoid unnecessary copies and keep Ident cheap to copy and
move. This appears to be important for performance.

The client-side interner is based roughly on the one from rustc_span, and uses
an arena inspired by rustc_arena.

RPC messages passing symbols always include the full value. This could
potentially be optimized in the future if it is revealed to be a
performance bottleneck.

Despite now having a relevant implementaion of Display for Ident, ToString is
still specialized, as it is a hot-path for this object.

The symbol infrastructure will also be used for literals in the next
part.
This builds on the symbol infrastructure built for `Ident` to replicate
the `LitKind` and `Lit` structures in rustc within the `proc_macro`
client, allowing literals to be fully created and interacted with from
the client thread. Only parsing and subspan operations still require
sync RPC.
This method is still only used for Literal::subspan, however the
implementation only depends on the Span component, so it is simpler and
more efficient for now to pass down only the information that is needed.
In the future, if more information about the Literal is required in the
implementation (e.g. to validate that spans line up as expected with
source text), that extra information can be added back with extra
arguments.
@rustbot
Copy link
Collaborator

rustbot commented Jul 18, 2022

Some changes occurred in library/proc_macro/src/bridge

cc @rust-lang/wg-rls-2

@eddyb
Copy link
Member

eddyb commented Jul 18, 2022

@bors r+

@bors
Copy link
Contributor

bors commented Jul 18, 2022

📌 Commit c4acac6 has been approved by eddyb

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 18, 2022
@bors
Copy link
Contributor

bors commented Jul 19, 2022

⌛ Testing commit c4acac6 with merge c3f3550...

@bors
Copy link
Contributor

bors commented Jul 19, 2022

☀️ Test successful - checks-actions
Approved by: eddyb
Pushing c3f3550 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 19, 2022
@bors bors merged commit c3f3550 into rust-lang:master Jul 19, 2022
@rustbot rustbot added this to the 1.64.0 milestone Jul 19, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (c3f3550): comparison url.

Instruction count

  • Primary benchmarks: mixed results
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
1.0% 2.2% 4
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-0.7% -1.7% 19
Improvements 🎉
(secondary)
-3.0% -3.1% 3
All 😿🎉 (primary) -0.4% 2.2% 23

Max RSS (memory usage)

Results
  • Primary benchmarks: 😿 relevant regressions found
  • Secondary benchmarks: mixed results
mean1 max count2
Regressions 😿
(primary)
4.9% 6.9% 2
Regressions 😿
(secondary)
4.6% 4.6% 1
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-2.4% -2.4% 1
All 😿🎉 (primary) 4.9% 6.9% 2

Cycles

Results
  • Primary benchmarks: mixed results
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
2.7% 2.7% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
-2.2% -2.2% 1
Improvements 🎉
(secondary)
-3.3% -3.4% 2
All 😿🎉 (primary) 0.2% 2.7% 2

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged label to this PR.

@rustbot label: +perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot added the perf-regression Performance regression. label Jul 19, 2022
@mystor
Copy link
Contributor Author

mystor commented Jul 19, 2022

Odd, I wonder what changed since #98189 (comment) to make opt builds of serde-derive slightly slower.

GIven that we basically will always build macros as debug builds (IIRC) I think that this regression likely isn't super relevant, but perhaps the Literal::to_string() impl was made more inlineable or something like that... We still see cargo overall build improvements.

@rylev
Copy link
Member

rylev commented Jul 19, 2022

It seems like there isn't any actionable items to address here, and the improvements outweigh the regressions. I'll mark this as triaged.

@rustbot label: +perf-regression-triaged

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Jul 19, 2022
mystor added a commit to mystor/rust that referenced this pull request Jul 29, 2022
…idge

This is done by having the crossbeam dependency inserted into the
proc_macro server code from the server side, to avoid adding a
dependency to proc_macro.

In addition, this introduces a -Z command-line option which will switch
rustc to run proc-macros using this cross-thread executor. With the
changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the
performance of the executor should be much closer to same-thread
execution.

In local testing, the crossbeam executor was substantially more
performant than either of the two existing CrossThread strategies, so
they have been removed to keep things simple.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 30, 2022
proc_macro: use crossbeam channels for the proc_macro cross-thread bridge

This is done by having the crossbeam dependency inserted into the `proc_macro` server code from the server side, to avoid adding a dependency to `proc_macro`.

In addition, this introduces a -Z command-line option which will switch rustc to run proc-macros using this cross-thread executor. With the changes to the bridge in rust-lang#98186, rust-lang#98187, rust-lang#98188 and rust-lang#98189, the performance of the executor should be much closer to same-thread execution.

In local testing, the crossbeam executor was substantially more performant than either of the two existing `CrossThread` strategies, so they have been removed to keep things simple.

r? `@eddyb`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.