-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A new matcher representation for use in parse_tt
#95555
A new matcher representation for use in parse_tt
#95555
Conversation
`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this is a bad representation for the traversal. - `TokenTree` is nested, and there's a bunch of expensive and fiddly state required to handle entering and exiting nested submatchers. - There are three positions (sequence separators, sequence Kleene ops, and end of the matcher) that are represented by an index that exceeds the end of the `&[TokenTree]`, which is clumsy and error-prone. This commit introduces a new representation called `MatcherLoc` that is designed specifically for matching. It fixes all the above problems, making the code much easier to read. A `&[TokenTree]` is converted to a `&[MatcherLoc]` before matching begins. Despite the cost of the conversion, it's still a net performance win, because various pieces of traversal state are computed once up-front, rather than having to be recomputed repeatedly during the macro matching. Some improvements worth noting. - `parse_tt_inner` is *much* easier to read. No more having to compare `idx` against `len` and read comments to understand what the result means. - The handling of `Delimited` in `parse_tt_inner` is now trivial. - The three end-of-sequence cases in `parse_tt_inner` are now handled in three separate match arms, and the control flow is much simpler. - `nameize` is no longer recursive. - There were two places that issued "missing fragment specifier" errors: one in `parse_tt_inner()`, and one in `nameize()`. Presumably the latter was never executed. There's now a single place issuing these errors, in `compute_locs()`. - The number of heap allocations done for a `check full` build of `async-std-1.10.0` (an extreme example of heavy macro use) drops from 11.8M to 2.6M, and most of these occur outside of macro matching. - The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough that it no longer needs boxing, which partly accounts for the reduction in allocations. - The rest of the drop in allocations is due to the removal of `MatcherKind`, because we no longer need to record anything for the parent matcher when entering a submatcher. - Overall it reduces code size by 45 lines.
To match the order the variants are declared in.
04103c6
to
0bd47e8
Compare
Best reviewed one commit at a time. Apologies for the first commit being quite large; the change is big and invasive enough that there wasn't really a good way to break it into smaller pieces. |
Here are some instruction count results for
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 0bd47e8 with merge 6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302... |
☀️ Try build successful - checks-actions |
Queued 6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302 with parent ec667fb, future comparison URL. |
Finished benchmarking commit (6ac9d4cf75f3aa6be784c21636d0f83c9f8d7302): comparison url. Summary:
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Footnotes |
@bors r+ |
📌 Commit 0bd47e8 has been approved by |
☀️ Test successful - checks-actions |
Finished benchmarking commit (6a9080b): comparison url. Summary:
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression Footnotes |
I looked at the regressions for Likewise for Overall I'm not worried because the speed benefits outweigh the regressions, but I'll still see if I can improve things some more, either by doing what the comment above suggests, or something else. |
In rust-lang#95555 this was moved out of `parse_tt_inner` and `nameize` into `compute_locs`. But the next commit will be moving `compute_locs` outwards to a place that isn't suitable for the missing fragment identifier checking. So this reinstates the old checking.
…r-rule, r=petrochenkov Call `compute_locs` once per rule This fixes the small regressions on `wg-grammar` and `hyper-0.14.18` seen in rust-lang#95555. r? `@petrochenkov`
This is a very nice code improvement! Thanks @nnethercote ! |
PR rust-lang#95159 unintentionally changed the behaviour of declarative macro matching for `NoDelim` delimited sequences. - Before rust-lang#95159, delimiters were implicit in `mbe::Delimited`. When doing macro matching, normal delimiters were generated out of thin air as necessary, but `NoDelim` delimiters were not. This was done within `TokenTree::{get_tt,len}`. - After rust-lang#95159, the delimiters were explicit. There was an unintentional change whereby `NoDelim` delimiters were represented explicitly just like normal delimeters. - rust-lang#95555 introduced a new matcher representation (`MatcherLoc`) and the `NoDelim` delimiters were made explicit within it, just like `mbe::Delimited`. - rust-lang#95797 then changed `mbe::Delimited` back to having implicit delimiters, but because matching is now being done on `MatcherLoc`, the behavioural change persisted. The fix is simple: remove the explicit `NoDelim` delimiters in the `MatcherLoc` representation. This gives back the original behaviour. As for why this took so long to be found: it seems that `NoDelim` sequences are unusual. It took a macro that generates another macro (from the `yarte_lexer` crate, found via a crater run) to uncover this. Fixes rust-lang#96305.
Remove box syntax from rustc_mir_dataflow and rustc_mir_transform Continuation of rust-lang#87781, inspired by rust-lang#97239. The usages that this PR removes have not appeared from nothing, instead the usage in `rustc_mir_dataflow` and `rustc_mir_transform` was from rust-lang#80522 which split up `rustc_mir`, and which was filed before I filed rust-lang#87781, so it was using the state from before my PR. But it was merged after my PR was merged, so the `box_syntax` uses were able to survive here. Outside of this introduction due to the code being outside of the master branch at the point of merging of my PR, there was only one other introduction of box syntax, in rust-lang#95159. That box syntax was removed again though in rust-lang#95555. Outside of that, `box_syntax` has not made its reoccurrance in compiler crates.
By transforming the matcher into a different form,
parse_tt
can run faster and be easier to understand.r? @petrochenkov