Remove Nonterminal
and TokenKind::Interpolated
#763
Labels
major-change
A proposal to make a major change to rustc
major-change-accepted
A major change proposal that was accepted
T-compiler
Add this label so rfcbot knows to poll the compiler team
Proposal
Declarative macro expansion involves the parsing of token sequences into AST nodes, which are then pasted back into the token stream as
TokenKind::Interpolated
tokens. Each such token contains aNonterminal
, an enum that can contain an AST expr, or stmt, or item, or block, etc.This MCP proposes to instead convert the AST node back into tokens and insert those tokens into the token stream, with invisible (a.k.a. "none") delimiters around the token sequence to protect precedence.
One reason for the change is that it's really weird to have AST pieces interpolated through a token stream, because tokens and AST nodes are two different levels. A bit like having a sequence of words and punctuation in natural language, but some of the "words" are themselves phrases or sentences or paragraphs. When I first encountered
Interpolated
tokens it took me some time to understand them.Another reason is that it makes the implementation of declarative macros and proc macros more similar. Currently if you pass an
Interpolated
token to a proc macro, the proc macro bridge converts it into aGroup
that is delimited by invisible delimiters. Then if that gets returned from the proc macro the invisible delimiters remain. (Proc macros can also create invisible-delimited sequences from scratch.) In other words, proc macros work entirely with token streams, so it will be nice for declarative macros (and the parser) to do the same.Also, currently the parser just ignores all invisible delimiters! This leads to occasional precedence issues like rust-lang/rust#67062. Fixing this bug was one of my original motivations with this work. It turns out more complicated than I originally expected, and this MCP won't be enough to fix the bug, though it's definitely a step in the right direction because it will result in us having a single mechanism for grouping tokens instead of two, and the parser will no longer eliminate all invisible delimiters.
This change will also completely eliminate the "forwarding a match fragment" limitation of declarative macros.(Update: this is closer to being eliminated, but is still necessary.)There will be some minor perf effects. Having to tokenize and reparse AST fragments has non-zero cost. Most rustc-perf benchmarks aren't affected.
deep-vector
is the big exception. It's an artificial stress test containing a singlevec!
call with 100,000+ zeroes in it, which is a pathological case for this change. Currently the biggest regression is 90% for an incr-unchanged check build, but there's an easy change that will reduce that to 25-30%.hyper
andlibc
also see some moderate regressions, up to 7% in the worst case. I think these regressions can be reduced some more, but probably not fully eliminated. Some other benchmarks see slight improvements of up to 1.5%, probably becauseTokenKind
can now be madeCopy
, and tokens get copied around a lot.Mentors or Reviewers
@petrochenkov will review, and has helped a lot along the way.
rust-lang/rust#124141 has a draft implementation, which is very closely to completely working. This is my third attempt at this change in three years (rust-lang/rust#96724 and rust-lang/rust#114647 were my previous attempts) and I'm confident it will succeed this time. Other than the "forwarding a match fragment" limitation being lifted, there shouldn't be any user-visible changes.
Process
The main points of the Major Change Process are as follows:
@rustbot second
.-C flag
, then full team check-off is required.@rfcbot fcp merge
on either the MCP or the PR.You can read more about Major Change Proposals on forge.
Comments
This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.
The text was updated successfully, but these errors were encountered: