Skip to content

Commit

Permalink
Auto merge of rust-lang#119673 - petrochenkov:dialoc5, r=compiler-err…
Browse files Browse the repository at this point in the history
…ors,cjgillot

macro_rules: Preserve all metavariable spans in a global side table

This PR preserves spans of `tt` metavariables used to pass tokens to declarative macros.
Such metavariable spans can then be used in span combination operations like `Span::to` to improve all kinds of diagnostics.

Spans of non-`tt` metavariables are currently kept in nonterminal tokens, but the long term plan is remove all nonterminal tokens from rustc parser and rely on the proc macro model with invisible delimiters (rust-lang#114647, rust-lang#67062).
In particular, `NtIdent` nonterminal (corresponding to `ident` metavariables) becomes easy to remove when this PR lands (rust-lang#119412 does it).

The metavariable spans are kept in a global side table keyed by `Span`s of original tokens.
The alternative to the side table is keeping them in `SpanData` instead, but the performance regressions would be large because any spans from tokens passed to declarative macros would stop being inline and would work through span interner instead, and the penalty would be paid even if we never use the metavar span for the given original span.
(But also see the comment on `fn maybe_use_metavar_location` describing the map collision issues with the side table approach.)

There are also other alternatives - keeping the metavar span in `Token` or `TokenTree`, but associating it with `Span` itsel is the most natural choice because metavar spans are used in span combining operations, and those operations are not necessarily tied to tokens.
  • Loading branch information
bors committed Feb 18, 2024
2 parents 8b21296 + 9f8d05f commit 4c0ba1e
Show file tree
Hide file tree
Showing 29 changed files with 232 additions and 125 deletions.
1 change: 1 addition & 0 deletions compiler/rustc_expand/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#![feature(if_let_guard)]
#![feature(let_chains)]
#![feature(macro_metavar_expr)]
#![feature(map_try_insert)]
#![feature(proc_macro_diagnostic)]
#![feature(proc_macro_internals)]
#![feature(proc_macro_span)]
Expand Down
55 changes: 45 additions & 10 deletions compiler/rustc_expand/src/mbe/transcribe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ use rustc_errors::DiagnosticBuilder;
use rustc_errors::{pluralize, PResult};
use rustc_span::hygiene::{LocalExpnId, Transparency};
use rustc_span::symbol::{sym, Ident, MacroRulesNormalizedIdent};
use rustc_span::{Span, SyntaxContext};
use rustc_span::{with_metavar_spans, Span, SyntaxContext};

use smallvec::{smallvec, SmallVec};
use std::mem;
Expand Down Expand Up @@ -254,7 +254,8 @@ pub(super) fn transcribe<'a>(
MatchedTokenTree(tt) => {
// `tt`s are emitted into the output stream directly as "raw tokens",
// without wrapping them into groups.
result.push(maybe_use_metavar_location(cx, &stack, sp, tt));
let tt = maybe_use_metavar_location(cx, &stack, sp, tt, &mut marker);
result.push(tt);
}
MatchedNonterminal(nt) => {
// Other variables are emitted into the output stream as groups with
Expand Down Expand Up @@ -319,6 +320,17 @@ pub(super) fn transcribe<'a>(
}
}

/// Store the metavariable span for this original span into a side table.
/// FIXME: Try to put the metavariable span into `SpanData` instead of a side table (#118517).
/// An optimal encoding for inlined spans will need to be selected to minimize regressions.
/// The side table approach is relatively good, but not perfect due to collisions.
/// In particular, collisions happen when token is passed as an argument through several macro
/// calls, like in recursive macros.
/// The old heuristic below is used to improve spans in case of collisions, but diagnostics are
/// still degraded sometimes in those cases.
///
/// The old heuristic:
///
/// Usually metavariables `$var` produce interpolated tokens, which have an additional place for
/// keeping both the original span and the metavariable span. For `tt` metavariables that's not the
/// case however, and there's no place for keeping a second span. So we try to give the single
Expand All @@ -338,15 +350,12 @@ pub(super) fn transcribe<'a>(
/// These are typically used for passing larger amounts of code, and tokens in that code usually
/// combine with each other and not with tokens outside of the sequence.
/// - The metavariable span comes from a different crate, then we prefer the more local span.
///
/// FIXME: Find a way to keep both original and metavariable spans for all tokens without
/// regressing compilation time too much. Several experiments for adding such spans were made in
/// the past (PR #95580, #118517, #118671) and all showed some regressions.
fn maybe_use_metavar_location(
cx: &ExtCtxt<'_>,
stack: &[Frame<'_>],
metavar_span: Span,
mut metavar_span: Span,
orig_tt: &TokenTree,
marker: &mut Marker,
) -> TokenTree {
let undelimited_seq = matches!(
stack.last(),
Expand All @@ -357,18 +366,44 @@ fn maybe_use_metavar_location(
..
})
);
if undelimited_seq || cx.source_map().is_imported(metavar_span) {
if undelimited_seq {
// Do not record metavar spans for tokens from undelimited sequences, for perf reasons.
return orig_tt.clone();
}

let insert = |mspans: &mut FxHashMap<_, _>, s, ms| match mspans.try_insert(s, ms) {
Ok(_) => true,
Err(err) => *err.entry.get() == ms, // Tried to insert the same span, still success
};
marker.visit_span(&mut metavar_span);
let no_collision = match orig_tt {
TokenTree::Token(token, ..) => {
with_metavar_spans(|mspans| insert(mspans, token.span, metavar_span))
}
TokenTree::Delimited(dspan, ..) => with_metavar_spans(|mspans| {
insert(mspans, dspan.open, metavar_span)
&& insert(mspans, dspan.close, metavar_span)
&& insert(mspans, dspan.entire(), metavar_span)
}),
};
if no_collision || cx.source_map().is_imported(metavar_span) {
return orig_tt.clone();
}

// Setting metavar spans for the heuristic spans gives better opportunities for combining them
// with neighboring spans even despite their different syntactic contexts.
match orig_tt {
TokenTree::Token(Token { kind, span }, spacing) => {
let span = metavar_span.with_ctxt(span.ctxt());
with_metavar_spans(|mspans| insert(mspans, span, metavar_span));
TokenTree::Token(Token { kind: kind.clone(), span }, *spacing)
}
TokenTree::Delimited(dspan, dspacing, delimiter, tts) => {
let open = metavar_span.shrink_to_lo().with_ctxt(dspan.open.ctxt());
let close = metavar_span.shrink_to_hi().with_ctxt(dspan.close.ctxt());
let open = metavar_span.with_ctxt(dspan.open.ctxt());
let close = metavar_span.with_ctxt(dspan.close.ctxt());
with_metavar_spans(|mspans| {
insert(mspans, open, metavar_span) && insert(mspans, close, metavar_span)
});
let dspan = DelimSpan::from_pair(open, close);
TokenTree::Delimited(dspan, *dspacing, *delimiter, tts.clone())
}
Expand Down
73 changes: 59 additions & 14 deletions compiler/rustc_span/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ pub mod fatal_error;

pub mod profiling;

use rustc_data_structures::fx::FxHashMap;
use rustc_data_structures::stable_hasher::{Hash128, Hash64, HashStable, StableHasher};
use rustc_data_structures::sync::{FreezeLock, FreezeWriteGuard, Lock, Lrc};

Expand All @@ -98,6 +99,9 @@ mod tests;
pub struct SessionGlobals {
symbol_interner: symbol::Interner,
span_interner: Lock<span_encoding::SpanInterner>,
/// Maps a macro argument token into use of the corresponding metavariable in the macro body.
/// Collisions are possible and processed in `maybe_use_metavar_location` on best effort basis.
metavar_spans: Lock<FxHashMap<Span, Span>>,
hygiene_data: Lock<hygiene::HygieneData>,

/// A reference to the source map in the `Session`. It's an `Option`
Expand All @@ -115,6 +119,7 @@ impl SessionGlobals {
SessionGlobals {
symbol_interner: symbol::Interner::fresh(),
span_interner: Lock::new(span_encoding::SpanInterner::default()),
metavar_spans: Default::default(),
hygiene_data: Lock::new(hygiene::HygieneData::new(edition)),
source_map: Lock::new(None),
}
Expand Down Expand Up @@ -168,6 +173,11 @@ pub fn create_default_session_globals_then<R>(f: impl FnOnce() -> R) -> R {
// deserialization.
scoped_tls::scoped_thread_local!(static SESSION_GLOBALS: SessionGlobals);

#[inline]
pub fn with_metavar_spans<R>(f: impl FnOnce(&mut FxHashMap<Span, Span>) -> R) -> R {
with_session_globals(|session_globals| f(&mut session_globals.metavar_spans.lock()))
}

// FIXME: We should use this enum or something like it to get rid of the
// use of magic `/rust/1.x/...` paths across the board.
#[derive(Debug, Eq, PartialEq, Clone, Ord, PartialOrd, Decodable)]
Expand Down Expand Up @@ -824,29 +834,64 @@ impl Span {
)
}

/// Check if you can select metavar spans for the given spans to get matching contexts.
fn try_metavars(a: SpanData, b: SpanData, a_orig: Span, b_orig: Span) -> (SpanData, SpanData) {
let get = |mspans: &FxHashMap<_, _>, s| mspans.get(&s).copied();
match with_metavar_spans(|mspans| (get(mspans, a_orig), get(mspans, b_orig))) {
(None, None) => {}
(Some(meta_a), None) => {
let meta_a = meta_a.data();
if meta_a.ctxt == b.ctxt {
return (meta_a, b);
}
}
(None, Some(meta_b)) => {
let meta_b = meta_b.data();
if a.ctxt == meta_b.ctxt {
return (a, meta_b);
}
}
(Some(meta_a), Some(meta_b)) => {
let meta_b = meta_b.data();
if a.ctxt == meta_b.ctxt {
return (a, meta_b);
}
let meta_a = meta_a.data();
if meta_a.ctxt == b.ctxt {
return (meta_a, b);
} else if meta_a.ctxt == meta_b.ctxt {
return (meta_a, meta_b);
}
}
}

(a, b)
}

/// Prepare two spans to a combine operation like `to` or `between`.
/// FIXME: consider using declarative macro metavariable spans for the given spans if they are
/// better suitable for combining (#119412).
fn prepare_to_combine(
a_orig: Span,
b_orig: Span,
) -> Result<(SpanData, SpanData, Option<LocalDefId>), Span> {
let (a, b) = (a_orig.data(), b_orig.data());
if a.ctxt == b.ctxt {
return Ok((a, b, if a.parent == b.parent { a.parent } else { None }));
}

if a.ctxt != b.ctxt {
// Context mismatches usually happen when procedural macros combine spans copied from
// the macro input with spans produced by the macro (`Span::*_site`).
// In that case we consider the combined span to be produced by the macro and return
// the original macro-produced span as the result.
// Otherwise we just fall back to returning the first span.
// Combining locations typically doesn't make sense in case of context mismatches.
// `is_root` here is a fast path optimization.
let a_is_callsite = a.ctxt.is_root() || a.ctxt == b.span().source_callsite().ctxt();
return Err(if a_is_callsite { b_orig } else { a_orig });
let (a, b) = Span::try_metavars(a, b, a_orig, b_orig);
if a.ctxt == b.ctxt {
return Ok((a, b, if a.parent == b.parent { a.parent } else { None }));
}

let parent = if a.parent == b.parent { a.parent } else { None };
Ok((a, b, parent))
// Context mismatches usually happen when procedural macros combine spans copied from
// the macro input with spans produced by the macro (`Span::*_site`).
// In that case we consider the combined span to be produced by the macro and return
// the original macro-produced span as the result.
// Otherwise we just fall back to returning the first span.
// Combining locations typically doesn't make sense in case of context mismatches.
// `is_root` here is a fast path optimization.
let a_is_callsite = a.ctxt.is_root() || a.ctxt == b.span().source_callsite().ctxt();
Err(if a_is_callsite { b_orig } else { a_orig })
}

/// This span, but in a larger context, may switch to the metavariable span if suitable.
Expand Down
8 changes: 8 additions & 0 deletions tests/coverage/no_spans.cov-map
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
Function name: no_spans::affected_function
Raw bytes (9): 0x[01, 01, 00, 01, 01, 1a, 1c, 00, 1d]
Number of files: 1
- file 0 => global file 1
Number of expressions: 0
Number of file 0 mappings: 1
- Code(Counter(0)) at (prev + 26, 28) to (start + 0, 29)

Function name: no_spans::affected_function::{closure#0}
Raw bytes (9): 0x[01, 01, 00, 01, 01, 1b, 0c, 00, 0e]
Number of files: 1
Expand Down
12 changes: 12 additions & 0 deletions tests/coverage/no_spans_if_not.cov-map
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
Function name: no_spans_if_not::affected_function
Raw bytes (21): 0x[01, 01, 01, 01, 00, 03, 01, 16, 1c, 01, 12, 02, 02, 0d, 00, 0f, 00, 02, 0d, 00, 0f]
Number of files: 1
- file 0 => global file 1
Number of expressions: 1
- expression 0 operands: lhs = Counter(0), rhs = Zero
Number of file 0 mappings: 3
- Code(Counter(0)) at (prev + 22, 28) to (start + 1, 18)
- Code(Expression(0, Sub)) at (prev + 2, 13) to (start + 0, 15)
= (c0 - Zero)
- Code(Zero) at (prev + 2, 13) to (start + 0, 15)

Function name: no_spans_if_not::main
Raw bytes (9): 0x[01, 01, 00, 01, 01, 0b, 01, 02, 02]
Number of files: 1
Expand Down
4 changes: 2 additions & 2 deletions tests/ui/anon-params/anon-params-edition-hygiene.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
//@ check-pass
//@ edition:2018
//@ aux-build:anon-params-edition-hygiene.rs

Expand All @@ -8,7 +9,6 @@
extern crate anon_params_edition_hygiene;

generate_trait_2015_ident!(u8);
// FIXME: Edition hygiene doesn't work correctly with `tt`s in this case.
generate_trait_2015_tt!(u8); //~ ERROR expected one of `:`, `@`, or `|`, found `)`
generate_trait_2015_tt!(u8);

fn main() {}
23 changes: 0 additions & 23 deletions tests/ui/anon-params/anon-params-edition-hygiene.stderr

This file was deleted.

4 changes: 2 additions & 2 deletions tests/ui/editions/edition-keywords-2018-2018-parsing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ macro_rules! local_passes_ident {
($i: ident) => ($i) //~ ERROR macro expansion ends with an incomplete expression
}
macro_rules! local_passes_tt {
($i: tt) => ($i) //~ ERROR macro expansion ends with an incomplete expression
($i: tt) => ($i)
}

pub fn check_async() {
Expand All @@ -34,7 +34,7 @@ pub fn check_async() {
if passes_tt!(r#async) == 1 {} // OK
if local_passes_ident!(async) == 1 {} // Error reported above in the macro
if local_passes_ident!(r#async) == 1 {} // OK
if local_passes_tt!(async) == 1 {} // Error reported above in the macro
if local_passes_tt!(async) == 1 {} //~ ERROR macro expansion ends with an incomplete expression
if local_passes_tt!(r#async) == 1 {} // OK
module::async(); //~ ERROR expected identifier, found keyword `async`
module::r#async(); // OK
Expand Down
6 changes: 3 additions & 3 deletions tests/ui/editions/edition-keywords-2018-2018-parsing.stderr
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,10 @@ LL | ($i: ident) => ($i)
| ^ expected one of `move`, `|`, or `||`

error: macro expansion ends with an incomplete expression: expected one of `move`, `|`, or `||`
--> $DIR/edition-keywords-2018-2018-parsing.rs:19:20
--> $DIR/edition-keywords-2018-2018-parsing.rs:37:30
|
LL | ($i: tt) => ($i)
| ^ expected one of `move`, `|`, or `||`
LL | if local_passes_tt!(async) == 1 {}
| ^ expected one of `move`, `|`, or `||`

error[E0308]: mismatched types
--> $DIR/edition-keywords-2018-2018-parsing.rs:42:33
Expand Down
2 changes: 1 addition & 1 deletion tests/ui/fmt/format-args-capture-first-literal-is-macro.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ extern crate format_string_proc_macro;
macro_rules! identity_mbe {
($tt:tt) => {
$tt
//~^ ERROR there is no argument named `a`
};
}

Expand All @@ -16,6 +15,7 @@ fn main() {
format!(identity_pm!("{a}"));
//~^ ERROR there is no argument named `a`
format!(identity_mbe!("{a}"));
//~^ ERROR there is no argument named `a`
format!(concat!("{a}"));
//~^ ERROR there is no argument named `a`
}
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
error: there is no argument named `a`
--> $DIR/format-args-capture-first-literal-is-macro.rs:16:26
--> $DIR/format-args-capture-first-literal-is-macro.rs:15:26
|
LL | format!(identity_pm!("{a}"));
| ^^^^^
Expand All @@ -8,10 +8,10 @@ LL | format!(identity_pm!("{a}"));
= note: to avoid ambiguity, `format_args!` cannot capture variables when the format string is expanded from a macro

error: there is no argument named `a`
--> $DIR/format-args-capture-first-literal-is-macro.rs:8:9
--> $DIR/format-args-capture-first-literal-is-macro.rs:17:27
|
LL | $tt
| ^^^
LL | format!(identity_mbe!("{a}"));
| ^^^^^
|
= note: did you intend to capture a variable `a` from the surrounding scope?
= note: to avoid ambiguity, `format_args!` cannot capture variables when the format string is expanded from a macro
Expand Down
4 changes: 3 additions & 1 deletion tests/ui/lint/wide_pointer_comparisons.rs
Original file line number Diff line number Diff line change
Expand Up @@ -110,10 +110,12 @@ fn main() {
{
macro_rules! cmp {
($a:tt, $b:tt) => { $a == $b }
//~^ WARN ambiguous wide pointer comparison
}

// FIXME: This lint uses some custom span combination logic.
// Rewrite it to adapt to the new metavariable span rules.
cmp!(a, b);
//~^ WARN ambiguous wide pointer comparison
}

{
Expand Down
Loading

0 comments on commit 4c0ba1e

Please sign in to comment.