From 6f47be4611f5a412a9b2ddf6ae4eeecc388a4693 Mon Sep 17 00:00:00 2001 From: Matthew Woodcraft Date: Mon, 3 Jan 2022 21:05:27 +0000 Subject: [PATCH 1/5] Add a 'Reserved prefixes' section to the Tokens page --- src/tokens.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/src/tokens.md b/src/tokens.md index 1c14a1819..682ef1902 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -659,3 +659,34 @@ them are referred to as "token trees" in [macros]. The three types of brackets [use declarations]: items/use-declarations.md [use wildcards]: items/use-declarations.md [while let]: expressions/loop-expr.md#predicate-pattern-loops + +## Reserved prefixes + +Some lexical forms known as _reserved prefixes_ are reserved for future use. + +Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix. + +Note that raw identifiers, raw string literals, and raw byte string literals may contain a `#` character but are not interpreted as containing a reserved prefix. + +Similarly the `r`, `b`, and `br` prefixes used in raw string literals, byte literals, byte string literals, and raw byte string literals are not interpreted as reserved prefixes. + +> **Edition Differences**: Starting with the 2021 edition, reserved prefixes are reported as an error by the lexer (in particular, they cannot be passed to macros). +> +> Before the 2021 edition, a reserved prefixes are accepted by the lexer and interpreted as multiple tokens (for example, one token for the identifier or keyword, followed by a `#` token). +> +> Examples accepted in all editions: +> ```rust +> macro_rules! lexes {($($_:tt)*) => {}} +> lexes!{a #foo} +> lexes!{continue 'foo} +> lexes!{match "..." {}} +> lexes!{r#let#foo} // three tokens: r#let # foo +> ``` +> +> Examples accepted before the 2021 edition but rejected later: +> ```rust,edition2018 +> macro_rules! lexes {($($_:tt)*) => {}} +> lexes!{a#foo} +> lexes!{continue'foo} +> lexes!{match"..." {}} +> ``` From 13ef57cffc07444bcc4f0fa47630a63787cdd914 Mon Sep 17 00:00:00 2001 From: Matthew Woodcraft Date: Mon, 3 Jan 2022 21:07:21 +0000 Subject: [PATCH 2/5] Reword the first sentence describing literal suffixes. Make it say that a suffix of the same form as a keyword is permitted. Make it clearer that the "Reserved prefixes" rule doesn't apply to suffixes. --- src/tokens.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index 682ef1902..b0a606afc 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -88,8 +88,7 @@ evaluated (primarily) at compile time. #### Suffixes -A suffix is a non-raw identifier immediately (without whitespace) -following the primary part of a literal. +A suffix is a sequence of characters following the primary part of a literal (without intervening whitespace), of the same form as a non-raw identifier or keyword. Any kind of literal (string, integer, etc) with any suffix is valid as a token, and can be passed to a macro without producing an error. From 9f8acfc976c75e29989121b0a64feb1176f9edb6 Mon Sep 17 00:00:00 2001 From: Matthew Woodcraft Date: Tue, 4 Jan 2022 21:56:28 +0000 Subject: [PATCH 3/5] Say that `_#`, `_'`, and `_"` are reserved prefixes --- src/tokens.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/tokens.md b/src/tokens.md index b0a606afc..f42f64894 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -663,7 +663,7 @@ them are referred to as "token trees" in [macros]. The three types of brackets Some lexical forms known as _reserved prefixes_ are reserved for future use. -Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix. +Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword or `_`) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix. Note that raw identifiers, raw string literals, and raw byte string literals may contain a `#` character but are not interpreted as containing a reserved prefix. From c934ea954692675e9975de4d69eb6f4e4cd86809 Mon Sep 17 00:00:00 2001 From: Matthew Woodcraft Date: Wed, 5 Jan 2022 21:06:44 +0000 Subject: [PATCH 4/5] Add a Lexer rules block for reserved prefixes --- src/tokens.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/tokens.md b/src/tokens.md index f42f64894..eb07806f8 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -661,6 +661,11 @@ them are referred to as "token trees" in [macros]. The three types of brackets ## Reserved prefixes +> **Lexer 2018+**\ +> RESERVED_TOKEN_DOUBLE_QUOTE : ( IDENTIFIER_OR_KEYWORD _Except `b` or `r` or `br`_ | `_` ) `"`\ +> RESERVED_TOKEN_SINGLE_QUOTE : ( IDENTIFIER_OR_KEYWORD _Except `b`_ | `_` ) `'`\ +> RESERVED_TOKEN_POUND : ( IDENTIFIER_OR_KEYWORD _Except `r` or `br`_ | `_` ) `#` + Some lexical forms known as _reserved prefixes_ are reserved for future use. Source input which would otherwise be lexically interpreted as a non-raw identifier (or a keyword or `_`) which is immediately followed by a `#`, `'`, or `"` character (without intervening whitespace) is identified as a reserved prefix. From be704f3a7e690b10d9ee66d5b29d455df82384a6 Mon Sep 17 00:00:00 2001 From: Eric Huss Date: Wed, 5 Jan 2022 20:17:20 -0800 Subject: [PATCH 5/5] Fix lexer edition. --- src/tokens.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/tokens.md b/src/tokens.md index eb07806f8..88a6d7252 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -661,7 +661,7 @@ them are referred to as "token trees" in [macros]. The three types of brackets ## Reserved prefixes -> **Lexer 2018+**\ +> **Lexer 2021+**\ > RESERVED_TOKEN_DOUBLE_QUOTE : ( IDENTIFIER_OR_KEYWORD _Except `b` or `r` or `br`_ | `_` ) `"`\ > RESERVED_TOKEN_SINGLE_QUOTE : ( IDENTIFIER_OR_KEYWORD _Except `b`_ | `_` ) `'`\ > RESERVED_TOKEN_POUND : ( IDENTIFIER_OR_KEYWORD _Except `r` or `br`_ | `_` ) `#`