From ce0b7cc529b2bba4207bcc2d93af4c7bb3640e64 Mon Sep 17 00:00:00 2001 From: bstrie Date: Wed, 16 May 2018 00:56:56 +0000 Subject: [PATCH] Fix grammar documentation wrt Unicode identifiers The grammar defines identifiers in terms of XID_start and XID_continue, but this is referring to the unstable non_ascii_idents feature. The documentation implies that non_ascii_idents is forthcoming, but this is left over from pre-1.0 documentation; in reality, non_ascii_idents has been without even an RFC for several years now, and will not be stabilized anytime soon. Furthermore, according to the tracking issue at https://github.com/rust-lang/rust/issues/28979 , it's highly questionable whether or not this feature will use XID_start or XID_continue even when or if non_ascii_idents is stabilized. This commit fixes this by respecifying identifiers as the usual [a-zA-Z_][a-zA-Z0-9_]* --- src/doc/grammar.md | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/doc/grammar.md b/src/doc/grammar.md index 78432b6a96593..4017c84c9b4b8 100644 --- a/src/doc/grammar.md +++ b/src/doc/grammar.md @@ -101,20 +101,15 @@ properties: `ident`, `non_null`, `non_eol`, `non_single_quote` and ### Identifiers -The `ident` production is any nonempty Unicode[^non_ascii_idents] string of +The `ident` production is any nonempty Unicode string of the following form: -[^non_ascii_idents]: Non-ASCII characters in identifiers are currently feature - gated. This is expected to improve soon. +- The first character is in one of the following ranges `U+0041` to `U+005A` +("A" to "Z"), `U+0061` to `U+007A` ("a" to "z"), or `U+005F` ("\_"). +- The remaining characters are in the range `U+0030` to `U+0039` ("0" to "9"), +or any of the prior valid initial characters. -- The first character has property `XID_start` -- The remaining characters have property `XID_continue` - -that does _not_ occur in the set of [keywords](#keywords). - -> **Note**: `XID_start` and `XID_continue` as character properties cover the -> character ranges used to form the more familiar C and Java language-family -> identifiers. +as long as the identifier does _not_ occur in the set of [keywords](#keywords). ### Delimiter-restricted productions