diff --git a/src/identifiers.md b/src/identifiers.md index 88d3f2071..e57be8407 100644 --- a/src/identifiers.md +++ b/src/identifiers.md @@ -2,8 +2,8 @@ > **Lexer:**\ > IDENTIFIER_OR_KEYWORD :\ ->       \[`a`-`z` `A`-`Z`] \[`a`-`z` `A`-`Z` `0`-`9` `_`]\*\ ->    | `_` \[`a`-`z` `A`-`Z` `0`-`9` `_`]+ +>       XID_start XID_continue\*\ +>    | `_` XID_continue+ > > RAW_IDENTIFIER : `r#` IDENTIFIER_OR_KEYWORD *Except `crate`, `self`, `super`, `Self`* > @@ -12,18 +12,22 @@ > IDENTIFIER :\ > NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER -An identifier is any nonempty ASCII string of the following form: +An identifier is any nonempty Unicode string of the following form: Either -* The first character is a letter. -* The remaining characters are alphanumeric or `_`. +* The first character has property [`XID_start`]. +* The remaining characters have property [`XID_continue`]. Or * The first character is `_`. * The identifier is more than one character. `_` alone is not an identifier. -* The remaining characters are alphanumeric or `_`. +* The remaining characters have property [`XID_continue`]. + +> **Note**: [`XID_start`] and [`XID_continue`] as character properties cover the +> character ranges used to form the more familiar C and Java language-family +> identifiers. A raw identifier is like a normal identifier, but prefixed by `r#`. (Note that the `r#` prefix is not included as part of the actual identifier.) @@ -32,3 +36,5 @@ keyword except the ones listed above for `RAW_IDENTIFIER`. [strict]: keywords.md#strict-keywords [reserved]: keywords.md#reserved-keywords +[`XID_start`]: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3AXID_Start%3A%5D&abb=on&g=&i= +[`XID_continue`]: http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5B%3AXID_Continue%3A%5D&abb=on&g=&i=