From 1cc5037ad21eb8f8bc181eaf7651ab9e3b9ca078 Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Fri, 16 Aug 2019 23:15:10 -0400 Subject: [PATCH] Editorial: Rearrange "Syntax for Patterns" into 4 subsections ... namely: - Patterns - Group Specifiers - Character Classes - Escapes (This moves productions around, but doesn't alter them at all.) Also, rearrange Early Errors rules and runtime semantics rules to reflect the same order of productions. --- spec.html | 491 +++++++++++++++++++++++++++--------------------------- 1 file changed, 249 insertions(+), 242 deletions(-) diff --git a/spec.html b/spec.html index 53a33929442..3cdfad91a3d 100644 --- a/spec.html +++ b/spec.html @@ -33897,7 +33897,7 @@

RegExp (Regular Expression) Objects

Syntax for Patterns

The `RegExp` constructor applies the following grammar to the input pattern String. An error occurs if the grammar cannot interpret the String as an expansion of |Pattern|.

-

Syntax

+

Patterns

Pattern[UnicodeMode, N] :: Disjunction[?UnicodeMode, ?N] @@ -33945,33 +33945,15 @@

Syntax

`(` GroupSpecifier[?UnicodeMode] Disjunction[?UnicodeMode, ?N] `)` `(` `?` `:` Disjunction[?UnicodeMode, ?N] `)` - SyntaxCharacter :: one of - `^` `$` `\` `.` `*` `+` `?` `(` `)` `[` `]` `{` `}` `|` - PatternCharacter :: SourceCharacter but not SyntaxCharacter - AtomEscape[UnicodeMode, N] :: - DecimalEscape - CharacterClassEscape[?UnicodeMode] - CharacterEscape[?UnicodeMode] - [+N] `k` GroupName[?UnicodeMode] - - CharacterEscape[UnicodeMode] :: - ControlEscape - `c` ControlLetter - `0` [lookahead ∉ DecimalDigit] - HexEscapeSequence - RegExpUnicodeEscapeSequence[?UnicodeMode] - IdentityEscape[?UnicodeMode] - - ControlEscape :: one of - `f` `n` `r` `t` `v` - - ControlLetter :: one of - `a` `b` `c` `d` `e` `f` `g` `h` `i` `j` `k` `l` `m` `n` `o` `p` `q` `r` `s` `t` `u` `v` `w` `x` `y` `z` - `A` `B` `C` `D` `E` `F` `G` `H` `I` `J` `K` `L` `M` `N` `O` `P` `Q` `R` `S` `T` `U` `V` `W` `X` `Y` `Z` + SyntaxCharacter :: one of + `^` `$` `\` `.` `*` `+` `?` `(` `)` `[` `]` `{` `}` `|` +
+

Group Specifiers

+ GroupSpecifier[UnicodeMode] :: [empty] `?` GroupName[?UnicodeMode] @@ -33998,35 +33980,55 @@

Syntax

<ZWNJ> <ZWJ> - RegExpUnicodeEscapeSequence[UnicodeMode] :: - [+UnicodeMode] `u` HexLeadSurrogate `\u` HexTrailSurrogate - [+UnicodeMode] `u` HexLeadSurrogate - [+UnicodeMode] `u` HexTrailSurrogate - [+UnicodeMode] `u` HexNonSurrogate - [~UnicodeMode] `u` Hex4Digits - [+UnicodeMode] `u{` CodePoint `}` - UnicodeLeadSurrogate :: > any Unicode code point in the inclusive range 0xD800 to 0xDBFF UnicodeTrailSurrogate :: > any Unicode code point in the inclusive range 0xDC00 to 0xDFFF
-

Each `\\u` |HexTrailSurrogate| for which the choice of associated `u` |HexLeadSurrogate| is ambiguous shall be associated with the nearest possible `u` |HexLeadSurrogate| that would otherwise have no corresponding `\\u` |HexTrailSurrogate|.

+ +

Character Classes

- HexLeadSurrogate :: - Hex4Digits [> but only if the MV of |Hex4Digits| is in the inclusive range 0xD800 to 0xDBFF] + CharacterClass[UnicodeMode] :: + `[` [lookahead != `^`] ClassRanges[?UnicodeMode] `]` + `[` `^` ClassRanges[?UnicodeMode] `]` - HexTrailSurrogate :: - Hex4Digits [> but only if the MV of |Hex4Digits| is in the inclusive range 0xDC00 to 0xDFFF] + ClassRanges[UnicodeMode] :: + [empty] + NonemptyClassRanges[?UnicodeMode] - HexNonSurrogate :: - Hex4Digits [> but only if the MV of |Hex4Digits| is not in the inclusive range 0xD800 to 0xDFFF] + NonemptyClassRanges[UnicodeMode] :: + ClassAtom[?UnicodeMode] + ClassAtom[?UnicodeMode] NonemptyClassRangesNoDash[?UnicodeMode] + ClassAtom[?UnicodeMode] `-` ClassAtom[?UnicodeMode] ClassRanges[?UnicodeMode] - IdentityEscape[UnicodeMode] :: - [+UnicodeMode] SyntaxCharacter - [+UnicodeMode] `/` - [~UnicodeMode] SourceCharacter but not UnicodeIDContinue + NonemptyClassRangesNoDash[UnicodeMode] :: + ClassAtom[?UnicodeMode] + ClassAtomNoDash[?UnicodeMode] NonemptyClassRangesNoDash[?UnicodeMode] + ClassAtomNoDash[?UnicodeMode] `-` ClassAtom[?UnicodeMode] ClassRanges[?UnicodeMode] + + ClassAtom[UnicodeMode] :: + `-` + ClassAtomNoDash[?UnicodeMode] + + ClassAtomNoDash[UnicodeMode] :: + SourceCharacter but not one of `\` or `]` or `-` + `\` ClassEscape[?UnicodeMode] + + +

Escapes

+ + ClassEscape[UnicodeMode] :: + `b` + [+UnicodeMode] `-` + CharacterClassEscape[?UnicodeMode] + CharacterEscape[?UnicodeMode] + + AtomEscape[UnicodeMode, N] :: + DecimalEscape + CharacterClassEscape[?UnicodeMode] + CharacterEscape[?UnicodeMode] + [+N] `k` GroupName[?UnicodeMode] DecimalEscape :: NonZeroDigit DecimalDigits[~Sep]? [lookahead ∉ DecimalDigit] @@ -34068,37 +34070,44 @@

Syntax

ControlLetter `_` - CharacterClass[UnicodeMode] :: - `[` [lookahead != `^`] ClassRanges[?UnicodeMode] `]` - `[` `^` ClassRanges[?UnicodeMode] `]` + CharacterEscape[UnicodeMode] :: + ControlEscape + `c` ControlLetter + `0` [lookahead ∉ DecimalDigit] + HexEscapeSequence + RegExpUnicodeEscapeSequence[?UnicodeMode] + IdentityEscape[?UnicodeMode] - ClassRanges[UnicodeMode] :: - [empty] - NonemptyClassRanges[?UnicodeMode] + ControlEscape :: one of + `f` `n` `r` `t` `v` - NonemptyClassRanges[UnicodeMode] :: - ClassAtom[?UnicodeMode] - ClassAtom[?UnicodeMode] NonemptyClassRangesNoDash[?UnicodeMode] - ClassAtom[?UnicodeMode] `-` ClassAtom[?UnicodeMode] ClassRanges[?UnicodeMode] + ControlLetter :: one of + `a` `b` `c` `d` `e` `f` `g` `h` `i` `j` `k` `l` `m` `n` `o` `p` `q` `r` `s` `t` `u` `v` `w` `x` `y` `z` + `A` `B` `C` `D` `E` `F` `G` `H` `I` `J` `K` `L` `M` `N` `O` `P` `Q` `R` `S` `T` `U` `V` `W` `X` `Y` `Z` - NonemptyClassRangesNoDash[UnicodeMode] :: - ClassAtom[?UnicodeMode] - ClassAtomNoDash[?UnicodeMode] NonemptyClassRangesNoDash[?UnicodeMode] - ClassAtomNoDash[?UnicodeMode] `-` ClassAtom[?UnicodeMode] ClassRanges[?UnicodeMode] + RegExpUnicodeEscapeSequence[UnicodeMode] :: + [+UnicodeMode] `u` HexLeadSurrogate `\u` HexTrailSurrogate + [+UnicodeMode] `u` HexLeadSurrogate + [+UnicodeMode] `u` HexTrailSurrogate + [+UnicodeMode] `u` HexNonSurrogate + [~UnicodeMode] `u` Hex4Digits + [+UnicodeMode] `u{` CodePoint `}` +
+

Each `\\u` |HexTrailSurrogate| for which the choice of associated `u` |HexLeadSurrogate| is ambiguous shall be associated with the nearest possible `u` |HexLeadSurrogate| that would otherwise have no corresponding `\\u` |HexTrailSurrogate|.

+ + HexLeadSurrogate :: + Hex4Digits [> but only if the MV of |Hex4Digits| is in the inclusive range 0xD800 to 0xDBFF] - ClassAtom[UnicodeMode] :: - `-` - ClassAtomNoDash[?UnicodeMode] + HexTrailSurrogate :: + Hex4Digits [> but only if the MV of |Hex4Digits| is in the inclusive range 0xDC00 to 0xDFFF] - ClassAtomNoDash[UnicodeMode] :: - SourceCharacter but not one of `\` or `]` or `-` - `\` ClassEscape[?UnicodeMode] + HexNonSurrogate :: + Hex4Digits [> but only if the MV of |Hex4Digits| is not in the inclusive range 0xD800 to 0xDFFF] - ClassEscape[UnicodeMode] :: - `b` - [+UnicodeMode] `-` - CharacterClassEscape[?UnicodeMode] - CharacterEscape[?UnicodeMode] + IdentityEscape[UnicodeMode] :: + [+UnicodeMode] SyntaxCharacter + [+UnicodeMode] `/` + [~UnicodeMode] SourceCharacter but not UnicodeIDContinue
@@ -34129,16 +34138,16 @@

Static Semantics: Early Errors

It is a Syntax Error if the MV of the first |DecimalDigits| is larger than the MV of the second |DecimalDigits|. - AtomEscape :: `k` GroupName + RegExpIdentifierStart :: `\` RegExpUnicodeEscapeSequence - AtomEscape :: DecimalEscape + RegExpIdentifierPart :: `\` RegExpUnicodeEscapeSequence NonemptyClassRanges :: ClassAtom `-` ClassAtom ClassRanges @@ -34159,22 +34168,22 @@

Static Semantics: Early Errors

It is a Syntax Error if IsCharacterClass of |ClassAtomNoDash| is *false* and IsCharacterClass of |ClassAtom| is *false* and the CharacterValue of |ClassAtomNoDash| is larger than the CharacterValue of |ClassAtom|. - RegExpIdentifierStart :: `\` RegExpUnicodeEscapeSequence + AtomEscape :: DecimalEscape - RegExpIdentifierStart :: UnicodeLeadSurrogate UnicodeTrailSurrogate + AtomEscape :: `k` GroupName - RegExpIdentifierPart :: `\` RegExpUnicodeEscapeSequence + RegExpIdentifierStart :: UnicodeLeadSurrogate UnicodeTrailSurrogate RegExpIdentifierPart :: UnicodeLeadSurrogate UnicodeTrailSurrogate @@ -34426,7 +34435,7 @@

Static Semantics: CharacterValue

1. Return the MV of |HexDigits|. - CharacterEscape :: IdentityEscape + CharacterEscape ::! IdentityEscape 1. Let _ch_ be the code point matched by |IdentityEscape|. 1. Return the code point value of _ch_. @@ -35065,150 +35074,6 @@

- -

AtomEscape

-

With parameter _direction_.

-

The production AtomEscape :: DecimalEscape evaluates as follows:

- - 1. Evaluate |DecimalEscape| to obtain an integer _n_. - 1. Assert: _n_ ≤ _NcapturingParens_. - 1. Return ! BackreferenceMatcher(_n_, _direction_). - -

The production AtomEscape :: CharacterEscape evaluates as follows:

- - 1. Evaluate |CharacterEscape| to obtain a character _ch_. - 1. Let _A_ be a one-element CharSet containing the character _ch_. - 1. Return ! CharacterSetMatcher(_A_, *false*, _direction_). - -

The production AtomEscape :: CharacterClassEscape evaluates as follows:

- - 1. Evaluate |CharacterClassEscape| to obtain a CharSet _A_. - 1. Return ! CharacterSetMatcher(_A_, *false*, _direction_). - - -

An escape sequence of the form `\\` followed by a non-zero decimal number _n_ matches the result of the _n_th set of capturing parentheses (). It is an error if the regular expression has fewer than _n_ capturing parentheses. If the regular expression has _n_ or more capturing parentheses but the _n_th one is *undefined* because it has not captured anything, then the backreference always succeeds.

-
-

The production AtomEscape :: `k` GroupName evaluates as follows:

- - 1. Search the enclosing |Pattern| for an instance of a |GroupSpecifier| containing a |RegExpIdentifierName| which has a CapturingGroupName equal to the CapturingGroupName of the |RegExpIdentifierName| contained in |GroupName|. - 1. Assert: A unique such |GroupSpecifier| is found. - 1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of the located |GroupSpecifier|. This is the total number of Atom :: `(` GroupSpecifier Disjunction `)` Parse Nodes prior to or enclosing the located |GroupSpecifier|, including its immediately enclosing |Atom|. - 1. Return ! BackreferenceMatcher(_parenIndex_, _direction_). - - - -

- BackreferenceMatcher ( - _n_: a positive integer, - _direction_: 1 or -1, - ) -

-
-
- - 1. Assert: _n_ ≥ 1. - 1. Return a new Matcher with parameters (_x_, _c_) that captures _n_ and _direction_ and performs the following steps when called: - 1. Assert: _x_ is a State. - 1. Assert: _c_ is a Continuation. - 1. Let _cap_ be _x_'s _captures_ List. - 1. Let _s_ be _cap_[_n_]. - 1. If _s_ is *undefined*, return _c_(_x_). - 1. Let _e_ be _x_'s _endIndex_. - 1. Let _len_ be the number of elements in _s_. - 1. Let _f_ be _e_ + _direction_ × _len_. - 1. If _f_ < 0 or _f_ > _InputLength_, return ~failure~. - 1. Let _g_ be min(_e_, _f_). - 1. If there exists an integer _i_ between 0 (inclusive) and _len_ (exclusive) such that Canonicalize(_s_[_i_]) is not the same character value as Canonicalize(_Input_[_g_ + _i_]), return ~failure~. - 1. Let _y_ be the State (_f_, _cap_). - 1. Return _c_(_y_). - -
-
- - -

CharacterEscape

-

The |CharacterEscape| productions evaluate as follows:

- - CharacterEscape :: - ControlEscape - `c` ControlLetter - `0` [lookahead ∉ DecimalDigit] - HexEscapeSequence - RegExpUnicodeEscapeSequence - IdentityEscape - - - 1. Let _cv_ be the CharacterValue of this |CharacterEscape|. - 1. Return the character whose character value is _cv_. - -
- - -

DecimalEscape

-

The |DecimalEscape| productions evaluate as follows:

- DecimalEscape :: NonZeroDigit DecimalDigits? - - 1. Return the CapturingGroupNumber of this |DecimalEscape|. - - -

If `\\` is followed by a decimal number _n_ whose first digit is not `0`, then the escape sequence is considered to be a backreference. It is an error if _n_ is greater than the total number of left-capturing parentheses in the entire regular expression.

-
-
- - -

CharacterClassEscape

-

The production CharacterClassEscape :: `d` evaluates as follows:

- - 1. Return the ten-element CharSet containing the characters `0` through `9` inclusive. - -

The production CharacterClassEscape :: `D` evaluates as follows:

- - 1. Return the CharSet containing all characters not in the CharSet returned by CharacterClassEscape :: `d` . - -

The production CharacterClassEscape :: `s` evaluates as follows:

- - 1. Return the CharSet containing all characters corresponding to a code point on the right-hand side of the |WhiteSpace| or |LineTerminator| productions. - -

The production CharacterClassEscape :: `S` evaluates as follows:

- - 1. Return the CharSet containing all characters not in the CharSet returned by CharacterClassEscape :: `s` . - -

The production CharacterClassEscape :: `w` evaluates as follows:

- - 1. Return _WordCharacters_. - -

The production CharacterClassEscape :: `W` evaluates as follows:

- - 1. Return the CharSet containing all characters not in the CharSet returned by CharacterClassEscape :: `w` . - -

The production CharacterClassEscape :: `p{` UnicodePropertyValueExpression `}` evaluates as follows:

- - 1. Return the CharSet containing all Unicode code points included in the CharSet returned by |UnicodePropertyValueExpression|. - -

The production CharacterClassEscape :: `P{` UnicodePropertyValueExpression `}` evaluates as follows:

- - 1. Return the CharSet containing all Unicode code points not included in the CharSet returned by |UnicodePropertyValueExpression|. - -

The production UnicodePropertyValueExpression :: UnicodePropertyName `=` UnicodePropertyValue evaluates as follows:

- - 1. Let _ps_ be SourceText of |UnicodePropertyName|. - 1. Let _p_ be ! UnicodeMatchProperty(_ps_). - 1. Assert: _p_ is a Unicode property name or property alias listed in the “Property name and aliases” column of . - 1. Let _vs_ be SourceText of |UnicodePropertyValue|. - 1. Let _v_ be ! UnicodeMatchPropertyValue(_p_, _vs_). - 1. Return the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value _v_. - -

The production UnicodePropertyValueExpression :: LoneUnicodePropertyNameOrValue evaluates as follows:

- - 1. Let _s_ be SourceText of |LoneUnicodePropertyNameOrValue|. - 1. If ! UnicodeMatchPropertyValue(`General_Category`, _s_) is identical to a List of Unicode code points that is the name of a Unicode general category or general category alias listed in the “Property value and aliases” column of , then - 1. Return the CharSet containing all Unicode code points whose character database definition includes the property “General_Category” with value _s_. - 1. Let _p_ be ! UnicodeMatchProperty(_s_). - 1. Assert: _p_ is a binary Unicode property or binary property alias listed in the “Property name and aliases” column of . - 1. Return the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value “True”. - -
-

CharacterClass

The production CharacterClass :: `[` ClassRanges `]` evaluates as follows:

@@ -35357,6 +35222,150 @@

ClassEscape

A |ClassAtom| can use any of the escape sequences that are allowed in the rest of the regular expression except for `\\b`, `\\B`, and backreferences. Inside a |CharacterClass|, `\\b` means the backspace character, while `\\B` and backreferences raise errors. Using a backreference inside a |ClassAtom| causes an error.

+ + +

AtomEscape

+

With parameter _direction_.

+

The production AtomEscape :: DecimalEscape evaluates as follows:

+ + 1. Evaluate |DecimalEscape| to obtain an integer _n_. + 1. Assert: _n_ ≤ _NcapturingParens_. + 1. Return ! BackreferenceMatcher(_n_, _direction_). + +

The production AtomEscape :: CharacterClassEscape evaluates as follows:

+ + 1. Evaluate |CharacterClassEscape| to obtain a CharSet _A_. + 1. Return ! CharacterSetMatcher(_A_, *false*, _direction_). + + +

An escape sequence of the form `\\` followed by a non-zero decimal number _n_ matches the result of the _n_th set of capturing parentheses (). It is an error if the regular expression has fewer than _n_ capturing parentheses. If the regular expression has _n_ or more capturing parentheses but the _n_th one is *undefined* because it has not captured anything, then the backreference always succeeds.

+
+

The production AtomEscape :: CharacterEscape evaluates as follows:

+ + 1. Evaluate |CharacterEscape| to obtain a character _ch_. + 1. Let _A_ be a one-element CharSet containing the character _ch_. + 1. Return ! CharacterSetMatcher(_A_, *false*, _direction_). + +

The production AtomEscape :: `k` GroupName evaluates as follows:

+ + 1. Search the enclosing |Pattern| for an instance of a |GroupSpecifier| containing a |RegExpIdentifierName| which has a CapturingGroupName equal to the CapturingGroupName of the |RegExpIdentifierName| contained in |GroupName|. + 1. Assert: A unique such |GroupSpecifier| is found. + 1. Let _parenIndex_ be the number of left-capturing parentheses in the entire regular expression that occur to the left of the located |GroupSpecifier|. This is the total number of Atom :: `(` GroupSpecifier Disjunction `)` Parse Nodes prior to or enclosing the located |GroupSpecifier|, including its immediately enclosing |Atom|. + 1. Return ! BackreferenceMatcher(_parenIndex_, _direction_). + + + +

+ BackreferenceMatcher ( + _n_: a positive integer, + _direction_: 1 or -1, + ) +

+
+
+ + 1. Assert: _n_ ≥ 1. + 1. Return a new Matcher with parameters (_x_, _c_) that captures _n_ and _direction_ and performs the following steps when called: + 1. Assert: _x_ is a State. + 1. Assert: _c_ is a Continuation. + 1. Let _cap_ be _x_'s _captures_ List. + 1. Let _s_ be _cap_[_n_]. + 1. If _s_ is *undefined*, return _c_(_x_). + 1. Let _e_ be _x_'s _endIndex_. + 1. Let _len_ be the number of elements in _s_. + 1. Let _f_ be _e_ + _direction_ × _len_. + 1. If _f_ < 0 or _f_ > _InputLength_, return ~failure~. + 1. Let _g_ be min(_e_, _f_). + 1. If there exists an integer _i_ between 0 (inclusive) and _len_ (exclusive) such that Canonicalize(_s_[_i_]) is not the same character value as Canonicalize(_Input_[_g_ + _i_]), return ~failure~. + 1. Let _y_ be the State (_f_, _cap_). + 1. Return _c_(_y_). + +
+
+ + +

DecimalEscape

+

The |DecimalEscape| productions evaluate as follows:

+ DecimalEscape :: NonZeroDigit DecimalDigits? + + 1. Return the CapturingGroupNumber of this |DecimalEscape|. + + +

If `\\` is followed by a decimal number _n_ whose first digit is not `0`, then the escape sequence is considered to be a backreference. It is an error if _n_ is greater than the total number of left-capturing parentheses in the entire regular expression.

+
+
+ + +

CharacterClassEscape

+

The production CharacterClassEscape :: `d` evaluates as follows:

+ + 1. Return the ten-element CharSet containing the characters `0` through `9` inclusive. + +

The production CharacterClassEscape :: `D` evaluates as follows:

+ + 1. Return the CharSet containing all characters not in the CharSet returned by CharacterClassEscape :: `d` . + +

The production CharacterClassEscape :: `s` evaluates as follows:

+ + 1. Return the CharSet containing all characters corresponding to a code point on the right-hand side of the |WhiteSpace| or |LineTerminator| productions. + +

The production CharacterClassEscape :: `S` evaluates as follows:

+ + 1. Return the CharSet containing all characters not in the CharSet returned by CharacterClassEscape :: `s` . + +

The production CharacterClassEscape :: `w` evaluates as follows:

+ + 1. Return _WordCharacters_. + +

The production CharacterClassEscape :: `W` evaluates as follows:

+ + 1. Return the CharSet containing all characters not in the CharSet returned by CharacterClassEscape :: `w` . + +

The production CharacterClassEscape :: `p{` UnicodePropertyValueExpression `}` evaluates as follows:

+ + 1. Return the CharSet containing all Unicode code points included in the CharSet returned by |UnicodePropertyValueExpression|. + +

The production CharacterClassEscape :: `P{` UnicodePropertyValueExpression `}` evaluates as follows:

+ + 1. Return the CharSet containing all Unicode code points not included in the CharSet returned by |UnicodePropertyValueExpression|. + +

The production UnicodePropertyValueExpression :: UnicodePropertyName `=` UnicodePropertyValue evaluates as follows:

+ + 1. Let _ps_ be SourceText of |UnicodePropertyName|. + 1. Let _p_ be ! UnicodeMatchProperty(_ps_). + 1. Assert: _p_ is a Unicode property name or property alias listed in the “Property name and aliases” column of . + 1. Let _vs_ be SourceText of |UnicodePropertyValue|. + 1. Let _v_ be ! UnicodeMatchPropertyValue(_p_, _vs_). + 1. Return the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value _v_. + +

The production UnicodePropertyValueExpression :: LoneUnicodePropertyNameOrValue evaluates as follows:

+ + 1. Let _s_ be SourceText of |LoneUnicodePropertyNameOrValue|. + 1. If ! UnicodeMatchPropertyValue(`General_Category`, _s_) is identical to a List of Unicode code points that is the name of a Unicode general category or general category alias listed in the “Property value and aliases” column of , then + 1. Return the CharSet containing all Unicode code points whose character database definition includes the property “General_Category” with value _s_. + 1. Let _p_ be ! UnicodeMatchProperty(_s_). + 1. Assert: _p_ is a binary Unicode property or binary property alias listed in the “Property name and aliases” column of . + 1. Return the CharSet containing all Unicode code points whose character database definition includes the property _p_ with value “True”. + +
+ + +

CharacterEscape

+

The |CharacterEscape| productions evaluate as follows:

+ + CharacterEscape :: + ControlEscape + `c` ControlLetter + `0` [lookahead ∉ DecimalDigit] + HexEscapeSequence + RegExpUnicodeEscapeSequence + IdentityEscape + + + 1. Let _cv_ be the CharacterValue of this |CharacterEscape|. + 1. Return the character whose character value is _cv_. + +
@@ -46127,26 +46136,23 @@

Regular Expressions

- - - - - + - -

Each `\\u` |HexTrailSurrogate| for which the choice of associated `u` |HexLeadSurrogate| is ambiguous shall be associated with the nearest possible `u` |HexLeadSurrogate| that would otherwise have no corresponding `\\u` |HexTrailSurrogate|.

-

 

- - - - + + + + + + + + @@ -46157,13 +46163,14 @@

Regular Expressions

- - - - - - - + + + + + + + +