From 0d3d71b12ebb27f7c16a2d409da960890d24d2a2 Mon Sep 17 00:00:00 2001 From: Michael Saboff Date: Wed, 6 Apr 2016 15:08:26 -0700 Subject: [PATCH 1/2] Proposed RegExp CharacterClassEscape changes for \w --- spec.html | 476 +++++++++++++++++++++++++++--------------------------- 1 file changed, 242 insertions(+), 234 deletions(-) diff --git a/spec.html b/spec.html index 394e3b0c10..6e62089e0f 100644 --- a/spec.html +++ b/spec.html @@ -28390,240 +28390,248 @@

CharacterClassEscape

The production CharacterClassEscape :: `d` evaluates by returning the ten-element set of characters containing the characters `0` through `9` inclusive.

The production CharacterClassEscape :: `D` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `d`.

The production CharacterClassEscape :: `s` evaluates by returning the set of characters containing the characters that are on the right-hand side of the |WhiteSpace| or |LineTerminator| productions.

-

The production CharacterClassEscape :: `S` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `s`.

-

The production CharacterClassEscape :: `w` evaluates by returning the set of characters containing the sixty-three characters:

-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- `a` - - `b` - - `c` - - `d` - - `e` - - `f` - - `g` - - `h` - - `i` - - `j` - - `k` - - `l` - - `m` - - `n` - - `o` - - `p` - - `q` - - `r` - - `s` - - `t` - - `u` - - `v` - - `w` - - `x` - - `y` - - `z` -
- `A` - - `B` - - `C` - - `D` - - `E` - - `F` - - `G` - - `H` - - `I` - - `J` - - `K` - - `L` - - `M` - - `N` - - `O` - - `P` - - `Q` - - `R` - - `S` - - `T` - - `U` - - `V` - - `W` - - `X` - - `Y` - - `Z` -
- `0` - - `1` - - `2` - - `3` - - `4` - - `5` - - `6` - - `7` - - `8` - - `9` - - `_` - - - - - - - - - - - - - - - -
-
-

The production CharacterClassEscape :: `W` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `w`.

+

The production CharacterClassEscape :: `S` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `s` .

+

The production CharacterClassEscape :: `w` evaluates as follows:

+ + 1. Create a set _A_ of characters containing the sixty-three characters: +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ `a` + + `b` + + `c` + + `d` + + `e` + + `f` + + `g` + + `h` + + `i` + + `j` + + `k` + + `l` + + `m` + + `n` + + `o` + + `p` + + `q` + + `r` + + `s` + + `t` + + `u` + + `v` + + `w` + + `x` + + `y` + + `z` +
+ `A` + + `B` + + `C` + + `D` + + `E` + + `F` + + `G` + + `H` + + `I` + + `J` + + `K` + + `L` + + `M` + + `N` + + `O` + + `P` + + `Q` + + `R` + + `S` + + `T` + + `U` + + `V` + + `W` + + `X` + + `Y` + + `Z` +
+ `0` + + `1` + + `2` + + `3` + + `4` + + `5` + + `6` + + `7` + + `8` + + `9` + + `_` + + + + + + + + + + + + + + + +
+
+ 1. If _Unicode_ is *true*, then + 1. Create an empty set _U_. + 1. For every character _c_ not in set _A_ where Canonicalize(_c_) is in _A_, add _c_ to _U_. + 1. Add the characters in set _U_ to set _A_. + 1. Return _A_. +
+

The production CharacterClassEscape :: `W` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `w` .

From ab3380df9d7171ff6b29a0513c4e3924db0195bf Mon Sep 17 00:00:00 2001 From: Michael Saboff Date: Tue, 7 Jun 2016 17:16:19 -0700 Subject: [PATCH 2/2] Updated per what was agreed at the May 2016 meeting. Created a new abstract operation "WordCharacters()" that is used by both IsWordChar() for word assertions and \w/\W CharacterClassEscapes. --- spec.html | 283 ++++++------------------------------------------------ 1 file changed, 29 insertions(+), 254 deletions(-) diff --git a/spec.html b/spec.html index 6e62089e0f..d60d0f2b7f 100644 --- a/spec.html +++ b/spec.html @@ -27756,16 +27756,14 @@

Assertion

- -

Runtime Semantics: IsWordChar Abstract Operation

-

The abstract operation IsWordChar takes an integer parameter _e_ and performs the following steps:

+ +

Runtime Semantics: WordCharacters Abstract Operation

+

The abstract operation WordCharacters performs the following steps:

- 1. If _e_ is -1 or _e_ is _InputLength_, return *false*. - 1. Let _c_ be the character _Input_[_e_]. - 1. If _c_ is one of the sixty-three characters below, return *true*. -
- - + 1. Create a set _A_ of characters containing the sixty-three characters: +
+
+ - -
`a` @@ -27991,14 +27989,30 @@

Runtime Semantics: IsWordChar Abstract Operation

-
- 1. Return *false*. + + + + 1. Create an empty set _U_. + 1. For every character _c_ not in set _A_ where Canonicalize(_c_) is in _A_, add _c_ to _U_. + 1. Assert: Unless _Unicode_ and _IgnoreCase_ are both true, _U_ is empty. + 1. Add the characters in set _U_ to set _A_. + 1. Return _A_.
+ + +

Runtime Semantics: IsWordChar Abstract Operation

+

The abstract operation IsWordChar takes an integer parameter _e_ and performs the following steps:

+ + 1. If _e_ is -1 or _e_ is _InputLength_, return *false*. + 1. Let _c_ be the character _Input_[_e_]. + 1. Let _WordChars_ be _WordCharacters_(). + 1. If _c_ is in _WordChars_, return *true*. + 1. Return *false*. + +
+
-

Quantifier

@@ -28391,246 +28405,7 @@

CharacterClassEscape

The production CharacterClassEscape :: `D` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `d`.

The production CharacterClassEscape :: `s` evaluates by returning the set of characters containing the characters that are on the right-hand side of the |WhiteSpace| or |LineTerminator| productions.

The production CharacterClassEscape :: `S` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `s` .

-

The production CharacterClassEscape :: `w` evaluates as follows:

- - 1. Create a set _A_ of characters containing the sixty-three characters: -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- `a` - - `b` - - `c` - - `d` - - `e` - - `f` - - `g` - - `h` - - `i` - - `j` - - `k` - - `l` - - `m` - - `n` - - `o` - - `p` - - `q` - - `r` - - `s` - - `t` - - `u` - - `v` - - `w` - - `x` - - `y` - - `z` -
- `A` - - `B` - - `C` - - `D` - - `E` - - `F` - - `G` - - `H` - - `I` - - `J` - - `K` - - `L` - - `M` - - `N` - - `O` - - `P` - - `Q` - - `R` - - `S` - - `T` - - `U` - - `V` - - `W` - - `X` - - `Y` - - `Z` -
- `0` - - `1` - - `2` - - `3` - - `4` - - `5` - - `6` - - `7` - - `8` - - `9` - - `_` - - - - - - - - - - - - - - - -
-
- 1. If _Unicode_ is *true*, then - 1. Create an empty set _U_. - 1. For every character _c_ not in set _A_ where Canonicalize(_c_) is in _A_, add _c_ to _U_. - 1. Add the characters in set _U_ to set _A_. - 1. Return _A_. -
+

The production CharacterClassEscape :: `w` evaluates by returning the set of all characters returned by _WordCharacters_().

The production CharacterClassEscape :: `W` evaluates by returning the set of all characters not included in the set returned by CharacterClassEscape :: `w` .