Inconsistencies to RegExp.prototype[@@match] #28

schuay · 2017-10-11T09:32:01Z

I'm a bit unclear on how closely matchAll should match the behavior of match. There's two (somewhat related) inconsistencies I noticed:

Non-global regexp.

Given a non-global regexp, @@match will simply return Return ? RegExpExec(rx, S)., while matchAll returns a single-element iterator.

Termination / zero-width matches.

matchAll handles termination through lastIndex comparisons and thus terminates on zero-width matches. @@match terminates on a failed match and advances lastIndex through AdvanceStringIndex on zero-width matches.

I suppose the termination method for matchAll was chosen to terminate correctly when passed a non-global regexp(?)

It feels strange to have these inconsistencies between such closely-related methods.

The text was updated successfully, but these errors were encountered:

ljharb · 2017-10-11T19:53:44Z

Symbol.match is special by necessity, since it's the sole symbol that makes IsRegExp pass. It wouldn't make sense to follow every spec step in it identically.

I'm not sure I understand your second point; can you provide me with a more concrete example? Indeed, the goal when passed a non-global regex was to provide an iterator for a single match, and RegExpExec used naively would provide (and did, in my initial polyfill implementation, oops) an infinite iterator for the same match object.

schuay · 2017-10-12T07:03:36Z

Symbol.match is special by necessity, since it's the sole symbol that makes IsRegExp pass. It wouldn't make sense to follow every spec step in it identically.

We may be talking past each other, I don't understand the connection to my point 1. AFAIK IsRegExp only needs @@match to be a truish value (and not even that for builtin RegExp objects, which have the [[RegExpMatcher]] internal slot). The semantics of @@match itself don't seem to have any relation to IsRegExp. Did I miss something?

I'm not sure I understand your second point; can you provide me with a more concrete example?

Sure. "abcde".match(/\B/g) ~~> ["", "", "", ""].

Now consider "abcde".matchAll(/\B/g). Each matched string has a length of 0, which means lastIndex remains unmodified after the first successful match. That would trigger termination (i.e. creating an IterResultObject with done = true) in next.

So, matchAll would return a single-element iterator while @@match would return a 4-element array.

ljharb · 2017-10-12T07:15:43Z

You're totally right that IsRegExp only needs a non-truthy value; the other part tho is that all the built-in regex symbols are codifying legacy behavior - matchAll isn't necessarily constrained by those limitations.

With your specific example, I think you've definitely located an interesting case. An exec loop with that regex provides only a single match; and matchAll as currently specced actually uses exec, not match, under the hood - so the current algorithm intentionally provides an iterator for a single match object. (That said, .replace does invoke the callback function 4 times)

It's important here that the iterator is for match objects - arrays with an input and lastIndex property - so I'm not sure how to proceed here.

Unrelated to the spec steps, and generally speaking, would it match your intuition if [...s.matchAll(r)].length === s.match(r).length for global regexes? (so that your example would be an iterator for 4 match objects)

schuay · 2017-10-12T07:33:51Z

Unrelated to the spec steps, and generally speaking, would it match your intuition if [...s.matchAll(r)].length === s.match(r).length for global regexes?

Yeah, personally I think staying consistent with @@match and @@replace behavior, both of which use AdvanceStringIndex, makes sense.

ljharb · 2017-10-12T07:47:21Z

I think that is both an improvement (smoothing over edge cases I hadn't previously thought of) and also more closely matches the goals of the proposal - and I think it can be achieved with a different spec algorithm.

Thanks for the help; I'll keep this open until I've finished that change.

Fixes #28.

ljharb added a commit that referenced this issue Nov 21, 2017

Add special handling for “global” regexes, just like @@match has.

fdca2a2

Fixes #28.

ljharb mentioned this issue Nov 21, 2017

Add special handling for “global” regexes, just like @@match has. #29

Merged

ljharb added the committee feedback label Nov 22, 2017

ljharb added a commit that referenced this issue Nov 22, 2017

Add special handling for “global” regexes, just like @@match has.

d5d14ca

Fixes #28.

ljharb added a commit that referenced this issue Dec 14, 2017

Add special handling for “global” regexes, just like @@match has.

f46da33

Fixes #28.

ljharb added a commit that referenced this issue Dec 14, 2017

Add special handling for “global” regexes, just like @@match has.

08f6369

Fixes #28.

ljharb closed this as completed in #29 Jan 8, 2018

ljharb mentioned this issue Apr 16, 2018

Re-evaluate IsRegExp call in MatchAllIterator? #34

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistencies to RegExp.prototype[@@match] #28

Inconsistencies to RegExp.prototype[@@match] #28

schuay commented Oct 11, 2017

ljharb commented Oct 11, 2017

schuay commented Oct 12, 2017

ljharb commented Oct 12, 2017

schuay commented Oct 12, 2017

ljharb commented Oct 12, 2017

Inconsistencies to RegExp.prototype[@@match] #28

Inconsistencies to RegExp.prototype[@@match] #28

Comments

schuay commented Oct 11, 2017

ljharb commented Oct 11, 2017

schuay commented Oct 12, 2017

ljharb commented Oct 12, 2017

schuay commented Oct 12, 2017

ljharb commented Oct 12, 2017