Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: Fix wording bug by tweaking CapturingGroupName #2722

Merged
merged 3 commits into from
Apr 28, 2022

Conversation

jmdyck
Copy link
Collaborator

@jmdyck jmdyck commented Apr 4, 2022

While reviewing #2721, I noticed a wording bug in the current spec:

Because the nonterminal RegExpIdentifierName is defined recursively, a GroupSpecifier usually contains/encloses multiple RegExpIdentifierNames.

So in a phrasing such as:
a |GroupSpecifier| containing a |RegExpIdentifierName| which has ...
the spec only intends the 'topmost' RegExpIdentifierName, but any of the 'deeper' RegExpIdentifierNames could satisfy the phrase.

E.g., consider /(?<NotTheOne>foo)\k<TheOne>/. This should be a syntax error, because the AtomEscape \k<TheOne> doesn't have a matching GroupSpecifier. But the GroupSpecifier ?<NotTheOne> does contain/enclose a RegExpIdentifierName TheOne that matches the GroupName of the AtomEscape, and so there isn't a syntax error.

There are 3 spots in the spec that are subject to this bug. This PR fixes them.


It's possible to select the 'topmost' RegExpIdentifierName just by tweaking the wording at the problem spots. However, this would lengthen some already-lengthy sentences (because you have to involve the GroupName parent of the RegExpIdentifierName).

But I noticed that the only reason we're accessing the RegExpIdentifierName is to invoke CapturingGroupName on it. So if we change CapturingGroupName to instead operate on the GroupName, then accessing its RegExpIdentifierName child (and not any deeper RegExpIdentifierName) becomes the easiest thing to do, and everything (including the definition of CapturingGroupName itself) gets slightly simpler.

That's the first commit.

(There are no references to CapturingGroupName in downstream specs.)


The second commit is a bonus. It factors out the idea of finding the GroupSpecifier(s) that match a given GroupName, and presents it as the abstract operation GroupSpecifiersThatMatch. (This should make the changes of #2721 a bit easier too.)

@bakkot
Copy link
Contributor

bakkot commented Apr 10, 2022

First commit LGTM. Second doesn't really seem like an improvement to me, but we can see what @michaelficarra thinks.

Copy link
Contributor

@gibson042 gibson042 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like both commits.

@bakkot
Copy link
Contributor

bakkot commented Apr 13, 2022

Other editors were in favor of the second commit, so let's keep it.

spec.html Outdated Show resolved Hide resolved
spec.html Outdated Show resolved Hide resolved
@jmdyck jmdyck force-pushed the CapturingGroupName branch 2 times, most recently from b6ac2bd to e50cac4 Compare April 21, 2022 03:05
@jmdyck jmdyck force-pushed the CapturingGroupName branch from e50cac4 to dbb1b41 Compare April 27, 2022 19:19
Copy link
Member

@michaelficarra michaelficarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! LGTM.

@jmdyck jmdyck force-pushed the CapturingGroupName branch from dbb1b41 to 46a06cf Compare April 28, 2022 01:10
@jmdyck
Copy link
Collaborator Author

jmdyck commented Apr 28, 2022

(force-pushed to squash today's 3 commits into one)

@bakkot bakkot added the ready to merge Editors believe this PR needs no further reviews, and is ready to land. label Apr 28, 2022
jmdyck added 3 commits April 27, 2022 21:42
A GroupSpecifier can enclose/contain *many* RegExpIdentifierNames,
which causes some wording to have an unintended meaning.

Fix this by changing the CapturingGroupName SDO
to operate on the GroupName rather than the RegExpIdentifierName.

(See PR for a more detailed explanation.)
That is, change:
    Atom :: `(` GroupSpecifier Disjunction `)`
    GroupSpecifier[UnicodeMode] :: [empty] | `?` GroupName
to:
    Atom :: `(` GroupSpecifier? Disjunction `)`
    GroupSpecifier[UnicodeMode] :: `?` GroupName
(and similarly for ExtendedAtom).

This doesn't change the language, it just means that
when we have a GroupSpecifier "in hand", we know that it has a GroupName child.

(As for the case where a GroupSpecifier is empty/omitted, the spec
doesn't specifically talk about that, so no changes are necessary.)

---

Extend CapturingGroupName to GroupSpecifier

Now that GroupSpecifier is defined by a chain production,
we can apply CapturingGroupName to a GroupSpecifier,
and let the chain rule handle it.

---

Reword the early error rule involving CapturingGroupName

Currently, the spec talks about
    "multiple [things that] have the same CapturingGroupName"
which is unusual wording by which to invoke an SDO.

Change it to:
    "two or more [things] for which CapturingGroupName of [thing] is the same"
which is still unusual, but maybe less so.
@ljharb ljharb force-pushed the CapturingGroupName branch from 46a06cf to 4052033 Compare April 28, 2022 04:42
@ljharb ljharb merged commit 4052033 into tc39:main Apr 28, 2022
@jmdyck jmdyck deleted the CapturingGroupName branch April 28, 2022 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
editorial change ready to merge Editors believe this PR needs no further reviews, and is ready to land. spec bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants