Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[spec] Clarify that strings can separate tokens #1490

Closed
wants to merge 1 commit into from
Closed

Conversation

tlively
Copy link
Member

@tlively tlively commented Jun 7, 2022

Since " is not included in idchar, it can mark the end of a token. For
example, 0"foo"$x would parse as 0, "foo", $x.

Since `"` is not included in `idchar`, it can mark the end of a token. For
example, `0"foo"$x` would parse as `0`, `"foo"`, `$x`.
@tlively tlively requested a review from rossberg June 7, 2022 22:20
Copy link
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting. That is kind of unintentional and inconsistent (e.g., I remember that we explicitly decided to make 0$a a single token and tweaked the lexical grammar for that purpose at some point in the past).

I wonder if we can afford fixing it? That is, add " to the characters that can occur in reserved? Technically, that would be a breaking change to the text format, but it's probably unlikely to affect real-world code. (Edit: as a quick check I just tried, at least the test suite does not notice the difference.)

@tlively
Copy link
Member Author

tlively commented Jun 8, 2022

Yeah, I can see that it might be surprising that " starts a new token but $ does not. I wonder how many parsers in the wild actually implement this lexing correctly anyway. It's probably worth asking more broadly whether anyone cares before making a breaking change, but I also suspect this wouldn't matter in practice.

@rossberg
Copy link
Member

rossberg commented Jun 8, 2022

Yes, absolutely. Shall we put it on the next meeting's agenda?

How did you run into this in the first place?

@tlively
Copy link
Member Author

tlively commented Jun 8, 2022

Sure, I’ll make an agenda PR. I found this because I’m working on a new spec-compliant text parser for Binaryen, so I’m taking a closer look at the text spec for the first time.

@tlively
Copy link
Member Author

tlively commented Jun 21, 2022

We had unanimous consent in the CG meeting this morning to make a change to the grammar to disallow strings from separating tokens, although we didn't get into the details of how such a change might be accomplished.

@rossberg
Copy link
Member

See #1499.

@rossberg
Copy link
Member

rossberg commented Aug 4, 2022

Closing this, since it has been superseded by a spec change.

@rossberg rossberg closed this Aug 4, 2022
@rossberg rossberg deleted the tlively-patch-2 branch February 28, 2024 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants