Parser re-escapes `\` in text for raw strings. #41

dmfxyz · 2022-05-22T16:28:22Z

Take this simple grammer:

grammar = `
str          ::= '"' (unsafe | SAFE)* '"'
SAFE         ::= #x21 | [#x24-#x5A] | [#x5E-#x7A] | #x7C | #x7E
unsafe       ::= ESCAPE #x22
ESCAPE       ::= #x5C
`

If we define a raw string as follows:

str = String.raw`"stringwith\"escapes"`
console.log(str)

We get the representation:

"stringwith\"escapes"

Now if we define rules and a parser for this grammar and run it on that raw string:

rules = ebnf.Grammars.W3C.getRules(grammar)
parser = new ebnf.Parser(rules)
ast = parser.getAST(str)
console.log(ast)

We see the ast:

<ref *1> {
  type: 'str',
  text: '"stringwith\\"escapes"',
  children: [
    {
      type: 'unsafe',
      text: '\\"',
      children: [],
      end: 13,
      errors: [],
      fullText: '',
      parent: [Circular *1],
      start: 11,
      rest: ''
    }
  ],
  end: 21,
  errors: [],
  fullText: '',
  parent: null,
  start: 0,
  rest: ''
}

The text property of both the parent str and child unsafe have had the \ re-escaped. I don't think this re-escapement should happen for raw strings.

You can see a basic example here: https://github.com/dmfxyz/node-ebnf-issue-example
And the repo in which we originally discovered this behavior: nmushegian/jams#23

The text was updated successfully, but these errors were encountered:

dmfxyz mentioned this issue May 22, 2022

Support for escaped quotes nmushegian/jams#23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parser re-escapes `\` in text for raw strings. #41

Parser re-escapes `\` in text for raw strings. #41

dmfxyz commented May 22, 2022

Parser re-escapes \ in text for raw strings. #41

Parser re-escapes \ in text for raw strings. #41

Comments

dmfxyz commented May 22, 2022

Parser re-escapes `\` in text for raw strings. #41

Parser re-escapes `\` in text for raw strings. #41