Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Error handling #116

Merged
merged 10 commits into from
May 17, 2021
Merged

Update Error handling #116

merged 10 commits into from
May 17, 2021

Conversation

hildjj
Copy link
Contributor

@hildjj hildjj commented Apr 27, 2021

Fixes #102

I'd like a couple of people to review this please. I'll put some notes inline.

Copy link
Contributor Author

@hildjj hildjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may break a few people who are relying on error formatting. I want to make sure we've discussed that adequately. Do we need to take a major version number bump for this?

" Error.captureStackTrace(this, peg$SyntaxError);",
" }",
" const self = Error.call(this, message);",
" Object.setPrototypeOf(self, peg$SyntaxError.prototype);",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://webreflection.co.uk/blog/2016/08/30/js-super-problem
What I want is for peg$syntaxError to be a real subclass of Error, with the constructor having been called correctly. Below this you see a call to peg$subclass which only does part of that.

@@ -39,7 +39,7 @@ function reportInfiniteRecursion(ast) {
throw new GrammarError(
"Possible infinite loop when parsing (left recursion: "
+ visitedRules.join(" -> ")
+ ").",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also changed the error texts to be consistent in punctuation. I flipped a coin and got "no period at the end"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. That points looks alien

@@ -1,15 +1,11 @@
"use strict";

// Thrown when the grammar contains an error.
class GrammarError {
class GrammarError extends Error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modern version of the change to peg$syntaxError

Comment on lines -39 to -41
"Label \"" + node.label + "\" is already defined "
+ "at line " + env[label].start.line + ", "
+ "column " + env[label].start.column + ".",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a useful information that should be kept. Some other messages just put there their own location, that is useless.

Ideally, this information also should be presented in the error object as a separate field, but right now I can't imagine which format would be better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this should be reintroduced. However, I also prefer the basic sentiment that the underlying message should be short.

I propose:

  1. A "shortMessage" field be introduced
  2. The full name be constructed by a function from the various fields
  3. The following arguments be added to the constructor, then promoted as fields in the downstream product:
    1. start (object)
      1. line (number)
      2. column (number)
      3. offset (number)
    2. end (object)
      1. line (number)
      2. column (number)
      3. offset (number)
    3. length (number)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would imply that the string being passed to GrammarError is actually the shortMessage

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy to do the work if we land this without this piece

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a later commit, I added a toString(), the codeLocation bits that @Mingun wanted, and added a referenceLocation field for the errors that say "you have a problem here that references a thing there", e.g. duplicate labels. I'm open to length, but not sure what value it adds. From pegjs-util, I really like found, and wouldn't mind the prolog, token, and epilog fields if they end up being helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following arguments be added to the constructor, then promoted as fields in the downstream product:

GrammarError already have a location that have more full information.

From pegjs-util, I really like found, and wouldn't mind the prolog, token, and epilog fields if they end up being helpful.

found is a part of SyntaxError already and the mentioned method processes that kind of errors. I don't think that prolog, token, and epilog fields will be useful, but we can provide method in the SyntaxError that extracts any token from input stream and error position just by executing specified "errorTokenRule" rule:

class SyntaxError extends Error {
  /// @param errorTokenRule?
  ///        Rule to extract logical token with error.
  ///        If not specified, grammar's default will be used
  ///        That default can be specified or in options,
  ///        or via annotations when annotations landed
  errorToken(input, errorTokenRule) {
    // Something like that
    return parser.parse(
      input.substr(this.location.start.offset),
      { startRule: errorTokenRule }
    );
  }
}

Comment on lines -14 to -16
"Rule \"" + node.name + "\" is already defined "
+ "at line " + rules[node.name].start.line + ", "
+ "column " + rules[node.name].start.column + ".",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Comment on lines -21 to -23
"\"@\" cannot be used with an action block "
+ "at line " + node.location.start.line + ", "
+ "column " + node.location.start.column + ".",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should replace self location with action node location, but there is a subtle aspect. action node location points to the expression and associated code block and it's start position visually not related to the code block at all. We have two options here, how to solve that:

  • change location of the action node. I think that is bad, because:
    • that is breaking change
    • that breaks assumption that locations of all children nodes lives in the location of their parent node
  • introduce a new codeLocation property, which should hold the CodeBlock location, i.e. space from opening { till closing }. That properties should be added to all nodes which have a CodeBlock. For me, that is the best choice

After introducing that property just use action.codeLocation.start there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, codeLocation would used for source map support

Copy link
Contributor Author

@hildjj hildjj Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doesn't the code block have its own AST node?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it have -- that is initializer, action, semantic_and, and semantic_not

Copy link
Contributor Author

@hildjj hildjj Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting that the ast should be:

{
  type: 'semantic_and',
  expression: {
    type: 'code',
    code: ' return true; ',
    location: ...
  },
  location: ...
}

instead of:

{
  type: 'semantic_and',
  code: ' return true; ',
  location: ...
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it would be hard to process them if use the visitor pattern as with other kinds of nodes. You always need information from their parent node to process it. So you have to:

  • or give each code node different name -- but then this situation wouldn't differ from present
  • or not use visitor pattern when process such nodes -- then an extra entity will give nothing useful
  • or give a reference to the owner node down to the visitor so code nodes known which data their represents

I think, that all alternatives not gives us so much and just a new property with an extra location will be enough:

{
  type: 'semantic_and',
  code: ' return true; ',
  codeLocation: ..., // for initializer just the same reference, as location
  location: ...
}

Comment on lines 791 to 792
" const self = Error.call(this, message);",
" Object.setPrototypeOf(self, peg$SyntaxError.prototype);",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now generated parser is still a ES5 compatible code. As I remember, const available only from ES6, the same for Object.setPrototypeOf.

So this is a breaking change. I do not mind, because I myself plan to do some PRs that are likely to be breaking.

Just do not forget to update also a list of supported targets in the readme.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const->var is an easy fix that was just carelessness on my part. Object.setPrototypeOf is supported in IE11+, and I think we're still trying to support IE9 for the moment. We can add a polyfill, or just accept slightly-incorrect behavior if Object.setPrototypeOf doesn't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"slightly-incorrect" here looks like this:

Uncaught [Error [SyntaxError]: ... ]

instead of this:

[peg$SyntaxError: ... ]

which i think i can live with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this polyfill isn't actually possible

I've made React 15 work in IE7

This is a much uglier thing than you might expect

As a point, does it really matter? If they weren't getting a nice class inheritance before, there's no sense establishing a backwards compatibility they never had in the first place

As long as it doesn't catch fire in new and unique ways, "compatability" can in their case be relegated to "eh, good enough" IMO. If it was already broken, we don't have to fix it

@@ -39,7 +39,7 @@ function reportInfiniteRecursion(ast) {
throw new GrammarError(
"Possible infinite loop when parsing (left recursion: "
+ visitedRules.join(" -> ")
+ ").",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. That points looks alien

Copy link
Member

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things can be done better. I've leave inline comments

@hildjj
Copy link
Contributor Author

hildjj commented Apr 27, 2021

@Mingun look again please. If this works for you, I'll rebase into better chunks.

Copy link
Member

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, but you can make it even better. See the inline comments. Not strictly required although

@@ -2,10 +2,29 @@

// Thrown when the grammar contains an error.
class GrammarError extends Error {
constructor(message, location) {
constructor(message, location, referenceLocation) {
Copy link
Member

@Mingun Mingun Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better to pass an array with message and location for additional notes to the main error, like in the Rust Diagnostic API. This opens door for the great errors reports, for example, the report-left-recursion pass would be to point to all locations involved in the left recursion. The error reporting code would look like:

throw new GrammarError(
  "Possible infinite loop when parsing",// message
  error_location,                       // location
  visitedRules.map(rule => {            // notes[]
    return {
      message: `left recursion via rule "${rule.name}"`,
      location: rule.location
    };
  })
);

This error could be rendered, for example, like a Rust error:

error: Possible infinite loop when parsing
 --> grammar.pegjs:2:1
  |
2 | left_recursion = ...;
  | ^^^^^^^^^^^^^^
  |
note: left recursion via rule "one_rule"
 --> grammar.pegjs:4:1
  |
4 | one_rule = ...;
  | ^^^^^^^^
note: left recursion via rule "another_rule"
 --> grammar.pegjs:6:1
  |
6 | another_rule = ...;
  | ^^^^^^^^^^^^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this, and am going to make the change. What do you think of ...locations as the second parameter?

Copy link
Member

@Mingun Mingun Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean

constructor(message: string, locations: LocationRange[], notes: DiagnosticNote[])

type DiagnosticNote{
  message: string;
  location: LocationRange;
}

?
No, I'm thinking about

constructor(message: string, location?: LocationRange, notes?: DiagnosticNote[])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of just constructor(message, ...locations), and locations could have diagnostic notes if you want.

Copy link
Member

@Mingun Mingun Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will not cover all possible variants.

Main message should be always present. location could be missing, for example, check, that allowedStartRules contains only declared rules (pegjs/pegjs#532, I should port that stuff). notes available not for all diagnostics.

If you thinking about adding message field into LocationRange itself, I think this wrong intention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about adding an optional message to LocationRange, yes. But I agree, that's a separate change. Let me finish off what I have, then let's take a look at it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spreading a container requires that to be varargs forever. if you decide you want to add something else afterwards, you can't.

better to just pass locations as a container imo

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, you remove the GrammarError.location property, this is breaking change.

I thinking about this a bit. We can unify my and you approaches -- make the GrammarError accept just an array of DiagnosticNote objects. In order to keep backward compatibility I'm suggest the following:

  • Currently implement GrammarError as I suggested:
    class GrammarError {
      constructor(message: string, location?: LocationRange, diagnostics?: ...DiagnosticNote) {}
    }
    // usage:
    throw new GrammarError("main message");
    throw new GrammarError("main message", main_location);
    throw new GrammarError(
      "main message", main_location,
      { message: "additional note message", location: note_location }
    );
    throw new GrammarError(
      "main message", null,
      { message: "additional note message 1", location: note_location },
      { message: "additional note message 2" }
    );
    That keeps the backward compatibility
  • In version 2.0 replace this to
    class GrammarError {
      constructor(diagnostics?: ...DiagnosticNote) {}
    }
    // usage:
    throw new GrammarError({ message: "main message" });
    throw new GrammarError({ message: "main message", location: main_location });
    throw new GrammarError(
      { message: "main message", location: main_location },
      { message: "additional note message", location: note_location }
    );
    throw new GrammarError(
      { message: "main message" },
      { message: "additional note message 1", location: note_location },
      { message: "additional note message 2" }
    );
  • Release 1.2.0 in the coming month, then release 2.0 soon

Maybe even keep both variants, because passing main message and their location are common case.

Anyway, the end-user is unlikely will deal with the GrammarError constructor. I'll make a new PR in couple of days where I introduce the new way of reporting errors, that do not require you to throw exceptions. That will allow to emit several errors, warnings and information messages during compilation run.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, i just found the breaking change when I went to look at the .d.ts file. I've added a getter that should paper over the issue, just returning the first location. I'm willing to try implementing your API above and see how I feel about it. Let's see what I come up with.

If you're going to not be throwing errors, is this PR all moot, or are you planning to reuse any of it in your approach? I'd rather have multiple errors if possible.

Copy link
Member

@Mingun Mingun Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think it still useful, because in the end I'll throw the GrammarError. Actual support of multiply errors in one compilation run will be possible only in 2.0, unfortunately, because that introduces a breaking change which I'm not see how to avoid :( .

const filename = (typeof this.location.source === "string")
? `"${this.location.source}" `
: "";
str += `\nat ${filename}line ${this.location.start.line}, column ${this.location.start.column}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to use <filename>:<line>:<column> syntax -- VSCode understand it and can create hyperlinks to corresponding locations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer that too. Willfix.

Comment on lines 20 to 135
if (this.referenceLocation) {
const filename = (typeof this.referenceLocation.source === "string")
? `"${this.referenceLocation.source}" `
: "";
str += `\nfrom ${filename}line ${this.referenceLocation.start.line}, column ${this.referenceLocation.start.column}`;
}

return str;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary to follow my general suggestion above, because it will require access to the original string. Just enumerate errors with "note: <message><location>" will be enough. But if you wish, you can implement the Rust format if source grammar string was given 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make that a separate method.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 28, 2021

The biggest issue remaining is that the peg$syntaxError formats are now their old ugly selves. I can translate all of this code to es5 I guess. Opinions?

@Mingun
Copy link
Member

Mingun commented Apr 29, 2021

Because in several issues we already faced with necessity to introduce a small breaking changes, I think, this can be done in the 2.0 release and we can get it relatively quickly. Release 1.2.0 within a month (I plan to make a several compatible PRs for it), and then change that in the 2.0.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 30, 2021

I'm fixing a few coverage misses and a bug that they pointed out.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 30, 2021

I think I've got the API the way we've discussed above. It's backward-compatible. I've added a format function both to GrammarError and peg$SyntaxError, used the format method in bin/peggy, and written tests for everything. There's one line I haven't figured out how to reach with a test, which happens when a syntax error spans multiple lines. I'm sure someone can think of a grammar that does that.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 30, 2021

Sample output:

$ echo "foo = 'bar'\nfoo= ." | bin/peggy -
Error: Rule "foo" is already defined
 --> stdin:2:1
   |
2  | foo= .
   | ^^^^^^
 --> stdin:1:1
   |
1  | foo = 'bar'
   | ^^^^^^^^^^^
note: Original rule location

Copy link
Member

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Just couple of small comments.

Also, I think it is better to place the note: ... text before their location:

$ echo "foo = 'bar'\nfoo= ." | bin/peggy -
Error: Rule "foo" is already defined
 --> stdin:2:1
   |
2  | foo= .
   | ^^^^^^
note: Original rule location
 --> stdin:1:1
   |
1  | foo = 'bar'
   | ^^^^^^^^^^^

Comment on lines 19 to 39
if (typeof this.location.source === "string") {
str += `${this.location.source}:`;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we could allow any defined object here. No need to forbid this feature for objects that implements toString protocol

Suggested change
if (typeof this.location.source === "string") {
str += `${this.location.source}:`;
}
if (this.location.source !== undefined && this.location.source !== null) {
str += `${this.location.source}:`;
}

Comment on lines 26 to 47
if (typeof diag.location.source === "string") {
str += `${diag.location.source}:`;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paired with the previous

Suggested change
if (typeof diag.location.source === "string") {
str += `${diag.location.source}:`;
}
if (diag.location.source !== undefined && diag.location.source !== null) {
str += `${diag.location.source}:`;
}

null,
this.locations.map(loc => loc.start.line)
).toString().length;
const srcLines = {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this correctly handle toString, etc. keys? Maybe replace with Map or Object.create(null)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this needs to change now. How about we just make it entry pairs. [[source, text], [source, text]]. That way we won't need a Map shim.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably more clear and easier to document to do [{source: "stdin", text: "input"}]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. There is not so much different sources, so linear search in the array won't a problem. Both, array of arrays or array of structures are acceptable

lib/peg.d.ts Outdated
found?: any;
expected?: ExpectedItem[];
stack?: any;
toString(): string;
format(sources:Sources): string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
format(sources:Sources): string;
format(sources: Sources): string;

@@ -25,7 +25,7 @@ describe("compiler pass |reportInfiniteRecursion|", function() {
"start = stop",
"stop = start"
].join("\n"), {
message: "Possible infinite loop when parsing (left recursion: start -> stop -> start).",
message: "Possible infinite loop when parsing (left recursion: start -> stop -> start)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we can put each rule to the diagnostics array (stop in that case)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave that for a follow-up, but I agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

src/parser.pegjs Outdated
= "{" @BareCodeBlock "}"

BareCodeBlock
= expression:Code { return { type: "code", expression, location: location() }; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, type is not used

Suggested change
= expression:Code { return { type: "code", expression, location: location() }; }
= expression:Code { return { expression, location: location() }; }

@hildjj hildjj force-pushed the error-consistency branch from a69730a to 635c939 Compare April 30, 2021 19:33
@hildjj
Copy link
Contributor Author

hildjj commented Apr 30, 2021

Fixed above nits, updated docs + CHANGELOG, squashed. Ready to land?

@hildjj hildjj force-pushed the error-consistency branch 3 times, most recently from 5a8521e to 3e87945 Compare May 2, 2021 21:30
@Mingun
Copy link
Member

Mingun commented May 3, 2021

There's one line I haven't figured out how to reach with a test, which happens when a syntax error spans multiple lines. I'm sure someone can think of a grammar that does that.

That is easy, just generate that error yourself:

// Grammar
start = '' {
  let end = { offset: input.length, line: 1, column: 1 };

  let p = 0;
  while (p < input.length) {
    if (input.charCodeAt(p) === 10) {
      end.line++;
      end.column = 1;
    } else {
      end.column++;
    }

    p++;
  }

  error("error", {
    source: "test", 
    start: { offset: 0, line: 1, column: 1 },
    end
  });
};

@hildjj
Copy link
Contributor Author

hildjj commented May 4, 2021

That is easy, just generate that error yourself:

I wasn't specific enough. It needs to be a peg$SyntaxError from parsing a grammar, which limits the approach a lot.

lib/parser.js Outdated
str += peg$padEnd("", this.location.end.column - this.location.start.column, "^");
}
else {
str += peg$padEnd("", line.length - this.location.start.column + 1, "^");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the line I can't reach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start = @&{
multiline
};

triggers that. If my fork would be used, I've changed that so it points only to @, therefore it will not work.

@Mingun
Copy link
Member

Mingun commented May 4, 2021

I wasn't specific enough. It needs to be a peg$SyntaxError from parsing a grammar, which limits the approach a lot.

error() generates syntax error in the same way as parser does internally, so why not use it?


I believe, that squashing everything into one big commit is not a good practice. There could be difficulties in the incidents investigation, if their happens in the future.

I've split your commit into small fine-grained commits and also implement a better messages for left-recursion detector. You can found that in my fork.

If you wish, I can polish and finish you work.

Also I slightly extend idea about codeLocations and introduce a nameLocation and a labelLocation, so the errors now points to the visually correct places.

Some examples

Left recursion
start = proxy;
proxy = a:end { return a; };
end = !start
PS D:\Projects\parsers\pegjs> node .\bin\peggy .\recursion.pegjs
Error: Possible infinite loop when parsing (left recursion: start -> proxy -> end -> start)
 --> .\recursion.pegjs:1:1
    |
1   | start = proxy;
    | ^^^^^
note: Step 1: call of the rule "proxy" without input consumption
 --> .\recursion.pegjs:1:9
    |
1   | start = proxy;
    |         ^^^^^
note: Step 2: call of the rule "end" without input consumption
 --> .\recursion.pegjs:2:11
    |
2   | proxy = a:end { return a; };
    |           ^^^
note: Step 3: call itself without input consumption - left recursion
 --> .\recursion.pegjs:3:8
    |
3   | end = !start
    |        ^^^^^
Duplicated rules
start = "a";
start = "b";
PS D:\Projects\parsers\pegjs> node .\bin\peggy .\recursion.pegjs
Error: Rule "start" is already defined
 --> .\recursion.pegjs:2:1
   |
2  | start = "b";
   | ^^^^^
note: Original rule location
 --> .\recursion.pegjs:1:1
   |
1  | start = "a";
   | ^^^^^
Duplicated labels
start = a:"a" a:"b";
PS D:\Projects\parsers\pegjs> node .\bin\peggy .\recursion.pegjs
Error: Label "a" is already defined
 --> .\recursion.pegjs:1:15
  |
1 | start = a:"a" a:"b";
  |               ^
note: Original label location
 --> .\recursion.pegjs:1:9
  |
1 | start = a:"a" a:"b";
  |         ^
Plucking together with an action
start = @"a" { code };
PS D:\Projects\parsers\pegjs> node .\bin\peggy .\recursion.pegjs
Error: "@" cannot be used with an action block
 --> .\recursion.pegjs:1:9
  |
1 | start = @"a" { code };
  |         ^
note: Action block location
 --> .\recursion.pegjs:1:15
  |
1 | start = @"a" { code };
  |               ^^^^^^
Plucking of a semantic predicate
start = @&{ code };
PS D:\Projects\parsers\pegjs> node .\bin\peggy .\recursion.pegjs
Error: "@" cannot be used on a semantic predicate
 --> .\recursion.pegjs:1:9
  |
1 | start = @&{ code };
  |         ^

lib/parser.js Outdated
peg$SyntaxError.prototype.format = function(sources) {
var str = "Error: " + this.message;
if (this.location) {
var srcLines = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable actually not used. I fix that in my fork

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is leftover from my port. good catch.

@hildjj
Copy link
Contributor Author

hildjj commented May 4, 2021

I think I'd like to take a crack at pulling my single commit into pieces, then we can see what is left from your fork that still needs to be included, if that works for you?

@Mingun
Copy link
Member

Mingun commented May 4, 2021

Yes

@hildjj
Copy link
Contributor Author

hildjj commented May 4, 2021

@Mingun after looking through your patch set in more detail, let's go ahead and take yours.

@Mingun
Copy link
Member

Mingun commented May 4, 2021

You can just reset you branch to my and push it. Also you can replace commit author - still, this is your code, I just organized it.

@Mingun
Copy link
Member

Mingun commented May 5, 2021

OK, I finish in my fork. You can just reset you branch to my, push it and then merge that PR.

Although I want that someone native speaker review my comments in the README.md (docs/documentation.html just copies they).

I have a strong feeling that the docs/documentation.html file was automatically generated from the README.md in the past, but now they are not in sync.

@hildjj hildjj force-pushed the error-consistency branch from 3e87945 to 4155870 Compare May 17, 2021 19:19
@hildjj
Copy link
Contributor Author

hildjj commented May 17, 2021

I finally got around to the merge, @Mingun. I think this is ready to merge.

@Mingun
Copy link
Member

Mingun commented May 17, 2021

Yes, it is ready. Do it.

@hildjj hildjj merged commit f07c500 into peggyjs:main May 17, 2021
@hildjj hildjj deleted the error-consistency branch May 17, 2021 20:17
physikerwelt referenced this pull request in physikerwelt/texvcjs May 21, 2022
Change-Id: I1e164456794980c0e2d95de8c8b8e8ec3ea7021e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Never set line/col information in GrammarError message
3 participants