Errors once #152

ehuelsmann · 2023-08-17T15:23:27Z

🤔 What's changed?

The Perl token matcher gained the ability to accept a token and at the same time indicate a token parser error. Before - due to the use of exceptions to indicate parser errors - it would only be possible to achieve one or the other.

⚡️ What's your motivation?

By making this change, the Perl implementation is able to report the parsing error exactly once, instead of - as it does now - requiring to de-duplicate errors which might be reported multiple times (this can happen due to the fact that the lookahead function of the parser triggers a parser error.

🏷️ What kind of change is this?

🏦 Refactoring/debt/DX (improvement to code design, tooling, documentation etc. without changing behaviour)

Since the error of the duplicates has already been fixed (the fast way), PR is more of a maintenance type of PR.

♻️ Anything particular you want feedback on?

During the weekly meeting, we assessed that exceptions are not the way to return data from a function.

📋 Checklist:

I agree to respect and uphold the Cucumber Community Code of Conduct
I haven't changed the behaviour of the code
My change doesn't require a change to the documentation.
Users should not have to know about my change (as it's fully internal behaviour to the parser

… the same time The current implemnetation of the parser calls the token matcher multiple times to identify the same input line (e.g. in a 'lookahead' or as part of the state machine ('match_token_at_*'). When an invocation causes an error (e.g. because there are tags with embedded spaces, or the selected language isn't available), the parser always stashes the error in the list of errors. Since the matcher may be invoked multiple times, the error may get stashed multiple times. To fix the user experience, Ruby, Perl, Python and others de-duplicate the reported errors by error message. This change moves stashing of errors to the point where the token is accepted into the AST, which means that multiples of the same error are discarded when that makes sense. E.g. the lookahead discards the errors and *only* looks at whether there is a match with the desired token type. Note: The structure of returning a boolean accepting the line and a second element indicating an error, adds the currently-missing capability of saying 'yes, this is what you're looking for, but there are problems with it'.

Move the responsibility to check that there even *is* an input line to match, to the token matcher. Now that the parser largely isn't responsible for it anymore, we can do away with a whole slew of equally-named wrapper functions in the parser that already exist in the token matcher. Not having two sets of functions by the same name helps to navigate the code.

ehuelsmann added 2 commits August 16, 2023 21:14

ehuelsmann mentioned this pull request Aug 17, 2023

TokenMatcher parse errors are not exceptions #153

Open

6 tasks

ehuelsmann merged commit 59bc8e8 into main Aug 17, 2023
27 checks passed

ehuelsmann deleted the errors-once branch August 17, 2023 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors once #152

Errors once #152

ehuelsmann commented Aug 17, 2023

Errors once #152

Errors once #152

Conversation

ehuelsmann commented Aug 17, 2023

🤔 What's changed?

⚡️ What's your motivation?

🏷️ What kind of change is this?

♻️ Anything particular you want feedback on?

📋 Checklist: