Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors once #152

Merged
merged 2 commits into from
Aug 17, 2023
Merged

Errors once #152

merged 2 commits into from
Aug 17, 2023

Conversation

ehuelsmann
Copy link
Contributor

🤔 What's changed?

The Perl token matcher gained the ability to accept a token and at the same time indicate a token parser error. Before - due to the use of exceptions to indicate parser errors - it would only be possible to achieve one or the other.

⚡️ What's your motivation?

By making this change, the Perl implementation is able to report the parsing error exactly once, instead of - as it does now - requiring to de-duplicate errors which might be reported multiple times (this can happen due to the fact that the lookahead function of the parser triggers a parser error.

🏷️ What kind of change is this?

🏦 Refactoring/debt/DX (improvement to code design, tooling, documentation etc. without changing behaviour)

Since the error of the duplicates has already been fixed (the fast way), PR is more of a maintenance type of PR.

♻️ Anything particular you want feedback on?

During the weekly meeting, we assessed that exceptions are not the way to return data from a function.

📋 Checklist:

  • I agree to respect and uphold the Cucumber Community Code of Conduct
  • I haven't changed the behaviour of the code
  • My change doesn't require a change to the documentation.
  • Users should not have to know about my change (as it's fully internal behaviour to the parser

… the same time

The current implemnetation of the parser calls the token matcher multiple times
to identify the same input line (e.g. in a 'lookahead' or as part of the state
machine ('match_token_at_*').  When an invocation causes an error (e.g. because
there are tags with embedded spaces, or the selected language isn't available),
the parser always stashes the error in the list of errors. Since the matcher
may be invoked multiple times, the error may get stashed multiple times. To
fix the user experience, Ruby, Perl, Python and others de-duplicate the reported
errors by error message.

This change moves stashing of errors to the point where the token is accepted into
the AST, which means that multiples of the same error are discarded when that makes
sense. E.g. the lookahead discards the errors and *only* looks at whether there
is a match with the desired token type.

Note: The structure of returning a boolean accepting the line and a second element
indicating an error, adds the currently-missing capability of saying 'yes, this is
what you're looking for, but there are problems with it'.
Move the responsibility to check that there even *is* an input line to match,
to the token matcher. Now that the parser largely isn't responsible for it
anymore, we can do away with a whole slew of equally-named wrapper functions
in the parser that already exist in the token matcher. Not having two sets
of functions by the same name helps to navigate the code.
@ehuelsmann ehuelsmann merged commit 59bc8e8 into main Aug 17, 2023
27 checks passed
@ehuelsmann ehuelsmann deleted the errors-once branch August 17, 2023 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant