Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misplaced parenthesis in object destructuring argument with trailing argument comma #513

Open
tlrobinson opened this issue Jun 28, 2018 · 2 comments

Comments

@tlrobinson
Copy link

I haven't been able to pin down exactly why this is happening, but here's my smallest reproduction so far. I'm using flow the parser.

class Foo {
  method = item => {
    return (
      <div className=""></div>
    );
  };
}

function x({ className, }) {
}

gets printed as

class Foo {
  method = item => {
    return (
      <div className=""></div>
    );
  };
}

function x({ (className, })) {
}

Note the extra (and mis-nested) parenthesis in the last function.

Interestingly, changing just about anything about the code above causes the problem to go away.

@benjamn
Copy link
Owner

benjamn commented Jun 28, 2018

This is just a guess, but in the past, bugs like this have sometimes been due to the parser providing slightly incorrect .loc.{start,end}.{line,column} information. Like, perhaps the parser gets confused by the trailing comma and decides that the className identifier location extends all the way past the end of the parameter list… or something like that.

benjamn added a commit that referenced this issue Sep 10, 2018
Recast has suffered for a long time because it did not have reliable
access to the lexical analysis of source tokens during reprinting.

Most importantly, accurate token information could be used to detect
whether a node was originally wrapped with parentheses, even if the
parentheses are separated from the node by comments or other incidental
non-whitespace text, such as trailing commas. Here are just some of the
issues that have resulted from the lack of reliable token information:
- #533
- #528
- #513
- #512
- #366
- #327
- #286

With this change, every node in the AST returned by recast.parse will now
have a node.loc.tokens array representing the entire sequence of original
source tokens, as well as node.loc.{start,end}.token indexes into this
array of tokens, such that

  node.loc.tokens.slice(
    node.loc.start.token,
    node.loc.end.token
  )

returns a complete list of all source tokens contained by the node. Note
that some nodes (such as comments) may contain no source tokens, in which
case node.loc.start.token === node.loc.end.token, which will be the index
of the first token *after* the position where the node appeared.

Most parsers can expose token information for free / very cheaply, as a
byproduct of the parsing process. In case a custom parser is provided that
does not expose token information, we fall back to Esprima's tokenizer.
While there is considerable variation between different parsers in terms
of AST format, there is much less variation in tokenization, so the
Esprima tokenizer should be adequate in most cases (even for JS dialects
like TypeScript). If it is not adequate, the caller should simply ensure
that the custom parser exposes an ast.tokens array containing token
objects with token.loc.{start,end}.{line,column} information.
@benjamn
Copy link
Owner

benjamn commented Sep 11, 2018

Can you still reproduce this? I've been trying to write a regression test, and it seems to work now. If my previous theory is correct, the Flow parser might have fixed their .loc tracking in the meantime.

benjamn added a commit that referenced this issue Sep 11, 2018
Recast has suffered for a long time because it did not have reliable
access to the lexical analysis of source tokens during reprinting.

Most importantly, accurate token information could be used to detect
whether a node was originally wrapped with parentheses, even if the
parentheses are separated from the node by comments or other incidental
non-whitespace text, such as trailing commas. Here are just some of the
issues that have resulted from the lack of reliable token information:
- #533
- #528
- #513
- #512
- #366
- #327
- #286

With this change, every node in the AST returned by recast.parse will now
have a node.loc.tokens array representing the entire sequence of original
source tokens, as well as node.loc.{start,end}.token indexes into this
array of tokens, such that

  node.loc.tokens.slice(
    node.loc.start.token,
    node.loc.end.token
  )

returns a complete list of all source tokens contained by the node. Note
that some nodes (such as comments) may contain no source tokens, in which
case node.loc.start.token === node.loc.end.token, which will be the index
of the first token *after* the position where the node appeared.

Most parsers can expose token information for free / very cheaply, as a
byproduct of the parsing process. In case a custom parser is provided that
does not expose token information, we fall back to Esprima's tokenizer.
While there is considerable variation between different parsers in terms
of AST format, there is much less variation in tokenization, so the
Esprima tokenizer should be adequate in most cases (even for JS dialects
like TypeScript). If it is not adequate, the caller should simply ensure
that the custom parser exposes an ast.tokens array containing token
objects with token.loc.{start,end}.{line,column} information.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants