Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssue #2896 - trim extra whitespaces with comment at end of line #4391

Closed
wants to merge 2 commits into from
Closed

ssue #2896 - trim extra whitespaces with comment at end of line #4391

wants to merge 2 commits into from

Conversation

davidBar-On
Copy link
Contributor

@davidBar-On davidBar-On commented Aug 19, 2020

Should resolve issue #2896 when trailing whitespaces remain because a comment at the end of the line was moved to a new line.
Example resolved:

fn main() {
    if 0 == 1 
    /* x */ as i32 {} }
fn main() {
    if 0 == ' ' /* x */ 
    as i32 {} }

Copy link
Contributor

@ayazhafiz ayazhafiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worth adding test cases for the issue as well.

@@ -1604,7 +1604,15 @@ pub(crate) fn recover_comment_removed(
let snippet = context.snippet(span);
let includes_comment = contains_comment(snippet);
if snippet != new && includes_comment && changed_comment_content(snippet, &new) {
Some(snippet.to_owned())
/* Trim white spaces at end of lines */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* Trim white spaces at end of lines */
/* Trim whitespace at end of lines */

/* Trim white spaces at end of lines */
let mut c = String::from("");
for line in snippet.to_owned().split('\n') {
if c != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if c != "" {
if !c.is_empty() {

};
c.push_str(line.trim_end());
}
Some(c.to_owned())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c is already a String, is to_owned needed?

@davidBar-On
Copy link
Contributor Author

Relevant test cases were added to file ".../tests/source/issue-2896.rs"

@calebcartwright
Copy link
Member

Thank you for the PR @davidBar-On, though I'm not yet convinced this is the best approach. In cases like these, rustfmt is failing to format and simply bailing since it's detected there's a comment that was lost, and emitting the original formatting. Since the original formatting had the trailing space, that is then triggering the warning/error.

Instead of trying to address a symptom of the failed formatting as proposed with the changes in this PR, we'd ideally be able to find the root cause and address there.

Specifically, the problem here is the formatting of the cast expression where comments between the subexpression and the type are getting dropped. You can see this reproduced with a simpler example like the below:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=305fdba4837383e339c36b4be8e41b05

fn main() {
    let x = 0f64            /* x */ as i32;
}

There's a very similar issue with formatting of binary expressions that have comments between the operator and operands:

fn main() {
    let x = 2        /* x */      *       /* y */     4;
}

The formatting logic for both of these expression kinds leverage rewrite_pair, so I suspect we'd either want to try to do some work there to better handle comments and/or perhaps in the infix derivation for those within format_expr though I've not had a chance to look at this too closely.

pub(crate) fn rewrite_pair<LHS, RHS>(
lhs: &LHS,
rhs: &RHS,
pp: PairParts<'_>,
context: &RewriteContext<'_>,
shape: Shape,
separator_place: SeparatorPlace,
) -> Option<String>
where

pub(crate) fn format_expr(
expr: &ast::Expr,
expr_type: ExprType,
context: &RewriteContext<'_>,
shape: Shape,
) -> Option<String> {

@davidBar-On
Copy link
Contributor Author

@calebcartwright, the issue handled by the PR is when a comment is moved to a new line and whitespace is left at the end of first line, not to a missing comment. Can't this case happen even if all other comments related issues are solved, e.g. when the comment is to long to remain in the line? If it can, then this PR may be of value.

Regarding the missing comment examples you gave. For the first I tried adding to rewrite_pair() a Span parameter of the code between the two pair items and append comments in this are to the rhs value. That seem to work well.
The second example is handled by rewrite_pairs_one_line() and requires more sophisticated handling for the comments before and after the operator. I see that lists.rs include fictions with similar functionality that may be reused, e.g. extract_pre_comment() and extract_post_comment().
I assume that related changes are also required to rewrite_pairs_multiline().

Is the above approach for changes to rewrite_pair() seem to be correct? If yes I can try to come up with a pull request for changes to the three functions.

@calebcartwright
Copy link
Member

@davidBar-On

@calebcartwright, the issue handled by the PR is when a comment is moved to a new line and whitespace is left at the end of first line, not to a missing comment.

Sorry, but that's not what is happening with the snippets in the PR description nor the referenced issue. As I detailed above and in Discord, the problem is the failed formatting of the cast expression.

rustfmt tries really hard to not be destructive and remove code, so when it detects that the updated formatting would be destructive, it bails and revers to the original snippet. rustfmt failed to format the cast expression in this case because rustfmt detected that it was going to lose the inner comments, so reverts back to the original snippet for the cast expr. Because the original snippets all had trailing spaces (if you look closely you'll see them), that then triggers the format failed warnings.

Can't this case happen even if all other comments related issues are solved, e.g. when the comment is to long to remain in the line? If it can, then this PR may be of value

Nope. Trailing whitespace will be trimmed even on long comments in cases where rustfmt formats successfully. The proposed changes here are would only obfuscate cases where rustfmt failed to format. We do not want to have rustfmt fail to format an expression, have to revert to the original formatting, and then try to retroactively process and tweak that formatting.

Is the above approach for changes to rewrite_pair() seem to be correct?

Sorry but it's not entirely clear to me what the approach/thinking is. We definitely do not want to try to calculate spans within rewrite_pair. rewrite_pair and friends are very generalized by design to support formatting a variety of use cases, and we don't want to try to force additional responsibilities in there that belong elsewhere.

Please also note that, at least in the case of a cast expression, there are two distinct "between" spans; not just one between the pair. There's the span between the subexpression and as and another between as and the type, both of which could potentially have comments, and rustfmt has to be able to handle all of them:

let x = 1       /* foo */   as      i32;
let x = 1       /* foo */   as        /* bar */  i32;
let x = 1       /* foo */as/* bar */  i32;

I see two potential approaches to address this:

  1. Handle exclusively within the formatting logic for cast expressions. I suppose one way to do this could be to derive the before/after spans described above, grab the comments and incorporate them somehow in the infix.
  2. Enhance PairParts to take some additional inputs that represent (or are used to derive) a lhs suffix (aka infix prefix) and a rhs prefix (aka infix suffix), and utilize that in rewrite_pair

@davidBar-On
Copy link
Contributor Author

@calebcartwright thanks for the detailed response. I hope it will prove useful and not a waste of your time.

Regarding the potential approaches I would like to continue evaluating the second (which also seem easier with my current understanding of the code and is also somewhat is the direction I thought to take)):

Enhance PairParts to take some additional inputs that represent (or are used to derive) a lhs suffix (aka infix prefix) and a rhs prefix (aka infix suffix), and utilize that in rewrite_pair

For cast the there is an issue that the span for the "as" does not exist. Therefore, to separate between the infix prefix and suffix, the string "as" should be searched for in the code snippet between the pair items. However, this is not straight forward because "as" can be part of a comment. Therefore the search should ignore comments text and look for the first "as". It may be better if a "as" span will be part of the cast expression, similar to the op node in binary expressions. Is is possible to do that (I couldn't figure that out)? Is it better than searching the code?

Another similar issue is that binary pair is handled by rewrite_pairs_one_line() and rewrite_pairs_multi_line() which do not get the op span. As above the operator (separator) can be searched in the code snippet, but a better way may be that FlattenPair for ast::Expr will include the separator span in its output to make it accessible to rewrite_pairs_one_line(). Again, the question is whether this better than searching the code.

@calebcartwright
Copy link
Member

For cast the there is an issue that the span for the "as" does not exist

That is correct. There's a lot of utility functions within rustfmt for dealing with spans, comments, and more, and you'd want to utilize those here. I believe something like the below would do the trick:

let before_span = mk_sp(
    subexpr.span.hi(),
    context.snippet_provider.span_before(expr.span, "as"),
);
let after_span = mk_sp(
    context.snippet_provider.span_after(expr.span, "as"),
    ty.span.lo(),
);

However, this is not straight forward because "as" can be part of a comment. Therefore the search should ignore comments text and look for the first "as".

That's why we have the util functions 😄 This is already handled correctly with the util functions I showed above.

It may be better if a "as" span will be part of the cast expression, similar to the op node in binary expressions

The AST is defined by rustc, not rustfmt, and keywords like as aren't explicitly represented within the AST nodes because they don't need to be.

Another similar issue is that binary pair is handled by rewrite_pairs_one_line() and rewrite_pairs_multi_line() which do not get the op span. As above the operator (separator) can be searched in the code snippet, but a better way may be that FlattenPair for ast::Expr will include the separator span in its output to make it accessible to rewrite_pairs_one_line(). Again, the question is whether this better than searching the code.

I want to reiterate my suggestion to focus on one issue at a time. PRs should be as small and targeted as possible. This makes them easier to review and increases the odds of being merged. Please don't try to address multiple, separate things within a single PR.

I'd really encourage you to focus on the cast expressions, and only the cast expressions, and get that working first before moving on to others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants