Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indent statements in suppressed ranges #6507

Merged
merged 2 commits into from
Aug 15, 2023
Merged

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Aug 11, 2023

Summary

This PR implements logic to fix up the indentation of suppressed statement ranges to prevent that formatting the following file results in a syntax error:

if True:
  pass
  # fmt: off
  print("test")

# Before
if True:
    pass
  # fmt: off
  print("test")

# After this PR
if True:
    pass
    # fmt: off
    print("test")
  • The relative indent of nested statements remains unchanged.
  • Indentations of parenthesized expressions remain unchanged. Users have to fix these indents up manually. This is to align the fmt: skip and fmt: off behavior (re-indents the statement, but not its content)

It may be a bit surprising that this PR also changes the indentation of comments. This is necessary to avoid that a comment getting associated with a different node after formatting once, if the indentation sequence changed:

For example, if we format the following with an indent of 4 spaces:

# fmt: off
def test():
  pass

  # It is necessary to indent comments because the following fmt: on comment because it otherwise becomes a trailing comment
  # of the `test` function if the "proper" indentation is larger than 2 spaces.
  # fmt: on

disabled +  formatting;

The output without re-indenting comments would be:

# fmt: off
def test():
	pass

  # It is necessary to indent comments because the following fmt: on comment because it otherwise becomes a trailing comment
  # of the `test` function if the "proper" indentation is larger than 2 spaces.
  # fmt: on

disabled +  formatting;

This results in an instability because the end of body comments of test now have a smaller indentation than pass, resulting in the formatter associating these as trailing comments of the function test rather than the pass statement. Now, fmt: on suddenly re-enables formatting of disabled + formatting;

The re-formatting of comments and statements may be controversial inside of a fmt: off..fmt: on region but the benefits of being able to format files with suppressed ranges rather than just failing if the indentations end up miss matching outweighs the maybe surprising behavior mainly because it results in no chance if the statements are correctly indented.

Test Plan

I added new test cases, including test cases with mixed indentation.

@MichaReiser
Copy link
Member Author

MichaReiser commented Aug 11, 2023

@MichaReiser MichaReiser added the formatter Related to the formatter label Aug 11, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Aug 11, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.00      4.5±0.06ms     8.9 MB/sec    1.00      4.6±0.06ms     8.9 MB/sec
formatter/numpy/ctypeslib.py               1.00    889.4±6.43µs    18.7 MB/sec    1.01    898.5±4.05µs    18.5 MB/sec
formatter/numpy/globals.py                 1.00     90.2±0.64µs    32.7 MB/sec    1.06     95.6±5.45µs    30.9 MB/sec
formatter/pydantic/types.py                1.00  1781.8±28.44µs    14.3 MB/sec    1.02   1809.1±6.54µs    14.1 MB/sec
linter/all-rules/large/dataset.py          1.02     13.0±0.45ms     3.1 MB/sec    1.00     12.6±0.09ms     3.2 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.3±0.04ms     5.0 MB/sec    1.00      3.3±0.03ms     5.0 MB/sec
linter/all-rules/numpy/globals.py          1.01    473.1±8.17µs     6.2 MB/sec    1.00    470.2±0.72µs     6.3 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.5±0.17ms     3.9 MB/sec    1.02      6.6±0.52ms     3.8 MB/sec
linter/default-rules/large/dataset.py      1.00      6.6±0.09ms     6.2 MB/sec    1.02      6.7±0.04ms     6.1 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1430.4±13.88µs    11.6 MB/sec    1.02   1455.6±4.67µs    11.4 MB/sec
linter/default-rules/numpy/globals.py      1.00    168.6±1.16µs    17.5 MB/sec    1.00    169.1±0.44µs    17.4 MB/sec
linter/default-rules/pydantic/types.py     1.00      2.9±0.04ms     8.7 MB/sec    1.01      3.0±0.02ms     8.6 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.01      4.2±0.05ms     9.6 MB/sec    1.00      4.2±0.05ms     9.7 MB/sec
formatter/numpy/ctypeslib.py               1.00   796.0±10.80µs    20.9 MB/sec    1.00   793.3±14.85µs    21.0 MB/sec
formatter/numpy/globals.py                 1.00     82.0±2.03µs    36.0 MB/sec    1.00     81.8±1.96µs    36.1 MB/sec
formatter/pydantic/types.py                1.01  1678.5±31.20µs    15.2 MB/sec    1.00  1664.2±27.01µs    15.3 MB/sec
linter/all-rules/large/dataset.py          1.02     13.0±0.17ms     3.1 MB/sec    1.00     12.7±0.12ms     3.2 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.01      3.6±0.03ms     4.6 MB/sec    1.00      3.5±0.07ms     4.7 MB/sec
linter/all-rules/numpy/globals.py          1.02    372.1±9.55µs     7.9 MB/sec    1.00    364.5±7.95µs     8.1 MB/sec
linter/all-rules/pydantic/types.py         1.01      6.6±0.07ms     3.8 MB/sec    1.00      6.5±0.06ms     3.9 MB/sec
linter/default-rules/large/dataset.py      1.00      7.0±0.06ms     5.8 MB/sec    1.00      7.0±0.11ms     5.8 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.01  1468.5±37.67µs    11.3 MB/sec    1.00  1448.9±20.34µs    11.5 MB/sec
linter/default-rules/numpy/globals.py      1.04    151.4±3.68µs    19.5 MB/sec    1.00    145.3±3.18µs    20.3 MB/sec
linter/default-rules/pydantic/types.py     1.01      3.1±0.04ms     8.1 MB/sec    1.00      3.1±0.02ms     8.2 MB/sec

@MichaReiser MichaReiser changed the base branch from fmt-on-off to main August 14, 2023 09:49
@MichaReiser MichaReiser force-pushed the fix-suppressed-indentation branch from ce43cf6 to 09bc0d9 Compare August 14, 2023 09:49
@MichaReiser MichaReiser changed the base branch from main to fmt-on-off August 14, 2023 10:42
@MichaReiser MichaReiser force-pushed the fix-suppressed-indentation branch 2 times, most recently from 33e5b05 to 0883745 Compare August 14, 2023 11:48
@MichaReiser MichaReiser marked this pull request as ready for review August 14, 2023 11:49
@MichaReiser MichaReiser mentioned this pull request Aug 10, 2023
6 tasks
@MichaReiser MichaReiser force-pushed the fix-suppressed-indentation branch from 0883745 to e4b26f2 Compare August 14, 2023 12:02
@MichaReiser MichaReiser force-pushed the fix-suppressed-indentation branch from e4b26f2 to 12f9bbc Compare August 14, 2023 12:31
crates/ruff_formatter/src/printer/mod.rs Outdated Show resolved Hide resolved
/// This implementation makes use of this fact and assumes a tab size of 1.
/// * The source document is correctly indented because it is valid Python code (or the formatter would have failed parsing the code).
#[derive(Copy, Clone)]
struct Indentation(u32);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already an Indentation in ruff_python_parser, do we want to give this one a different name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a recommendation? I wasn't able to come up with a good name.

I'm not too concerned about the naming conflict, considering that this is a private type.

crates/ruff_python_formatter/src/verbatim.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/verbatim.rs Outdated Show resolved Hide resolved
Comment on lines +666 to +677
/// Notice how the `not_formatted + b` expression statement gets the same indentation as the `print` statement above,
/// but the indentation of the expression remains unchanged. It changes the indentation to:
/// * Prevent syntax errors because of different indentation levels between formatted and suppressed statements.
/// * Align with the `fmt: skip` where statements are indented as well, but inner expressions are formatted as is.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like the docstring solution of only stripping the shared leading indentation, through docstring we would also already have the utils for that, except for own line comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like that too. But we would need to track two different indentations:

  • One to strip before statements
  • The shared leading indentation, for comments

It may otherwise result in invalid code if a comment has less indentation than the statements. I think we consider introducing this complexity if we get reports that our fmt: off breaks the intended formatting (or e.g. test if the entire suite is disabled, and if so, don't fix up the indent)

Comment on lines 778 to 794
// Is there any remaining content after the last token? If so, return it as the last
// logical line.
return if self.last_line_end < self.content_end {
let content_start = self.last_line_end;
self.last_line_end = self.content_end;
Some(Ok(LogicalLine {
content_range: TextRange::new(content_start, self.content_end),
contains_newlines,
has_trailing_newline: false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would that content be and how do we know whether it contains newlines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extended the comment to mention what content it is and changed contains_newlines to always be No because the lexer would otherwise return a Newline token.

@MichaReiser MichaReiser force-pushed the fix-suppressed-indentation branch from 12f9bbc to bdab1c2 Compare August 14, 2023 15:47
Base automatically changed from fmt-on-off to main August 14, 2023 15:57
@MichaReiser MichaReiser force-pushed the fix-suppressed-indentation branch from bdab1c2 to df98068 Compare August 14, 2023 16:09
@MichaReiser MichaReiser merged commit 232b44a into main Aug 15, 2023
@MichaReiser MichaReiser deleted the fix-suppressed-indentation branch August 15, 2023 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants