-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add range formatting support #92
Conversation
I don't mind this honestly because if an argument breaks, then that means the entire function call should break. I imagine you might could get yourself into weird scenarios where an argument is formatted correctly but the function call as a whole isn't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think everything looks good except for the find_deepest_enclosing_logical_lines
algorithm, see comments below
e619633
to
c7741dc
Compare
I was envisioning only expanded calls / pipelines would work this way. Flattened calls and pipelines would be formatted as a whole. I think the idea for range formatting is not to leave the whole document in a consistently formatted state but just the minimal range that the user is currently working with. Changing code outside of this scope is likely to be distracting. Then the user can save or format the whole doc if needed. That said I don't think this would be worth the extra complexity so I'm happy with what we have here. |
crates/lsp/src/handlers_format.rs
Outdated
let logical_lines: Vec<RSyntaxNode> = iter | ||
.map(|expr| expr.into_syntax()) | ||
.skip_while(|node| !node.text_range().contains(range.start())) | ||
.take_while(|node| node.text_range().start() <= range.end()) | ||
.collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, were you expecting that it formats both 2 + 2
and THE ENTIRE for
loop? i.e. note how 1 : 10
is formatted even though it is outside the selection.
It seems to be due to this bit here.
Screen.Recording.2024-12-12.at.5.16.07.PM.mov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in #92 (comment) I was expecting that this just formats 2 + 2
because that is the only node inside the common logical line "root" fully covered by the user's selection
It seems like the most conservative approach that avoids going outside of the user's selection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be this?
let logical_lines: Vec<RSyntaxNode> = iter | |
.map(|expr| expr.into_syntax()) | |
.skip_while(|node| !node.text_range().contains(range.start())) | |
.take_while(|node| node.text_range().start() <= range.end()) | |
.collect(); | |
let logical_lines: Vec<RSyntaxNode> = iter | |
.map(|expr| expr.into_syntax()) | |
.filter(|node| range.contains_range(node.text_trimmed_range())) | |
.collect(); |
Careful to use text_trimmed_range()
here, because we'd want this (where <
and >
represent the selection) to still format the 1+1
even though it doesn't span the full text_range()
when you add in the comment
# hi there
<1+1>
This breaks one of your tests related to partial selection (reproduced in the image below) where I'd argue that only 1+1
should be reformatted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this was on purpose. We always format the selected part of user code. Sometimes it means widening the range to the highest level of logical line selected by the user. In other words we never narrow the requested range, we only widen it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I don't love about this is that it means we can format lines arbitrarily far away from the user's actual selection
Like in this case I accidentally selected a little too far up and it means that the whole function is formatted, even if thats 20 lines further up
Screen.Recording.2024-12-13.at.10.02.53.AM.mov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked and decided to keep the current behavior, but with a comment around these lines and the fact that they allow the current implementation to expand the user's range, and add in my suggested code above as a commented out alternative that is more conservative if we ever wanted to switch to that
Happens in edge cases when biome returns a `Printed::new_empty()`
f90e017
to
5e8dfd8
Compare
Here is an example of the behaviour difference I'm worried about between Screen.Recording.2024-12-16.at.12.10.44.movThe formatting of other arguments in argument lists feel inconsistent with how we don't format expressions in braced lists. This is why I wondered if we should extend our notion of logical lines to include lines in expanded lists (and possibly lines in pipelines). But let's try out the current approach to start with. |
Closes #63.
Range formatting is trickier than whole document formatting because we need to find the complete entity that makes the most sense to format around the provided range. To that end we're allowed to format a larger range than provided (maybe a smaller as well but we don't do that here and that would likely feel inconsistent).
Once we have found the piece of code to format, anything outside it is left alone. In order to match the surrounding indentation, Biome (in this cas
biome_formatter::format_sub_tree()
) does some work to figure out the initial indentation and the resulting output is printed with this indentation.Biome provides
format_range()
but I haven't ended up using it because of the way it selects a safe node to format. There are two issues with it:First it tends to select too high in the tree. It uses this algorithm (https://github.com/biomejs/biome/blob/2a098f94/crates/biome_formatter/src/lib.rs#L1590-L1607) from prettier (https://github.com/prettier/prettier/blob/cae19518/src/main/range-util.js#L36).
If you try to format the second line in:
It ends up selecting the whole expression list instead of the binary expression.
Second it only supports selecting a single node. If you wanted to format the first two lines in
its only possible choice would be to extend the range to the enclosing expression list and format the third line as well.
For comparison here is what Ruff does with an algorithm that works in the opposite direction, starting from the root node: https://github.com/astral-sh/ruff/blob/0e9427255fe3f9f42e1947a6f35af4483c101e95/crates/ruff_python_formatter/src/range.rs#L149. The idea is to find the deepest "logical line" starting from the root. I used this as an inspiration for the
find_deepest_enclosing_logical_lines()
routine implemented in this PR. The main difference is that it finds all consecutive logical lines within the range so that the user can select multiple lines of code to format.Currently I define a logical line as any child of the top-level program or of an
{
expression. In the future, I think it'd be helpful to also treat function arguments and binary operands as logical lines. That would be helpful when format-on-paste is enabled and the user is pasting inside an expanded argument list or an expanded pipeline. Air will be able to only format what's been pasted. Currently we'll extend the range to format the whole thing.There's a bit of trickery involving rewrapping the nodes in an expression list, itself wrapped in an artificial
RRoot
node. The first wrapping is necessary to handle multiple logical lines. The second is necessary to resolve an issue in Biome's comment mapper. If the node to format is not at least two levels deep, leading comments end up attached to nodes deeper in the tree (see https://github.com/biomejs/biome/blob/2a098f94aa58cedfd2691de4f3fd3ef65c05c5d6/crates/biome_formatter/src/comments/builder.rs#L104-L112). This introduces hard line breaks where they are not expected. One effect from wrapping nodes in this way is that a hard line is introduced at the end of the expression list. We just remove it at the end.