-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autofix for RET504 UnnecessaryAssign #2599
Conversation
For more context, currently my fix only changes the return, and doesn't remove the variable assignment (and thus I haven't changed the snap tests). E.x: def f():
var = 2
return var becomes def f():
var = 2
return 2 |
src/rules/flake8_return/rules.rs
Outdated
)); | ||
} | ||
} | ||
checker.diagnostics.push(diagnostic); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the framing here should be that we want to replace the Assign
statement and the Return
statement with the return ${value}
.
Something like:
diagnostic.amend(Fix::replacement(
unparse_expr(assign_expr, checker.stylist),
assign_stmt.location,
return_stmt.end_location.unwrap(),
));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only issue with this is that it won't work for something that is multi-line.
For example:
def f(x):
val = 2
print(x)
return val
Would fail and produce:
def f(x):
return 2
Unless we are tracking everything between those two with source_slicer
or source_locator
(something along those lines, I've seen it in other checkers). Additionally, do we want to leave behind empty space? The start of the expression is after the indentation, according to the AST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing I'm slightly confused on is this code I'm trying:
if let Some(assign_loc) = stack.assigns.get(id.as_str()){
println!("return {:?}", assign_loc.last().unwrap());
diagnostic.amend(Fix::replacement(
unparse_expr(assign_expr, checker.stylist),
*assign_loc.last().unwrap(),
expr.end_location.unwrap(),
));
}
This code causes ruff to hang, and the prints show why:
return Location { row: 5, column: 4 }
return Location { row: 12, column: 4 }
return Location { row: 20, column: 4 }
return Location { row: 19, column: 4 }
return Location { row: 21, column: 4 }
return Location { row: 25, column: 4 }
return Location { row: 33, column: 4 }
... (this pattern repeats for a LOT of lines)
any clue why this could be happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now we limit this to the version in which the assignment and the return are consecutive statements, if we can.
If we want to support fixes in which there are interleaving statements, we have to do something clever like structure the fix as a replacement with "all the content before the statement, followed by the new return" (rather than two separate fixes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I can help debug - what file / command are you running to get that output?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think you're right -- I should have this finished by tomorrow. My only other thought is how to work with the Assign
statement being on another line, how would you ensure that there is just the assignment + whitespace/comment on the next few lines, would you have loop through the source code directly and check if each line is whitespace/comment or is there a better way?
What I'm talking about is something like this:
def safe_average(lst):
"""There is a extra newline and extra indentation after the assignment"""
return_value = 2.4
# This is also a comment explaining why we need to check for comments, and there is indentation on the next line too!
"""
Also, the next comment has *critical* information, but would the autofix remove it?
I presume we would have to keep track of this, or just ignore this for now and
put it in an issue somewhere.
"""
# We return this function in this manner because if the length was zero we would raise a ZeroDivisionError
return sum(lst) / len(lst) if len(lst) > 0 else 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here's what I'd try to do (though it'll require more bookkeeping than we do now)...
When we track the assignment, track the start and end locations of the entire assignment statement.
Then, when we create the fix, structure it as: "Replace all the content from the start of the assignment with: (1) the content from the end of the assignment to the start of the return + (2) the modified return statement."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have the ability to skip the fix if there's a comment between the assign and the return -- we have a has_comments_in
utility that takes a range as input, and returns true if there are any comments. So we could do that instead.
Or we could even really restrictive and say that we're only going to do this fix if it's purely whitespace between the assignment and the return. That's actually pretty reasonable IMO!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try for the first option, if it seems a bit too difficult (shouldn't be, but will require a bit more restructuring and there's also a whitespace issue, but Black would fix that... not sure what the policy is for that) it's a bit late for me right now so I'll get to it by tomorrow. Sorry if I've asked too many questions or been a hassle, but thanks for helping me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No apology needed at all. I wish I could be even more helpful! Just trying to juggle a few things at once. Appreciate your effort on the issue :)
src/rules/flake8_return/visitor.rs
Outdated
@@ -15,6 +15,7 @@ pub struct Stack<'a> { | |||
pub assigns: FxHashMap<&'a str, Vec<Location>>, | |||
pub loops: Vec<(Location, Location)>, | |||
pub tries: Vec<(Location, Location)>, | |||
pub assign_values: FxHashMap<&'a str, rustpython_ast::Located<ExprKind>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a tip, we never need to use Located<T>
. Instead, you can do Expr
(which is equivalent to Located<ExprKind>
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip!
This is nearly finished - all it needs is to dedent the output if it isn't already. @charliermarsh How would you recommend going about this? |
The code is working but it does have a false positive that changes the function of the code. I'm unsure how you would work around that -- the false positive originates from the check itself, and the autofix runs under the assumption that the check isn't a false positive. We could patch the check, as it appears the problem in question isn't really refactor-able (thus why it's a false-positive in my opinion) unless you use a ternary statement, which is not the focus of this check. On the linked issue, it appears this was actually patched (no longer raises |
Where can I find the linked issue? |
@charliermarsh It’s linked in the test # Can be refactored false positives
# https://github.com/afonasev/flake8-return/issues/47#issuecomment-1122571066
def get_bar_if_exists(obj):
result = ""
if hasattr(obj, "bar"):
result = str(obj.bar)
return result |
Hmm, I think those issues still exist upstream -- I see them when I run |
Spent a little time trying to fix that issue but it's really tough without more rigorous branch analysis. |
I think we should try to be really conservative here, and only fix if they're consecutive statements -- otherwise, we risk breaking code, like above. What do you think? |
@charliermarsh I think there are pretty much three valid options here:
I can probably implement any of these. Personally, I think the strictly comments (1) is probably the best in-between that will not cause breaking fixes and still fix most of the cases that |
I think (1) makes sense. Do you wanna give that a try? |
Yep. I’ll do it |
This is very close to done. All I need to do is add one more check before going through the autofix, it will also probably fail CI because it's in the middle of being completed. |
@charliermarsh This is finished. You can take a look - the implementation isn't pretty but if you have any better ideas they would be well appreciated. |
@Sawbez - Awesome, thank you for all the work! I likely won't get to this tonight, but will review as soon as I can. |
Whoops- ignore that |
{ | ||
let in_between_source = checker | ||
.locator | ||
.slice_source_code_range(&Range::new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason I'm having trouble pushing edits but these need to change to .slice(...)
now instead of . slice_source_code_range(...)
.
expr.location, | ||
)) | ||
.to_owned(); | ||
// Ensure that all indentation is identical (no if or while statement). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a more rigorous way to detect this, if I'm interpreting the comment correctly. Indentation isn't a totally "safe" way to detect whether we have control flow like if
or while
etc., since users could do if x: y
on a single line.
What's this logic guarding against? Maybe we can track this condition in some other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see what you mean with the one-liner. The main problem is that there's no Dedent
token during the code that ensures it's comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have to be checking the line itself too (e.g. slicing the line and checking if there is any tokens there)
if let Some(assign_expr) = stack.assign_values.get(id.as_str()) { | ||
if let Some(assign_locations) = stack.assigns.get(id.as_str()) { | ||
let expr_loc = expr.end_location.unwrap(); | ||
if let Some(last_assign_loc) = assign_locations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this be None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you return an undefined variable (a problem in itself, but shouldn't cause a panic). The second if let
can be removed however the third one is also important because it would fail if the variable was assigned to, but only after the return statement (similar problem). In my earlier testing it appeared that every assignment would be in the vector even if they were after the return statement (which was against an assumption I had made and was causing that infinite loop error because it was slicing the wrong way)
line.chars() | ||
.take_while(|c| stylist.indentation().contains(*c)) | ||
.collect() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we perhaps use the indentation
helper in crates/ruff/src/ast/whitespace.rs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? But I would guess it would be slower because I would have to create a bunch of useless intermediate objects because of the way that the function takes in input as a Located<T>
, and I mainly just have a &str
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be kind of inconvenient. I might have to iterate through it in a sliding window fashion manually (it would also be more performant).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, the function you are referring to would be inconvenient (and quite possibly slower) either way, and rewriting without TupleWindows
would be faster either way.
|
||
if checker.patch(diagnostic.kind.rule()) { | ||
if let Some(assign_expr) = stack.assign_values.get(id.as_str()) { | ||
if let Some(assign_locations) = stack.assigns.get(id.as_str()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add some comments about what assign_expr
, assign_locations
, etc. represent here? I wrote some of this originally and even I can't remember what they mean and how they apply to this fix heh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
pub loops: Vec<(Location, Location)>, | ||
pub tries: Vec<(Location, Location)>, | ||
pub assign_values: FxHashMap<&'a str, Expr>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be FxHashMap<&'a str, &'a Expr>
? Would that let you avoid the clone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try this.
.or_insert_with(Vec::new) | ||
.push(expr.location); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same as the previous version? I know we have the logic in // This is hacky ...
, but what if we visit a tuple here? Then we'd iterate over the names as part of this method via the recursive call above. In the current version, the self.visit_assign_target
in the Tuple
case above would have no effect, I think? Given (a, b, c)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused on what you mean here
I will start working on this later today. |
Thanks for the tips! |
@Sawbez - I'm sorry for the slow review here. The autofix stuff has to be very carefully done, and the |
@charliermarsh No problem. As long as you get to it eventually, then I'm good. Take as long (or as little) time as you would like. |
I take responsibility for not getting this merged, but I suspect the code has changed enough that we'd want to revisit from scratch. Having seen more bugs and issues related to the |
Works on #2263
The only problem that I'm not too sure how to work around is how to add multiple fixes to a single violation. Additionally, I would like to remove the usage of
.clone
on theLocated<ExprKind>
, but I cannot seem to figure out how.