Skip to content

Commit

Permalink
Add branch detection to the semantic model (#6694)
Browse files Browse the repository at this point in the history
## Summary

We have a few rules that rely on detecting whether two statements are in
different branches -- for example, different arms of an `if`-`else`.
Historically, the way this was implemented is that, given two statement
IDs, we'd find the common parent (by traversing upwards via our
`Statements` abstraction); then identify branches "manually" by matching
the parents against `try`, `if`, and `match`, and returning iterators
over the arms; then check if there's an arm for which one of the
statements is a child, and the other is not.

This has a few drawbacks:

1. First, the code is generally a bit hard to follow (Konsti mentioned
this too when working on the `ElifElseClause` refactor).

2. Second, this is the only place in the codebase where we need to go
from `&Stmt` to `StatementID` -- _everywhere_ else, we only need to go
in the _other_ direction. Supporting these lookups means we need to
maintain a mapping from `&Stmt` to `StatementID` that includes every
`&Stmt` in the program. (We _also_ end up maintaining a `depth` level
for every statement.) I'd like to get rid of these requirements to
improve efficiency, reduce complexity, and enable us to treat AST modes
more generically in the future. (When I looked at adding the `&Expr` to
our existing statement-tracking infrastructure, maintaining a hash map
with all the statements noticeably hurt performance.)

The solution implemented here instead makes branches a first-class
concept in the semantic model. Like with `Statements`, we now have a
`Branches` abstraction, where each branch points to its optional parent.
When we store statements, we store the `BranchID` alongside each
statement. When we need to detect whether two statements are in the same
branch, we just realize each statement's branch path and compare the
two. (Assuming that the two statements are in the same scope, then
they're on the same branch IFF one branch path is a subset of the other,
starting from the top.) We then add some calls to the visitor to push
and pop branches in the appropriate places, for `if`, `try`, and `match`
statements.

Note that a branch is not 1:1 with a statement; instead, each branch is
closer to a suite, but not _every_ suite is a branch. For example, each
arm in an `if`-`elif`-`else` is a branch, but the `else` in a `for` loop
is not considered a branch.

In addition to being much simpler, this should also be more efficient,
since we've shed the entire `&Stmt` hash map, plus the `depth` that we
track on `StatementWithParent` in favor of a single `Option<BranchID>`
on `StatementWithParent` plus a single vector for all branches. The
lookups should be faster too, since instead of doing a bunch of jumps
around with the hash map + repeated recursive calls to find the common
parents, we instead just do a few simple lookups in the `Branches`
vector to realize and compare the branch paths.

## Test Plan

`cargo test` -- we have a lot of coverage for this, which we inherited
from PyFlakes
  • Loading branch information
charliermarsh authored Aug 19, 2023
1 parent 648333b commit 17af12e
Show file tree
Hide file tree
Showing 24 changed files with 283 additions and 298 deletions.
14 changes: 3 additions & 11 deletions crates/ruff/src/checkers/ast/analyze/deferred_scopes.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
use ruff_diagnostics::Diagnostic;
use ruff_python_ast::Ranged;
use ruff_python_semantic::analyze::{branch_detection, visibility};
use ruff_python_semantic::analyze::visibility;
use ruff_python_semantic::{Binding, BindingKind, ScopeKind};

use crate::checkers::ast::Checker;
Expand Down Expand Up @@ -112,11 +112,7 @@ pub(crate) fn deferred_scopes(checker: &mut Checker) {
// If the bindings are in different forks, abort.
if shadowed.source.map_or(true, |left| {
binding.source.map_or(true, |right| {
branch_detection::different_forks(
left,
right,
checker.semantic.statements(),
)
checker.semantic.different_branches(left, right)
})
}) {
continue;
Expand Down Expand Up @@ -208,11 +204,7 @@ pub(crate) fn deferred_scopes(checker: &mut Checker) {
// If the bindings are in different forks, abort.
if shadowed.source.map_or(true, |left| {
binding.source.map_or(true, |right| {
branch_detection::different_forks(
left,
right,
checker.semantic.statements(),
)
checker.semantic.different_branches(left, right)
})
}) {
continue;
Expand Down
6 changes: 3 additions & 3 deletions crates/ruff/src/checkers/ast/analyze/statement.rs
Original file line number Diff line number Diff line change
Expand Up @@ -464,17 +464,17 @@ pub(crate) fn statement(stmt: &Stmt, checker: &mut Checker) {
flake8_pyi::rules::pass_statement_stub_body(checker, body);
}
if checker.enabled(Rule::PassInClassBody) {
flake8_pyi::rules::pass_in_class_body(checker, stmt, body);
flake8_pyi::rules::pass_in_class_body(checker, class_def);
}
}
if checker.enabled(Rule::EllipsisInNonEmptyClassBody) {
flake8_pyi::rules::ellipsis_in_non_empty_class_body(checker, stmt, body);
flake8_pyi::rules::ellipsis_in_non_empty_class_body(checker, body);
}
if checker.enabled(Rule::PytestIncorrectMarkParenthesesStyle) {
flake8_pytest_style::rules::marks(checker, decorator_list);
}
if checker.enabled(Rule::DuplicateClassFieldDefinition) {
flake8_pie::rules::duplicate_class_field_definition(checker, stmt, body);
flake8_pie::rules::duplicate_class_field_definition(checker, body);
}
if checker.enabled(Rule::NonUniqueEnums) {
flake8_pie::rules::non_unique_enums(checker, stmt, body);
Expand Down
49 changes: 40 additions & 9 deletions crates/ruff/src/checkers/ast/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ use itertools::Itertools;
use log::error;
use ruff_python_ast::{
self as ast, Arguments, Comprehension, Constant, ElifElseClause, ExceptHandler, Expr,
ExprContext, Keyword, Parameter, ParameterWithDefault, Parameters, Pattern, Ranged, Stmt,
Suite, UnaryOp,
ExprContext, Keyword, MatchCase, Parameter, ParameterWithDefault, Parameters, Pattern, Ranged,
Stmt, Suite, UnaryOp,
};
use ruff_text_size::{TextRange, TextSize};

Expand Down Expand Up @@ -193,18 +193,22 @@ impl<'a> Checker<'a> {
}
}

/// Returns the [`IsolationLevel`] for fixes in the current context.
/// Returns the [`IsolationLevel`] to isolate fixes for the current statement.
///
/// The primary use-case for fix isolation is to ensure that we don't delete all statements
/// in a given indented block, which would cause a syntax error. We therefore need to ensure
/// that we delete at most one statement per indented block per fixer pass. Fix isolation should
/// thus be applied whenever we delete a statement, but can otherwise be omitted.
pub(crate) fn isolation(&self, parent: Option<&Stmt>) -> IsolationLevel {
parent
.and_then(|stmt| self.semantic.statement_id(stmt))
.map_or(IsolationLevel::default(), |node_id| {
IsolationLevel::Group(node_id.into())
})
pub(crate) fn statement_isolation(&self) -> IsolationLevel {
IsolationLevel::Group(self.semantic.current_statement_id().into())
}

/// Returns the [`IsolationLevel`] to isolate fixes in the current statement's parent.
pub(crate) fn parent_isolation(&self) -> IsolationLevel {
self.semantic
.current_statement_parent_id()
.map(|node_id| IsolationLevel::Group(node_id.into()))
.unwrap_or_default()
}

/// The [`Locator`] for the current file, which enables extraction of source code from byte
Expand Down Expand Up @@ -619,16 +623,28 @@ where
}
}

// Iterate over the `body`, then the `handlers`, then the `orelse`, then the
// `finalbody`, but treat the body and the `orelse` as a single branch for
// flow analysis purposes.
let branch = self.semantic.push_branch();
self.semantic.handled_exceptions.push(handled_exceptions);
self.visit_body(body);
self.semantic.handled_exceptions.pop();
self.semantic.pop_branch();

for except_handler in handlers {
self.semantic.push_branch();
self.visit_except_handler(except_handler);
self.semantic.pop_branch();
}

self.semantic.set_branch(branch);
self.visit_body(orelse);
self.semantic.pop_branch();

self.semantic.push_branch();
self.visit_body(finalbody);
self.semantic.pop_branch();
}
Stmt::AnnAssign(ast::StmtAnnAssign {
target,
Expand Down Expand Up @@ -708,6 +724,7 @@ where
) => {
self.visit_boolean_test(test);

self.semantic.push_branch();
if typing::is_type_checking_block(stmt_if, &self.semantic) {
if self.semantic.at_top_level() {
self.importer.visit_type_checking_block(stmt);
Expand All @@ -716,9 +733,12 @@ where
} else {
self.visit_body(body);
}
self.semantic.pop_branch();

for clause in elif_else_clauses {
self.semantic.push_branch();
self.visit_elif_else_clause(clause);
self.semantic.pop_branch();
}
}
_ => visitor::walk_stmt(self, stmt),
Expand Down Expand Up @@ -1353,6 +1373,17 @@ where
}
}

fn visit_match_case(&mut self, match_case: &'b MatchCase) {
self.visit_pattern(&match_case.pattern);
if let Some(expr) = &match_case.guard {
self.visit_expr(expr);
}

self.semantic.push_branch();
self.visit_body(&match_case.body);
self.semantic.pop_branch();
}

fn visit_type_param(&mut self, type_param: &'b ast::TypeParam) {
// Step 1: Binding
match type_param {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
use ruff_python_ast::{self as ast, Expr, Ranged, Stmt};
use rustc_hash::FxHashSet;

use ruff_diagnostics::Diagnostic;
use ruff_diagnostics::{AlwaysAutofixableViolation, Fix};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{self as ast, Expr, Ranged, Stmt};

use crate::autofix;
use crate::checkers::ast::Checker;
Expand Down Expand Up @@ -49,11 +49,7 @@ impl AlwaysAutofixableViolation for DuplicateClassFieldDefinition {
}

/// PIE794
pub(crate) fn duplicate_class_field_definition(
checker: &mut Checker,
parent: &Stmt,
body: &[Stmt],
) {
pub(crate) fn duplicate_class_field_definition(checker: &mut Checker, body: &[Stmt]) {
let mut seen_targets: FxHashSet<&str> = FxHashSet::default();
for stmt in body {
// Extract the property name from the assignment statement.
Expand Down Expand Up @@ -85,11 +81,11 @@ pub(crate) fn duplicate_class_field_definition(
if checker.patch(diagnostic.kind.rule()) {
let edit = autofix::edits::delete_stmt(
stmt,
Some(parent),
Some(stmt),
checker.locator(),
checker.indexer(),
);
diagnostic.set_fix(Fix::suggested(edit).isolate(checker.isolation(Some(parent))));
diagnostic.set_fix(Fix::suggested(edit).isolate(checker.statement_isolation()));
}
checker.diagnostics.push(diagnostic);
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
use ruff_python_ast::{Expr, ExprConstant, Ranged, Stmt, StmtExpr};

use ruff_diagnostics::{AutofixKind, Diagnostic, Fix, Violation};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{Constant, Expr, ExprConstant, Ranged, Stmt, StmtExpr};

use crate::autofix;
use crate::checkers::ast::Checker;
Expand Down Expand Up @@ -44,11 +43,7 @@ impl Violation for EllipsisInNonEmptyClassBody {
}

/// PYI013
pub(crate) fn ellipsis_in_non_empty_class_body(
checker: &mut Checker,
parent: &Stmt,
body: &[Stmt],
) {
pub(crate) fn ellipsis_in_non_empty_class_body(checker: &mut Checker, body: &[Stmt]) {
// If the class body contains a single statement, then it's fine for it to be an ellipsis.
if body.len() == 1 {
return;
Expand All @@ -59,24 +54,24 @@ pub(crate) fn ellipsis_in_non_empty_class_body(
continue;
};

let Expr::Constant(ExprConstant { value, .. }) = value.as_ref() else {
continue;
};

if !value.is_ellipsis() {
continue;
}

let mut diagnostic = Diagnostic::new(EllipsisInNonEmptyClassBody, stmt.range());
if checker.patch(diagnostic.kind.rule()) {
let edit = autofix::edits::delete_stmt(
stmt,
Some(parent),
checker.locator(),
checker.indexer(),
);
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.isolation(Some(parent))));
if matches!(
value.as_ref(),
Expr::Constant(ExprConstant {
value: Constant::Ellipsis,
..
})
) {
let mut diagnostic = Diagnostic::new(EllipsisInNonEmptyClassBody, stmt.range());
if checker.patch(diagnostic.kind.rule()) {
let edit = autofix::edits::delete_stmt(
stmt,
Some(stmt),
checker.locator(),
checker.indexer(),
);
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.statement_isolation()));
}
checker.diagnostics.push(diagnostic);
}
checker.diagnostics.push(diagnostic);
}
}
19 changes: 7 additions & 12 deletions crates/ruff/src/rules/flake8_pyi/rules/pass_in_class_body.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
use ruff_python_ast::{Ranged, Stmt};

use ruff_diagnostics::{AlwaysAutofixableViolation, Diagnostic, Fix};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{self as ast, Ranged};

use crate::autofix;
use crate::checkers::ast::Checker;
Expand All @@ -22,26 +21,22 @@ impl AlwaysAutofixableViolation for PassInClassBody {
}

/// PYI012
pub(crate) fn pass_in_class_body(checker: &mut Checker, parent: &Stmt, body: &[Stmt]) {
pub(crate) fn pass_in_class_body(checker: &mut Checker, class_def: &ast::StmtClassDef) {
// `pass` is required in these situations (or handled by `pass_statement_stub_body`).
if body.len() < 2 {
if class_def.body.len() < 2 {
return;
}

for stmt in body {
for stmt in &class_def.body {
if !stmt.is_pass_stmt() {
continue;
}

let mut diagnostic = Diagnostic::new(PassInClassBody, stmt.range());
if checker.patch(diagnostic.kind.rule()) {
let edit = autofix::edits::delete_stmt(
stmt,
Some(parent),
checker.locator(),
checker.indexer(),
);
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.isolation(Some(parent))));
let edit =
autofix::edits::delete_stmt(stmt, Some(stmt), checker.locator(), checker.indexer());
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.statement_isolation()));
}
checker.diagnostics.push(diagnostic);
}
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
use ruff_python_ast::{Ranged, Stmt};

use ruff_diagnostics::{AlwaysAutofixableViolation, Diagnostic, Edit, Fix};
use ruff_macros::{derive_message_formats, violation};
use ruff_python_ast::{Ranged, Stmt};

use crate::checkers::ast::Checker;
use crate::registry::Rule;
Expand All @@ -22,15 +21,15 @@ impl AlwaysAutofixableViolation for PassStatementStubBody {

/// PYI009
pub(crate) fn pass_statement_stub_body(checker: &mut Checker, body: &[Stmt]) {
if body.len() != 1 {
let [stmt] = body else {
return;
}
if body[0].is_pass_stmt() {
let mut diagnostic = Diagnostic::new(PassStatementStubBody, body[0].range());
};
if stmt.is_pass_stmt() {
let mut diagnostic = Diagnostic::new(PassStatementStubBody, stmt.range());
if checker.patch(Rule::PassStatementStubBody) {
diagnostic.set_fix(Fix::automatic(Edit::range_replacement(
format!("..."),
body[0].range(),
stmt.range(),
)));
};
checker.diagnostics.push(diagnostic);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,7 @@ pub(crate) fn str_or_repr_defined_in_stub(checker: &mut Checker, stmt: &Stmt) {
let stmt = checker.semantic().current_statement();
let parent = checker.semantic().current_statement_parent();
let edit = delete_stmt(stmt, parent, checker.locator(), checker.indexer());
diagnostic.set_fix(
Fix::automatic(edit)
.isolate(checker.isolation(checker.semantic().current_statement_parent())),
);
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.parent_isolation()));
}
checker.diagnostics.push(diagnostic);
}
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ pub(crate) fn empty_type_checking_block(checker: &mut Checker, stmt: &ast::StmtI
let stmt = checker.semantic().current_statement();
let parent = checker.semantic().current_statement_parent();
let edit = autofix::edits::delete_stmt(stmt, parent, checker.locator(), checker.indexer());
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.isolation(parent)));
diagnostic.set_fix(Fix::automatic(edit).isolate(checker.parent_isolation()));
}
checker.diagnostics.push(diagnostic);
}
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,6 @@ fn fix_imports(

Ok(
Fix::suggested_edits(remove_import_edit, add_import_edit.into_edits())
.isolate(checker.isolation(parent)),
.isolate(checker.parent_isolation()),
)
}
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,6 @@ fn fix_imports(

Ok(
Fix::suggested_edits(remove_import_edit, add_import_edit.into_edits())
.isolate(checker.isolation(parent)),
.isolate(checker.parent_isolation()),
)
}
2 changes: 1 addition & 1 deletion crates/ruff/src/rules/pyflakes/rules/unused_import.rs
Original file line number Diff line number Diff line change
Expand Up @@ -256,5 +256,5 @@ fn fix_imports(
checker.stylist(),
checker.indexer(),
)?;
Ok(Fix::automatic(edit).isolate(checker.isolation(parent)))
Ok(Fix::automatic(edit).isolate(checker.parent_isolation()))
}
Loading

0 comments on commit 17af12e

Please sign in to comment.