-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable token-based rules on source with syntax errors #11950
Conversation
b7134c9
to
7db979b
Compare
for token in tokens.up_to_first_unknown() { | ||
for token in tokens { | ||
pylint::rules::invalid_string_characters( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking at string tokens and the lexer doesn't emit them if it's unterminated. So, we might get away with not doing anything in this case for now.
impl<'a> DocLines<'a> { | ||
fn new(tokens: &'a Tokens) -> Self { | ||
Self { | ||
inner: tokens.up_to_first_unknown().iter(), | ||
inner: tokens.iter(), | ||
prev: TextSize::default(), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extracts a specific set of comments so it doesn't require any specific update.
_ => { | ||
kind => { | ||
if matches!(kind, TokenKind::Newline if fstrings > 0) { | ||
// The parser recovered from an unterminated f-string. | ||
fstrings = 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should work as the newline tokens within f-strings are actually NonLogicalNewline
, I'll move this into TokenIterWithContext
. I'll test this a lot because f-strings are complex.
7db979b
to
1961406
Compare
for token in tokens { | ||
match token.kind() { | ||
TokenKind::EndOfFile => { | ||
break; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The token stream doesn't contain the EndOfFile
token.
|
6e96839
to
f3bbacd
Compare
f3bbacd
to
7b42997
Compare
This PR reverts #12016 with a small change where the error location points to the continuation character only. Earlier, it would also highlight the whitespace that came before it. The motivation for this change is to avoid panic in #11950. For example: ```py \) ``` Playground: https://play.ruff.rs/87711071-1b54-45a3-b45a-81a336a1ea61 The range of `Unknown` token and `Rpar` is the same. Once #11950 is enabled, the indexer would panic. It won't panic in the stable version because we stop at the first `Unknown` token.
7b42997
to
eeb24b1
Compare
eeb24b1
to
4019ca4
Compare
CodSpeed Performance ReportMerging #11950 will not alter performanceComparing Summary
|
4019ca4
to
27f494e
Compare
3390bf0
to
b58e87b
Compare
6bb916f
to
85baab7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice!
if !line_is_comment_only { | ||
self.max_preceding_blank_lines = BlankLines::Zero; | ||
} | ||
if kind.is_any_newline() && !self.tokens.in_parenthesized_context() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an improvement even without the error recoverability :)
TokenKind::Newline if self.nesting > 0 => { | ||
self.nesting = 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's simpler than I expected. Nice
## Summary This PR updates Ruff to **not** generate auto-fixes if the source code contains syntax errors as determined by the parser. The main motivation behind this is to avoid infinite autofix loop when the token-based rules are run over any source with syntax errors in #11950. Although even after this, it's not certain that there won't be an infinite autofix loop because the logic might be incorrect. For example, #12094 and #12136. This requires updating the test infrastructure to not validate for fix availability status when the source contained syntax errors. This is required because otherwise the fuzzer might fail as it uses the test function to run the linter and validate the source code. resolves: #11455 ## Test Plan `cargo insta test`
85baab7
to
2e932e3
Compare
Summary
This PR updates the linter, specifically the token-based rules, to work on the tokens that come after a syntax error.
For context, the token-based rules only diagnose the tokens up to the first lexical error. This PR builds up an error resilience by introducing a
TokenIterWithContext
which updates thenesting
level and tries to reflect it with what the lexer is seeing. This isn't 100% accurate because if the parser recovered from an unclosed parenthesis in the middle of the line, the context won't reduce the nesting level until it sees the newline token at the end of the line.resolves: #11915
Test Plan