-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[red-knot] Add support for string annotations (#14151)
## Summary This PR adds support for parsing and inferring types within string annotations. ### Implementation (attempt 1) This is preserved in 6217f48. The implementation here would separate the inference of string annotations in the deferred query. This requires the following: * Two ways of evaluating the deferred definitions - lazily and eagerly. * An eager evaluation occurs right outside the definition query which in this case would be in `binding_ty` and `declaration_ty`. * A lazy evaluation occurs on demand like using the `definition_expression_ty` to determine the function return type and class bases. * The above point means that when trying to get the binding type for a variable in an annotated assignment, the definition query won't include the type. So, it'll require going through the deferred query to get the type. This has the following limitations: * Nested string annotations, although not necessarily a useful feature, is difficult to implement unless we convert the implementation in an infinite loop * Partial string annotations require complex layout because inferring the types for stringified and non-stringified parts of the annotation are done in separate queries. This means we need to maintain additional information ### Implementation (attempt 2) This is the final diff in this PR. The implementation here does the complete inference of string annotation in the same definition query by maintaining certain state while trying to infer different parts of an expression and take decisions accordingly. These are: * Allow names that are part of a string annotation to not exists in the symbol table. For example, in `x: "Foo"`, if the "Foo" symbol is not defined then it won't exists in the symbol table even though it's being used. This is an invariant which is being allowed only for symbols in a string annotation. * Similarly, lookup name is updated to do the same and if the symbol doesn't exists, then it's not bounded. * Store the final type of a string annotation on the string expression itself and not for any of the sub-expressions that are created after parsing. This is because those sub-expressions won't exists in the semantic index. Design document: https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4 Closes: #13796 ## Test Plan * Add various test cases in our markdown framework * Run `red_knot` on LibCST (contains a lot of string annotations, specifically https://github.com/Instagram/LibCST/blob/main/libcst/matchers/_matcher_base.py), FastAPI (good amount of annotated code including `typing.Literal`) and compare against the `main` branch output
- Loading branch information
1 parent
a48d779
commit 9ec690b
Showing
6 changed files
with
569 additions
and
87 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
186 changes: 184 additions & 2 deletions
186
crates/red_knot_python_semantic/resources/mdtest/annotations/string.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,191 @@ | ||
# String annotations | ||
|
||
## Simple | ||
|
||
```py | ||
def f() -> "int": | ||
return 1 | ||
|
||
# TODO: We do not support string annotations, but we should not panic if we encounter them | ||
reveal_type(f()) # revealed: @Todo | ||
reveal_type(f()) # revealed: int | ||
``` | ||
|
||
## Nested | ||
|
||
```py | ||
def f() -> "'int'": | ||
return 1 | ||
|
||
reveal_type(f()) # revealed: int | ||
``` | ||
|
||
## Type expression | ||
|
||
```py | ||
def f1() -> "int | str": | ||
return 1 | ||
|
||
def f2() -> "tuple[int, str]": | ||
return 1 | ||
|
||
reveal_type(f1()) # revealed: int | str | ||
reveal_type(f2()) # revealed: tuple[int, str] | ||
``` | ||
|
||
## Partial | ||
|
||
```py | ||
def f() -> tuple[int, "str"]: | ||
return 1 | ||
|
||
reveal_type(f()) # revealed: tuple[int, str] | ||
``` | ||
|
||
## Deferred | ||
|
||
```py | ||
def f() -> "Foo": | ||
return Foo() | ||
|
||
class Foo: | ||
pass | ||
|
||
reveal_type(f()) # revealed: Foo | ||
``` | ||
|
||
## Deferred (undefined) | ||
|
||
```py | ||
# error: [unresolved-reference] | ||
def f() -> "Foo": | ||
pass | ||
|
||
reveal_type(f()) # revealed: Unknown | ||
``` | ||
|
||
## Partial deferred | ||
|
||
```py | ||
def f() -> int | "Foo": | ||
return 1 | ||
|
||
class Foo: | ||
pass | ||
|
||
reveal_type(f()) # revealed: int | Foo | ||
``` | ||
|
||
## `typing.Literal` | ||
|
||
```py | ||
from typing import Literal | ||
|
||
def f1() -> Literal["Foo", "Bar"]: | ||
return "Foo" | ||
|
||
def f2() -> 'Literal["Foo", "Bar"]': | ||
return "Foo" | ||
|
||
class Foo: | ||
pass | ||
|
||
reveal_type(f1()) # revealed: Literal["Foo", "Bar"] | ||
reveal_type(f2()) # revealed: Literal["Foo", "Bar"] | ||
``` | ||
|
||
## Various string kinds | ||
|
||
```py | ||
# error: [annotation-raw-string] "Type expressions cannot use raw string literal" | ||
def f1() -> r"int": | ||
return 1 | ||
|
||
# error: [annotation-f-string] "Type expressions cannot use f-strings" | ||
def f2() -> f"int": | ||
return 1 | ||
|
||
# error: [annotation-byte-string] "Type expressions cannot use bytes literal" | ||
def f3() -> b"int": | ||
return 1 | ||
|
||
def f4() -> "int": | ||
return 1 | ||
|
||
# error: [annotation-implicit-concat] "Type expressions cannot span multiple string literals" | ||
def f5() -> "in" "t": | ||
return 1 | ||
|
||
# error: [annotation-escape-character] "Type expressions cannot contain escape characters" | ||
def f6() -> "\N{LATIN SMALL LETTER I}nt": | ||
return 1 | ||
|
||
# error: [annotation-escape-character] "Type expressions cannot contain escape characters" | ||
def f7() -> "\x69nt": | ||
return 1 | ||
|
||
def f8() -> """int""": | ||
return 1 | ||
|
||
# error: [annotation-byte-string] "Type expressions cannot use bytes literal" | ||
def f9() -> "b'int'": | ||
return 1 | ||
|
||
reveal_type(f1()) # revealed: Unknown | ||
reveal_type(f2()) # revealed: Unknown | ||
reveal_type(f3()) # revealed: Unknown | ||
reveal_type(f4()) # revealed: int | ||
reveal_type(f5()) # revealed: Unknown | ||
reveal_type(f6()) # revealed: Unknown | ||
reveal_type(f7()) # revealed: Unknown | ||
reveal_type(f8()) # revealed: int | ||
reveal_type(f9()) # revealed: Unknown | ||
``` | ||
|
||
## Various string kinds in `typing.Literal` | ||
|
||
```py | ||
from typing import Literal | ||
|
||
def f() -> Literal["a", r"b", b"c", "d" "e", "\N{LATIN SMALL LETTER F}", "\x67", """h"""]: | ||
return "normal" | ||
|
||
reveal_type(f()) # revealed: Literal["a", "b", "de", "f", "g", "h"] | Literal[b"c"] | ||
``` | ||
|
||
## Class variables | ||
|
||
```py | ||
MyType = int | ||
|
||
class Aliases: | ||
MyType = str | ||
|
||
forward: "MyType" | ||
not_forward: MyType | ||
|
||
reveal_type(Aliases.forward) # revealed: str | ||
reveal_type(Aliases.not_forward) # revealed: str | ||
``` | ||
|
||
## Annotated assignment | ||
|
||
```py | ||
a: "int" = 1 | ||
b: "'int'" = 1 | ||
c: "Foo" | ||
# error: [invalid-assignment] "Object of type `Literal[1]` is not assignable to `Foo`" | ||
d: "Foo" = 1 | ||
|
||
class Foo: | ||
pass | ||
|
||
c = Foo() | ||
|
||
reveal_type(a) # revealed: Literal[1] | ||
reveal_type(b) # revealed: Literal[1] | ||
reveal_type(c) # revealed: Foo | ||
reveal_type(d) # revealed: Foo | ||
``` | ||
|
||
## Parameter | ||
|
||
TODO: Add tests once parameter inference is supported |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.