-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flesh out more of jsonschema stubs #7950
Conversation
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
Apply more detailed annotations to the format module and most of the exceptions module.
9a3a65f
to
931af6d
Compare
This comment has been minimized.
This comment has been minimized.
2 similar comments
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I'm not sure what the mypy_primer error means. I tried running
Neither of which are, as far as I know, related to this change. So possibly this schemathesis line is a false negative? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'm not too familiar with jsonschema
, but here's a few thoughts from looking at the source code.
def is_email(instance) -> bool: ... | ||
def is_ipv4(instance) -> bool: ... | ||
def is_ipv6(instance) -> bool: ... | ||
def is_email(instance: Any) -> bool: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading the source code correctly, I think this (and several other of the is_foo
functions here) can accept literally any type. In general, for functions like these, we prefer using object
instead of Any
for parameter annotations. Any
is sort of an "escape hatch" to be used as a last resort, or for when the true type of a parameter isn't really expressible in the current type system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch this.
Aside/tangent: using object
in similar functions (e.g. ensure_is_foo(x: object) -> Foo
) can be less nice than Any
because of how it responds to type narrowing. After you write enough "assert-is-X" utilities, it becomes a habit to use Any
in basically any case in which one could use object
. This is all off-topic, but I wonder if changing the way that object
narrows would make it more comfortable/normal for typing users like myself to use it. When I first ran into this, I found it very surprising that isinstance(x, str)
does not narrow x: object
to str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I first ran into this, I found it very surprising that
isinstance(x, str)
does not narrowx: object
tostr
.
That's not true:
https://mypy-play.net/?mypy=latest&python=3.10&gist=5104b5bedec0b6dd05bd8b6cf23fc1ef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Hmm... I know I've run into unexpected issues in which object
doesn't narrow but Any
does. But clearly it wasn't such a simple case. If I see this again, I'll make sure to pay closer attention.
Sorry for the inaccuracy and distraction.
def is_ipv4(instance) -> bool: ... | ||
def is_ipv6(instance) -> bool: ... | ||
def is_email(instance: Any) -> bool: ... | ||
def is_ipv4(instance: Any) -> bool: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this function often returns an IPv4Address
instance. From the look of the function, though, I guess it only really matters whether it's truthy or falsey (it doesn't actually have to be True
or False
). So I guess I'm okay pretending that this returns a boolean :)
Looks like the same goes for several other of the is_foo
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wow, I completely missed that! I'd like to fix the runtime logic to bool(...)
so that these all have the same object -> bool
signature. If you're okay with this "white lie" in the types, I'd like to keep this as the annotation and then go fix the library later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're okay with this "white lie" in the types, I'd like to keep this as the annotation and then go fix the library later.
I'm okay with that 😄 I guessed that that was probably the case!
@property | ||
def absolute_path(self): ... | ||
def absolute_path(self) -> Sequence[str]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally prefer to be as precise as possible with return types in typeshed, so that would imply using deque[str]
here instead of Sequence[str]
. If you think it's an implementation detail which precise type is returned here, however, then you may want to ignore me here :)
The same point applies to absolute_schema_path
below.
def absolute_path(self) -> Sequence[str]: ... | |
def absolute_path(self) -> deque[str]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think it's an implementation detail which precise type is returned here, however, then you may want to ignore me here :)
Yeah, I considered annotating it as deque
-- that was in my first draft -- but then decided exactly this. Callers should not be relying on that value being a deque
IMO.
In a way, your comment helps validate my decision. It means I was thinking the right kinds of things!
@property | ||
def json_path(self): ... | ||
def json_path(self) -> str: ... | ||
def _contents(self) -> dict[str, Any]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a bit involved, but you could consider returning a TypedDict
here for more precise typing (each key will have a specific type associated with it for the corresponding value, if I understand the source code correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to punt on this for now, and just add a TODO comment to make the type more precise with TypedDict
, if that's okay. Right now, I want to get this changeset "over the finish line" and then come back when I have some more time to work on jsonschema
and on the typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
STRONG_MATCHES: Any | ||
from jsonschema import _utils, protocols | ||
|
||
_RelevanceFuncType = Callable[[ValidationError], Any] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functions passed to the key
argument of builtins.max
must return objects that satisfy the _typeshed.SupportsRichComparison
interface:
Line 1295 in 2d2b34c
def max(__arg1: _T, __arg2: _T, *_args: _T, key: Callable[[_T], SupportsRichComparison]) -> _T: ... |
_RelevanceFuncType = Callable[[ValidationError], Any] | |
_RelevanceFuncType = Callable[[ValidationError], SupportsRichComparison] |
(You'll have to import it from _typeshed
at the top of the file.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's why Ctrl+F: "Comparable"
didn't get me anything on the typing docs!
I wanted this but decided against adding
class _Comparable(Protocol):
def __lt__(self, other: Any) -> bool: ...
to the stubs.
I'll make the change, with great enthusiasm.
(Is SupportsRichComparison
on a path towards inclusion in typing-extensions
and typing
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
SupportsRichComparison
on a path towards inclusion intyping-extensions
andtyping
?
I'm rooting for it 🤞 but I think it's still too new to add it to typing_extensions
just yet — we only added it a few months ago in #6583, so I don't think it can really be considered stable yet
|
- is_X funcs are now `object -> bool`, not `Any -> bool` - _Error.contents has a TODO comment to use TypedDict - _RelevanceFuncType uses _typeshed.SupportsRichComparison
Diff from mypy_primer, showing the effect of this PR on open source code: schemathesis (https://github.com/schemathesis/schemathesis)
- src/schemathesis/specs/openapi/negative/__init__.py:6: note: ... from here:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'll wait a bit to see if any other maintainers have any thoughts before merging
I haven't done a full review but nothing stands out to me on skimming the stubs. If you're OK with the change, feel free to merge. |
Merged ;) |
The output you get from the command is the effect of diffing mypy master vs mypy 0.960 (while using a custom typeshed repo). I'm not sure what the schemathesis diff is about, but we do some output munging / mypy is sensitive to various things that cause it to change file processing order, and sometimes we get weird diffs. |
@property | ||
def absolute_path(self): ... | ||
def absolute_path(self) -> Sequence[str]: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is incorrect (at least with jsonschema 4.4.0) because path entries may be indexes into a JSON array. For example:
import jsonschema
SCHEMA = {
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "array",
"items": {"type": "string"},
}
document = ["a", 12345, "c", "d"]
try:
jsonschema.validate(document, SCHEMA)
except jsonschema.ValidationError as e:
print(repr(e))
for path_entry in e.absolute_path:
print(repr(path_entry), type(path_entry))
$ pip show jsonschema | \grep Version
Version: 4.4.0
$ python temp.py
<ValidationError: "12345 is not of type 'string'">
1 <class 'int'>
I think Sequence[str | int]
is the correct annotation? I'll make a PR with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! Apologies for the oversight. These can be ints in precisely the case you mention. I'll double-check for other cases of this. (I'm not a typeshed maintainer, but I'll also go 👍 your PR.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to worry, glad it was straightforward to sort out!
Apply more detailed annotations to the format module and most of the exceptions module.
Some background:
jsonschema
is starting to add internal annotations and in the last year addedmypy
to its own linting/testing. Hopefully at some point we can publish the package with annotations for general use, but we're not there today.So there's going to be an ongoing project of adding internal annotations, and I'm going to try to port that stuff back into typeshed where appropriate.
There are also some updates here which aren't part of the
jsonschema
annotations, and I would like to at some point port them back in the other direction -- intojsonschema
.If anyone has ideas about how to get annotations gradually added and eventually published as part of a widely used package which is already in typeshed, I'm open to learning a better way of doing this.