Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add signature for dataclasses.replace #14849

Merged
merged 49 commits into from
Jun 17, 2023

Conversation

ikonst
Copy link
Contributor

@ikonst ikonst commented Mar 7, 2023

Validate dataclassses.replace actual arguments to match the fields:

  • Unlike __init__, the arguments are always named.
  • All arguments are optional except for InitVars without a default value.

The tricks:

  • We're looking up type of the first positional argument ("obj") through private API. See Adjusting the signature of a function on the basis of caller types in plugin #10216, Let function signature hook have context of caller #14845.
  • We're preparing the signature of "replace" (for that specific dataclass) during the dataclass transformation and storing it in a "private" class attribute __mypy-replace (obviously not part of PEP-557 but contains a hyphen so should not conflict with any future valid identifier). Stashing the signature into the symbol table allows it to be passed across phases and cached across invocations. The stashed signature lacks the first argument, which we prepend at function signature hook time, since it depends on the type that replace is called on.

Based on #14526 but actually simpler.
Partially addresses #5152.

Remaining tasks

  • handle generic dataclasses
  • avoid data class transforms
  • fine-grained mode tests

@github-actions

This comment has been minimized.

arg_names = [None]
arg_kinds = [ARG_POS]
arg_types = [obj_type]
for attr in dataclass["attributes"]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I could've deserialized metadata, or even accessed a list of DataclassAttributes that's already been deserialized...

Or maybe the right way to go is to replace dataclasses.replace with an OverloadedFuncDef that the transform (which has full context) will keep adding to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OverloadedFuncDef appears to be a non-starter since by the time I can do anything to replace dataclasses.replace with a synthesized OverloadedFuncDef, the call expression has been resolved with the pointer to the generic FunctionDef.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, solved it by creating a "secret" symbol with a signature at semantic analysis time, then using this signature in the function sig callback.

@ikonst ikonst force-pushed the 2023-03-06-dataclasses-replace branch from f6c3a8f to 1f08816 Compare March 8, 2023 05:09
@github-actions

This comment has been minimized.

@ikonst
Copy link
Contributor Author

ikonst commented Mar 8, 2023

@ilevkivskyi care to take a look?

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@ikonst
Copy link
Contributor Author

ikonst commented Mar 14, 2023

👋 @AlexWaygood

Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just a triager over at mypy, so I'm afraid I don't have merge privileges! Here's a quick comment though:

@@ -32,3 +32,5 @@ def field(*,


class Field(Generic[_T]): pass

def replace(obj: _T, **changes: Any) -> _T: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty different to typeshed's stub for replace. Would it be worth adding a pythoneval test as well, to check that this machinery all works with the full typeshed stubs for the stdlib, outside the context of mypy's test suite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stub has:

def replace(__obj: _DataclassT, **changes: Any) -> _DataclassT: ...

I can rename obj to __obj in this fixture. We can also change the stub to better reflect reality:

if sys.version_info >= (3, 9):
    def replace(obj: _DataclassT, /, **changes: Any) -> _DataclassT: ...
else:
   def replace(obj: _DataclassT, **changes: Any) -> _DataclassT: ...

Can we just use the Python 3.9 syntax even for Python 3.8 targets? Passing obj as kw is deprecated in Python 3.8 so being a checker/linter we might as well just disallow it, no?

Do you think it matters much? For the plugin to function, replace has to:

  1. exist
  2. have a first argument

The **changes are optional, the retval is overwritten, and FWIW it doesn't look for a keyword arg obj since it's deprecated since Python 3.8 and was never a good idea to begin with, so we might as well just not support it, right?

Copy link
Member

@AlexWaygood AlexWaygood Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also change the stub to better reflect reality:

No need: prepending the parameter name with __ is the py37-compatible way of declaring that a parameter is positional-only, as is described in PEP 484: https://peps.python.org/pep-0484/#positional-only-arguments. So the obj parameter is already marked as positional-only on all Python versions in typeshed's stubs.

I don't think you need to change anything in your existing fixtures/tests for this PR, I'm just suggesting adding a single additional test in the pythoneval.test file. Unlike mypy's other tests, the tests in that file don't use fixtures — they run with typeshed's full stdlib stubs. So it's a good sanity check to make sure that the machinery you've added works with typeshed's stubs for this function as well as with mypy's fixtures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions

This comment has been minimized.

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
@github-actions

This comment has been minimized.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks like a hack, but OTOH it should not make things worse, so let's give it a try. But please be sure to at least fix the fine-grained test before merging.

if replace_sym is None:
_fail_not_dataclass(ctx, display_typ, parent_typ)
return None
replace_sig = get_proper_type(replace_sym.type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to call get+proper_type() here, since we know we added it, you can just add assert isinstance(..., ProperType) instead.

Copy link
Contributor Author

@ikonst ikonst May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ends up looking awkward like this:

-replace_sig = get_proper_type(replace_sym.type)
+replace_sig = replace_sym.type
+assert isinstance(replace_sig, ProperType)
 assert isinstance(replace_sig, CallableType)

Isn't it simpler to do what I did before, even if taking a few more cycles?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually looks fine to have two asserts. It has two additional benefits:

  • It clearly indicates the intent
  • It reduces number of get_proper_type() in the code, so people we hopefully cargo cult them less (we had a bunch of weird crashes caused by blindly copying some get_proper_type() calls).

return None


def _meet_replace_sigs(sigs: list[CallableType]) -> CallableType:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like hack. Is this because the plugin systems expects a CallableType returned from the plugin? Maybe we can instead relax this expectation and allow returning say a union of callable types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, that's an interesting idea. It didn't cross my mind I could return a union of callables and let mypy handle it in check_union_call.

Copy link
Contributor Author

@ikonst ikonst May 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. I'll be happy to simplify this code, but do you reckon we should first change the plugin API and then rework this PR, or merge as-is, change plugin API and then refactor?

(A bit worried the devil's in the details and that maybe it's not as easy as it sounds, or not equivalent.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to do it in a follow up PR, but leave a TODO or open an issue (or both).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


assert isinstance(ctx.api, TypeChecker)
obj_type = ctx.api.expr_checker.accept(obj_arg)
# </hack>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal still stands. Do we have an issue for this?

test-data/unit/check-dataclasses.test Outdated Show resolved Hide resolved
test-data/unit/check-dataclasses.test Show resolved Hide resolved
test-data/unit/fine-grained-dataclass.test Outdated Show resolved Hide resolved
@@ -0,0 +1,25 @@
[case replace]
[file model.py]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this case may be not correct, there should be a top-level section like import replace. I am not sure what the test runner does when it is empty. (see testMetaclassOperators as a totally random example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's meaningful to import it? I would think mypy goes over all Python files, rather than follow the import chain.

e.g.

echo > foo.py
echo "a: str = 42" > bar.py
mypy .

results in

bar.py:1: error: Incompatible types in assignment (expression has type "int", variable has type "str")  [assignment]
F

I don't mind adding the import but it doesn't make the test any more/less meaningful (notice that it does produce the expected error).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried fine-grained test runner doesn't simply check all files (like some other test runners), but I checked the code, and it actually looks OK.

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

vision (https://github.com/pytorch/vision) got 1.42x slower (29.2s -> 41.3s)

@ikonst
Copy link
Contributor Author

ikonst commented Jun 2, 2023

@ilevkivskyi can take another look?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@github-actions
Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@ikonst
Copy link
Contributor Author

ikonst commented Jun 13, 2023

@ilevkivskyi can take another look?

@ilevkivskyi
Copy link
Member

@ikonst Please note there is now a merge conflict. You can also address couple remaining comments before I merge this.

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

graphql-core (https://github.com/graphql-python/graphql-core): typechecking got 1.06x slower (281.6s -> 297.4s)
(Performance measurements are based on a single noisy sample)

@ilevkivskyi ilevkivskyi merged commit 6f2bfff into python:master Jun 17, 2023
@ikonst ikonst deleted the 2023-03-06-dataclasses-replace branch June 17, 2023 13:16
sobolevn added a commit that referenced this pull request Jun 26, 2023
…5503)

Now we use a similar approach to
#14849
First, we generate a private name to store in a metadata (with `-`, so -
no conflicts, ever).
Next, we check override to be compatible: we take the currect signature
and compare it to the ideal one we have.

Simple and it works :)

Closes #15498
Closes #9254

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ivan Levkivskyi <levkivskyi@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants