QuerySet.annotate improvements #398

syastrov · 2020-06-12T21:51:41Z

QuerySet.annotate now returns correct self-type rather than QuerySet[Any]. Attribute access falls back to Any.
QuerySets that have an annotated model do not report errors during .filter() when called with invalid fields.
QuerySets that have an annotated model return ordinary dict rather than TypedDict for .values()
QuerySets that have an annotated model return Any rather than typed Tuple for .values_list()

Calling QuerySet.annotate introduces an "annotated" model type which is a subtype of the original model dynamically (named "MyModel (annotated)"), but which allows arbitrary getting/setting of attributes.

This is originally to fix this issue:

a = MyModel.objects.filter()
a = a.annotate()

error: Incompatible types in assignment (expression has type “QuerySet[Any]“, variable has type “Manager[Any]“)

The problem is because of not using self-type

sobolevn

Great work!

The only question I have is: how one can represent annotated User statically? For example, we have this code:

def allowed(param: User):
     return param.foo  # E: "User" has no attribute "foo"

But, when one will pass there an instance with .annotate(foo=...) it will work in runtime, but fail in typechecking. Do you have any solution in mind?

syastrov · 2020-06-13T12:40:49Z

Thanks.

At the moment, I don't have a solution for representing annotated User as a type annotation in the actual code. I wanted to address the issue with calling .annotate turning the Manager[SomeModel] into QuerySet[Any], which is annoying and loses information.
So currently, this PR fixes that and also allows access to any attr on the model (within the same function). As you noted, there is no way of statically annotating an annotated model as the type is generated dynamically.

One idea is to make all Models generic to mark them as annotated, like:

_Annotations = TypeVar('_Annotations')
class Model(Generic[_Annotations]):
  pass

class MyModel(Model): ...

class MyAnnotations(TypedDict):
  foo: str
  bar: str

def func(param: MyModel[MyAnnotations]): pass

And handle this in the plugin.

But this might be too intrusive? We also have to see if we can default the generic param in a way so that by default the model is not annotated.

Another idea would be to make use of Annotated introduced in https://www.python.org/dev/peps/pep-0593/ (ironically named very similarly to what we might want to call it)
This way, the type would be the same as the original type at runtime, but we should be able to introduce behavior in the plugin to make it behave differently.
It could look like this:

from typing import Type
from typing_extensions import Annotated
from django.db.models.base import Model

class Annotations:
    def __init__(self, **kwargs: Type): ...

# User code:
# If no arguments, assume any
def func(user: Annotated[Model, Annotations()]) -> str:
  return user.foo + user.bar

# If we later want to support specifying which fields are annotated, it could look like this:
def func2(user: Annotated[Model, Annotations(foo=str, bar=str)]) -> str:
  return user.foo + user.bar

At first, I thought it's only available in Python 3.9 (unreleased), but it looks like it's in typing_extensions so we actually could try using it in older versions.
(Downside is that it seems like PyCharm doesn't support it yet though -- it thinks it's just Any, so might be annoying for those trying to autocomplete).

Another downside of using Annotated is that the type-checker won't check for type compatibility between different annotated models, e.g. assuming that one is annotated with a TypedDict that is a subset of another TypedDict.

syastrov · 2020-06-13T22:06:05Z

Looks like it will be some work to fix django typechecking

sobolevn · 2020-06-14T08:16:29Z

@syastrov I really like your Annotated[] idea. Let's merge this PR as is and open a new issue to track Annotated[] there!

Thanks a lot for the great work! 👍

syastrov · 2020-06-14T19:32:45Z

The tests are passing now, but I have some reservations about this, and would like to do some additional testing. So please, hold off on merging.

syastrov · 2020-06-14T21:45:17Z

By the way, looks like PyCharm is working on supporting Annotated: https://youtrack.jetbrains.com/issue/PY-41847
And that version is planned for release in July, so maybe there's a chance it's coming soon.

On second thought, I don't have any reservations as long as this PR is just fixing the problem within functions and not ACROSS functions (due to not being able to add type annotations using an "annotated" model or QuerySet).

I still need some more tests of .values() / .values_list() with no params and I think it should be good.

sobolevn · 2020-06-15T08:58:52Z

I still need some more tests of .values() / .values_list() with no params and I think it should be good.

Ok, let's wait for it and then merge!

@syastrov I have opened a new issue to track Annotated feature: #399

syastrov · 2020-08-01T20:50:01Z

@sobolevn I finally found some time to finish this :)

kszmigiel

Looks totally good for me
@syastrov thank you for you work!
@sobolevn How should we deal with all the changes in mypy plugin created since the refactor branch was created? Should I apply all these changes to the refactor branch by hand? What's your opinion?

sobolevn · 2020-08-05T11:09:28Z

Let's merge our own refactored code first. Then we can rebase these changed and apply them as well.
In case nothing is broken.

- QuerySets that have an annotated model do not report errors during .filter() when called with invalid fields. - QuerySets that have an annotated model return ordinary dict rather than TypedDict for .values() - QuerySets that have an annotated model return Any rather than typed Tuple for .values_list()

…ypechecking Django testsuite.

…ming QuerySet.first() won't return None) Fix mypy self-check.

…s_list. Cleanup tests.

…params.

syastrov · 2021-07-06T22:52:35Z

Hmm, this is getting tricky. I believe the type ignores in _ValuesQuerySet are causing certain things to not work.

I had an issue with the test code:

qs1 = Blog.objects.values('text').annotate(foo=F('id'))
reveal_type(qs1) # N: Revealed type is "django.db.models.query._ValuesQuerySet[myapp.models.Blog (annotated), builtins.dict[builtins.str, Any]]"

The actual value is django.db.models.query._ValuesQuerySet[myapp.models.Blog, TypedDict({'text': builtins.str})]", which is not expected. I believe the problem is that annotate lacks the self-type _QS. I tried adding that, but mypy says Invalid self argument "_ValuesQuerySet[Blog, TypedDict({'text': str})]" to attribute function "annotate" with type "Callable[[_QS, VarArg(Any), KwArg(Any)], _QS]".
I believe this is due to the type ignore on the class _ValuesQuerySet(...) line, which causes mypy to seemingly not apply the inheritance and thus messes up the self-typing logic.

To try to fix the inheritance by removing that type ignore, I tried changing the class definition, but this causes a failure in values_list_flat_true_with_ids.

The definition I am trying:

class _ValuesQuerySet(QuerySet[_T], Collection[_Row], Reversible[_Row], Sized):
    def __iter__(self) -> Iterator[_Row]: ... # type: ignore 
    @overload
    def __getitem__(self, i: int) -> _Row: ...
    @overload
    def __getitem__(self: _QS, s: slice) -> _QS: ...
    def iterator(self, chunk_size: int = ...) -> Iterator[_Row]: ...  # type: ignore
    def get(self, *args: Any, **kwargs: Any) -> _Row: ...  # type: ignore
    def earliest(self, *fields: Any, field_name: Optional[Any] = ...) -> _Row: ...  # type: ignore
    def latest(self, *fields: Any, field_name: Optional[Any] = ...) -> _Row: ...  # type: ignore
    def first(self) -> Optional[_Row]: ...  # type: ignore
    def last(self) -> Optional[_Row]: ...  # type: ignore

I believe the problem is the type ignore on __iter__. It means that mypy doesn't really apply this function annotation, but uses the super-class's (QuerySet's) definition.

So there is not really any way to resolve this, I think, except re-instating _BaseQuerySet, I believe.

… PEP 583 Annotated type.

syastrov · 2021-07-07T04:04:24Z

I managed to make it possible to annotate functions as taking "annotated models" or QuerySets using PEP 593 Annotated.

The "repr" also looks good due to some hacks :)

I quite like this way of doing it, as it's more self-explainable when you see the error messages, and it makes sense that users can also annotate their own functions using these types.

So, finally, if the issues above regarding inheritance of _ValuesQuerySet are fixed, then this PR should be ready (hopefully) :)

sobolevn

Great work. Like really great work! 👍

sobolevn · 2021-07-07T08:14:59Z

README.md

+    username = models.CharField(max_length=100)
+
+
+def func(m: WithAnnotations[MyModel]) -> str:


Will it be possible for us to later add something like WithAnnotations[Model, TypedDict] and not break things for users? Double checking 🙂

mypy_django_plugin/django/context.py

mypy_django_plugin/lib/helpers.py

sobolevn · 2021-07-07T08:18:18Z

mypy_django_plugin/transformers/models.py

+
+
+def get_or_create_annotated_type(
+    api: Union[SemanticAnalyzer, CheckerPluginInterface], model_type: Instance


api: SemanticAnalyzer?

Or even TypeAnalyzer

It's unfortunately used both in MethodContext (CheckerPluginInterface) and AnalyzeTypeContext (where I can grab SemanticAnalyzer from the api.api attribute).

sobolevn · 2021-07-07T08:25:03Z

Looks like we still have a broken test:

=================================== FAILURES ===================================
________________________ values_list_flat_true_with_ids ________________________
/home/runner/work/django-stubs/django-stubs/tests/typecheck/managers/querysets/test_values_list.yml:220: 
E   pytest_mypy_plugins.utils.TypecheckAssertionError: Invalid output: 
E   Expected:
E     main:2: note: Revealed type is "django.db.models.query._ValuesQuerySet[myapp.models.Blog, builtins.i...
E     main:3: note: Revealed type is "django.db.models.query._ValuesQuerySet[myapp.models.Blog, builtins.i...
E     main:5: note: Revealed type is "builtins.list[builtins.int*]" (diff)
E   Actual:
E     main:2: note: Revealed type is "django.db.models.query._ValuesQuerySet[myapp.models.Blog, builtins.i...
E     main:3: note: Revealed type is "django.db.models.query._ValuesQuerySet[myapp.models.Blog, builtins.i...
E     main:5: note: Revealed type is "builtins.list[myapp.models.Blog*]" (diff)
E   
E   Alignment of first line difference:
E     E: ...ed type is "builtins.list[builtins.int*]"
E     A: ...ed type is "builtins.list[myapp.models.Blog*]"
E                                     ^
=========================== short test summary info ============================
FAILED tests/typecheck/managers/querysets/test_values_list.yml::values_list_flat_true_with_ids

Is it the one you have mnetioned in #398 (comment)
Is it related to _BaseQuerySet?

@sobolevn

This currently has the drawback that error messages display the internal type _QuerySet, with both type arguments. See also discussion on typeddjango#661 and typeddjango#608. Fixes typeddjango#635: QuerySet methods on Managers (like .all()) now return QuerySets rather than Managers. Address code review by @sobolevn.

sobolevn

This is pure gold 🏅 Thank you!

Merge when you fill confident!

syastrov · 2021-07-11T13:02:24Z

Thanks! I tried to explore this route with annotating using TypedDicts. I think it's a good idea, and motivated me to change the implementation from using Annotated(...) to Annotated[...], since you cannot use "class construction" in function argument type annotations, at least at runtime.

I updated the README as well, so you can see the current limitations.

Let's see if I didn't break a lot of things :)

README.md

…dDict.

sobolevn

💯

syastrov · 2021-07-19T19:31:31Z

I would like to test it against an internal code-base before merging.

However, there is still one problem which I don't think we can solve at the moment, at least with the current implementation approach, but which I think would be useful to solve.

I would like to define a function which can add annotations to an existing QuerySet.
It might look something like this (although WithAnnotations[_QS, FooDict] would probably be cleaner):

class FooDict(TypedDict):
  foo: int

_Model = TypeVar("_Model", bound=MyModel)
def add_annotation_with_typevar(qs: QuerySet[_Model]) -> QuerySet[WithAnnotations[_Model, FooDict]]:
    return qs.annotate(foo=F('id'))

I'd like to be able to have multiple functions like these and chain them together so I can build up a QuerySet with multiple annotations.

This won't work correctly in the mypy plugin since get_type_analyze_hook receives the _Model arg for WithAnnotations as a TypeVarType, which doesn't really allow us to generate this "intersection type" -- the concrete subclass of _Model and a protocol containing the fields in the TypedDict. As a hack, I think we'd have to return some intermediate class that then somehow allows attribute access to the given fields from the TypedDict (using get_attribute_hook). But I'm not sure how that would play in terms of subtyping (could we really get it to be considered a subtype of _Model then?). I think this is just limited by the fact that mypy doesn't have an intersection type.

Anyway, I don't think this problem should prevent this PR from being merged.

syastrov · 2021-07-19T20:12:34Z

By the way, I found this issue python/mypy#6501 regarding getting a better repr for types.

…or example). Fix some edge case with from_queryset after QuerySet changed to be an alias to _QuerySet. Can't make a minimal test case as this only occurred on a large internal codebase.

syastrov · 2021-07-21T22:46:48Z

Alright, I got this to work against the internal codebase after a small fix.

I also noticed another bug, which I was able to workaround, but I was not able to make a minimal test case of.
Most likely, it was somehow possible to trigger it before these changes, but somehow it occurs for me now.

It involves having a custom manager which has a method with an argument annotated with User (from contrib.auth).
The error is KeyError: "models" in semantic analyzer, which is while it's trying to lookup myapp.modules.User (which is really just the imported contrib.auth.User). This doesn't happen if I change the annotation to be some other model than User, though.

Perhaps there is some defer call missing in the code that generates the manager class based on the custom manager class. But I am not sure.

If you agree, let's merge this!

syastrov · 2021-07-21T23:06:17Z

Hmm, I did notice that when using from_queryset and when doing MyModel.objects.filter(...).my_method() mypy says that my_method doesn't exist on _QuerySet[MyModel, MyModel] . I think this is after my change where BaseManager no longer derives from QuerySet. There isn't any existing test for this it seems.

sobolevn · 2021-07-22T07:54:13Z

@syastrov let's open two separate issues for things you described and merge this! 👍

…ent with a type annotation on the QuerySet. The mypy docstring on anal_type says not to call defer() after it.

syastrov · 2021-07-23T12:40:52Z

I fixed the first issue (KeyError: "models") while investigating the other issue. I unfortunately can't make a reproducible test case. But, from the mypy docstring on anal_type, it says not to call defer after it and the plugin is doing that. So I've removed that, and removed allow_placeholders=True as well, since that seemed redundant.

syastrov · 2021-07-23T12:51:26Z

@sobolevn You are welcome to merge if the tests pass :)

Then I'll open an issue on the second thing

sobolevn · 2021-07-23T13:15:32Z

@syastrov thanks a lot for your amazing work! 🎉

noelleleigh · 2021-07-23T13:50:54Z

Wow, this has been a long time coming! When can we expect a release with these changes?

syastrov · 2021-07-23T14:10:15Z

You are welcome! :) Glad it's merged. I opened an issue for the other thing

syastrov force-pushed the queryset-annotate branch 3 times, most recently from e7f0544 to 41de338 Compare June 12, 2020 21:55

sobolevn previously approved these changes Jun 12, 2020

View reviewed changes

sobolevn assigned sobolevn and syastrov Jun 15, 2020

sobolevn mentioned this pull request Jun 15, 2020

Add Annotated type to represent the result of .annotate() call #399

Closed

sobolevn requested review from mkurnikov and kszmigiel August 1, 2020 21:10

kszmigiel approved these changes Aug 5, 2020

View reviewed changes

sidmitra mentioned this pull request Apr 25, 2021

QuerySet's .annotate() returns an incorrect type when chained with .values() #602

Open

syastrov added 10 commits May 8, 2021 09:08

Fix .annotate so it reuses existing annotated types. Fixes error in t…

d4278f7

…ypechecking Django testsuite.

Fix self-typecheck error

922ed03

Fix flake8

c31a0a6

Fix case of .values/.values_list before .annotate.

dad53c2

Extra ignores for Django 2.2 tests (false positives due to tests assu…

63fcc8a

…ming QuerySet.first() won't return None) Fix mypy self-check.

More tests + more precise typing in case annotate called before value…

2a319a8

…s_list. Cleanup tests.

Test and fix annotate in combination with values/values_list with no …

33372f6

…params.

Remove line that does nothing :)

db604d6

Formatting fixes

ebf9500

syastrov force-pushed the queryset-annotate branch from 935590f to ebf9500 Compare May 10, 2021 07:14

syastrov added 2 commits July 7, 2021 01:39

Make it possible to annotate user code with "annotated models", using…

b8debd2

… PEP 583 Annotated type.

Add docs

751ae7f

sobolevn reviewed Jul 7, 2021

View reviewed changes

sobolevn approved these changes Jul 7, 2021

View reviewed changes

Support passing TypedDicts to WithAnnotations

51f0448

syastrov force-pushed the queryset-annotate branch from 2106a39 to 51f0448 Compare July 11, 2021 13:02

sobolevn reviewed Jul 11, 2021

View reviewed changes

README.md Show resolved Hide resolved

Add an example of an error to README regarding WithAnnotations + Type…

72c1dfb

…dDict.

sobolevn approved these changes Jul 17, 2021

View reviewed changes

Fix runtime behavior of ValuesQuerySet alias (you can't extend Any, f…

a6a47da

…or example). Fix some edge case with from_queryset after QuerySet changed to be an alias to _QuerySet. Can't make a minimal test case as this only occurred on a large internal codebase.

Fix issue when using from_queryset in some cases when having an argum…

d3ea945

…ent with a type annotation on the QuerySet. The mypy docstring on anal_type says not to call defer() after it.

syastrov force-pushed the queryset-annotate branch from 7fd6691 to d3ea945 Compare July 23, 2021 12:46

sobolevn merged commit cfd69c0 into typeddjango:master Jul 23, 2021

syastrov mentioned this pull request Jul 23, 2021

Error calling custom queryset method on from_queryset generated manager after calling other QuerySet method (e.g. filter) #673

Closed

flaeppe mentioned this pull request Jan 24, 2023

WithAnnotations Crash in 1.9.0 #760

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QuerySet.annotate improvements #398

QuerySet.annotate improvements #398

syastrov commented Jun 12, 2020 •

edited

Loading

sobolevn left a comment

syastrov commented Jun 13, 2020

syastrov commented Jun 13, 2020

sobolevn commented Jun 14, 2020

syastrov commented Jun 14, 2020

syastrov commented Jun 14, 2020

sobolevn commented Jun 15, 2020

syastrov commented Aug 1, 2020

kszmigiel left a comment

sobolevn commented Aug 5, 2020 •

edited

Loading

syastrov commented Jul 6, 2021 •

edited

Loading

syastrov commented Jul 7, 2021

sobolevn left a comment

sobolevn Jul 7, 2021

sobolevn Jul 7, 2021

sobolevn Jul 7, 2021

syastrov Jul 7, 2021

sobolevn commented Jul 7, 2021

sobolevn left a comment

syastrov commented Jul 11, 2021

sobolevn left a comment

syastrov commented Jul 19, 2021

syastrov commented Jul 19, 2021

syastrov commented Jul 21, 2021

syastrov commented Jul 21, 2021

sobolevn commented Jul 22, 2021 •

edited

Loading

syastrov commented Jul 23, 2021

syastrov commented Jul 23, 2021

sobolevn commented Jul 23, 2021

noelleleigh commented Jul 23, 2021

syastrov commented Jul 23, 2021

		username = models.CharField(max_length=100)


		def func(m: WithAnnotations[MyModel]) -> str:



		def get_or_create_annotated_type(
		api: Union[SemanticAnalyzer, CheckerPluginInterface], model_type: Instance

QuerySet.annotate improvements #398

QuerySet.annotate improvements #398

Conversation

syastrov commented Jun 12, 2020 • edited Loading

sobolevn left a comment

Choose a reason for hiding this comment

syastrov commented Jun 13, 2020

syastrov commented Jun 13, 2020

sobolevn commented Jun 14, 2020

syastrov commented Jun 14, 2020

syastrov commented Jun 14, 2020

sobolevn commented Jun 15, 2020

syastrov commented Aug 1, 2020

kszmigiel left a comment

Choose a reason for hiding this comment

sobolevn commented Aug 5, 2020 • edited Loading

syastrov commented Jul 6, 2021 • edited Loading

syastrov commented Jul 7, 2021

sobolevn left a comment

Choose a reason for hiding this comment

sobolevn Jul 7, 2021

Choose a reason for hiding this comment

sobolevn Jul 7, 2021

Choose a reason for hiding this comment

sobolevn Jul 7, 2021

Choose a reason for hiding this comment

syastrov Jul 7, 2021

Choose a reason for hiding this comment

sobolevn commented Jul 7, 2021

sobolevn left a comment

Choose a reason for hiding this comment

syastrov commented Jul 11, 2021

sobolevn left a comment

Choose a reason for hiding this comment

syastrov commented Jul 19, 2021

syastrov commented Jul 19, 2021

syastrov commented Jul 21, 2021

syastrov commented Jul 21, 2021

sobolevn commented Jul 22, 2021 • edited Loading

syastrov commented Jul 23, 2021

syastrov commented Jul 23, 2021

sobolevn commented Jul 23, 2021

noelleleigh commented Jul 23, 2021

syastrov commented Jul 23, 2021

syastrov commented Jun 12, 2020 •

edited

Loading

sobolevn commented Aug 5, 2020 •

edited

Loading

syastrov commented Jul 6, 2021 •

edited

Loading

sobolevn commented Jul 22, 2021 •

edited

Loading