Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYPING: Added types for some tests #29205

Closed
wants to merge 2 commits into from

Conversation

TomAugspurger
Copy link
Contributor

Working around a strange typing issue. See
#28394 (comment)
for more, but the types on these were being inferred incorrectly by
mypy with just the addition of the allows_duplicate_labels kwarg.

cc @simonjayhawkins & @WillAyd. Hopefully I got things correct. I copied the use of TypeVar for FrameOrSeries with IndexOrSeries.

Working around a strange typing issue. See
pandas-dev#28394 (comment)
for more, but the types on these were being inferred incorrectly by
mypy with just the addition of the `allows_duplicate_labels` kwarg.
@TomAugspurger TomAugspurger added the Typing type annotations, mypy/pyright type checking label Oct 24, 2019
@TomAugspurger TomAugspurger added this to the 1.0 milestone Oct 24, 2019
@@ -32,6 +33,7 @@
FilePathOrBuffer = Union[str, Path, IO[AnyStr]]

FrameOrSeries = TypeVar("FrameOrSeries", bound="NDFrame")
IndexOrSeries = TypeVar("IndexOrSeries", bound="IndexOpsMixin")
Copy link
Contributor

@jreback jreback Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this meant to include array likes, eg EA? as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, just Index and Series.

pd.Index,
] # type: List[Union[Type[pd.Index], Type[pd.RangeIndex], Type[pd.Series]]]
left = [pd.RangeIndex(10, 40, 10)] # type: List[Union[Index, Series]]
for cls in index_or_series_params:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn’t this any_numeric_dtype ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to include float16. Aside from that, they look the same at a glance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we barely support float16 would just remove and use the fixture

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2019

Any chance you can post the failures this was solving? Would be helpful as background

@simonjayhawkins
Copy link
Member

Any chance you can post the failures this was solving? Would be helpful as background

pandas\tests\test_strings.py:206: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:240: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:379: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:391: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:431: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:444: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:498: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:600: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:660: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_strings.py:661: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:519: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:550: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:615: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:1093: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:1123: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:1147: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:1335: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\test_base.py:1399: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\io\json\test_json_table_schema.py:434: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\io\json\test_json_table_schema.py:441: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:518: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:544: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:566: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:786: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:798: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:836: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\indexing\test_coercion.py:868: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\dtypes\test_concat.py:43: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
pandas\tests\arithmetic\test_numeric.py:87: error: List comprehension has incompatible type List[PandasObject]; expected List[RangeIndex]
pandas\tests\arithmetic\test_numeric.py:101: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\arithmetic\test_numeric.py:126: error: List comprehension has incompatible type List[PandasObject]; expected List[RangeIndex]
pandas\tests\arithmetic\test_numeric.py:140: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"
pandas\tests\arrays\test_array.py:275: error: List item 0 has incompatible type "Type[Series]"; expected "Type[PandasObject]"

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Some minor things I think can help the typing and legibility.

Nice to parametrize here as well

index_or_series_params = [
pd.Series,
pd.Index,
] # type: List[Union[Type[pd.Index], Type[pd.RangeIndex], Type[pd.Series]]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
] # type: List[Union[Type[pd.Index], Type[pd.RangeIndex], Type[pd.Series]]]
] # type: Sequence[Type[Union[pd.Index, pd.Series]]]

Can shorten this quite a bit if you put the Union inside of the Type. Also, if you decide to use Sequence instead of list it covariant, so can implicitly handle RangeIndex being a subclass of Index

pandas/tests/arithmetic/test_numeric.py Show resolved Hide resolved
pandas/conftest.py Show resolved Hide resolved
from pandas.core.indexes.datetimelike import DatetimeIndexOpsMixin
import pandas.util.testing as tm

index_or_series = [Index, Series] # type: List[Type[IndexOpsMixin]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you not reusing the definition in pandas._typing here for a particular reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this one.

@@ -790,6 +791,17 @@ def tick_classes(request):
return request.param


index_or_series_params = [pd.Index, pd.Series] # type: IndexOrSeries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# type: List[IndexOrSeries]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be List[Type[IndexOrSeries]]? Surprised this passed checks as is; seems like something wonky going on

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch.

agree that strange it doesn't fail

index_or_series_params = [
pd.Series,
pd.Index,
] # type: Sequence[Type[Union[pd.Index, pd.Series]]]
Copy link
Member

@simonjayhawkins simonjayhawkins Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List[Type[IndexOrSeries]]

import pandas.core.strings as strings
import pandas.util.testing as tm
from pandas.util.testing import assert_index_equal, assert_series_equal

index_or_series_params = [Index, Series] # type: List[Type[PandasObject]]
Copy link
Member

@simonjayhawkins simonjayhawkins Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List[Type[IndexOrSeries]]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, I get

pandas/tests/test_strings.py:18: error: Type variable "pandas._typing.IndexOrSeries" is unbound

Any suggestions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably don't need to create IndexOrSeries in _typing. so the union or the shared class is fine, but should probably be consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins do you know generally how mypy is inferencing these? I guess from the original list of errors I'm not sure where Type[PandasObject] even comes into play. Wondering if there isn't a core issue we are missing here, but I may also just be overlooking

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my first reaction is that it is a mypy error. Revealed type is 'builtins.list[def (data: Any =, Any =, Any =, name: Any =, Any =, Any =, **Any) -> pandas.core.base.PandasObject]' seems iffy.

but agree, that it may be better to try and determine the cause rather than making these changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if types are getting mangled from pytest

Copy link
Member

@simonjayhawkins simonjayhawkins Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think pytest is relevant...

1 from typing import List, Type, cast
2
3 import pandas as pd
4 
5 reveal_type([pd.Index, pd.Series])
6
7 [pd.Index, pd.Series]
8
9 cast(List[Type[pd.core.base.PandasObject]], [pd.Index, pd.Series])
$ mypy test.py
test.py:5: note: Revealed type is 'builtins.list[def (data: Any =, Any =, Any =, name: Any =, Any =, Any =, Any =) -> pandas.core.base.PandasObject]'
test.py:5: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
test.py:7: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
(pandas-dev) 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think the reveal_type([pd.Index, pd.Series]) should be giving an error which is why i suspect a mypy issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.....so when I run the above this is all I get:

$ mypy -V
mypy 0.720
$ mypy test.py
test.py:5: note: Revealed type is 'builtins.list[builtins.type*]'

Wonder why we get different output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to clone #28394

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Oct 24, 2019

Here's a somewhat reduced example

(edit, simplified some more)

# ok.py
from typing import List, Type, cast


class Base:
    pass


class Series(Base):
    def __init__(self, a=None):
        pass


class Index(Base):
    def __new__(cls, a=None, **kwargs) -> "Index":
        result = object.__new__(cls)
        return result


reveal_type([Index, Series])
[Index, Series]
cast(List[Type[Base]], [Index, Series])
# bug.py
from typing import List, Type, cast


class Base:
    pass


class Series(Base):
    def __init__(self, a=None, z=None):
        pass


class Index(Base):
    def __new__(cls, a=None, **kwargs) -> "Index":
        result = object.__new__(cls)
        return result


reveal_type([Index, Series])
[Index, Series]
cast(List[Type[Base]], [Index, Series])

so the only difference is the new keyword. (z=False)

The outputs are

bug.py:20: note: Revealed type is 'builtins.list[def (a: Any =, Any =) -> bug.Base]'
bug.py:20: error: List item 0 has incompatible type "Type[Index]"; expected "Type[Base]"
bug.py:21: error: List item 0 has incompatible type "Type[Index]"; expected "Type[Base]"
ok.py:20: note: Revealed type is 'builtins.list[builtins.type*]'

@TomAugspurger
Copy link
Contributor Author

A couple observations:

  1. This goes away when there are the same number of kwargs (adding an unused argument to Index solves things).
  2. Removing **kwargs from Index.__new__ solves things.

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Oct 24, 2019

May be onto something... This passes mypy --strict without any errors. I changed

  • added a type to kwargs
  • added a type: Index to the result of Index.__new__
# bug.py
from typing import List, Type, cast, Any, Mapping, Hashable


class Base:
    pass


class Series(Base):
    def __init__(self, a: Any = None, z: bool=False) -> None:
        pass


class Index(Base):
    def __new__(cls, a: Any = None, **kwargs: Mapping[Hashable, Any]) -> "Index":
        result = object.__new__(cls)  # type: Index
        return result


reveal_type([Index, Series])
[Index, Series]
cast(List[Type[Base]], [Index, Series])

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2019

I asked about this on the typing Gitter and one of the devs shared this as where the "best" super type determination gets made:

https://github.com/python/mypy/blob/09c1fc7a4c19fe76c2ca9f0c58eaeccfe1ef9ed5/mypy/join.py#L340

Obviously here we have PandasObject and IndexOpsMixin as shared super types, and in this case it appears mypy infers PandasObject

Haven't stepped through the code but may provide some insight. In an case from comments seems like this behavior might be tenuous at best, so something to consider

@simonjayhawkins
Copy link
Member

might be clearer to use return cast("Index", result) instead of # type: Index

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Oct 24, 2019

Unfortunately, I can't get the suggestion in #29205 (comment) to type **kwargs and Index.__new__ working on my branch.

@TomAugspurger
Copy link
Contributor Author

Given that this seems like a mypy bug, any objection to just adding # type: ignore to the lines causing issues in #28394? Hopefully that's OK since they're just test files...

@simonjayhawkins
Copy link
Member

my preference would be a # type: ignore to changing things if it maybe a mypy issue.

--warn-unused-ignores can be used to remove if issues are fixed in mypy releases, so I don't think difficult to track.

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2019

Given that this seems like a mypy bug,

I don't think this is a bug. It's just ambiguous as to what our intent is here.

Does making the TypeVar point to PandasObject work? I guess having a TypeVar use a Mixin might not make a ton of sense anyway

@simonjayhawkins
Copy link
Member

another alternative to # type: ignore could be to simply replace [Index, Series] with cast(Any, [Index, Series]) as an interim measure.

@simonjayhawkins
Copy link
Member

I don't think this is a bug.

from typing import List, Type, cast


class Base:
    pass


class Series(Base):
    def __init__(self, a=None, z=None):
        pass


class Index(Base):
    def __new__(cls, a=None, **kwargs) -> "Index":
        result = object.__new__(cls)
        return result


reveal_type([Index, Series])

cast(List[Type[Base]], [Index, Series])

b: List[Type[Base]] = [Index, Series]
reveal_type(b)
$ mypy bug.py
bug.py:19: note: Revealed type is 'builtins.list[def (a: Any =, Any =) -> bug.Base]'
bug.py:19: error: List item 0 has incompatible type "Type[Index]"; expected "Type[Base]"
bug.py:24: note: Revealed type is 'builtins.list[Type[bug.Base]]'
(pandas-dev) 

The assignment to b does not fail. Is this not the type that mypy should be inferring?

@WillAyd
Copy link
Member

WillAyd commented Oct 24, 2019

That's a great example. I think the expression will always be ambiguous as to the intent whereas in the assignment we can clearly state what is expected. Can't we do that here in this PR but change List[Type[IndexOrSeries]] to just be List[Type[PandasObject]] then? I don't think it makes sense to use a Mixin here anyway, especially since these are really being used as constructors

A Literal Type[Series] or Type[Index] would probably be best, but we don't have that feature yet (though maybe should). If the Type[PandasObject] doesn't work then sure can ignore; I guess I'm just hesitant to do that because it's not clear what would need to happen to ever un-ignore

@simonjayhawkins
Copy link
Member

I guess I'm just hesitant to do that because it's not clear what would need to happen to ever un-ignore

to be clear. mypy is infering a type for a list of two items, and then is immediately producing an error with the inferred type.

i.e. reveal_type([Index, Series]) produces a error

the message in the error includes expected "Type[Base]" and yet explicity specifying the type as this type does not produce an error. i.e. b: List[Type[Base]] = [Index, Series]

so pretty sure this is a bug and what would need to happen to un-ignore is the bug to be fixed.

@TomAugspurger
Copy link
Contributor Author

I think I'll close this. Will ignore as needed on my other PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants