Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update signature for _arrayfunction.__array__ #9237

Merged
merged 19 commits into from
Jul 21, 2024

Conversation

Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Jul 11, 2024

I believe numpy updated the signature of __array__ in numpy2.
It currently looks like this now: https://github.com/numpy/numpy/blob/de1ca2676e0f1db5a1b95e1d05e86ca87321cac2/numpy/__init__.pyi#L1471-L1478

xref #9231, #9252

@Illviljan Illviljan changed the title Update test_namedarray.py Fix typing in test_namedarray.py Jul 11, 2024
@Illviljan
Copy link
Contributor Author

Attempting to fix these:

xarray/tests/test_namedarray.py: note: In function "check_duck_array_typevar":
xarray/tests/test_namedarray.py:82: note: Revealed type is "Union[xarray.namedarray._typing._arrayfunction[Any, _DType`-1], xarray.namedarray._typing._arrayapi[Any, _DType`-1]]"
xarray/tests/test_namedarray.py: note: In class "TestNamedArray":
xarray/tests/test_namedarray.py:185: error: Argument 2 to "NamedArray" has incompatible type "ndarray[Any, dtype[Any]]"; expected "_arrayfunction[Any, Never] | _arrayapi[Any, Never]"  [arg-type]
xarray/tests/test_namedarray.py: note: In member "test_duck_array_class" of class "TestNamedArray":
xarray/tests/test_namedarray.py:346: error: Argument 1 to "check_duck_array_typevar" has incompatible type "ndarray[Any, dtype[signedinteger[_64Bit]]]"; expected "_arrayfunction[Any, Never] | _arrayapi[Any, Never]"  [arg-type]
xarray/tests/test_namedarray.py: note: In member "test_warn_on_repeated_dimension_names" of class "TestNamedArray":
xarray/tests/test_namedarray.py:565: error: Argument 2 to "NamedArray" has incompatible type "ndarray[Any, dtype[signedinteger[Any]]]"; expected "_arrayfunction[Any, Never] | _arrayapi[Any, Never]"  [arg-type]

@Illviljan
Copy link
Contributor Author

Illviljan commented Jul 16, 2024

I'm struggling to understand what mypy complains about here.

  • mypy seems unable to narrow to the correct dtype for some reason. What are typical reasons to get Never ?
  • mypy passes if the typing is explictly defined, fails if it's only implicit. What are typical reasons to require annotating variables ?
  • pyright seems to handle this example fine though.

Here's a simplified example:

def test_mypy_typing_fail() -> None:
    from typing import Generic, TypeVar
    from xarray.namedarray._typing import _Dims

    _ST = TypeVar("_ST", bound=Any)
    _DT = TypeVar("_DT", bound=Any)

    class TestArray(Generic[_ST, _DT]):
        __slots__ = ("_data", "_dims", "_attrs")

        _data: duckarray[_ST, _DT]
        _dims: _Dims
        _attrs: dict[Any, Any] | None

        def __init__(
            self,
            dims: _Dims,
            data: duckarray[_ST, _DT],
            attrs: dict[Any, Any] = {},
        ):
            self._data = data
            self._dims = dims
            self._attrs = attrs

    data: np.ndarray[Any, np.dtype[np.int64]] = np.array([1, 2, 3], dtype=np.int64)
    # ta: TestArray[Any, np.dtype[np.int64]] # mypy fails if commented out, why?
    ta = TestArray(("time",), data)

    # mypy:
    # error: Need type annotation for "ta"  [var-annotated]
    # error: Argument 2 to "TestArray" has incompatible type "ndarray[Any, dtype[signedinteger[_64Bit]]]"; expected "_arrayfunction[Never, Never] | _arrayapi[Never, Never]"  [arg-type]

    # pyright - pass:
    # reveal_type(ta)  # information: Type of "ta" is "TestArray[_ShapeType_co@_arrayapi, dtype[signedinteger[_64Bit]]]"

@headtr1ck
Copy link
Collaborator

headtr1ck commented Jul 16, 2024

Thats weird... I cannot reproduce a minimal example that does not use duckarray...

Also reveal_type(duckarray)

Revealed type is "typing._SpecialForm"

is not very helpful...

@Illviljan
Copy link
Contributor Author

Illviljan commented Jul 18, 2024

Also reveal_type(duckarray)

Revealed type is "typing._SpecialForm"

is not very helpful...

Interesting, maybe due to the Union? This simpler example does the same at least:

int_and_float = Union[int, float]
reveal_type(int_and_float)  #  note: Revealed type is "typing._SpecialForm"

Compare pyright:

int_and_float = Union[int, float]
reveal_type(int_and_float)  #  information: Type of "int_and_float" is "type[int] | type[float]"

@Illviljan
Copy link
Contributor Author

Illviljan commented Jul 19, 2024

An interesting note. The duckarray protocol has to use all the typevars to be able to properly inherit values without explicit definition. Since numpy doesn't do that yet with _ShapeType I haven't pushed it either.

This example works but commenting out shape from the Protocol will start forcing explicit definitions:

from typing import Any, Generic, Protocol, TypeVar


_ST = TypeVar("_ST", bound=Any)
_ST_co = TypeVar("_ST_co", bound=Any, covariant=True)

_DT = TypeVar("_DT", bound=Any)
_DT_co = TypeVar("_DT_co", bound=Any, covariant=True)


class ArrayProtocol(Protocol[_ST_co, _DT_co]):
    @property
    def dtype(self) -> _DT_co: ...

    @property
    def shape(self) -> _ST_co: ...  # Commenting out shape will start to require explicit definition


class Array(Generic[_ST, _DT]):
    def __init__(self, shape: _ST, dtype: _DT) -> None:
        self._shape = shape
        self._dtype = dtype

    @property
    def shape(self) -> _ST:
        return self._shape

    @property
    def dtype(self) -> _DT:
        return self._dtype

    def count(self, value: Any, /) -> int:
        return 2


def test_pass_typevars(a: ArrayProtocol[_ST, _DT]) -> ArrayProtocol[_ST, _DT]:
    return a


arr = Array(shape=(3,), dtype=int)
reveal_type(arr)  # information: Type of "arr" is "Array[tuple[Literal[3]], type[int]]"

array = test_pass_typevars(a=arr)
reveal_type(array)  #  information: Type of "array" is "ArrayProtocol[tuple[Literal[3]], type[int]]"

@headtr1ck
Copy link
Collaborator

This example works but commenting out shape from the Protocol will start forcing explicit definitions:

That is expected, because the shape TypeVar is unbound then.

Maybe you could add a definition/overload of __new__ that always returns it with shape=Any?
Not sure how numpy handles this problem currently...

@Illviljan
Copy link
Contributor Author

Illviljan commented Jul 19, 2024

Wow, what a journey. Turns out it was _arrayfunction.__array__ that didn't have the correct signature, that caused the typevars to somehow stay unbounded (Never/Any).

Debug method was to turn off methods in _arrayfunction until the dtype typevars worked again. The clue was I managed to get it to work as intended by replacing with very basic Protocols.

@Illviljan Illviljan marked this pull request as ready for review July 19, 2024 21:59
@Illviljan Illviljan changed the title Fix typing in test_namedarray.py Update signature for _arrayfunction.__array__ Jul 19, 2024
@Illviljan Illviljan mentioned this pull request Jul 19, 2024
4 tasks
Copy link
Collaborator

@headtr1ck headtr1ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for figuring this out!

Some verbosity of mypy on why it decides stuff would be useful ...

@headtr1ck headtr1ck added the plan to merge Final call for comments label Jul 20, 2024
@Illviljan Illviljan merged commit 10bb94c into pydata:main Jul 21, 2024
34 checks passed
dcherian added a commit that referenced this pull request Jul 22, 2024
* main:
  add backend intro and how-to diagram (#9175)
  Fix copybutton for multi line examples in double digit ipython cells (#9264)
  Update signature for _arrayfunction.__array__ (#9237)
  Add encode_cf_datetime benchmark (#9262)
  groupby, resample: Deprecate some positional args (#9236)
  Delete ``base`` and ``loffset`` parameters to resample (#9233)
  Update dropna docstring (#9257)
  Grouper, Resampler as public api (#8840)
  Fix mypy on main (#9252)
  fix fallback isdtype method (#9250)
  Enable pandas type checking (#9213)
  Per-variable specification of boolean parameters in open_dataset (#9218)
  test push
  Added a space to the documentation (#9247)
  Fix typing for test_plot.py (#9234)
dcherian added a commit that referenced this pull request Jul 24, 2024
* main: (54 commits)
  Adding `open_datatree` backend-specific keyword arguments (#9199)
  [pre-commit.ci] pre-commit autoupdate (#9202)
  Restore ability to specify _FillValue as Python native integers (#9258)
  add backend intro and how-to diagram (#9175)
  Fix copybutton for multi line examples in double digit ipython cells (#9264)
  Update signature for _arrayfunction.__array__ (#9237)
  Add encode_cf_datetime benchmark (#9262)
  groupby, resample: Deprecate some positional args (#9236)
  Delete ``base`` and ``loffset`` parameters to resample (#9233)
  Update dropna docstring (#9257)
  Grouper, Resampler as public api (#8840)
  Fix mypy on main (#9252)
  fix fallback isdtype method (#9250)
  Enable pandas type checking (#9213)
  Per-variable specification of boolean parameters in open_dataset (#9218)
  test push
  Added a space to the documentation (#9247)
  Fix typing for test_plot.py (#9234)
  Allow mypy to run in vscode (#9239)
  Revert "Test main push"
  ...
dcherian added a commit to JoelJaeschke/xarray that referenced this pull request Jul 25, 2024
…monotonic-variable

* main: (995 commits)
  Adding `open_datatree` backend-specific keyword arguments (pydata#9199)
  [pre-commit.ci] pre-commit autoupdate (pydata#9202)
  Restore ability to specify _FillValue as Python native integers (pydata#9258)
  add backend intro and how-to diagram (pydata#9175)
  Fix copybutton for multi line examples in double digit ipython cells (pydata#9264)
  Update signature for _arrayfunction.__array__ (pydata#9237)
  Add encode_cf_datetime benchmark (pydata#9262)
  groupby, resample: Deprecate some positional args (pydata#9236)
  Delete ``base`` and ``loffset`` parameters to resample (pydata#9233)
  Update dropna docstring (pydata#9257)
  Grouper, Resampler as public api (pydata#8840)
  Fix mypy on main (pydata#9252)
  fix fallback isdtype method (pydata#9250)
  Enable pandas type checking (pydata#9213)
  Per-variable specification of boolean parameters in open_dataset (pydata#9218)
  test push
  Added a space to the documentation (pydata#9247)
  Fix typing for test_plot.py (pydata#9234)
  Allow mypy to run in vscode (pydata#9239)
  Revert "Test main push"
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments topic-typing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants