Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add high level from_array function in namedarray #8281

Closed
wants to merge 62 commits into from

Conversation

Illviljan
Copy link
Contributor

@Illviljan Illviljan commented Oct 7, 2023

@Illviljan Illviljan changed the title Add from_array function in namedarray Add high level from_array function in namedarray Oct 7, 2023
Illviljan and others added 24 commits October 7, 2023 16:36
Illviljan and others added 24 commits October 8, 2023 04:34
@Illviljan
Copy link
Contributor Author

Illviljan commented Oct 10, 2023

Getting a little discouraged now.

Since we use T_DuckArray we lock that type to the initialized NamedArray for the duration of the class instance. Array operations changes dtypes quite frequently within methods which makes it hard to change the typing correctly.

I don't think this T_DuckArray strategy is possible until python/typing#548 is closed.
Which means I would like to use a TypeVar within a TypeVar, something like T_DuckArray[Any, np.dtype[T_ScalarType]].

I think I'll restart in another branch for historical purposes so I won't fall for the same traps too many more times.

Failing proof of concept code:

from __future__ import annotations

# from collections.abc import Hashable, Iterable, Mapping, Sequence
from typing import Any, Protocol, TypeVar, runtime_checkable, overload, Union, Generic

from typing_extensions import Self

import numpy as np

from numpy.typing import DTypeLike

# https://stackoverflow.com/questions/74633074/how-to-type-hint-a-generic-numpy-array
_T = TypeVar("_T")
_T_co = TypeVar("_T_co", covariant=True)

_DType = TypeVar("_DType", bound=np.dtype[Any])
_DType_co = TypeVar("_DType_co", covariant=True, bound=np.dtype[Any])
_ScalarType = TypeVar("_ScalarType", bound=np.generic)
_ScalarType_co = TypeVar("_ScalarType_co", bound=np.generic, covariant=True)

_IntOrUnknown = int
_Shape = tuple[_IntOrUnknown, ...]
_ShapeType = TypeVar("_ShapeType", bound=Any)
_ShapeType_co = TypeVar("_ShapeType_co", bound=Any, covariant=True)

# A protocol for anything with the dtype attribute
@runtime_checkable
class _SupportsDType(Protocol[_DType_co]):
    @property
    def dtype(self) -> _DType_co:
        ...


_DTypeLike = Union[
    np.dtype[_ScalarType], type[_ScalarType], _SupportsDType[np.dtype[_ScalarType]]
]


@runtime_checkable
class _array(Protocol[_ShapeType_co, _DType_co]):
    @property
    def dtype(self) -> _DType_co:
        ...

    # TODO: Should be -> T_DuckArray[_ScalarType]:
    @overload
    def astype(self, dtype: _DTypeLike[_ScalarType]) -> _Array[_ScalarType]:
        ...

    # TODO: Should be -> T_DuckArray[Any]:
    @overload
    def astype(self, dtype: DTypeLike) -> _Array[Any]:
        ...


_Array = _array[Any, np.dtype[_ScalarType_co]]
T_DuckArray = TypeVar("T_DuckArray", bound=_Array[np.generic], covariant=True)


class Named(Generic[T_DuckArray]):
    _data: T_DuckArray

    def __init__(self, data: T_DuckArray) -> None:
        self._data = data

    @property
    def dtype(self) -> np.dtype[Any]:
        return self._data.dtype

    @overload
    def astype(self, dtype: _DTypeLike[_ScalarType]) -> _Named[_Array[_ScalarType]]:
        ...

    @overload
    def astype(self, dtype: DTypeLike) -> _Named[_Array[Any]]:
        ...

    def astype(
        self, dtype: _DTypeLike[_ScalarType] | DTypeLike
    ) -> _Named[_Array[_ScalarType]] | _Named[_Array[Any]]:
        # Mypy keeps expecting the T_DuckArray, pyright thinks it's ok.
        return type(self)(self._data.astype(dtype))  # type: ignore[arg-type]


_Named = Named

a = np.array([2, 3, 5], dtype=np.float64)
reveal_type(
    a
)  # note: Revealed type is "numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]"
narr = Named(a)
reveal_type(
    narr
)  # note: Revealed type is "Named[numpy.ndarray[Any, numpy.dtype[numpy.floating[numpy._typing._64Bit]]]]"
reveal_type(
    narr.astype(np.dtype(np.int8))
)  # note: Revealed type is "Named[_array[Any, numpy.dtype[numpy.signedinteger[numpy._typing._8Bit]]]]"
reveal_type(
    narr.astype(np.int16)
)  # note: Revealed type is "Named[_array[Any, numpy.dtype[numpy.signedinteger[numpy._typing._16Bit]]]]"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant