Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial refactor for NamedArray #8075

Merged
merged 83 commits into from
Sep 27, 2023
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
81098cf
initial prototype for NamedArray
andersy005 Aug 16, 2023
27910dc
move NDArrayMixin and NdimSizeLenMixin inside named_array
andersy005 Aug 16, 2023
1a02dac
vendor is_duck_dask_array
andersy005 Aug 16, 2023
636b156
vendor Frozen object
andersy005 Aug 16, 2023
9ba6c84
update import
andersy005 Aug 17, 2023
b1a1de0
move _default sentinel value
andersy005 Aug 17, 2023
1e11e87
rename subpackage to namedarray per @TomNicholas suggestion
andersy005 Aug 17, 2023
ad364f0
Remove NdimSizeLenMixin
andersy005 Aug 17, 2023
d1e8d2a
fix typing
andersy005 Aug 17, 2023
5654063
Merge branch 'main' into named-array
andersy005 Aug 17, 2023
098eb0c
add annotations
andersy005 Aug 17, 2023
38c105a
Remove NDArrayMixin
andersy005 Aug 17, 2023
1fdd281
Apply suggestions from code review
andersy005 Aug 18, 2023
7060268
Merge branch 'main' into named-array
andersy005 Aug 18, 2023
33c2216
fix typing
andersy005 Aug 21, 2023
2c9223d
fix return type
andersy005 Aug 21, 2023
0e4afe0
revert NDArrayMixin
andersy005 Aug 22, 2023
ab79fb1
[WIP] as_compatible_data refactor
dcherian Aug 22, 2023
e70e98a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 22, 2023
a393d7f
duplicate sentinel value and leave the original sentinel object alone
andersy005 Aug 23, 2023
7b8316e
Apply suggestions from code review
andersy005 Aug 23, 2023
d74b802
use DuckArray
andersy005 Aug 23, 2023
acfdb90
Apply suggestions from code review
andersy005 Aug 23, 2023
d8b79eb
Merge branch 'main' into named-array
andersy005 Aug 23, 2023
2ece3c0
use sentinel value from xarray
andersy005 Aug 23, 2023
6fb79e6
remove unused code
andersy005 Aug 23, 2023
9545ca2
fix variable constructor
andersy005 Aug 23, 2023
e41a27c
fix as_compatible_data utility function
andersy005 Aug 23, 2023
259e0bd
move _to_dense and _non_zero to NamedArray
andersy005 Aug 23, 2023
a7ec770
more typing
andersy005 Aug 24, 2023
c55f35a
add initial tests
andersy005 Aug 24, 2023
2335bba
Merge branch 'main' into named-array
andersy005 Aug 30, 2023
34a262a
Apply suggestions from code review
andersy005 Aug 31, 2023
4b22b29
Merge branch 'main' into named-array
andersy005 Aug 31, 2023
790bfc2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 31, 2023
b909c87
Merge branch 'main' into pr/8075
Illviljan Sep 11, 2023
a31da00
attempt to fix some mypy errors
Illviljan Sep 11, 2023
b6c0af5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 11, 2023
b1e42aa
Update core.py
Illviljan Sep 11, 2023
45f9d99
Merge branch 'named-array' of https://github.com/andersy005/xarray in…
Illviljan Sep 11, 2023
2661001
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 11, 2023
b2a1cda
Update core.py
Illviljan Sep 11, 2023
d2971cc
Merge branch 'named-array' of https://github.com/andersy005/xarray in…
Illviljan Sep 11, 2023
b25a8ff
All input data can be arraylike
Illviljan Sep 11, 2023
06d77ad
Update core.py
Illviljan Sep 11, 2023
96ac4ec
Update core.py
Illviljan Sep 11, 2023
760cb48
get and set attrs at the same level.
Illviljan Sep 11, 2023
15c7300
data doesn't have to be ndarray
Illviljan Sep 11, 2023
bbe3db4
avoid redefining typing use new variable names instead
Illviljan Sep 11, 2023
2233662
import on runtime as well to be able to cast
Illviljan Sep 11, 2023
fb2ca4d
requires ufunc and function to be a valid duck array
Illviljan Sep 11, 2023
cf91823
Add array_namespace
Illviljan Sep 15, 2023
f21297b
Update test_dataset.py
Illviljan Sep 15, 2023
4fafb02
Update test_dataset.py
Illviljan Sep 15, 2023
c07fa0d
Merge branch 'main' into named-array
andersy005 Sep 15, 2023
c5fb91d
remove Frozen
andersy005 Sep 15, 2023
f2d3c95
Merge branch 'main' into named-array
andersy005 Sep 19, 2023
abc02c5
Merge branch 'main' into named-array
andersy005 Sep 19, 2023
4708ca2
update tests
andersy005 Sep 19, 2023
ff1b4de
update tests
andersy005 Sep 20, 2023
5455a44
Merge branch 'main' into named-array
andersy005 Sep 20, 2023
2162063
switch to functional API
andersy005 Sep 20, 2023
e530dd1
add fastpath
andersy005 Sep 20, 2023
9b3590c
Merge branch 'main' into named-array
andersy005 Sep 20, 2023
0f42857
Test making sizes dict[Hashable, int]
Illviljan Sep 20, 2023
afc7228
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 20, 2023
32ec4ea
A lot of errors... Try Mapping instead
Illviljan Sep 20, 2023
76bb881
Update groupby.py
Illviljan Sep 20, 2023
2d59cf5
Merge branch 'main' into named-array
andersy005 Sep 21, 2023
df77741
Update types.py
Illviljan Sep 21, 2023
8bf13b5
Apply suggestions from code review
andersy005 Sep 25, 2023
89a0010
Merge branch 'main' into named-array
andersy005 Sep 25, 2023
2f0192f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 25, 2023
3f22902
update docstrings
andersy005 Sep 25, 2023
f618625
update error messages
andersy005 Sep 25, 2023
94bf6c4
update tests
andersy005 Sep 25, 2023
0ec7876
test explicitly index array
andersy005 Sep 25, 2023
fb4ed12
update tests
andersy005 Sep 25, 2023
f0cfc11
remove unused types
andersy005 Sep 25, 2023
48fcf9b
Update xarray/tests/test_namedarray.py
andersy005 Sep 26, 2023
5f4e127
Merge branch 'main' into named-array
andersy005 Sep 26, 2023
2d9d7ff
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 26, 2023
2ef5064
use Self
andersy005 Sep 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions xarray/backends/common.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import annotations

import logging
import math
import os
import time
import traceback
Expand All @@ -14,7 +15,7 @@
from xarray.core import indexing
from xarray.core.parallelcompat import get_chunked_array_type
from xarray.core.pycompat import is_chunked_array
from xarray.core.utils import FrozenDict, NdimSizeLenMixin, is_remote_uri
from xarray.core.utils import FrozenDict, is_remote_uri

if TYPE_CHECKING:
from io import BufferedIOBase
Expand Down Expand Up @@ -162,9 +163,39 @@ def robust_getitem(array, key, catch=Exception, max_retries=6, initial_delay=500
time.sleep(1e-3 * next_delay)


class BackendArray(NdimSizeLenMixin, indexing.ExplicitlyIndexed):
class BackendArray(indexing.ExplicitlyIndexed):
__slots__ = ()

@property
def ndim(self: Any) -> int:
"""
Number of array dimensions.

See Also
--------
numpy.ndarray.ndim
"""
return len(self.shape)

@property
def size(self: Any) -> int:
"""
Number of elements in the array.

Equal to ``np.prod(a.shape)``, i.e., the product of the array’s dimensions.

See Also
--------
numpy.ndarray.size
"""
return math.prod(self.shape)

def __len__(self: Any) -> int:
try:
return self.shape[0]
except IndexError:
raise TypeError("len() of unsized object")

def get_duck_array(self, dtype: np.typing.DTypeLike = None):
key = indexing.BasicIndexer((slice(None),) * self.ndim)
return self[key] # type: ignore [index]
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/alignment.py
Original file line number Diff line number Diff line change
Expand Up @@ -839,7 +839,7 @@ def is_alignable(obj):
elif raise_on_invalid:
raise ValueError(
"object to align is neither an xarray.Dataset, "
"an xarray.DataArray nor a dictionary: {!r}".format(variables)
f"an xarray.DataArray nor a dictionary: {variables!r}"
)
else:
out.append(variables)
Expand Down
3 changes: 1 addition & 2 deletions xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,8 @@
from xarray.core.merge import PANDAS_TYPES, MergeError
from xarray.core.options import OPTIONS, _get_keep_attrs
from xarray.core.utils import (
Default,
HybridMappingProxy,
ReprObject,
_default,
either_dict_or_kwargs,
emit_user_level_warning,
)
Expand All @@ -55,6 +53,7 @@
as_compatible_data,
as_variable,
)
from xarray.namedarray.utils import Default, _default
andersy005 marked this conversation as resolved.
Show resolved Hide resolved
from xarray.plot.accessor import DataArrayPlotAccessor
from xarray.plot.utils import _get_units_from_attrs

Expand Down
3 changes: 1 addition & 2 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,11 +91,9 @@
)
from xarray.core.types import QuantileMethods, T_Dataset
from xarray.core.utils import (
Default,
Frozen,
HybridMappingProxy,
OrderedSet,
_default,
decode_numpy_dict_values,
drop_dims_from_indexers,
either_dict_or_kwargs,
Expand All @@ -111,6 +109,7 @@
broadcast_variables,
calculate_dimensions,
)
from xarray.namedarray.utils import Default, _default
from xarray.plot.accessor import DatasetPlotAccessor

if TYPE_CHECKING:
Expand Down
93 changes: 89 additions & 4 deletions xarray/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import enum
import functools
import math
import operator
from collections import Counter, defaultdict
from collections.abc import Hashable, Iterable, Mapping
Expand All @@ -26,7 +27,6 @@
)
from xarray.core.types import T_Xarray
from xarray.core.utils import (
NDArrayMixin,
either_dict_or_kwargs,
get_valid_numpy_dtype,
is_scalar,
Expand Down Expand Up @@ -458,9 +458,53 @@ def get_duck_array(self):
return self.array


class ExplicitlyIndexedNDArrayMixin(NDArrayMixin, ExplicitlyIndexed):
class ExplicitlyIndexedNDArrayMixin(ExplicitlyIndexed):
__slots__ = ()

@property
def ndim(self: Any) -> int:
"""
Number of array dimensions.

See Also
--------
numpy.ndarray.ndim
"""
return len(self.shape)

@property
def size(self: Any) -> int:
"""
Number of elements in the array.

Equal to ``np.prod(a.shape)``, i.e., the product of the array’s dimensions.

See Also
--------
numpy.ndarray.size
"""
return math.prod(self.shape)

def __len__(self: Any) -> int:
try:
return self.shape[0]
except IndexError:
raise TypeError("len() of unsized object")

@property
def dtype(self: Any) -> np.dtype:
return self.array.dtype

@property
def shape(self: Any) -> tuple[int, ...]:
return self.array.shape

def __getitem__(self: Any, key):
return self.array[key]

def __repr__(self: Any) -> str:
return f"{type(self).__name__}(array={self.array!r})"

def get_duck_array(self):
key = BasicIndexer((slice(None),) * self.ndim)
return self[key]
Expand All @@ -471,7 +515,7 @@ def __array__(self, dtype: np.typing.DTypeLike = None) -> np.ndarray:
return np.asarray(self.get_duck_array(), dtype=dtype)


class ImplicitToExplicitIndexingAdapter(NDArrayMixin):
class ImplicitToExplicitIndexingAdapter:
"""Wrap an array, converting tuples into the indicated explicit indexer."""

__slots__ = ("array", "indexer_cls")
Expand All @@ -483,6 +527,47 @@ def __init__(self, array, indexer_cls=BasicIndexer):
def __array__(self, dtype: np.typing.DTypeLike = None) -> np.ndarray:
return np.asarray(self.get_duck_array(), dtype=dtype)

@property
def ndim(self: Any) -> int:
"""
Number of array dimensions.

See Also
--------
numpy.ndarray.ndim
"""
return len(self.shape)

@property
def size(self: Any) -> int:
"""
Number of elements in the array.

Equal to ``np.prod(a.shape)``, i.e., the product of the array’s dimensions.

See Also
--------
numpy.ndarray.size
"""
return math.prod(self.shape)

def __len__(self: Any) -> int:
try:
return self.shape[0]
except IndexError:
raise TypeError("len() of unsized object")

@property
def dtype(self: Any) -> np.dtype:
return self.array.dtype

@property
def shape(self: Any) -> tuple[int, ...]:
return self.array.shape

def __repr__(self: Any) -> str:
return f"{type(self).__name__}(array={self.array!r})"

def get_duck_array(self):
return self.array.get_duck_array()

Expand Down Expand Up @@ -1303,7 +1388,7 @@ def __init__(self, array):
if not isinstance(array, np.ndarray):
raise TypeError(
"NumpyIndexingAdapter only wraps np.ndarray. "
"Trying to wrap {}".format(type(array))
f"Trying to wrap {type(array)}"
)
self.array = array

Expand Down
73 changes: 0 additions & 73 deletions xarray/core/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@
import inspect
import io
import itertools
import math
import os
import re
import sys
Expand All @@ -57,7 +56,6 @@
MutableSet,
Sequence,
)
from enum import Enum
from typing import (
TYPE_CHECKING,
Any,
Expand Down Expand Up @@ -542,69 +540,6 @@ def __repr__(self) -> str:
return f"{type(self).__name__}({list(self)!r})"


class NdimSizeLenMixin:
andersy005 marked this conversation as resolved.
Show resolved Hide resolved
"""Mixin class that extends a class that defines a ``shape`` property to
one that also defines ``ndim``, ``size`` and ``__len__``.
"""

__slots__ = ()

@property
def ndim(self: Any) -> int:
"""
Number of array dimensions.

See Also
--------
numpy.ndarray.ndim
"""
return len(self.shape)

@property
def size(self: Any) -> int:
"""
Number of elements in the array.

Equal to ``np.prod(a.shape)``, i.e., the product of the array’s dimensions.

See Also
--------
numpy.ndarray.size
"""
return math.prod(self.shape)

def __len__(self: Any) -> int:
try:
return self.shape[0]
except IndexError:
raise TypeError("len() of unsized object")


class NDArrayMixin(NdimSizeLenMixin):
"""Mixin class for making wrappers of N-dimensional arrays that conform to
the ndarray interface required for the data argument to Variable objects.

A subclass should set the `array` property and override one or more of
`dtype`, `shape` and `__getitem__`.
"""

__slots__ = ()

@property
def dtype(self: Any) -> np.dtype:
return self.array.dtype

@property
def shape(self: Any) -> tuple[int, ...]:
return self.array.shape

def __getitem__(self: Any, key):
return self.array[key]

def __repr__(self: Any) -> str:
return f"{type(self).__name__}(array={self.array!r})"


class ReprObject:
"""Object that prints as the given value, for use with sentinel values."""

Expand Down Expand Up @@ -1108,14 +1043,6 @@ def __get__(self, obj: None | object, cls) -> type[_Accessor] | _Accessor:
return self._accessor(obj) # type: ignore # assume it is a valid accessor!


# Singleton type, as per https://github.com/python/typing/pull/240
andersy005 marked this conversation as resolved.
Show resolved Hide resolved
class Default(Enum):
token = 0


_default = Default.token


def iterate_nested(nested_list):
for item in nested_list:
if isinstance(item, list):
Expand Down
Loading
Loading