Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic duck array testing - reductions #4972

Draft
wants to merge 140 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
140 commits
Select commit Hold shift + click to select a range
21696bf
add a initial, tiny draft of the automatic duckarray test machinery
keewis Feb 27, 2021
f14ba29
add missing comma
keewis Feb 27, 2021
90f9c41
fix the global marks
keewis Feb 27, 2021
aa4a457
don't try to apply marks if marks is None
keewis Feb 27, 2021
9fa2eca
only set pytestmark if the value is not None
keewis Feb 28, 2021
7994bad
skip the module if pint is not installed
keewis Feb 28, 2021
c4a35f0
filter UnitStrippedWarnings
keewis Feb 28, 2021
0efbbbb
also test sparse
keewis Feb 28, 2021
73499b5
add a test for the test extractor
keewis Mar 1, 2021
532f213
move the selector parsing code to a new function
keewis Mar 1, 2021
f44aafa
also skip the sparse tests
keewis Mar 1, 2021
d651438
move the utils tests into a different file
keewis Mar 1, 2021
f84894a
don't keep the utils tests in a test group
keewis Mar 1, 2021
0090db5
split apply_marks into two separate functions
keewis Mar 1, 2021
ef05c7d
add a mark which attaches marks to test variants
keewis Mar 6, 2021
20334d9
move the duckarray testing module to tests
keewis Mar 6, 2021
f7acc0f
move the utils to a separate module
keewis Mar 6, 2021
e41a15b
fix the existing tests
keewis Mar 6, 2021
1f095a1
completely isolate the apply_marks tests
keewis Mar 6, 2021
2503af7
add a test for applying marks to test variants
keewis Mar 6, 2021
b229645
skip failing test variants
keewis Mar 6, 2021
0723418
fix the import path
keewis Mar 6, 2021
6c4ccb0
rename the duckarray testing module
keewis Mar 9, 2021
c4aa05a
use Variable as example
keewis Mar 9, 2021
fc97e90
fix the skips
keewis Mar 9, 2021
31e577a
only use dimensionless for cumprod
keewis Mar 9, 2021
8d80212
also test dask wrapped by pint
keewis Mar 9, 2021
7c43e91
add a function to concatenate mappings
keewis Mar 9, 2021
b6a90df
add tests for preprocess_marks
keewis Mar 9, 2021
a95b5c4
fix the tests
keewis Mar 9, 2021
aa7caaa
show the duplicates in the error message
keewis Mar 9, 2021
6415be8
add back support for test marks
keewis Mar 9, 2021
de25594
allow passing a list of addition assert functions
keewis Mar 9, 2021
1b0f372
add some notes about the test suite
keewis Mar 9, 2021
706ee54
simplify the extra_assert function
keewis Mar 9, 2021
caf6308
Merge branch 'master' into duckarray-tests
keewis Mar 22, 2021
cd5aa70
convert to hypothesis
keewis Mar 24, 2021
08d72ed
add a marker to convert label-space parameters
keewis Mar 27, 2021
0649d59
add a dummy expect_error function
keewis Mar 27, 2021
c32cb5a
compute actual before expected
keewis Mar 27, 2021
440e0bd
pass a strategy instead of a single dtype
keewis Mar 27, 2021
f74a29c
set a default for expect_error
keewis Mar 27, 2021
d9346f8
add a test for clip
keewis Mar 27, 2021
75f584a
allow passing a separate "create_label" function
keewis Mar 31, 2021
b94c84d
draft the base class hierarchy tailored after pandas' extension array…
keewis Apr 10, 2021
7a150f8
make sure multiple dims are passed as a list
keewis Apr 12, 2021
a6eecb8
sort the dtypes differently
keewis Apr 12, 2021
dcb9fc0
add a strategy to generate a single axis
keewis Apr 12, 2021
0ee096d
add a function to compute the axes from the dims
keewis Apr 12, 2021
53debb2
move the call of the operation to a hook
keewis Apr 12, 2021
9b2c0a3
remove the arg* methods since they are not reducing anything
keewis Apr 12, 2021
a6efbe1
add a context manager to suppress specific warnings
keewis Apr 12, 2021
5d679bf
don't try to reduce along multiple dimensions
keewis Apr 12, 2021
50db3c3
demonstrate the new pattern using pint
keewis Apr 12, 2021
8114d2a
fix the sparse tests
keewis Apr 12, 2021
6afc7c3
Merge branch 'duckarray-testing-baseclasses' into duckarray-tests
keewis Apr 20, 2021
2f084d0
Merge branch 'master' into duckarray-tests
keewis Apr 20, 2021
14349f2
back to only raising for UnitStrippedWarning
keewis Apr 20, 2021
6a658ef
remove the old duckarray testing module
keewis Apr 22, 2021
d595cd6
rename the tests
keewis Apr 22, 2021
2b0dcba
add a mark to skip individual test nodes
keewis Apr 22, 2021
6b61900
skip the prod and std tests
keewis Apr 22, 2021
b9535a1
skip all sparse tests for now
keewis Apr 22, 2021
c675f8d
also skip var
keewis Apr 22, 2021
6e7c538
add a duckarray base class
keewis Apr 22, 2021
e0ee7a6
move the strategies to a separate file and add a variable strategy
keewis Apr 22, 2021
3feef1c
add a simple DataArray strategy and use it in the DataArray tests
keewis Apr 22, 2021
2a70c38
use the DataArray reduce tests with pint
keewis Apr 22, 2021
6a18acf
add a simple strategy to create Dataset objects
keewis Apr 22, 2021
835930c
fix the variable strategy
keewis Apr 23, 2021
0a5c487
adjust the dataset strategy
keewis Apr 23, 2021
d1184a4
parametrize the dataset strategy
keewis Apr 23, 2021
12b5527
fix some of the pint testing utils
keewis Apr 23, 2021
1f95318
use flatmap to define the data_vars strategy
keewis Apr 23, 2021
9800db5
add tests for dataset reduce
keewis Apr 23, 2021
c43f35e
demonstrate the use of the dataset reduce tests using pint
keewis Apr 23, 2021
d1b541e
simplify check_reduce
keewis Apr 23, 2021
19d9d96
remove apparently unnecessary skips
keewis Apr 23, 2021
69e0624
skip the tests if hypothesis is missing
keewis Apr 23, 2021
c7f6677
update the sparse tests
keewis Apr 23, 2021
396c2ba
add DataArray and Dataset tests for sparse
keewis Apr 23, 2021
ead706e
fix attach_units
keewis Apr 23, 2021
3cf9523
rename the test classes
keewis Apr 23, 2021
cd132c6
update a few strategies
keewis Apr 23, 2021
1c310b0
fix the strategies and utils
keewis Apr 23, 2021
7f879b0
use allclose instead of identical to compare
keewis Apr 23, 2021
ff91be8
don't provide a default for shape
keewis Apr 25, 2021
cb286ef
remove the function to generate dimension names
keewis Apr 25, 2021
438f8a5
simplify the generation of the dimension sizes
keewis Apr 25, 2021
01814ff
immediately draw the computed dimension sizes
keewis Apr 26, 2021
0f1222e
convert the sizes to a dict when making sure data vars are not dims
keewis Apr 26, 2021
a38a307
align the default maximum number of dimensions
keewis Apr 26, 2021
ea3d015
draw the data before passing it to DataArray
keewis Apr 26, 2021
afa33ac
directly generate the reduce dimensions
keewis Apr 28, 2021
566627a
Merge branch 'master' into duckarray-tests
keewis May 11, 2021
2e0c6bf
disable dim=[] / axis=() because that's not supported by all duckarrays
keewis May 11, 2021
01599b7
skip the sparse tests
keewis May 11, 2021
259e1d5
typo
keewis May 11, 2021
527b17c
use a single dtype for all variables of a dataset
keewis Jun 30, 2021
3437c3d
specify tolerances per dtype
keewis Jun 30, 2021
4866801
abandon the notion of single_dtype=True
keewis Jun 30, 2021
8019a20
limit the values and add dtype specific tolerances
keewis Jun 30, 2021
cc75b46
Merge branch 'main' into duckarray-tests
keewis Jul 1, 2021
566470b
Merge branch 'main' into duckarray-tests
keewis Aug 12, 2021
b0e94f1
disable bottleneck
keewis Aug 13, 2021
e57cd7b
Merge branch 'main' into duckarray-tests
keewis Aug 15, 2021
33f63a7
reduce the maximum number of dims, dim sizes, and variables
keewis Aug 15, 2021
11d41e3
disable bottleneck for the sparse tests
keewis Aug 15, 2021
71a37ba
try activating the sparse tests
keewis Aug 15, 2021
1d98fec
propagate the dtypes
keewis Aug 15, 2021
a826249
Merge remote-tracking branch 'upstream/main' into duckarray-tests
dcherian Nov 24, 2021
f2cd8a1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 24, 2021
21a5838
Merge branch 'main' into duckarray-tests
dcherian Jul 22, 2022
c747733
Turn off deadlines
dcherian Jul 22, 2022
3f81995
Disable float16 tests.
dcherian Jul 22, 2022
a282686
Use as_numpy in as_dense
dcherian Jul 22, 2022
ede0045
Merge branch 'main' into duckarray-tests
dcherian Jul 22, 2022
cbf408c
Merge branch 'main' into duckarray-tests
keewis Aug 3, 2022
f5b9bdc
move the hypothesis importorskip to before the strategy definitions
keewis Aug 3, 2022
ed68dc2
properly filter out float16 dtypes for sparse
keewis Aug 3, 2022
1e4f18e
also filter out complex64 because there seems to be a bug in sparse
keewis Aug 8, 2022
da2225f
Merge branch 'main' into duckarray-tests
dcherian Aug 8, 2022
37622c5
Merge branch 'main' into duckarray-tests
dcherian Aug 8, 2022
86377e6
Update xarray/tests/duckarrays/test_sparse.py
dcherian Aug 8, 2022
5af49d8
use the proper base to check the dtypes
keewis Aug 8, 2022
8d0a8c3
Merge branch 'main' into duckarray-tests
TomNicholas Aug 8, 2022
50151a4
make sure the importorskip call is before any hypothesis imports
keewis Aug 8, 2022
707aecb
remove the op parameter to create
keewis Aug 9, 2022
23e4e70
Merge branch 'main' into duckarray-tests
TomNicholas Aug 9, 2022
cc7af83
add special test job for duck arrays
keewis Aug 11, 2022
644bf69
limit the github annotations to just the standard env
keewis Aug 11, 2022
cae9287
don't run duckarray tests unless explicitly requested
keewis Aug 11, 2022
240cbcf
fix the workflow
keewis Aug 11, 2022
aa7d893
one more fix
keewis Aug 11, 2022
fa34023
try not pinning the os for duckarray tests
keewis Aug 11, 2022
b92f272
explicitly write out the new entries
keewis Aug 11, 2022
4b5f63b
try removing the variable evaluation
keewis Aug 11, 2022
909583c
set the environment file by default
keewis Aug 11, 2022
2bcdc83
don't overwrite the default
keewis Aug 11, 2022
9161a7d
Merge branch 'main' into duckarray-tests
TomNicholas Aug 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 18 additions & 6 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,15 @@ jobs:
- env: "flaky"
python-version: "3.10"
os: ubuntu-latest
- env: "duckarrays"
python-version: "3.10"
os: "ubuntu-latest"
- env: "duckarrays"
python-version: "3.10"
os: "windows-latest"
- env: "duckarrays"
python-version: "3.10"
os: "macos-latest"
steps:
- uses: actions/checkout@v3
with:
Expand All @@ -70,17 +79,20 @@ jobs:
if [[ ${{ matrix.os }} == windows* ]] ;
then
echo "CONDA_ENV_FILE=ci/requirements/environment-windows.yml" >> $GITHUB_ENV
elif [[ "${{ matrix.env }}" != "" ]] ;
else
echo "CONDA_ENV_FILE=ci/requirements/environment.yml" >> $GITHUB_ENV
fi

if [[ "${{ matrix.env }}" != "" ]] ;
then
if [[ "${{ matrix.env }}" == "flaky" ]] ;
then
if [[ "${{ matrix.env }}" == "flaky" ]] ; then
echo "CONDA_ENV_FILE=ci/requirements/environment.yml" >> $GITHUB_ENV
echo "PYTEST_EXTRA_FLAGS=--run-flaky --run-network-tests" >> $GITHUB_ENV
elif [[ "${{ matrix.env }}" == "duckarrays" ]] ; then
echo "PYTEST_EXTRA_FLAGS=--run-duckarray-tests xarray/tests/duckarrays/" >> $GITHUB_ENV
else
echo "CONDA_ENV_FILE=ci/requirements/${{ matrix.env }}.yml" >> $GITHUB_ENV
fi
else
echo "CONDA_ENV_FILE=ci/requirements/environment.yml" >> $GITHUB_ENV
fi

echo "PYTHON_VERSION=${{ matrix.python-version }}" >> $GITHUB_ENV
Expand All @@ -96,7 +108,7 @@ jobs:
# We only want to install this on one run, because otherwise we'll have
# duplicate annotations.
- name: Install error reporter
if: ${{ matrix.os }} == 'ubuntu-latest' and ${{ matrix.python-version }} == '3.10'
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.10' && matrix.env == ''
run: |
python -m pip install pytest-github-actions-annotate-failures

Expand Down
9 changes: 9 additions & 0 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ def pytest_addoption(parser):
action="store_true",
help="runs tests requiring a network connection",
)
parser.addoption(
"--run-duckarray-tests",
action="store_true",
help="runs the duckarray hypothesis tests",
)


def pytest_runtest_setup(item):
Expand All @@ -21,6 +26,10 @@ def pytest_runtest_setup(item):
pytest.skip(
"set --run-network-tests to run test requiring an internet connection"
)
if "duckarrays" in item.keywords and not item.config.getoption(
"--run-duckarray-tests"
):
pytest.skip("set --run-duckarray-tests option to run duckarray tests")


@pytest.fixture(autouse=True)
Expand Down
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,7 @@ markers =
flaky: flaky tests
network: tests requiring a network connection
slow: slow tests
duckarrays: duckarray tests

[flake8]
ignore =
Expand Down
42 changes: 42 additions & 0 deletions xarray/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,48 @@ def backend(request):
return request.param


def pytest_configure(config):
config.addinivalue_line(
"markers",
"apply_marks(marks): function to attach marks to tests and test variants",
)


def always_sequence(obj):
if not isinstance(obj, (list, tuple)):
obj = [obj]

return obj


def pytest_collection_modifyitems(session, config, items):
for item in items:
mark = item.get_closest_marker("apply_marks")
if mark is None:
continue

marks = mark.args[0]
if not isinstance(marks, dict):
continue

possible_marks = marks.get(item.originalname)
if possible_marks is None:
continue

if not isinstance(possible_marks, dict):
for mark in always_sequence(possible_marks):
item.add_marker(mark)
continue

variant = item.name[len(item.originalname) :]
to_attach = possible_marks.get(variant)
if to_attach is None:
continue

for mark in always_sequence(to_attach):
item.add_marker(mark)


@pytest.fixture(params=[1])
def ds(request, backend):
if request.param == 1:
Expand Down
Empty file.
7 changes: 7 additions & 0 deletions xarray/tests/duckarrays/base/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from .reduce import DataArrayReduceTests, DatasetReduceTests, VariableReduceTests

__all__ = [
"VariableReduceTests",
"DataArrayReduceTests",
"DatasetReduceTests",
]
144 changes: 144 additions & 0 deletions xarray/tests/duckarrays/base/reduce.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
import hypothesis.strategies as st
import numpy as np
import pytest
from hypothesis import given, note, settings

from ... import assert_identical
from . import strategies


class VariableReduceTests:
def check_reduce(self, obj, op, *args, **kwargs):
actual = getattr(obj, op)(*args, **kwargs)

data = np.asarray(obj.data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename instances of arrays to be called arr instead of data? It took me a pretty long time to realise that this had nothing to do with hypothesis' strategies.data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, although it would probably be better to call it numpy_arr or move it into the .copy call

expected = getattr(obj.copy(data=data), op)(*args, **kwargs)

note(f"actual:\n{actual}")
note(f"expected:\n{expected}")

assert_identical(actual, expected)

@staticmethod
def create(shape, dtypes):
return strategies.numpy_array(shape)

@pytest.mark.parametrize(
"method",
(
"all",
"any",
"cumprod",
"cumsum",
"max",
"mean",
"median",
"min",
"prod",
"std",
"sum",
"var",
),
)
@given(st.data())
@settings(deadline=None)
def test_reduce(self, method, data):
var = data.draw(
strategies.variable(lambda shape, dtypes: self.create(shape, dtypes))
)

reduce_dims = data.draw(strategies.valid_dims(var.dims))

self.check_reduce(var, method, dim=reduce_dims)


class DataArrayReduceTests:
def check_reduce(self, obj, op, *args, **kwargs):
actual = getattr(obj, op)(*args, **kwargs)

data = np.asarray(obj.data)
expected = getattr(obj.copy(data=data), op)(*args, **kwargs)

note(f"actual:\n{actual}")
note(f"expected:\n{expected}")

assert_identical(actual, expected)

@staticmethod
def create(op, shape, dtypes):
return strategies.numpy_array(shape, dtypes)

@pytest.mark.parametrize(
"method",
(
"all",
"any",
"cumprod",
"cumsum",
"max",
"mean",
"median",
"min",
"prod",
"std",
"sum",
"var",
),
)
@given(st.data())
@settings(deadline=None)
def test_reduce(self, method, data):
arr = data.draw(
strategies.data_array(lambda shape, dtypes: self.create(shape, dtypes))
)

reduce_dims = data.draw(strategies.valid_dims(arr.dims))

self.check_reduce(arr, method, dim=reduce_dims)


class DatasetReduceTests:
def check_reduce(self, obj, op, *args, **kwargs):
actual = getattr(obj, op)(*args, **kwargs)

data = {name: np.asarray(obj.data) for name, obj in obj.variables.items()}
expected = getattr(obj.copy(data=data), op)(*args, **kwargs)

note(f"actual:\n{actual}")
note(f"expected:\n{expected}")

assert_identical(actual, expected)

@staticmethod
def create(shape, dtypes):
return strategies.numpy_array(shape, dtypes)

@pytest.mark.parametrize(
"method",
(
"all",
"any",
"cumprod",
"cumsum",
"max",
"mean",
"median",
"min",
"prod",
"std",
"sum",
"var",
),
)
@given(st.data())
@settings(deadline=None)
def test_reduce(self, method, data):
ds = data.draw(
strategies.dataset(
lambda shape, dtypes: self.create(shape, dtypes), max_size=5
)
)

reduce_dims = data.draw(strategies.valid_dims(ds.dims))

self.check_reduce(ds, method, dim=reduce_dims)
Loading