Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from xarray-contrib:main #48

Merged
merged 50 commits into from
Oct 18, 2024
Merged

Conversation

pull[bot]
Copy link

@pull pull bot commented Mar 17, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

dcherian and others added 4 commits March 13, 2024 14:33
* Fix direct reductions of Xarray objects

Closes pydata/xarray#8819

* Fix doctest
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4.0.0 to 4.1.0.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v4.0.0...v4.1.0)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix upstream-dev CI

Closes #337

* silence warnings

* fix mypy

* Trigger upstream workflow
@pull pull bot added the ⤵️ pull label Mar 17, 2024
dcherian and others added 23 commits March 19, 2024 15:37
* Fix nanlen with strings

Closes pydata/xarray#8853

* fix windows

* Silence warnings
* Another `method` detection optimization

* fix

* silence warnings

* silence one more warning

* Even better shortcut

* Update docs
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v4.1.0...v4.1.1)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Initial minimal working Cubed example for "map-reduce"

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix misspelled `aggegrate_func`

* Update flox/core.py

Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>

* Expand to ALL_FUNCS

* Use `_finalize_results` directly

* Add test for nan values

* Removed unused dtype from test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Move example notebook to a gist https://gist.github.com/tomwhite/2d637d2581b44468da5b7e29c30c0c49

* Add CubedArray type

* Add Cubed to CI

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Make mypy happy

* Make mypy happy (again)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
* [pre-commit.ci] pre-commit autoupdate

updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.9 → v0.3.5](astral-sh/ruff-pre-commit@v0.1.9...v0.3.5)
- [github.com/psf/black-pre-commit-mirror: 23.12.1 → 24.3.0](psf/black-pre-commit-mirror@23.12.1...24.3.0)
- [github.com/nbQA-dev/nbQA: 1.7.1 → 1.8.5](nbQA-dev/nbQA@1.7.1...1.8.5)
- [github.com/kynan/nbstripout: 0.6.1 → 0.7.1](kynan/nbstripout@0.6.1...0.7.1)
- [github.com/abravalheri/validate-pyproject: v0.15 → v0.16](abravalheri/validate-pyproject@v0.15...v0.16)
- [github.com/rhysd/actionlint: v1.6.26 → v1.6.27](rhysd/actionlint@v1.6.26...v1.6.27)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Optimize bitmask finding for chunk size 1.

* Fix benchmark.

* bugfix

* Add single chunk benchmark

* Optimize single chunk case.

* Add test
…method (#356)

* Add cubed notebook for hourly climatology example using "map-reduce" method

* Add cubed dependencies to docs build

* Use Cubed version with HTML repr fix
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4.1.1 to 4.3.1.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v4.1.1...v4.3.1)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ohorts. (#300)

* Manually fuse reindexing intermediates with blockwise reduction for cohorts.

```
| Change   | Before [627bf2b] <main>   | After [9d710529] <optimize-cohorts-graph>   |   Ratio | Benchmark (Parameter)                           |
|----------|----------------------------|---------------------------------------------|---------|-------------------------------------------------|
| -        | 3.39±0.02ms                | 2.98±0.01ms                                 |    0.88 | cohorts.PerfectMonthly.time_graph_construct     |
| -        | 20                         | 17                                          |    0.85 | cohorts.PerfectMonthly.track_num_layers         |
| -        | 23.0±0.07ms                | 19.0±0.1ms                                  |    0.83 | cohorts.ERA5Google.time_graph_construct         |
| -        | 4878                       | 3978                                        |    0.82 | cohorts.ERA5Google.track_num_tasks              |
| -        | 179±0.8ms                  | 147±0.5ms                                   |    0.82 | cohorts.OISST.time_graph_construct              |
| -        | 159                        | 128                                         |    0.81 | cohorts.ERA5Google.track_num_layers             |
| -        | 936                        | 762                                         |    0.81 | cohorts.PerfectMonthly.track_num_tasks          |
| -        | 1221                       | 978                                         |    0.8  | cohorts.OISST.track_num_layers                  |
| -        | 4929                       | 3834                                        |    0.78 | cohorts.ERA5DayOfYear.track_num_tasks           |
| -        | 351                        | 274                                         |    0.78 | cohorts.NWMMidwest.track_num_layers             |
| -        | 4562                       | 3468                                        |    0.76 | cohorts.ERA5DayOfYear.track_num_tasks_optimized |
| -        | 164±1ms                    | 118±0.4ms                                   |    0.72 | cohorts.ERA5DayOfYear.time_graph_construct      |
| -        | 1100                       | 735                                         |    0.67 | cohorts.ERA5DayOfYear.track_num_layers          |
| -        | 3930                       | 2605                                        |    0.66 | cohorts.NWMMidwest.track_num_tasks              |
| -        | 3715                       | 2409                                        |    0.65 | cohorts.NWMMidwest.track_num_tasks_optimized    |
| -        | 28952                      | 18798                                       |    0.65 | cohorts.OISST.track_num_tasks                   |
| -        | 27010                      | 16858                                       |    0.62 | cohorts.OISST.track_num_tasks_optimized         |
```

* fix typing
* Use threadpool for finding labels in chunk

Great when we have lots of decent size chunks, particularly the NWM
county groupby: 600ms -> 400ms.

```
| Before [0cccb90] <optimize-again>   | After [38fe8a6c] <threadpool>   |   Ratio | Benchmark (Parameter)                       |
|--------------------------------------|---------------------------------|---------|---------------------------------------------|
| 3.50±0.2ms                           | 2.93±0.07ms                     |    0.84 | cohorts.PerfectMonthly.time_graph_construct |
| 20.0±1ms                             | 9.66±1ms                        |    0.48 | cohorts.NWMMidwest.time_find_group_cohorts  |
```

* Add threshold

* Fix + comment

* Fix benchmark.

* Tweak threshold

* Small cleanup

* Comment

* Try single allocation

* Revert "Try single allocation"

This reverts commit c6b93367e2024e60d77af24a69d177670a040dfc.

* cleanup
* Optimize min_count for all numpy

For pure numpy arrays, min_count=1 (xarray default) is the same
as min_count=None, with the right fill_value. This avoids
one useless pass over the data, and one useless copy.

We need to always accumulate count with dask, to make sure we
get the right values at the end.

* Better?
* import `normalize_axis_index` from `numpy.lib` on `numpy>=2`

* import the right thing
* Initial minimal working Cubed example for "blockwise"

* Update minimum cubed version that includes cubed-dev/cubed#448

* Fix mypy errors

* Update documentation with a 'blockwise' example for Cubed
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4.3.1 to 4.4.1.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v4.3.1...v4.4.1)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Property tests with hypothesis

* skip on minimal env

* fix typing

* fix test

* fix mypy

* remove docstring

* try again

* fix again

* more fix

* fix tests

* Try fix

* some debug logging instead of info

* try `int8`

* Update casting behaviour

* More dtypes

* Complex fixes

* Revert "try `int8`"

This reverts commit a9097c2.

* fix dtype

* skip complex var, std

* Start fixing timedelta64

* fix casting

* exclude timedelta64, datetime64

* tweak

* filter out too_slow

* update hypothesis cache

* fix

* fix more.

* update caching strategy

* WIP

* Skip float16

* Attempt to increase numerical stablity of var, std

* update tolerances

* fix

* update action

* fixes

* Trim CI

* Cast to int64 instead of intp

* revert?

* [revert]

* try again

* debug logging

* Revert "try again"

This reverts commit a02d947.

* adapt

* Revert "Revert "try again""

This reverts commit 35ff742.

* Fix cast

* remove prints

* Revert "[revert]"

This reverts commit d143a98.

* info -> debug

* Fix quantiles

* bring back notes

* Small opt

* Just skip var, std

* Fix mypy

* no-redef

* try again
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.3.5 → v0.5.0](astral-sh/ruff-pre-commit@v0.3.5...v0.5.0)
- [github.com/pre-commit/pre-commit-hooks: v4.5.0 → v4.6.0](pre-commit/pre-commit-hooks@v4.5.0...v4.6.0)
- [github.com/psf/black-pre-commit-mirror: 24.3.0 → 24.4.2](psf/black-pre-commit-mirror@24.3.0...24.4.2)
- [github.com/codespell-project/codespell: v2.2.6 → v2.3.0](codespell-project/codespell@v2.2.6...v2.3.0)
- [github.com/abravalheri/validate-pyproject: v0.16 → v0.18](abravalheri/validate-pyproject@v0.16...v0.18)
- [github.com/rhysd/actionlint: v1.6.27 → v1.7.1](rhysd/actionlint@v1.6.27...v1.7.1)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 4.4.1 to 4.5.0.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v4.4.1...v4.5.0)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Deepak Cherian <dcherian@users.noreply.github.com>
* Add scans

* grouped reduce

* Some fixes.

* Updates for ffill

* Better ffill

* Support numpy

* cleanup

* more tests

* Fix ffill

* [WIP] expand tests

* Fixes. we need two versions of binary_op

* Fix ffill again

* Disable cumsum for now.

* Fixes.

* Fix tests: Remove overflowing test cases, proper fill_value

* typing

* Fix tests

* Try and avoid some roundoff error

* Skip float32 for cumsum

* fix min deps test

* Another fix

* Silence warnings

* Cleanup

* Add docs

* fix

* bfill

* Fix test

* hypothesis: Better CI profile

* Small change.

* Add hypothesis to all envs

* Generate chunking along all dimensions

* lint

* more guards

* more guards

* fix

* Fix typing

* cleanup

* fix mypy

* Add comments
dcherian and others added 22 commits July 28, 2024 14:15
Remove from ffill, bfill

Fix fill_value for datetime64

isnan to isnull
1. Record duration
2. Speed up some tests
3. Silence logging
* Stricter tolerance in property tests

now that ml31415/numpy-groupies#90 was merged

* skip float32 with cumsum
* Add cohorts snapshot tests with syrupy

* Fix.

* fix again

* Rework CI

* [revery]

* improve

* fix mypy?

* Revert "[revery]"

This reverts commit 7664e5e.

* Try again

* fix mypy
* Optimize for-loop merging of cohorts.

Do this by skipping perfect cohorts that we already know about.

* Add new benchmark

* Fix

* Cleanup print statements

* minimize diff

* cleanup

* Update snapshot
* [skip-ci] Bump furo

* [skip-ci] bump rtd config

* fix docs
* Fix first, last again

Add more first, last tests

* Fix

* fix type ignores

* Add one more property test

* Support cohorts and grouped_combine

* fix docs

* fix profile
* Handle dtypes.NA properly for datetime/timedelta

* Add Aggregation.preserves_dtype

* Fix ffill, bfill
* Expand groupby_reduce property tests

* Add back var, std

* cast quantile result

* Revert "Add back var, std"

This reverts commit 805b8d3.

* pin numpy in benchmark env

* Add benchmarks as test
* Drop python 3.9, use ruff

* switch to Ruff

* fix mypy

* remove toctrees

* fix
* Preserve dtype better when specified.

* Add one more test

* tweak test

* more test

* [revert] test with Xarray PR branch

* tweak

* show versions

* Drop python 3.9, use ruff

* switch to Ruff

* fix mypy

* remove toctrees

* fix

* one more
* Avoid rechunking when preferred_method="blockwise"

* Add test

* fix
* Avoid rechunking when preferred_method="blockwise"

* Add new numpy1 environment

* try int_ instead of intp

* Use uintp instead

* Use np.uint instead

* more fixes

* Add test

* fix

* fix again
* Faster subsetting for cohorts

Closes #396

* tpying
@pull pull bot merged commit 07a15c4 into Illviljan:main Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants