Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow DataArray.to_series() without invoking sparse.COO.todense() #4007

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

khaeru
Copy link

@khaeru khaeru commented Apr 25, 2020

This adds some code (from iiasa/ixmp#317) that allows DataArray.to_series() to be called without invoking sparse.COO.todense() when that is the backing data type.

I'm aware this needs some improvement to meet the standard of the existing codebase, so I hope I could ask for some guidance on how to address the following points (including whom to ask about them):

  • Make the same improvement in {DataArray,Dataset}.to_dataframe().
  • Possibly move the code out of dataarray.py to a more appropriate location (where?).
  • Possibly check for sparse.COO explicitly instead of xarray.core.pycompat.sparse_array_type. Other SparseArray subclasses, e.g. DOK, may not have the same attributes.

Standard items:

  • Tests added.
  • Passes isort -rc . && black . && mypy . && flake8
    (Sort of: these wanted to modify 7 files beyond the one I touched; didn't commit these changes.)
  • Fully documented, including whats-new.rst for all changes and api.rst for new API.

@khaeru khaeru marked this pull request as draft April 25, 2020 20:15
@khaeru khaeru force-pushed the feature/sparse-to-series branch from 2b549a3 to b5581fc Compare April 25, 2020 20:19
@khaeru khaeru marked this pull request as ready for review May 7, 2020 12:54
@dcherian
Copy link
Contributor

Thanks @khaeru

Make the same improvement in {DataArray,Dataset}.to_dataframe().

I think only Dataset._to_dataframe is needed since DataArray.to_dataframe() calls that.

Possibly move the code out of dataarray.py to a more appropriate location (where?).

Perhaps to_series could be refactored to use _to_dataframe too?

Possibly check for sparse.COO explicitly instead of xarray.core.pycompat.sparse_array_type. Other SparseArray subclasses, e.g. DOK, may not have the same attributes.

👍 Should add something to the docstring once you figure this out.

This PR also needs tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants