Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sparse.COO.pad #438

Closed
khaeru opened this issue Mar 7, 2021 · 1 comment · Fixed by #474
Closed

Implement sparse.COO.pad #438

khaeru opened this issue Mar 7, 2021 · 1 comment · Fixed by #474
Labels
enhancement Indicates new feature requests

Comments

@khaeru
Copy link

khaeru commented Mar 7, 2021

Is your feature request related to a problem? Please describe.
I would like to be able to use the xarray.DataArray API when the data is sparse.COO.

Instead—as mentioned at #1 (comment) —it appears I currently need to write a custom subclass of DataArray that handles cases where sparse doesn't implement a method expected by xarray.

Describe the solution you'd like
For sparse.COO.pad and the other methods mentioned in the linked comment to be implemented so that, in general, it is possible to use the xarray.DataArray API, or at least all a certain advertised set of basic functions, with sparse data.

Describe alternatives you've considered
As mentioned, I am currently handling this with wrapper code that converts an xr.DataArray to dense data, performs operations, then re-converts the data to sparse.

Another mitigation would be for the sparse documentation to describe which DataArray operations are not supported.

Some example code:

import pandas as pd
import xarray as xr
from numpy import nan

idx = dict(
    columns=pd.Index(["x1", "x2", "x3", "x4"], name="x"),
    index=pd.Index(["y1", "y2", "y3", "y4"], name="y"),
)

# Simplified example data
s1 = pd.DataFrame(
    [
        [1.0, nan, nan, nan],
        [nan, nan, 2.0, nan],
        [nan, nan, nan, 3.0],
        [nan, 4.0, nan, nan],
    ],
    **idx
).stack()

print(s1)

da1 = xr.DataArray.from_series(s1, sparse=True)

print(da1)

# I would like to produce data like the following by using DataArray.shift()
s2 = pd.DataFrame(
    [
        [nan, 1.0, nan, nan],
        [nan, nan, nan, 2.0],
        [nan, nan, nan, nan],
        [nan, nan, 4.0, nan],
    ],
    **idx
).stack()

print(s2)

da2 = xr.DataArray.from_series(s2)

print(da2)

# This line raises an exception, below
da1.shift(x=1)

The exception raised:

Traceback (most recent call last):
  File "test.py", line 42, in <module>
    da1.shift(x=1)
  File "/home/khaeru/.local/lib/python3.8/site-packages/xarray/core/dataarray.py", line 3077, in shift
    variable = self.variable.shift(
  File "/home/khaeru/.local/lib/python3.8/site-packages/xarray/core/variable.py", line 1205, in shift
    result = result._shift_one_dim(dim, count, fill_value=fill_value)
  File "/home/khaeru/.local/lib/python3.8/site-packages/xarray/core/variable.py", line 1166, in _shift_one_dim
    data = duck_array_ops.pad(
  File "/home/khaeru/.local/lib/python3.8/site-packages/xarray/core/duck_array_ops.py", line 56, in f
    return wrapped(*args, **kwargs)
  File "<__array_function__ internals>", line 5, in pad
TypeError: no implementation found for 'numpy.pad' on types that implement __array_function__: [<class 'sparse._coo.core.COO'>]
@khaeru khaeru added the enhancement Indicates new feature requests label Mar 7, 2021
@khaeru
Copy link
Author

khaeru commented May 11, 2021

Thanks @H4R5H1T-007 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Indicates new feature requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant