How should xarray use/support sparse arrays? #3213

shoyer · 2019-08-13T03:29:42Z

I'm looking forward to being easily able to create sparse xarray objects from pandas: #3206

Are there other xarray APIs that could make good use of sparse arrays, or could make sparse arrays easier to use?

Some ideas:

to_sparse()/to_dense() methods for converting to/from sparse without requiring using .data
to_dataframe()/to_series() could grow options for skipping the fill-value in sparse arrays, so they can round-trip MultiIndex data back to pandas
Serialization to/from netCDF files, using some custom convention (see Sparse arrays #1375 (comment))

The text was updated successfully, but these errors were encountered:

khaeru · 2019-08-13T08:31:30Z

This is very exciting! In energy-economic research (unlike, e.g., earth systems research), data are almost always sparse, so first-class sparse support will be broadly useful.

I'm leaving a comment here (since this seems to be a meta-issue; please link from wherever else, if needed) with two example use-cases. For the moment, #3206 seems to cover them, so I can't name any specific additional features.

MESSAGEix is an energy systems optimization model framework, formulated as a linear program.
- Some variables have many dimensions, for instance, the input coefficient for a technology has the dimensions (node_loc, technology, year_vintage, year_active, mode, node_origin, commodity, level, time, time_origin).
  - In the global version of our model, the technology dimension has over 400 labels.
  - Often two or more dimensions are tied, eg technology='coal power plant' will only take input from (commodity='coal', level='primary energy'); all other combinations of (commodity, level) are empty for this technology.
  - So, this data is inherently sparse.
- For modeling research, specifying quantities in this way is a good design because (a) it is intuitive to researchers in this domain, and (b) the optimization model is solved using various LP solvers via GAMS, which automatically prune zero rows in the resulting matrices.
- When we were developing a dask/DAG-based system for model results post-processing, we wanted to use xarray, but had some quantities with tens of millions of elements that were less than 1% full. Here is some test code that triggered MemoryErrors using xarray. We chose to fall back on using a pd.Series subclass that mocks xarray methods.
In transportation research, stock models of vehicle fleets are often used.
- These models always have at least two time dimensions: cohort (the time period in which a vehicle was sold) and period(s) in which it is used (and thus consumes fuel, etc.).
- Since a vehicle sold in 2020 can't be used in 2015, these data are always triangular w.r.t. these two dimensions. (The dimensions year_vintage and year_active in example Added setup.py which runs unit tests as necessary. #1 above have the same relationship.)
- Once multiplied by other dimensions (technology; fuel; size or shape or market segment; embodied materials; different variables; model runs across various scenarios or input assumptions) the overhead of dense arrays can become problematic.

crusaderky · 2019-08-14T12:15:39Z

+1 for the introduction of to_sparse() / to_dense(), but let's please avoid the mistakes that were done with chunk(). DataArray.chunk() is extremely frustrating when you have non-index coords and, 9 times out of 10, you only want to chunk the data and you have to go through the horrid

a = DataArray(a.data.chunk(), dims=a.dims, coords=a.coords, attrs=a.attrs, name=a.name)

Exactly the same issue would apply to to_sparse().

Possibly we could define them as

class DataArray:
    def to_sparse(
        self,
        data: bool = True,
        coords: Union[Iterable[Hashable], bool] = False
    )

class Dataset:
    def to_sparse(
        self,
        data_vars: Union[Iterable[Hashable], bool] = True,
        coords: Union[Iterable[Hashable], bool] = False
    )

same for to_dense() and chunk() (the latter would require a DeprecationWarning for a few release before switching the default for coords from True to False - only to be triggered in presence of dask-backed coords).

crusaderky · 2019-08-14T12:22:37Z

As already mentioned in #3206, unstack(sparse=True) would be extremely useful.

crusaderky · 2019-08-14T12:25:39Z

As for NetCDF, instead of a bespoke xarray-only convention, wouldn't it be much better to push a spec extension upstream?

shoyer · 2019-08-14T15:42:58Z

netCDF has a pretty low-level base spec, with conventions left to higher level docs like CF conventions. Fortunately, there does seems to be a CF convention that would be a good fit for for sparse data in COO format, namely the indexed ragged array representation (example, note the instance_dimension attribute). That's probably the right thing to use for sparse arrays in xarray.

ivirshup · 2019-08-15T06:28:24Z

Would it be feasible to use the contiguous ragged array spec or the gathering based compression when the COO coordinates are sorted? I think this could be very helpful for read efficiency, though I'm not sure if random writes were a requirement here.

shoyer · 2019-08-15T06:42:44Z

I like the indexed ragged array representation because it maps directly into sparse’s COO format. I’m sure other formats would be possible, but they would also likely be harder to implement.

ivirshup · 2019-08-15T10:34:30Z

That's fair. I just think it would be useful to have an assurance that indices are sorted you read them. I don't see how to express this within the CF specs while still looking like a COO array though.

shoyer · 2019-08-15T15:50:42Z

Yes, it would be useful (eventually) to have lazy loading of sparse arrays from disk, like we want we currently do for dense arrays. This would indeed require knowing that the indices are sorted.

normanb · 2019-08-22T21:40:13Z

I would be interested in trying to add PDAL (https://pdal.io/) support, is this something of interest in a similar way to the support for rasterio for dense arrays?

darothen · 2019-08-22T22:39:21Z

Tagging @jeliashi for visibility/collaboration

fjanoos · 2019-08-30T18:53:44Z

Would it be possible that pd.{Series, DataFrame}.to_xarray() automatically creates a sparse dataarray - or we have a flag in to_xarray which allows controlling for this. I have a very sparse dataframe and everytime I try to convert it to xarray I blow out my memory. Keeping it sparse but logically as a DataArray would be fantastic.

shoyer · 2019-08-30T19:19:13Z

We have a new "sparse=True" option in xarray.Dataset.from_dataframe for exactly this use case. Pandas's to_xarray() method just calls this method, so it would make sense to forward keyword arguments, too.

…

On Fri, Aug 30, 2019 at 11:53 AM firdaus janoos ***@***.***> wrote: Would it be possible that pd.{Series, DataFrame}.to_xarray() automatically creates a sparse dataarray - or we have a flag in to_xarray which allows controlling for this. I have a very sparse dataframe and everytime I try to convert it to xarray I blow out my memory. Keeping it sparse but logically as a DataArray would be fantastic — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#3213?email_source=notifications&email_token=AAJJFVTD2IWWPE6RTSWPVLDQHFUDTA5CNFSM4ILGYGP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5SPPNI#issuecomment-526710709>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJJFVWEOGEBGXV62QFYU6DQHFUDTANCNFSM4ILGYGPQ> .

fjanoos · 2019-08-30T20:10:43Z

I cloned the master branch and installed it using 'python setup.py develop'.

When I try to use the sparse data loading functionality as per

oo = xa.Dataset.from_dataframe( my_df, sparse=True )

I get the following error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-9-fce0ca6bc4c2> in <module>
----> 1 oo = xa.Dataset.from_dataframe( poly_df.iloc[:10000], sparse=True )

/mnt/local/xarray/xarray/core/dataset.py in from_dataframe(cls, dataframe, sparse)
   4040 
   4041         if sparse:
-> 4042             obj._set_sparse_data_from_dataframe(dataframe, dims, shape)
   4043         else:
   4044             obj._set_numpy_data_from_dataframe(dataframe, dims, shape)

/mnt/local/xarray/xarray/core/dataset.py in _set_sparse_data_from_dataframe(self, dataframe, dims, shape)
   3936         self, dataframe: pd.DataFrame, dims: tuple, shape: Tuple[int, ...]
   3937     ) -> None:
-> 3938         from sparse import COO
   3939 
   3940         idx = dataframe.index

ModuleNotFoundError: No module named 'sparse'

Any suggestions on what I need to do ?

dcherian · 2019-08-30T20:21:28Z

conda install -c conda-forge sparse

Basically you need to install https://sparse.pydata.org/en/latest/ using either pip or conda.

fjanoos · 2019-08-30T20:57:54Z

Thanks.

That solved that error but introduced another one.

Specifically - this is my dataframe

and this is the error that I get with sparse=True

My numpy version is definitely about 1.16

I also set this
os.environ["NUMPY_EXPERIMENTAL_ARRAY_FUNCTION"]='1'
just in case

Furthermore, I don't get this error when I don't set sparse=True ( I just get OOM errors but that's another matter) ...

shoyer · 2019-08-30T21:01:55Z

You will need to install NumPy 1.17 or set the env variable before importing NumPy.

…

On Fri, Aug 30, 2019 at 1:57 PM firdaus janoos ***@***.***> wrote: Thanks. That solved that error but introduced another one. Specifically - this is my dataframe [image: image] <https://user-images.githubusercontent.com/923438/64050831-2d061280-cb47-11e9-915b-01fe42eadefe.png> and this is the error that I get with sparse=True [image: image] <https://user-images.githubusercontent.com/923438/64049668-91bf6e00-cb43-11e9-921f-1a044f3446a9.png> [image: image] <https://user-images.githubusercontent.com/923438/64050631-a94c2600-cb46-11e9-8653-9820b445bc86.png> My numpy version is definitely about 1.16 [image: image] <https://user-images.githubusercontent.com/923438/64050648-b701ab80-cb46-11e9-8dac-aaf2bf9e260d.png> I also set this os.environ["NUMPY_EXPERIMENTAL_ARRAY_FUNCTION"]='1' just in case Furthermore, I don't get this error when I don't set sparse=True ( I just get OOM errors but that's another matter) ... — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#3213?email_source=notifications&email_token=AAJJFVTN37AMEA6ROS7YT2LQHGCVHA5CNFSM4ILGYGP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5SYQ6Q#issuecomment-526747770>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJJFVWM2V4H3V6BJMJ7IQDQHGCVHANCNFSM4ILGYGPQ> .

p-d-moore · 2019-09-04T06:32:21Z

I would like to add a request for sparse xarrays: Support ffill and bfill operations along ordered dimensions (such as datetime coordinates) while maintaining the sparse level of data density.

The challenge to overcome is that performing ffill operations on sparse data quickly creates data that is no longer "sparse" in practice and makes dealing with the data challenging.

My suggested implementation (and the way I have previously done this in another programming environment) is to represent the data as rows of contiguous regions with a single (non-sparse) value rather than rows of single points. The contiguous dimensions could be defined as any dimensions that are "ordered" such as datetime coordinates. That is, the data then is represented as a list of values + coordinate ranges rather than a list of values + coordinates.

The idea is that you can easily compute operations like ffill without changing the sparsity of the matrix, and thus support typical aggregating functions you might like to apply to the data before you collapse the data and convert to a non-sparse form (e.g. perform a lag difference of the most recent value with the most recent value 20 days ago, or do a cross-sectional mean on the data along a certain dimension, using the most recent data at each given point in time). These types of operations can be more useful when the data is "fuller" such as after a forward fill, but often not useful when the data is very sparsely populated (as the cross-sectional operations are unlikely to hit the sparse data among the different dimensions).

Care must be taken to avoid "collisions" between sparse blocks of data, that is, avoiding that the list of sparse blocks accidentally overlap. The implementation can get tricky but I believe the goal to be worthwhile.

I am happy to expand on the request if the idea is not well expressed.

crusaderky · 2019-09-04T06:46:08Z

@p-d-moore what you say makes sense but it is well outside of the domain of xarray. What you're describing is basically a new sparse class, substantially more sophisticated than COO, and should be proposed in the sparse board, not here. After it's implemented in sparse, xarray will be able to wrap around it.

p-d-moore · 2019-09-04T07:05:37Z

Thanks @crusaderky, appreciated. Might as as well suggest it there.

k-a-mendoza · 2019-10-24T19:05:23Z

So how would one change an existing dataset or dataarray to using a sparse representation?
something like
xr.Dataset.from_dataframe(data_set.to_dataframe(),sparse=True)?
seems a little silly to have to convert it to a pandas dataframe and back

oliverhiggs · 2019-11-07T07:57:17Z

Thanks for rolling out support for sparse arrays!

I think it would be great to have a sparse argument in the concat function. This will be useful with join='outer', as the resultant DataArray/Dataset can become quite sparse.

As an example, I have a use case where I want to concatenate (across a new dimension) a number of DataArrays with date indexes covering different date ranges. When I use concat the memory used by the resultant array is much greater than that used by the original list of DataArrays due to the large number of NA values that appear in it.

k-a-mendoza · 2019-11-07T13:57:46Z

@oliverhiggs Ive also noticed a huge computational overhead when joining xarray datasets where the result would be sparse. Something like a minute of computation time to join two 10GB datasets, even when there are no overlapping indices. I'm not sure if a sparse representation would help but its possible we'd get a reduced memory footprint and a faster merge/concat time with this kind of support.

dcherian · 2019-11-07T15:23:56Z

@El-minadero a lot of that overhead may be fixed on master and more recent xarray versions. https://xarray.pydata.org/en/stable/io.html#reading-multi-file-datasets has some tips on quickly concatenating / merging datasets. It depends on the datasets you are joining...

k-a-mendoza · 2019-11-07T15:37:21Z

@dcherian These examples seem focused on merging from disk, whereas the use-case I'm running into is joining data produced by computation in ram. I'll try updating my xarray installation and see where that gets me.

dcherian · 2019-11-07T15:40:12Z

the coords, data_vars, join ,compat kwargs in that example are passed down to concat and merge, as appropriate. We do need more documentation on that ....

SimonHeybrock · 2020-05-22T07:20:35Z

I am not familiar with the details of the various applications people in this discussion have, but here is an approach we are taking, trying to solve variations of the problem "data scattered in multi-dimensional space" or irregular time-series data. See https://scipp.github.io/user-guide/binned-data/binned-data.html for an illustrated description.

The basic idea is to keep data in a linear representation and wrap it in a "realigned" wrapper. One reason for this development was to provide a pathway to use dask with our type of data (independent time series at a large number of points in space, with chunking along the "time-series", which is not a dimension since every time series has a different length). With the linked approach we could use dask to distribute the linear underlying representation, keeping the lightweight realigned wrapper on all workers. We are still in early experimentation with this (the dask part is not actually in development yet). It probably has performance issues if more than "millions" of points are realigned --- our case is millions of time series with thousands/millions of time points in each, but the two do not mix (not both are realigned, and if they are it is independently), so we do not run into the performance issue in most cases.

In principle I could imagine this non-destructive realignment approach could be mapped to xarray, so it may be of interest to people here.

pnsaevik · 2020-05-27T09:44:55Z

Thanks for looking into sparse arrays for xarray. I have a use case I believe would be common:

Load a netCDF file written using ragged array representation
Extract a slice in either coordinate direction
Store back into netCDF

At least I would love such a functionality...

SimonHeybrock · 2020-05-27T10:00:25Z

@pnsaevik If the approach we adopt in scipp could be ported to xarray you would be able to to something like (assuming that the ragged array representation you have in mind is "list of lists"):

data = my_load_netcdf(...) # list of lists
# assume 'x' is the dimension of the nested lists
bin_edges = sc.Variable(dims=['x'], values=[0.1,0.3,0.5,0.7,0.9])
realigned = sc.realign(data, {'x':bin_edges})
filtered = realigned['x', 1:3].copy()
my_store_netcdf(filtered.unaligned, ...)

Basically, we have slicing for the "realigned" wrapper. It performs a filter operation when copied.

Edit 2021: Above example is very outdated, we have cleaned up the mechanism, see https://scipp.github.io/user-guide/binned-data/binned-data.html.

scottgigante-immunai · 2021-10-14T16:10:10Z

According to test_sparse.py it looks like XArray already supports sparse, even though the XArray docs doesn't mention this support. Can we expect sparse to gain first-class support in future? Or is the absence of mention in the docs a hint we shouldn't rely on this functionality?

keewis · 2021-10-14T16:25:04Z

that's mostly an oversight, I think. However, to be really useful we'd need to get a sparse-xarray library which makes working with sparse and xarray easier (going from dense to sparse or the reverse still requires something like da.copy(data=sparse.COO.from_numpy(da.data)), which is not user-friendly).

Anyways, the docs you're looking for is working with numpy-like arrays, even though there's no explicit mention of sparse there, either.

scottgigante-immunai · 2021-10-14T16:26:21Z

Thanks so much! Appreciate it.

Material-Scientist · 2022-01-16T14:35:29Z

I would prefer to retain the dense representation, but with tricks to keep the data of sparse type in memory.

Look at the following example with pandas multiindex & sparse dtype:

The dense data uses ~40 MB of memory, while the dense representation with sparse dtypes uses only ~0.5 kB of memory!

And while you can import dataframes with the sparse=True keyword, the size seems to be displayed inaccurately (both are the same size?), and we cannot examine the data like we can with pandas multiindex + sparse dtype:

Besides, a lot of operations are not available on sparse xarray data variables (i.e. if I wanted to group by price level for ffill & downsampling):

So, it would be nice if xarray adopted pandas’ approach of unstacking sparse data.

In the end, you could extract all the non-NaN values and write them to a sparse storage format, such as TileDB sparse arrays.
cc: @stavrospapadopoulos

hameerabbasi · 2022-01-17T10:48:48Z

For ffill specifically, you would get a dense array out anyway, so there's no point to keeping it sparse, unless one did something like run-length-encoding or similar.

As for the size issue, PyData/Sparse provides the nbytes attribute which could be helpful in determining size.

Material-Scientist · 2022-01-17T12:20:18Z

I know. But having sparse data I can treat as if it were dense allows me to unstack without running out of memory, and then ffill & downsample the data in chunks:

It would be nice if xarray automatically converted the data from sparse back to dense for doing operations on the chunks just like pandas does.

The picture shows that I'm already using nbytes to determine the size.

jbbutler · 2023-05-03T22:40:32Z

Hi all! As part of a research project, I'm looking to contribute to xArray's sparse capabilities, with an emphasis on sparse support for use-cases in the geosciences. I'm wondering if anyone in the geosciences (or adjacent disciplines!) has encountered problems with xArray's current level of sparse support, and what kinds of improvements they'd like to see to address those issues. From playing around, it seems the current strategy of backing DataArrays with COO sparse arrays takes care of a lot of use cases, but I have the following ideas that may (or may not) be useful to implement further:

Functions/methods to convert from geopandas GeoDataframes of vector data to rasterized, potentially sparse ndarrays in an xArray Dataset/DataArray (reverse direction too); this is related to the issue of converting from sparse arrays back to multi-indexed pandas objects at the top of this issue (which I believe has yet to be solved)
Loading sparse data from a netcdf() file directly into a Dataset/Array backed by sparse ndarray(s) (seems like the only way to get sparse backings is to either unstack or call '.from_dataframe()/series()' with the sparse flag set to True?)
Support for other sparse array conventions (for ex, GCXS in the sparse package for better memory efficiency; I can't find any improvements to make on the current COO backing in terms of supported arithmetic operations, merges/joins, etc.)

I'd appreciate any feedback on these ideas, as well as any other things that would be nice to have implemented!

rabernat · 2023-05-04T02:36:57Z

Hi @JDButler and welcome! We would welcome this sort of contribution eagerly.

I would characterize our current support of sparse arrays as really just a proof of concept. When to use sparse and how to do it effectively is not well documented. Simply adding more documentation around the already-supported use cases would be a great place to start IMO.

My own exploration of this are described in this Pangeo post. The use case is regridding. It touches on quite a few of the points you're interested in, in particular the integration with geodataframe. Along similar lines, @dcherian has been working on using opt_einsum together with sparse in pangeo-data/xESMF#222 (comment) and #7764.

I'd also suggest catching up on what @martinfleis is doing with vector data cubes in xvec. (See also Pangeo post on this topic.)

Of the three topics you enumerated, I'm most interested in the serialization one. However, I'd rather see serialization of sparse arrays prototyped in Zarr, as its much more conducive to experimentation than NetCDF (which requires writing C to do anything custom). I would recommend exploring serialization from a sparse array in memory to a sparse format on disk via a custom codec. Zarr recently added support for a meta_array parameter that determines what array type is materialized by the codec pipeline (see zarr-developers/zarr-python#1131). The use case there was loading data direct to GPU. In a way sparse is similar--it's an array container that is not numpy or dask.

khaeru · 2023-05-04T07:40:26Z

@jbbutler please also see this comment et seq. pydata/sparse#1 (comment) and related pydata/sparse#438.

To add to @rabernat's point about sparse support being "not well documented", I suspect (but don't know, as I'm just a user of xarray, not a developer) that it's also not thoroughly tested. I expected to be able to use e.g. DataArray.cumprod when the underlying data was sparse, but could not.

IMHO, I/O to/from sparse-backed objects is less valuable if only a small subset of xarray functionality is available on those objects. Perhaps explicitly testing/confirming which parts of the API do/do not currently work with sparse would support the improvements to the docs that Ryan mentioned, and reveal the work remaining to provide full(er) support.

hameerabbasi · 2023-05-04T07:47:04Z

Speaking a bit to things like cumprod, it's hard to support those natively with sparse data structures in many cases (at least as things stand in the current Numba framework).

While that doesn't apply in the case of cumprod, PyData/Sparse also has a policy that if the best algorithm available is a dense one, we simply raise an error, and the user should densify explicitly to avoid filling all available RAM or getting obscure MemoryErrors.

khaeru · 2023-05-04T12:31:22Z

That's a totally valid scope limitation for the sparse package, and I understand the motivation.

I'm just saying that the principle of least astonishment is not being followed: the user cannot at the moment read either the xarray or sparse docs and know which portions of the xarray API will work when giving …, sparse=True, and which instead require a deliberate choice to densify, or see examples of how best to mix the two. It would be helpful to clarify—that's all.

rabernat · 2023-05-04T12:51:59Z

I suspect (but don't know, as I'm just a user of xarray, not a developer) that it's also not thoroughly tested.

Existing sparse testing is here: https://github.com/pydata/xarray/blob/main/xarray/tests/test_sparse.py

We would welcome enhancements to this!

jbbutler · 2023-05-12T01:01:21Z

Thank you all so much for the feedback and resources! I agree (1) testing the limits of xArray's API compatibility with sparse and (2) developing some documentation for what is/isn't supported are great places to start, so I'll get on that while I think about the other I/O issues (serialization, etc.)

dcherian · 2023-06-07T15:43:55Z

A use case example: https://ncar.github.io/esds/posts/2022/sparse-PFT-gridding/

shoyer mentioned this issue Aug 13, 2019

Sparse arrays #1375

Closed

p-d-moore mentioned this issue Sep 4, 2019

Support ffill and bfill operations while remaining sparse pydata/sparse#285

Closed

shoyer mentioned this issue Oct 20, 2019

Saving sparse Dataset using ds.to_netcdf() #3415

Closed

dcherian mentioned this issue Nov 22, 2019

Explicit indexes in xarray's data-model (Future of MultiIndex) #1603

Closed

dcherian mentioned this issue Jun 17, 2020

MultiIndex serialization to NetCDF #1077

Closed

trhallam mentioned this issue Aug 17, 2020

Investigate whether segyio Group class can be used to speed up reading (non-3d) data. trhallam/segysak#57

Closed

jni mentioned this issue Feb 26, 2021

Sparse Labels Layer napari/napari#2281

Open

dcherian added the topic-arrays related to flexible array support label Jul 4, 2021

leewujung mentioned this issue Nov 12, 2021

Large memory expansion when data are converted and organized into groups/EchoData object OSOceanAcoustics/echopype#489

Closed

fmfreeze mentioned this issue Jan 17, 2022

Support for jagged array #1482

Closed

ivirshup mentioned this issue May 8, 2023

Sparse Arrays scientific-python/summit-2023#2

Open

arogozhnikov mentioned this issue Jul 17, 2023

gzip? alugowski/sparse-matrix-io-comparison#1

Open

dcherian mentioned this issue Jan 8, 2024

writing sparse to netCDF #4156

Open

michaelweinold mentioned this issue Jan 20, 2024

Investigate (Sparse) XArray Sizes depending on Dimensions sustainableaviation/EcoPyLot#8

Closed

staadecker mentioned this issue Mar 8, 2024

Support for sparse xarrays PyPSA/linopy#248

Open

irm-codebase mentioned this issue Jul 6, 2024

Use of native sparse array support in xarray / pandas / netCDF calliope-project/calliope#634

Closed

Mr-Milk mentioned this issue Aug 27, 2024

Sparse Array in Labels? scverse/spatialdata#691

Open

How should xarray use/support sparse arrays? #3213

How should xarray use/support sparse arrays? #3213

Comments

shoyer commented Aug 13, 2019

khaeru commented Aug 13, 2019

crusaderky commented Aug 14, 2019 • edited Loading

crusaderky commented Aug 14, 2019

crusaderky commented Aug 14, 2019

shoyer commented Aug 14, 2019

ivirshup commented Aug 15, 2019 • edited Loading

shoyer commented Aug 15, 2019

ivirshup commented Aug 15, 2019

shoyer commented Aug 15, 2019

normanb commented Aug 22, 2019

darothen commented Aug 22, 2019

fjanoos commented Aug 30, 2019

shoyer commented Aug 30, 2019 via email

fjanoos commented Aug 30, 2019

dcherian commented Aug 30, 2019

fjanoos commented Aug 30, 2019

shoyer commented Aug 30, 2019 via email

p-d-moore commented Sep 4, 2019

crusaderky commented Sep 4, 2019

p-d-moore commented Sep 4, 2019

k-a-mendoza commented Oct 24, 2019

oliverhiggs commented Nov 7, 2019

k-a-mendoza commented Nov 7, 2019

dcherian commented Nov 7, 2019

k-a-mendoza commented Nov 7, 2019

dcherian commented Nov 7, 2019

SimonHeybrock commented May 22, 2020 • edited Loading

pnsaevik commented May 27, 2020

SimonHeybrock commented May 27, 2020 • edited Loading

scottgigante-immunai commented Oct 14, 2021

keewis commented Oct 14, 2021

scottgigante-immunai commented Oct 14, 2021

Material-Scientist commented Jan 16, 2022 • edited Loading

hameerabbasi commented Jan 17, 2022

Material-Scientist commented Jan 17, 2022

jbbutler commented May 3, 2023

rabernat commented May 4, 2023

khaeru commented May 4, 2023

hameerabbasi commented May 4, 2023

khaeru commented May 4, 2023

rabernat commented May 4, 2023

jbbutler commented May 12, 2023

dcherian commented Jun 7, 2023

crusaderky commented Aug 14, 2019 •

edited

Loading

ivirshup commented Aug 15, 2019 •

edited

Loading

SimonHeybrock commented May 22, 2020 •

edited

Loading

SimonHeybrock commented May 27, 2020 •

edited

Loading

Material-Scientist commented Jan 16, 2022 •

edited

Loading