Replies: 1 comment
-
Update: on the latest v3.0.0-beta.1, writing any DataTree object to zarr fails with ValueError(
"compressor cannot be used for arrays with version 3. Use bytes-to-bytes codecs instead."
) even though no compressor is specified |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am primarily using zarr to store 4D spatialtemporal data, and was initially inspired by the work of carbonplan. As I scale the temporal dimension of
xarray
DataTree objects, I end up generating very large graphs which crash dask when I attempt writing out a DataTree object withdt.to_zarr()
Following this 4D example, I get a
DataTree
object with 6 groups containing following dimensions and chunk sizes:Using this structure, the largest dataset at level 6 is 1.5 GB , which I am able write to a Zarr pyramid without any issues:
However, keeping everything else equal, if I scale the temporal dimension (month) by a factor for 4 - from 12 to 48 months - the dask graph size becomes too large (6GB):
At level 6, the 6 GB graph crashes dask.
I am running on a 12-core M2 macOS with 64 GiB RAM, and my dask configuration is:
Some of the things I have tried:
Starting the dask client with less workers/ higher memory limit (e.g. 1 worker with 64GB limit) and more workers/lower memory limit (e.g. 12 workers with 5GB limit), but the error is the same
Setting higher array chunk sizes
dask.delayed.Delayed
object, but this does not seem to be implemented yet for DataTree objectsNone of these worked though. Will it be possible in zarr 3.0 to write out DataTrees to zarr in parallel?
Are there any suggestions or
xarray
best practices (dask configuration, limiting graph size) for how to write out Zarr data pyramids without running into these errors for larger dimensions? Is it expected that the task graph size remains the same despite re-chunking? I would assume the specs of my machine should be sufficient to handle writing a 6GB Zarr pyramids at the scale listed in the above example, but I am not sure in which direction to go.Beta Was this translation helpful? Give feedback.
All reactions