Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve to_zarr doc about chunking #4048

Merged
merged 7 commits into from
May 20, 2020
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -1604,6 +1604,12 @@ def to_zarr(
References
----------
https://zarr.readthedocs.io/

Notes
-----
zarr may automatically chunk a DataArray if it is not chunked or
if `chunks` is not set to -1 in its `enconding` attribute or in the
argument of the `encoding` parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead state what does happen if a variable is already chunked or chunks can be found in encoding?

Something like

  • If chunks are found in the encoding argument or attribute corresponding to any DataArray, those chunks are used.
  • Otherwise, if the DataArray is already a dask array, it is written with those chunks.
  • Finally, if not other chunks are found, Zarr uses its own heuristics to choose automatic chunk sizes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I closely followed your suggestion.
Note that I am not sure about the docstring style for a list-like note like this one.

"""
if encoding is None:
encoding = {}
Expand Down