-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
automatic chunking of zarr archive #4046
Comments
Thanks for raising this useful issue. There are two ways to control Zarr chunks:
If neither of these are present, Xarray creates the zarr arrays with no chunks specified. In this case, zarr will choose the chunks automatically for you. This behavior is described in the Zarr docs:
You can override this default per variable by specifying a single global chunk in encoding: ds.foo.encoding['chunks'] = -1 or, at write time, ds.to_zarr('test.zarr', mode='w', encoding={'foo': {'chunks': -1}}) I agree that none of this is described well in the Xarray docs. A PR to improve the docs would be most welcome. 😉 |
Thanks for this speedy reply @rabernat ! Improving docs is still within my reach (I hope) and I will give it a shot. |
In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the |
I store data in a zarr archive that is not chunked and the resulting zarr archive is chunked.
This may be as simple usage question.
I don't know how to turn this behavior off.
Code sample
Here is minimal example that reproduces the issue:
returns:
Expected Output
I would expect the archive to not to be chunked.
Versions
Output of xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Mar 23 2020, 23:03:20)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.12.53-60.30-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4
xarray: 0.15.2.dev29+g6048356
pandas: 1.0.3
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.4.0
cftime: 1.1.1.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.13.0
distributed: 2.13.0
matplotlib: 3.2.1
cartopy: 0.17.0
seaborn: 0.10.0
numbagg: None
pint: None
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: None
IPython: 7.13.0
sphinx: None
The text was updated successfully, but these errors were encountered: