-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr chunking (GH2300) #2487
Zarr chunking (GH2300) #2487
Conversation
Hello @lilyminium! Thanks for updating the PR.
Comment last updated on October 30, 2018 at 17:48 Hours UTC |
This looks fine to me. I actually thought we had fixed this already but I guess not. Can you add whatsnew entry in the docs as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! I agree that the logic in _determine_zarr_chunks
was incorrect...it makes sense to have a special case for the final chunk on each dimension and allow it to be smaller.
But I don't understand how this fixes #2300. The error you were getting there was NotImplementedError
. I don't see how this PR overcomes that. Could you clarify? Sorry if this is obvious, but I just don't get it.
Thanks for this contribution!
We are still missing an entry in whatsnew. Please add it to give yourself credit for your contribution! :) |
606fa56
to
6dc1f59
Compare
Forgot I hadn't done that yet, thanks! |
Thanks! |
* upstream/master: (122 commits) add missing , and article in error message (pydata#2557) Add libnetcdf, libhdf5, pydap and cfgrib to xarray.show_versions() (pydata#2555) revert to dev version for 0.11.1 Release xarray v0.11 DOC: update whatsnew for xarray 0.11 release (pydata#2548) Drop the hack needed to use CachingFileManager as we don't use it anymore. (pydata#2544) add full test env for py37 ci env (pydata#2545) Remove old-style resample example in documentation (pydata#2543) Stop loading tutorial data by default (pydata#2538) Remove the old syntax for resample. (pydata#2541) Remove use of deprecated, unused keyword. (pydata#2540) Deprecate inplace (pydata#2524) Zarr chunking (GH2300) (pydata#2487) Include multidimensional stacking groupby in docs (pydata#2493) (pydata#2536) Switch enable_cftimeindex to True by default (pydata#2516) Raise more informative error when converting tuples to Variable. (pydata#2523) Global option to always keep/discard attrs on operations (pydata#2482) Remove tests where answers change in cftime 1.0.2.1 (pydata#2522) Finish deprecation cycle for DataArray.__contains__ checking array values (pydata#2520) Fix bug where OverflowError is not being raised (pydata#2519) ...
to_zarr
performance #2300I don't fully understand the ins-and-outs of Zarr, but it seems that if it can be serialised with a smaller end-chunk to begin with, then saving a Dataset constructed from Zarr should not have an issue with that either.