Consider smaller Zarr chunks with async IO #89

shoyer · 2021-03-31T22:36:21Z

(This isn't really a Pangeo-Forge issue, but it's something I realized recently that I think is quite relevant for storing typical Pangeo datasets)

The standard practice I've seen with Pangeo is to use relatively large chunk sizes, e.g., ~100 MB. This is similar to recommended chunk-sizes for computation with dask.array.

I think this is unnecessarily large, and you could get away with much smaller chunk-sizes as long as the underlying Zarr store is using concurrent/asynchronous IO, and multiple Zarr chunks fit into each Dask chunks. The bigger Dask chunks ensure that Dask's overhead is fixed (e.g., it doesn't build up unreasonably large task graphs), and the concurrent access to the Zarr store (within each chunk) ensures that data from Zarr can still be read at roughly the same speed.

For example, in the NOAA OISST example, you might keep the Zarr arrays smaller (e.g., one time-step per chunk for ~4 MB per array chunk), but simply open the dataset up with different chunks, e.g., xr.open_zarr(path, chunks={'time': '100MB'}).

Why would this be a good idea?

Smaller chunks means significantly reduced IO for previewing data and makes a big difference for interactive visualization.
Smaller chunks should reduce the need for rechunking for different types of analysis.
File systems and object stores can typically efficiently store much smaller chunks of data without overhead (~1-10 MB is a typical range) than dask likes for array chunks. So there's fair amount of addition flexibility available.
This mirrors the standard guidance to size thread-pools based on how they are used: executing many CPU bound tasks should use 1 thread/core, but IO bound tasks can make good use of many more threads.

The text was updated successfully, but these errors were encountered:

rabernat · 2021-04-01T11:50:02Z

This is a good suggestion, and I agree with you in theory. This was one of the big motivations for pursuing async support in the first place (see discussion in zarr-developers/zarr-python#536).

However, the tradeoff is that a huge number of chunks translates to a huge number of files / objects to manage. For a 1 TB in 4 MB chunks, there are 250_000 chunks to keep track of! This creates a big overhead on the object store if you ever want to delete / move / rename the data.

shoyer mentioned this issue Apr 2, 2021

async in zarr zarr-developers/zarr-python#536

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider smaller Zarr chunks with async IO #89

Consider smaller Zarr chunks with async IO #89

shoyer commented Mar 31, 2021

rabernat commented Apr 1, 2021

Consider smaller Zarr chunks with async IO #89

Consider smaller Zarr chunks with async IO #89

Comments

shoyer commented Mar 31, 2021

rabernat commented Apr 1, 2021