-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop dimension_names from v3 #219
Comments
There was already quite a bit of discussion regarding whether to;
See previous discussion: While there wasn't a strong reason to favor 1 vs 2, given that dimension names are quite broadly applicable across many different zarr implementations and use cases, and some zarr implementations (such as Neuroglancer and TensorStore) directly make use of dimension names, there was strong support for choosing either 1 or 2, in favor of 3, in order to promote a single way to specify dimension names for better common denominator interoperability between e.g. xarray, netcdf, and OME-zarr. My hope is that for zarr v3, xarray, netcdf and OME-zarr all make use of this I think the general argument in favor of (1) rather than (2) is that dimension names are more universally applicable than other metadata like units. |
Thanks for explaining the previous discussion. I agree that dimension names are broadly applicable. However, a flat list of strings may not be enough to specify all the metadata that a community needs, which will create parallel metadata. For example, in OME-Zarr we already store metadata for axes (=dimensions) with additional fields such as |
Agreed that there are many other per-dimension attributes that may be useful. For example, in addition to the ones you mentioned, I'm planning to propose a zarr extension for non-zero origins, which would need a way to specify a lower bound and grid offset for each dimension. I'm also planning to propose a "resizable" attribute (#212). In general, there is the choice of whether to represent per-dimension attributes using a "row"-based organization similar to your example above, e.g.:
or to use an equivalent columnar representation: {"dimension_names": ["x", "y", "c", null],
"attributes": {
"dimension_units": ["meter", "meter", null, null],
"dimension_type": ["space", "space", null, null]
},
...
} For the row-based organization we are also effectively adding the concept of per-dimension user-defined attributes. As I see it, we can basically accomplish the same thing with either representation, but there are pros and cons to each approach:
{"dimensions": [{"name": "x", "coordinate_array": {"must_understand": false, "path": "xxxxx"}, ...],
...} However, it is not clear whether adding an optional core metadata per-dimension attribute is particularly useful given that it could instead be added as a user attribute.
|
This is something that could still be added later, as an alternative to "dimensions": {
"x": { // "x" refers to a dimension name
"type": "space",
"unit": "nanometer"
}
} This would then be in addition to Simply having dimension names without further properties was requested by multiple people in different threads, which I think justifies adding it to v3 atm. I don't think that this is the case for any more complex fields. Since this can be changed and added later as well, I'm strongly in favor of keeping the spec as-is for now. |
Well, OME-NGFF has that. But, I don't want this to block ZEP1, either. |
@normanrz Can we close this issue? |
I was going through the v3 spec and noticed the
dimension_names
attribute. I was wondering if that might better be placed in the upcoming metadata convention ZEP? There are already community-specific conventions for assigning names (and other metadata) to dimensions, e.g. OME-Zarr or xarray zarr dimension encoding.The text was updated successfully, but these errors were encountered: