Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray allows several types for netcdf attributes. Is it expected ? #7093

Open
ghislainp opened this issue Sep 27, 2022 · 3 comments
Open

xarray allows several types for netcdf attributes. Is it expected ? #7093

ghislainp opened this issue Sep 27, 2022 · 3 comments
Labels
needs discussion topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)

Comments

@ghislainp
Copy link
Contributor

What is your issue?

Xarray is permissive regarding the type of the attributes. If using a wrong type, the error reveals the valid types: For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple

Using a non iterable type used to raise an Exception when reading the saved netcdf, but this is now solved with #7085

The pending question is whether it is valid to save netcdf attributes with type other than a string or not.
The following lines are working (in a notebook):

xr.DataArray([1, 2, 3], attrs={'units': 1}, name='x').to_netcdf("tmp.nc")
!ncdump tmp.nc

xr.DataArray([1, 2, 3], attrs={'units': np.nan}, name='x').to_netcdf("tmp.nc")
!ncdump tmp.nc

xr.DataArray([1, 2, 3], attrs={'units': ['xarray', 'is', 'very', 'permissive', ]}, name='x').to_netcdf("tmp.nc")
!ncdump tmp.nc

On the other hand, the following line raises an error:

xr.DataArray([1, 2, 3], attrs={'units': None}, name='x').to_netcdf("tmp.nc")
@ghislainp ghislainp added the needs triage Issue that has not been reviewed by xarray team member label Sep 27, 2022
@headtr1ck
Copy link
Collaborator

What exactly do you mean by netcdf attributes? The special meaning ones like units, _FillValue or long_name?

In general xarray is quite relaxed on what users put into the attrs, internally it is simply a dict of anything (see e.g. #7111 where people put in recursive DataArrays).

As soon as you try to write it into a netCDF it has to be serializable, but that's why it raises an error.

Personally I think the current behavior is fine (ofc, plotting etc should be able to deal with non-standard units etc.)

@ghislainp
Copy link
Contributor Author

  • this issue was submitted because of solve a bug when the units attribute is not a string  #7085 (comment)
  • I also tend to agree that this behavior is fine for the non-specific netcdf attributes.
  • For specific attributes as unit and _FillValue, is it also fine ? I would expect that Dataset.to_netcdf check the type (string for unit and the type of the variable for _FillValue and raise at least a warning. It is currently possible to save a number for the unit... Is the resulting netcdf CF-compliant ?
  • if any kind of types for attributes is allowed, this requires to chase bugs of the kind solved in solve a bug when the units attribute is not a string  #7085.

@headtr1ck headtr1ck added needs discussion topic-metadata Relating to the handling of metadata (i.e. attrs and encoding) and removed needs triage Issue that has not been reviewed by xarray team member labels Oct 3, 2022
@acrosby
Copy link

acrosby commented Oct 4, 2022

I believe NetCDF, CF-Conventions, and xarray should all be considered independently. It is not necessarily the responsibility of xarray serialization to ensure a CF-complaint file...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs discussion topic-metadata Relating to the handling of metadata (i.e. attrs and encoding)
Projects
None yet
Development

No branches or pull requests

3 participants