Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: xarray.open_mfdataset drops .encoding attributes, need to handle with xcdat.open_mfdataset() #308

Closed
tomvothecoder opened this issue Aug 10, 2022 · 0 comments · Fixed by #309
Assignees
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 10, 2022

What happened?

Related to #302 (comment).

Nice catch! I found an open xarray issue related to xr.open_mfdataset() dropping encoding attributes.

Somebody mentions this about the issue here pydata/xarray#2436 (comment):

However, there is another problem related to xarray.open_mfdataset:
The encoding dictionary gets lost somewhere during the merging operation of the datasets of the respective files (https://github.com/pydata/xarray/issues/2436).

This leads to problems for example with cf-xarray when trying to detect coordinates or bounds, but also leads to problems related to the time axis encoding apparently (as seen in the linked issue). I managed at least to avoid the problems for cf-xarray bounds and coordinates detection by using the decode functionality of xarray only after the datasets have been read in (leaving however the unnecessary time dimension in place ...):

ds = xarray.open_mfdataset("/path/to/files/*.nc")
ds = xarray.decode_cf(ds, decode_coords="all")

I think we might need to use the preprocess function to save the .encoding dict, or some other workaround like from one of the comments.

The temporal averaging APIs require decoded time coordinates and the ds.time.encoding dict to be populated with the "calendar" attr.

What did you expect to happen?

The .encoding attributes should remain persistent after running xcdat.open_mfdataset().

This does not affect xcdat.open_dataset().

Minimal Complete Verifiable Example

>> import xcdat

>> ds = xcdat.open_mfdataset(
    "/p/css03/cmip5_css02/data/cmip5/output1/ICHEC/EC-EARTH/historical/mon/atmos/Amon/r1i1p1/tas/1/*.nc")

>> ds.time.encoding

{}

Relevant log output

No response

Anything else we need to know?

Possible Solutions

  1. Store .encoding and copy it back to the final merged dataset (maybe with preprocess kwarg)

Related Issues

Environment

xcdat 0.3.0

INSTALLED VERSIONS

commit: None
python: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21)
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.45.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1

xarray: 2022.3.0
pandas: 1.4.3
numpy: 1.22.4
scipy: 1.9.0
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.6.1
distributed: 2022.6.1
matplotlib: 3.5.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.7.1
cupy: None
pint: None
sparse: 0.13.0
setuptools: 63.3.0
pip: 22.1.2
conda: None
pytest: 7.1.2
IPython: 8.4.0
sphinx: 4.5.0

@tomvothecoder tomvothecoder added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Aug 10, 2022
@tomvothecoder tomvothecoder self-assigned this Aug 10, 2022
@tomvothecoder tomvothecoder moved this to Todo in v0.3.1 Aug 10, 2022
Repository owner moved this from Todo to Done in v0.3.1 Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant