-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
save "encoding" when using open_mfdataset #2436
Comments
Do you know you can access them in the |
It would be ok but it is (or looks) empty when I use open_dataset() |
@sbiner are you looking at the encoding attribute of the full Dataset or the time variable? The time variable should retain the calendar encoding (the Dataset will not). E.g.:
|
@spencerkclark Yes I was looking at time.encoding. Following you example I did some tests and the problem is related to the fact that I am opening multiple netCDF files with open_mfdataset. Doing so time.encoding is empty while it is as expected when opening any of the files with open_dataset instead. |
If So I think to fix this then either |
@TomNicholas I forgot about this sorry. I just made a quick check with the latest xarray master and I still have the problem ... see code. Related question but maybe out of line, is there any way to know that the snw.time type is cftime.DatetimeNoLeap (as it is visible in the overview of
|
No worries!
#3498 added a new keyword argument to
If this is the case, then to solve your original problem, you could also try using the def store_encoding(ds):
encoding = ds['time'].encoding
ds.time.attrs['calendar_encoding'] = encoding
return ds
snw = xr.open_mfdataset(l_f, combine='nested', concat_dim='time',
master_file=lf[0], preprocess=store_encoding)['snw']
I'm not familiar with these classes, but presumably you mean more than just checking with from cftime import DatetimeNoLeap
print(isinstance(snw.time.values, cftime.DatetimeNoLeap)) |
#3498 says something about a
I tried and it did not work ...
Yes, I was more thinking of something like
|
unfortunately, >>> import cftime
>>> isinstance(ds.time.values[0], cftime.DatetimeNoLeap)
True
>>> type(ds.time.values[0])
<class 'cftime._cftime.DatetimeNoLeap'> In #3498, the original proposal was to name the new kwarg Before trying to help with debugging your issue: could you post the output of Also, could you try to demonstrate your issue using a synthetic example? I've been trying to reproduce it with: In [14]: units = 'days since 2000-02-25'
...: times = cftime.num2date(np.arange(7), units=units, calendar='365_day')
...: for x in range(5):
...: ds = xr.DataArray(
...: np.arange(x, 7 + x).reshape(7, 1),
...: coords={"time": times, "x": [x]},
...: dims=['time', "x"],
...: name='a',
...: ).to_dataset()
...: ds.to_netcdf(f'data-noleap{x}.nc')
...: paths = sorted(glob.glob("data-noleap*.nc"))
...: with xr.open_mfdataset(paths, combine="by_coords") as ds:
...: print(ds.time.encoding)
...:
{'zlib': False, 'shuffle': False, 'complevel': 0, 'fletcher32': False, 'contiguous': True, 'chunksizes': None, 'source': '.../data-noleap0.nc', 'original_shape': (7,), 'dtype': dtype('int64'), 'units': 'days since 2000-02-25 00:00:00.000000', 'calendar': 'noleap'} |
This example works without |
I removed it since it doesn't change anything. |
I use the following, which seems to work for me but I thought something shorter and more elegant could be done ...
Yes,
Here is the output:
I used your code and it works for me also. I noticed the synthetic file is Here is an output ouf
I guess the next option could be to go into xarray code to try to find what the problem is but I would need some direction for doing this. |
DescriptionAny news about this issue? I am facing the same problem and I had to get the calendars by hand... I tried to update xarray but there is still the same issue of missing the Step to reproduceHere is a simple example to illustrate: import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature')
ds.time.encoding that gives: {'source': '/home/lalandmi/.xarray_tutorial_data/air_temperature.nc',
'original_shape': (2920,),
'dtype': dtype('float32'),
'units': 'hours since 1800-01-01',
'calendar': 'standard'} Let's slipt this dataset and try to read it back with ds.sel(time='2013').to_netcdf('tutorial_air_temperature_2013.nc')
ds.sel(time='2014').to_netcdf('tutorial_air_temperature_2014.nc')
ds_mf = xr.open_mfdataset('tutorial_air_temperature_*.nc', combine='by_coords')
ds_mf.time.encoding that results in an empty dictionary: {} Adding some arguments Xarray versionxr.show_versions()INSTALLED VERSIONS
------------------
commit: None
python: 3.8.4 | packaged by conda-forge | (default, Jul 17 2020, 15:16:46)
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 4.19.0-9-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4
xarray: 0.16.0
pandas: 1.0.5
numpy: 1.19.0
scipy: None
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.21.0
distributed: 2.21.0
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 49.2.0.post20200712
pip: 20.1.1
conda: None
pytest: None
IPython: 7.16.1
sphinx: None |
Any update about this issue? I'm working on a code where I want to make sure I have consistent calendars for all my inputs. Couldn't we add an option to use the encoding from the first file in the list or something? |
annoying issue with xarray pydata/xarray#2436
annoying issue with xarray pydata/xarray#2436
annoying issue with xarray pydata/xarray#2436
annoying issue with xarray pydata/xarray#2436
annoying issue with xarray pydata/xarray#2436
I like the automatic decoding of the time variable when reading netcdf files but I often need to keep the calendar attribute of the time variable.
Could it be possible to keep those attributes in the DataSet/DataArray return by open_dataset?
The text was updated successfully, but these errors were encountered: