-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Issues opening up datasets with multiple coord vars for the same axis mapped to single dim/independent dims (IPSL ESM files) #285
Comments
Thanks for sharing this – we had not come across this case yet (looking at CMIP6 data). This file seems to have two time variables (with identical time values in each), but I could imagine that some netCDF files would have two distinct time axes. I see that this issue results because xcdat is expecting one time axis and has logic built around checking/decoding a single time axis, but this file has This is going to need a closer look. |
Thank you for your interest. Note that we could specify the name of the right time axis at the call of open_dataset. However, if different variables in the file use different time axis, we are screwed with this method. |
By specifying the name when you open the dataset, you could at least work with all the variables that share that axis. Then open a new session for the variables with the other time axis. |
I like @rljacob's idea (at least as a possible stopgap solution). I think you could potentially implement this by taking in an optional variable in I am hitting a similar error opening CESM2 LE data (which signals this issue may affect E3SM files, too).
As a temporary solution, I am opening this data with xarray. xcdat basically uses |
@pochedls is there a way you can make the Also, I think adding something in the documentation about the logic of finding the axes and how the Clarifying how you handle the
|
I looked into this a little further and have some ideas to address the issue in Test 1 and Test 2. The immediate problem is when opening the dataset and getting the time axis in the
This occurs because the dataset has multiple time axes. There are a couple potentially simple solutions to this:
A more complete solution might look like:
I think this would decode all time axes, which I think is what we should do (if possible). I do think temporal operations will still need to be updated to be able to select the appropriate time coordinate (e.g., by selecting the time coordinate associated with a specified data_var). One note that might simplify the main logic above (at Step 2). Our version of |
In looking at Test 3, I see another issue here. The time axis needed for temporal operations is not actually assigned as a dimension. I thought this could be fixed by dropping
But this yields a different error:
@tomvothecoder - do you think this is a separate issue in the temporal averaging logic? |
Yes, this issue is related to the I opened up a separate issue and the related PR that fixes this: |
I just wanted to note that I hit this error message when using the regridder (use the code in this issue with one of the two datasets below).
It's possible that efforts to address this issue (e.g., here) will fix the regridding issue. |
Hi @oliviermarti and @jypeter, Thank you for your patience as we worked to solve this issue. I just wanted to inform you that that this issue is now resolved in Example code: import numpy as np,
import xarray as xr
import xcdat as xc
print(xc.__version)
# 0.4.0
TS_File ='https://thredds-su.ipsl.fr/thredds/dodsC/ipsl_thredds/omamce/TestCases/XCDAT/thetao_CF_1.nc'
dTS = xc.open_dataset(TS_File, add_bounds=True, decode_times=True)
print(dTS)
# <xarray.Dataset>
# Dimensions: (y: 149, x: 182, olevel: 31, time_counter: 12,
# axis_nbounds: 2, bnds: 2)
# Coordinates:
# * olevel (olevel) float32 5.0 15.0 25.0 ... 4.75e+03 5.25e+03
# * time_counter (time_counter) object 1850-01-16 12:00:00 ... 1850-...
# Dimensions without coordinates: y, x, axis_nbounds, bnds
# Data variables:
# nav_lat (y, x) float32 ...
# nav_lon (y, x) float32 ...
# thetao (time_counter, olevel, y, x) float32 ...
# time_centered (time_counter) float64 ...
# time_centered_bounds (time_counter, axis_nbounds) float64 ...
# time_counter_bnds (time_counter, bnds) object 1850-01-01 18:00:00 ......
# olevel_bnds (olevel, bnds) float32 -0.000237 10.0 ... 5.5e+03
# Attributes:
# name: CM6-SW-pi-Par01_1m_18500101_18501231_grid_T
# description: Created by xios
# title: Created by xios
# Conventions: CF-1.6
# timeStamp: 2021-Oct-22 08:27:34 GMT
# uuid: bf661a39-5d4a-4999-aaae-56f09c3123c5
# LongName: IPSLCM6.1.10-LR
# NCO: netCDF Operators version 4.9.9 (Homepage...
# _NCProperties: version=2,netcdf=4.7.4,hdf5=1.10.6
# history: Thu Jul 28 16:24:47 2022: ncatted -a coo...
# DODS_EXTRA.Unlimited_Dimension: time_counter
tos_year = dTS.temporal.group_average("thetao", freq="year", weighted=True)
# <xarray.Dataset>
# Dimensions: (y: 149, x: 182, olevel: 31, bnds: 2, time_counter: 1)
# Coordinates:
# * olevel (olevel) float32 5.0 15.0 25.0 ... 4.25e+03 4.75e+03 5.25e+03
# * time_counter (time_counter) object 1850-01-01 00:00:00
# Dimensions without coordinates: y, x, bnds
# Data variables:
# nav_lat (y, x) float32 ...
# nav_lon (y, x) float32 ...
# olevel_bnds (olevel, bnds) float32 -0.000237 10.0 10.0 ... 5e+03 5.5e+03
# thetao (time_counter, olevel, y, x) float64 nan nan nan ... nan nan
# Attributes:
# name: CM6-SW-pi-Par01_1m_18500101_18501231_grid_T
# description: Created by xios
# title: Created by xios
# Conventions: CF-1.6
# timeStamp: 2021-Oct-22 08:27:34 GMT
# uuid: bf661a39-5d4a-4999-aaae-56f09c3123c5
# LongName: IPSLCM6.1.10-LR
# NCO: netCDF Operators version 4.9.9 (Homepage...
# _NCProperties: version=2,netcdf=4.7.4,hdf5=1.10.6
# history: Thu Jul 28 16:24:47 2022: ncatted -a coo...
# DODS_EXTRA.Unlimited_Dimension: time_counter In your example file:
What changed in
|
Le 10 nov. 2022 à 23:46, Tom Vo ***@***.***> a écrit :
Hi @oliviermarti <https://github.com/oliviermarti> and @jypeter <https://github.com/jypeter>,
Thank you for your patience as we worked to solve this issue. I just wanted to inform you that that this issue is now resolved in xcdat=0.4.0 (#343 <#343>). You should be able to add bounds, decode times, and perform xcdat temporal operations now.
Give it a shot and let us know how it goes!
Hi,
Sorry for this late response.
This xcdat revision is OK. I can read and manipulate my data now. Thanks !
Olivier
—
Olivier Marti
LSCE Bât 714 p. 1049
MERMAID Team
Normal situation : +33 1 69 08 77 27
Corona lockdown : +33 6 45 36 43 74
***@***.***
|
What happened?
The IPSL Earth System Model output files has two time variables, and xcdat refuses to open them.
Developer Notes (@tomvothecoder, 8/30/22)
Dataset Info:
time_counter
time_counter
andtime_variable
mapped totime_counter
dimensionHow xcdat handles coords:
cf_xarray.cf["T"]
to get the coord var, which breaks if there are multiple coord vars foundcf_xarray.cf[["T"]]
to return a Dataset with all the coord vars mapped to an axisWhat did you expect to happen?
No response
Minimal Complete Verifiable Example
Relevant log output
Anything else we need to know?
No response
Environment
xcdat.version : '0.3.0'
xr.show_versions() :
INSTALLED VERSIONS
commit: None
python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:03:09) [Clang 13.0.1 ]
python-bits: 64
OS: Darwin
OS-release: 21.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1
xarray: 2022.3.0
pandas: 1.4.3
numpy: 1.22.4
scipy: 1.8.1
netCDF4: 1.6.0
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.1
nc_time_axis: 1.4.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.6.1
distributed: 2022.6.1
matplotlib: 3.5.2
cartopy: 0.20.2
seaborn: None
numbagg: None
fsspec: 2022.5.0
cupy: None
pint: 0.19.2
sparse: 0.13.0
setuptools: 63.1.0
pip: 22.1.2
conda: 4.13.0
pytest: None
IPython: 8.4.0
sphinx: None
The text was updated successfully, but these errors were encountered: