Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: coordinate attributes not always retained during group_average operations #529

Closed
pochedls opened this issue Aug 10, 2023 · 5 comments
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.

Comments

@pochedls
Copy link
Collaborator

pochedls commented Aug 10, 2023

What happened?

Scenario A (xcdat v0.5.0 + xarray >=v2023.3.0)

This scenario affects xcdat v0.5.0 (regardless of xarray version). In this issue, xcdat does not create time_bounds for annually averaged data.

fn = '/p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/Amon/tas/gn/v20190218/tas_Amon_CESM2_amip_r1i1p1f1_gn_195001-201412.nc'
ds = xc.open_dataset(fn)
print('Time bounds in dataset: ', 'time_bnds' in ds.variables)
ds = ds.temporal.group_average('tas', freq='year')
print('Time bounds in dataset after taking annual average: ', 'time_bnds' in ds.variables)

Time bounds in dataset: True
Time bounds in dataset after taking annual average: False

Scenario B (xcdat 0.5.0 and xarray 2023.7.0)

This scenario affects xcdat v0.5.0 (and xcdat v 2023.7.0, but not xcdat v2023.3.0). In this issue, xcdat drops axis attributes after performing a group_average operation.

fn = '/p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/Amon/tas/gn/v20190218/tas_Amon_CESM2_amip_r1i1p1f1_gn_195001-201412.nc'
ds = xc.open_dataset(fn)
ds.lon.attrs
ds = ds.sel(time=slice("1950-01-01", "2000-12-30"))
ds = ds.temporal.group_average('tas', freq='year')
ds.bounds.get_bounds("X")

KeyError: "No bounds data variables were found for the 'X' axis. Make sure the dataset has bound data vars and their names match the 'bounds' attributes found on their related time coordinate variables. Alternatively, you can add bounds with xcdat.add_missing_bounds or xcdat.add_bounds."

This results because the attributes for ds.lon are missing (after taking the group average): ds.lon.attrs

{}

(They were not missing before taking the group average)

What did you expect to happen? Are there are possible answers you came across?

In scenario A, I would have expected annual bounds (this isn't that critical, though).

In scenario B, I would have expected ds.lon.attrs to persist after ds.temporal.group_average() operations. Note that this issue has downstream effects (e.g., taking spatial averages which need the lat/lon bounds).

Minimal Complete Verifiable Example (MVCE)

No response

Relevant log output

No response

Anything else we need to know?

I'm less worried about Scenario A. Scenario B can actually cause downstream problems.

Environment

xcdat v0.5.0
xarray version v2023.7.0

@pochedls pochedls added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Aug 10, 2023
@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 16, 2023

In scenario A, I would have expected annual bounds (this isn't that critical, though).

In the scenario where a user runs group_average(), they can decide which time bounds method to use to generate time bounds (add_time_bounds(method="freq") or add_time_bounds(method="midpoint). An alternative is to detect the method for existing bounds then recreate them internally after grouping averaging.

Scenario B (xcdat 0.5.0 and xarray 2023.7.0)

I believe this scenario is a duplicate of #458 and was resolved by #465, which will be included in v0.6.0.
Can you try the same code out on the main branch with the latest dev environment?

@pochedls
Copy link
Collaborator Author

This works on the main branch. We might consider a feature in the future that adds the correct bounds after group averaging, but that can be raised as a separate issue. I think this can be closed.

@tomvothecoder
Copy link
Collaborator

This works on the main branch. We might consider a feature in the future that adds the correct bounds after group averaging, but that can be raised as a separate issue. I think this can be closed.

I opened #534 to cover scenario A.

@tomvothecoder
Copy link
Collaborator

@pochedls I just released v0.5.0 patch 1 on conda-forge to fix scenario B.

conda/mamba should automatically pull the latest version. If you want to update xcdat in an existing environment, you can run mamba update -c conda-forge xcdat or mamba install -c conda-forge xcdat=0.5.0=pyhd8ed1ab_1

I ran your example for scenario B below and it works with v0.5.0 now.

fn = '/p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/Amon/tas/gn/v20190218/tas_Amon_CESM2_amip_r1i1p1f1_gn_195001-201412.nc'
ds = xc.open_dataset(fn)
ds.lon.attrs
ds = ds.sel(time=slice("1950-01-01", "2000-12-30"))
ds = ds.temporal.group_average('tas', freq='year')
ds.bounds.get_bounds("X")

@tomvothecoder
Copy link
Collaborator

@pochedls I just released v0.5.0 patch 1 on conda-forge to fix scenario B.

conda/mamba should automatically pull the latest version. If you want to update xcdat in an existing environment, you can run mamba update -c conda-forge xcdat or mamba install -c conda-forge xcdat=0.5.0=pyhd8ed1ab_1

I ran your example for scenario B below and it works with v0.5.0 now.

fn = '/p/css03/esgf_publish/CMIP6/CMIP/NCAR/CESM2/amip/r1i1p1f1/Amon/tas/gn/v20190218/tas_Amon_CESM2_amip_r1i1p1f1_gn_195001-201412.nc'
ds = xc.open_dataset(fn)
ds.lon.attrs
ds = ds.sel(time=slice("1950-01-01", "2000-12-30"))
ds = ds.temporal.group_average('tas', freq='year')
ds.bounds.get_bounds("X")

Also just an FYI @xCDAT/core-developers

@tomvothecoder tomvothecoder changed the title [Bug]: bounds not always retained during group_average operations [Bug]: coordinate attributes not always retained during group_average operations Aug 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
None yet
Development

No branches or pull requests

2 participants