Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add general CORDEX coordinate boundary fix #184

Closed

Conversation

aperezpredictia
Copy link
Contributor

@aperezpredictia aperezpredictia commented Jul 24, 2019

Description

CORDEX has not rlat / rlon / lat / lon boundaries, so we have to use the function already created for the bcc but guessing the boundaries of rlat / rlon in order to interpolate later them to lat / lon.

This PR is part of #182 , which is being separated into parts.

In the documentation, we should write something like '''Fixes for cordex-eur11-44''' instead of '''Fixes for bcc-csm1-1.'''

Closes #issue_number

Link to documentation:


Before you get started

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@zklaus
Copy link

zklaus commented Jul 24, 2019

This PR depends on #183.

@mattiarighi mattiarighi added enhancement New feature or request cmor Related to the CMOR standard labels Jul 24, 2019
@bouweandela bouweandela changed the base branch from development to master January 3, 2020 12:11
@bouweandela
Copy link
Member

@zklaus Now that #183 is merged, I think it would be good to continue with this pull request? Or should #185 go first?

@jvegreg
Copy link
Contributor

jvegreg commented May 5, 2020

I reworked a bit this pull request, but I need a CORDEX dataset that actually requires this fix to test

@aperezpredictia , @mwjury can you provide an example?

@aperezpredictia
Copy link
Contributor Author

Hello, you can just try with this dataset: https://cloud.predictia.es/index.php/s/E6IqhWhhy5ghlMi

@jvegreg
Copy link
Contributor

jvegreg commented May 5, 2020

Ready to test from my side

To simplify the management of CORDEX, I modified a bit the way we load the fixes to allow define generic project fixes that will be applied to all datasets. The CORDEX fix is implemented in that way

@mwjury
Copy link
Contributor

mwjury commented May 6, 2020

Thanks @jvegasbsc and @aperezpredictia, works like a charm!

I tested it also for extract_region (works fine) and area_statistics (doesn't, see #631).

Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see progress here! Just a small comment, could you have a look?

esmvalcore/cmor/_fixes/cordex/project.py Outdated Show resolved Hide resolved
bouweandela
bouweandela previously approved these changes May 6, 2020
@bouweandela
Copy link
Member

@mattiarighi Could you please run a final test and merge?

@zklaus
Copy link

zklaus commented May 7, 2020

The thing that is special about CORDEX is that basically all coordinate systems are rotated lat-lon systems. Taking this into account would be better than using the rather crude interpolation to which we are condemned in situations like the bcc grid.

@bouweandela
Copy link
Member

Taking this into account

Where and how should this be taken into account? Would that affect this pull request?

@zklaus
Copy link

zklaus commented May 11, 2020

Where and how should this be taken into account? Would that affect this pull request?

Yes, this would affect this pull request.

This PR calculates boundaries for 2d lat-lon coordinates in a generic way, that is it interpolates a spline to put cell corners roughly between cell centers in 2d space. This approach is appropriate where the cell center coordinates have no simple relation, or that relation is not correctly available as in the case of the bcc-csm models where it allows us to still get somewhat meaningfull cells.

However, here we have a simple rotated grid where not only 1d rotated coordinates rlat and rlon are available, but even the full coordinate system information, and the exact cell corners can be calculated simply by using guess_bounds on the 1d rlat and rlon and using turning this into the full 2d boundary field with mesh_grid followed by a backrotation with iris.analysis.cartography.unrotate_pole.

@bouweandela bouweandela dismissed their stale review May 12, 2020 12:51

Thanks for explaining @zklaus! @jvegasbsc Could you please have a look?

@mwjury
Copy link
Contributor

mwjury commented Jun 9, 2020

I have been working a bit on this and the procedure suggested by @zklaus.
Currently attached project.txt (or here).

I ran into some problem as some of the models are having wrongly defined, and/or iris is making trouble when reading lamber conformal conical coordinate systems.
I checked for differences with the present latitude longitude grid points (arrays), the largest were around +/- 0.005 (for one model, for all others around e-06 or well below). That leaves the question if not also the latitude, longitude grid points (arrays) are to be replaced, or if only the boundaries should be added.

I tested it on virtually all CORDEX EUR-44 models.

@zklaus, @jvegasbsc could you have a look please.

@mwjury
Copy link
Contributor

mwjury commented Jun 17, 2020

@zklaus @jvegasbsc Made some updates and had to fix an error due some proj.db error I had.
Here the new project.txt

which is now also tested for CORDEX EUR-11 models.

@aperezpredictia
Copy link
Contributor Author

What is the status of this PR? Is there any way I can help you?

@bouweandela bouweandela added the fix for dataset Related to dataset-specific fix files label Aug 20, 2020
@jvegreg
Copy link
Contributor

jvegreg commented Aug 24, 2020

Sorry @mwjury for the lack of response. A couple of questions:

  • The code in project.txt should replace the current fix?

  • Are you happy enough with it so I can replace it straight away and focus on this alternative?

I checked for differences with the present latitude longitude grid points (arrays), the largest were around +/- 0.005 (for one model, for all others around e-06 or well below). That leaves the question if not also the latitude, longitude grid points (arrays) are to be replaced, or if only the boundaries should be added.

The e-06 we can ignore safely, I think. Looks like float issues to me

About the 0.005 one, which is the resolution of the dataset? For 45º lat, using 45.005º is wrong by ~500m, so it will be only significant for HR experiments. In any case, it is a model issue

Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to also have some unit tests for this, as the code looks pretty complicated, so without unit tests this will become very difficult to maintain/update.

@bouweandela
Copy link
Member

@jvegasbsc We now also have pre-commit hooks, you can start using them by running pre-commit install, they are recommended ;-)

@mwjury
Copy link
Contributor

mwjury commented Oct 5, 2020

Sorry just returned from my parental leave. Thanks for pushing that forward @jvegasbsc . Let me know if I can help.

@jvegreg
Copy link
Contributor

jvegreg commented Oct 5, 2020

It would be nice to also have some unit tests for this, as the code looks pretty complicated, so without unit tests this will become very difficult to maintain/update.

I will prepare something, but it will be a regression test, as I do not have the knowledge to do proper unit testing on this

@aperezpredictia
Copy link
Contributor Author

Hi, are there any news about this PR? What is the current status of using CORDEX project in ESMValTool?

@thomascrocker
Copy link
Contributor

thomascrocker commented Jun 7, 2021

As mentioned in #772 this PR needs merging in order for regridding with many CORDEX models to work. It is also very handy because it also resolves some issues related to some of the CORDEX models that have LambertConformal projections.

Another potential issue mentioned in #772 by @zklaus and also at #184 (comment) is that of the minor rounding / machine precision size differences that exist on CORDEX grids. I'm not sure if there are two possible issues here though?

  1. Minor (machine precision / rounding) differences in the native grid between institutions from models that are on the same domain and grid (i.e. grid points and bounds should be identical)
  2. Minor differences between the derived regular grid lat and lon coordindates and bounds that the methodology in this PR produces vs those that are already stored in the original data files themselves.

My understanding is that this PR warns about number 2, but number 1 is still a potential issue. It might be possible to fix both by rounding all coordinate data to a set precision level, or perhaps there could be another solution involving pre defined standard grids for each domain stored somewhere, it might take some time to come to an agreement on the best solution to this.

My feeling is that perhaps we try and get this PR merged (it seems the blocker at the moment is some comprehensive tests for these fixes). And then deal with solving the minor differences in coordinates in a seperate issue / PR. Essentially this PR resolves an outstanding issue that will unlock progress in other areas (i.e. regridding). The coordinates precision issue should probably be addressed, but isn't a hold up to the essence of this PR, so maybe that should become a seperate issue?

@codecov
Copy link

codecov bot commented Oct 7, 2021

Codecov Report

Merging #184 (78b6893) into main (20cefc7) will decrease coverage by 0.98%.
The diff coverage is 21.52%.

❗ Current head 78b6893 differs from pull request most recent head 2ed7ca3. Consider uploading reports for the commit 2ed7ca3 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main     #184      +/-   ##
==========================================
- Coverage   88.04%   87.06%   -0.99%     
==========================================
  Files         194      195       +1     
  Lines        9765     9897     +132     
==========================================
+ Hits         8598     8617      +19     
- Misses       1167     1280     +113     
Impacted Files Coverage Δ
esmvalcore/cmor/_fixes/cordex/project.py 13.74% <13.74%> (ø)
esmvalcore/cmor/_fixes/fix.py 96.36% <100.00%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20cefc7...2ed7ca3. Read the comment docs.

@thomascrocker
Copy link
Contributor

Just to bump this PR, I've started doing some work with CORDEX data from the CAM-44 domain and needed this PR in order to do regridding and area averaging of the data. (As was also the case with models from the EUR-11 domain)

@bouweandela
Copy link
Member

bouweandela commented Mar 4, 2022

@thomascrocker Unfortunately @jvegreg who was previously working on this has left the project, so is unable to finish this.

To get this merged or make the functionality available in some other way, we first need to agree on what the correct implementation is. Unfortunately, I personally do not have the required knowledge to say anything meaningful about that. @zklaus Would you be able to comment on this? @ESMValGroup/tech-reviewers or @ESMValGroup/science-reviewers Can anyone who is knowledgeable on CORDEX data please have a look at this and comment?

If there is agreement that the approach here is viable, then the way towards getting this merged would be someone having a look at the checklist in the top post above and seeing what still needs to be done and following up by actually doing it.

@thomascrocker
Copy link
Contributor

@bouweandela
Thanks for the update, unfortunately the project I am working on that is using this functionality is coming to an end this month.
However, I recently sat in on a meeting (including @axel-lauer) around scoping for another project that may well require this functionality https://climate.esa.int/en/projects/cmug/about/
We are hoping that it might be possible to find resource in the budget for a software engineer that could potentially look into this, but there are no guarantees unfortunately.

@nhsavage
Copy link

hi all, can someone who was involved in the review of this work please confirm what the status is and how we can move it forwards? This seems a real foundation issue for the use of CORDEX. I might be able to help with some of the coding and/or review. It would also be really useful to have a simple example in an issue of the problem this fix is trying to resolve. It would probably also help to make it more granular by starting only with rotated pole models and then building from there?

@sloosvel
Copy link
Contributor

Myself and @pepcos will be picking the CORDEX related issues up, but I have to say that we are going to need some context first, as we are a bit lost at the moment.

@nhsavage
Copy link

I can probably help provide some context but I have as yet limited experience with using ESMValTool myself. (and no funded time to work on). @thomascrocker - do you think we could together write some failing recipies for e,g,

a. single rotated pole model which has good metadata
b. rotated pole with iffy metadata
c. lambert conformal with good metadata
d. lambert conformal with iffy metadata

and open issues for each of these?

@sloosvel
Copy link
Contributor

@pepcos and I have some experience with ESMValTool, so it would be no problem to help. We do very much have an interest on this, but right now with the release in the middle is not the best time. We will start working on this after the release is done.

@thomascrocker
Copy link
Contributor

I can probably help provide some context but I have as yet limited experience with using ESMValTool myself. (and no funded time to work on). @thomascrocker - do you think we could together write some failing recipies for e,g,

a. single rotated pole model which has good metadata b. rotated pole with iffy metadata c. lambert conformal with good metadata d. lambert conformal with iffy metadata

and open issues for each of these?

Good idea, it shouldn't be too much work to do so. I will hopefully be able to get around to it sometime next week,

@thomascrocker
Copy link
Contributor

OK, I spent much of Friday afternoon looking at this, and have created a simple recipe. Also.. there is a possibility of the work that sparked my interest in this issue being useful in another funded project so it might be I can justify a little more time on this... 🤞

Simple recipe below:

documentation:
  title: Demo recipe for PR 184

  description:
    Some example problem models for PR 184
    What if I want to regrid a regional model file?

  authors:
    - predoi_valeriu

datasets:
# Rotated pole model (coordinate name error, fails CMOR check)
- {institute: MOHC, driver: MPI-M-MPI-ESM-LR, dataset: HadREM3-GA7-05, project: CORDEX, ensemble: r1i1p1, mip: mon, rcm_version: v1, domain: EUR-11}

# Rotated pole model
- {institute: SMHI, driver: MOHC-HadGEM2-ES, dataset: RCA4, project: CORDEX, ensemble: r1i1p1, mip: mon, rcm_version: v1, domain: EUR-11}

# lambert conformal model
- {institute: CNRM, driver: MOHC-HadGEM2-ES, dataset: ALADIN63, project: CORDEX, ensemble: r1i1p1, mip: mon, rcm_version: v1, domain: EUR-11}

preprocessors:
  preproc:
    regrid:
      target_grid:
        start_latitude: 50
        end_latitude: 60
        start_longitude: 0
        end_longitude: 10
        step_latitude: 1
        step_longitude: 1
      scheme: linear

diagnostics:
  temp:
    description: Regrid some data
    variables:
      tas:
        start_year: 1981
        end_year: 2000
        exp: historical
        preprocessor: preproc
    scripts: null

The first dataset listed HadREM3-GA7-05 is a rotated pole model and fails due to CMOR checking and is an example of #1044

The second dataset RCA4 is another rotated pole model that passes CMOR checking, however the regridding step of the preprocessor fails, I think because the dataset has no bounds on it's coordinates (i.e. the problem this pr is trying to fix):

2022-07-08 15:25:19,257 UTC [18702] WARNING There were warnings in variable tas:
 Can not guess bounds for coordinate lon from var lon: Multi-dimensional coordinate not supported: 'longitude'
 Can not guess bounds for coordinate lat from var lat: Multi-dimensional coordinate not supported: 'latitude'
....
2022-07-08 15:25:19,517 UTC [18702] WARNING There were warnings in variable tas:
 Coordinate lon from var lon does not have bounds
 Coordinate lat from var lat does not have bounds
....
  File "/opt/scitools/conda/environments/esmvaltool-2.5.0/lib/python3.9/site-packages/esmvalcore/preprocessor/_regrid_esmpy.py", line 112, in is_lon_circular
    seam = (lon.bounds[1:-1, -1, (1, 2)]
TypeError: 'NoneType' object is not subscriptable

If this PR is applied though, the recipe will run fine.

The final dataset ALADIN63 is an example of a Lambert Conformal grid, and appears to work fine, because the file contains bounds on the coords. I'm trying to remember what the issue mentioned with Lambert Conformal grids earlier in this pr is. I'll update if I do remember, (perhaps it has been fixed in iris in the 3 years since this pr was created?).. Certainly #1156 is still an issue with these files, and this pr only helps via avoiding the problem by regridding first, which isn't always the best thing to do.

@thomascrocker
Copy link
Contributor

thomascrocker commented Jul 8, 2022

Just having a look at the details of this pr the lambert conformal fixes are in this line here:

def fix_coordinate_system(cube):

I.e. dealing with some models that have the coord system badly defined. I suspect this would be better applied as model specific fixes (although maybe with this function stored in a shared module that can be called from each fix individually) rather than has a blanket fix for everything from a project. In any case it's probably better off as a separate issue / PR, and leaving this one to deal solely with the bounds issues.

@sloosvel
Copy link
Contributor

I think a good part of this work is now covered by #1765 . I guess this can be closed

@sloosvel sloosvel closed this Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cmor Related to the CMOR standard enhancement New feature or request fix for dataset Related to dataset-specific fix files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants