-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State of the art FX variables handling without preprocessing #557
Conversation
I will keep maintaining #439 because @ledm and @LisaBock are still using it to get IPCC stuffs done, but I reckon that upon approval of this PR we can safely close #511 and start thinking how to get us rid of ancestors and implement the fx vars preprocessing via |
thanks @valeriupredoi I tried some testing. It seems to work perfectly for UKESM1. I processed historical data and it correctly picked up the piControl sftlf file. I used pre-processor: I have also combined it with annual meaning in the pre-processor: Both of these bit-compare with raw CMIP6 files I pulled back and processed by hand. I'll also give it a try with some other models. |
Note that now you can throw nothing or anything including the kitchen sink or anything inbetween at the preprocessors that use
|
cheers muchly @ChrisJones-MOHC - was just writing a comment-example 🍺 |
just to confirm secondary test - it works also on CNRM-CM6-1 where it picked up sftlf from amip. Again results agree with processing manually. |
for fxvar in var['fx_files']} | ||
settings[step]['fx_files'] = fx_files_dict | ||
logger.info(msg, step, pformat(fx_files_dict)) | ||
"""Update fx settings depending on the needed method.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is needlessly complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hang on, looking at it in a jiffy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok simplified now courtesy to @zklaus last suggestions, happy Bouwe or unhappy Bouwe? 🍺
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, much better! I still think the function inside a function _get_fx_vars_from_attribute
is not really needed, or do you have a particular reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deal with it 😆
Co-Authored-By: Bouwe Andela <bouweandela@users.noreply.github.com>
Co-Authored-By: Klaus Zimmermann <klaus.zimmermann@smhi.se>
waddup, guys? 🍺 |
doc/esmvalcore/preprocessor.rst
Outdated
weighting_landsea_fraction: | ||
area_type: land | ||
exclude: ['CanESM2', 'reference_dataset'] | ||
fx_files: [{'short_name': 'sftlf', 'exp': 'piControl'}, {'short_name': 'sftof', 'exp': 'piControl'}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to use the keyword fx_variables
instead of fx_files
? The current name suggests you have to give it a path to a certain file which might be confusing (see for example https://esmvaltool.readthedocs.io/projects/esmvalcore/en/latest/recipe/preprocessor.html#extract-shape with the shapefile
keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this looks like it could be kept more readable by using the same format as for data sets:
datasets:
- {dataset: CanESM2, project: CMIP5, exp: historical, ensemble: r1i1p1, start_year: 2001, end_year: 2004}
- {dataset: UKESM1-0-LL, project: CMIP6, exp: historical, ensemble: r1i1p1f2, start_year: 2001, end_year: 2004, grid: gn}
- {dataset: EC-EARTH3, alias: custom_alias, project: CMIP6, exp: historical, ensemble: r1i1p1f1, start_year: 2001, end_year: 2004, grid: gn}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced with fx_variables
-> the user can specify whatever keys they want and in whatever order they want them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the user can specify whatever keys they want and in whatever order they want them
sorry for not being clear, I didn't mean the should add these specific keys and order, just the way it is formatted, so instead of the current:
fx_files: [{'short_name': 'sftlf', 'exp': 'piControl'}, {'short_name': 'sftof', 'exp': 'piControl'}]
have the other format in the documentation:
fx_files:
- {'short_name': 'sftlf', 'exp': 'piControl'}
- {'short_name': 'sftof', 'exp': 'piControl'}
But it's just a suggestion. Polishing nail extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I saw you now replaced the argument fx_variables
on lots of other places as well, but the docstrings still say that these fx_variables
then should be a dictionary with {field: filename of fx variable}
. There's also some other places where still the name fx_files
is used. maybe check whether in these instances that you now changed it's really not the full filename but just the variable name? If you change these couple, then change the other occurrences as well. (e.g def tile_grid_areas(cube, fx_files):
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @valeriupredoi how was the pizza? Did you have a look at this already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great pizza, cheers, I can't remember the toppings though 😁 So here's the thing: the way the user can ask for fx_variables
in the recipe (as arg of the preprocessor step) can be either as a list of short names eg fx_variables: ['sftlf', 'sftof']
or as a list of dicts eg fx_variables: [{short_name: sftlf, mip: fx}, {short_name: sftof, mip: fx}]
so making the listing look like the variable dictionary will mean that the user will be confused as to why it could also be just a simple list; tile_grid_areas
actually takes as argument the fx files and not the variables, so that's why I left it with fx_files
in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch though - I just realized I missed adding documentation for these funcs (now added) 🍺
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, if you think this won't cause confusion I'm perfectly fine with this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be fine, RTFM will play a good part here 😁
esmvalcore/_recipe.py
Outdated
var = dict(variable) | ||
var_project = variable['project'] | ||
get_project_config(var_project) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't seem to use the returned value from get_project_config()
here, do you intend to only use the check and error raising? And if so, can't we make it more clear that you're doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced with an actual RecipeError'
ping @valeriupredoi |
Soorry folks, on Easter hols (staying at home this year 😁 ), will revisit this bright and early on Tuesday 🐰 |
ayt, guys, have now plugged them suggestions, what say yous? 🍺 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I think it's ready to merge.
cheers @JaroCamphuijsen mate, nice review 🍺 |
This PR is the abridged version of #511 and includes all the modifications from #511 and #439 that @schlunma and myself have implemented and @zklaus toiled to review and suggest a number of very useful changes; it however doesn't include the hacky way of preprocessing fx data as in #511 and #439 since we may want to hold that in the background until implementing the iris
cell_measures
parsing as @bouweandela suggested; it also includes functionality to solve @ChrisJones-MOHC issue reported in #547Most of it has been reviewed by @zklaus and should be in good shape, but alas, another pass and testing is needed; review for tests is still needed as well.
Cheers 🍺