Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check mandatory ATMODAT requirements w.r.t. the description of Model’s Axes #114

Open
atmodatcode opened this issue Jan 25, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request wontfix This will not be worked on for now

Comments

@atmodatcode
Copy link
Member

atmodatcode commented Jan 25, 2022

The ATMODAT Standard v3.0 Section 4.3. Specifications for File Formats and Standards states
"The ATMODAT standard requires that:
[...]
• NetCDF file headers include description of time, coordinate and vertical axes according to Appendix E.
[...] "

How do we address this requirement?

I suggest that we add a note in the checker results that we are not checking for this requirement and that users should check this requirement themselves. What do you think?

@atmodatcode atmodatcode added the enhancement New feature or request label Jan 25, 2022
@jkretz
Copy link
Collaborator

jkretz commented Jan 25, 2022

Well should be covered by the CF-Checker as far as I know

@atmodatcode
Copy link
Member Author

not necessarily.
See Appendix E ATMODAT Standard.
"E. Description of Model’s Axes
Providing horizontal, vertical and temporal axes is optional in the CF Conventions. The conventions just prescribe how these axes have to be described when they are provided. However, having spatial and temporal information is often required for a proper reuse of atmospheric model data. Therefore, this standard requires this information under specific conditions:
• If data are horizontally resolved (e.g. lon + lat or x + y), then horizontal coordinate axes should be provided.
• If data have reasonable vertical information (e.g. pressure or height), then the vertical axis should be described.
• If data are not static in time (e.g. via dimension and variable time), then the time axis should be provided."

So, if you have timeseries netcdf files where each timestep is stored in a separate file, then users might omit adding a time dimension to the data variables. So maybe they just put var1(lat,lon) and not var1(time,lat,lon) which - according to the ATMODAT Standard - they should. The CF-Checker won't complain about var1(lat,lon) ...
I understand that this is hard to capture with a checker, but we could put a short info message at the bottom of short summary to make users aware:

e.g.
Short summary
bla..bla
Please note:
• If data are horizontally resolved (e.g. lon + lat or x + y), then horizontal coordinate axes should be provided.
• If data have reasonable vertical information (e.g. pressure or height), then the vertical axis should be described.
• If data are not static in time (e.g. via dimension and variable time), then the time axis should be provided.

@jkretz
Copy link
Collaborator

jkretz commented Jan 25, 2022

Well var1(lat,lon) is static in time, therefore there is no need to check anything. And if a timeseries is stored in separate files, I wouldn't call that timeseries which makes it static again.
This is definitely an edge case that we cannot cover and furthermore, the nomenclature is "should" so I would avoid outputting a warning which we will have to display each and every time as we don't know what potential user data will look like.
Let's focus on things we can improve for now and leave that for later.

@atmodatcode
Copy link
Member Author

I don't agree with your "And if a timeseries is stored in separate files, I wouldn't call that timeseries which makes it static again.". If the nature of a variable is time-dependent (e.g. precipitation model output at a given time step), then it is not static...even if the output file only contains a single timestep. It is essential that users add time as a dimension because then, time information can be read in a machine-actionable manner because it is properly described via the time coordinate variable. And this allows users to merge the individual timesteps with e.g. cdo mergetime

Look at the header of a time series dataset where individual timesteps are saved in individual netcdf files...
Actually, this is quite common, especially when multidimensional data are stored and when storing of more than one timestep in a single file makes file sizes hard to handle.

dimensions:
time = UNLIMITED ; // (1 currently)
lon = 384 ;
lat = 192 ;
variables:
double time(time) ;
time:standard_name = "time" ;
time:long_name = "time" ;
time:units = "days since 1850-01-01 00:00:00" ;
time:calendar = "proleptic_gregorian" ;
time:axis = "T" ;
float rldscs(time, lat, lon) ;
....
data:
time = 0.0625 ;

static files are e.g. land-sea masks which are considered static (temporal changes over the model period are considered neglectable and the time dimension is therefore considered irrelevant).

I think we need to discuss this issue with the AtMoDat team.

.....

@jkretz
Copy link
Collaborator

jkretz commented Jan 25, 2022

Sounds good to dicuss that later.

@jkretz jkretz added the wontfix This will not be worked on for now label Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on for now
Projects
None yet
Development

No branches or pull requests

2 participants