Skip to content

Commit

Permalink
Yaml update (NOAA-GFDL#1269)
Browse files Browse the repository at this point in the history
 documentation update
  • Loading branch information
uramirez8707 authored and rem1776 committed May 1, 2024
1 parent d28dbf2 commit 8514ef8
Show file tree
Hide file tree
Showing 4 changed files with 139 additions and 238 deletions.
92 changes: 40 additions & 52 deletions diag_manager/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
## Diag Table Yaml Format:

The purpose of this documents is to explain the diag_table yaml format.
The purpose of this document is to explain the diag_table yaml format.

## Contents
- [1. Coverting from legacy ascii diag_table format](README.md#1-coverting-from-legacy-ascii-diag_table-format)
- [1. Converting from legacy ascii diag_table format](README.md#1-converting-from-legacy-ascii-diag_table-format)
- [2. Diag table yaml sections](README.md#2-diag-table-yaml-sections)
- [2.1 Global Section](README.md#21-global-section)
- [2.2 File Section](README.md#22-file-section)
Expand All @@ -15,7 +15,7 @@ The purpose of this documents is to explain the diag_table yaml format.
- [2.6 Sub_region Section](README.md#26-sub_region-section)
- [3. More examples](README.md#3-more-examples)

### 1. Coverting from legacy ascii diag_table format
### 1. Converting from legacy ascii diag_table format

To convert the legacy ascii diad_table format to this yaml format, the python script [**diag_table_to_yaml.py**](https://github.com/NOAA-GFDL/fms_yaml_tools/blob/aafc3293d45df2fc173d3c7afd8b8b0adc18fde4/fms_yaml_tools/diag_table/diag_table_to_yaml.py#L23-L26) can be used. To confirm that your diag_table.yaml was created correctly, the python script [**is_valid_diag_table_yaml.py**](https://github.com/NOAA-GFDL/fms_yaml_tools/blob/aafc3293d45df2fc173d3c7afd8b8b0adc18fde4/fms_yaml_tools/diag_table/is_valid_diag_table_yaml.py#L24-L27) can be used.

Expand All @@ -41,10 +41,10 @@ diag_files:

### 2.1 Global Section
The diag_yaml requires “title” and the “baseDate”.
- The **title** is a string that labels the diag yaml. The equivalent in the diag table would be the experiment. It is recommended that each diag_yaml have a separate title label that is descriptive of the experiment that is using it.
- The **basedate** is an array of 6 integer indicating the base_date in the format [year month day hour minute second].
- The **title** is a string that labels the diag yaml. The equivalent in the legacy diag_table would be the experiment. It is recommended that each diag_yaml have a separate title label that is descriptive of the experiment that is using it.
- The **basedate** is an array of 6 integers indicating the base_date in the format [year month day hour minute second].

**Example:**
**Example:**

In the YAML format:
```yaml
Expand All @@ -59,27 +59,28 @@ ESM4_piControl
```

### 2.2 File Section
The files are listed under the diagFiles section as a dashed array.
The files are listed under the diagFiles section as a dashed array.

Below are the **required** keys needed to define each file.
- **file_name** is a string that defines the name of the file. Do not add ".nc" and "tileX" to the filename as this will handle by FMS.
- **freq** is an integer that defines the frequency that data will be written. The acceptable values are:
- =-1: output at the end of the run only
- =0: output every timestep
- \>0: output frequency
- **freq_units** is a string that defines the units of the frequency from above. The acceptable values are seconds, minutes, hours, days, months, years.
- **time_units** is a string that defines units for time. The acceptable values are seconds, minutes, hours, days, months, years.
- **file_name** is a string that defines the name of the file. Do not add ".nc" and "tileX" to the filename as this will be handled by FMS.
- **freq** defines the frequency and the units that data will be written
- The acceptable values for freq are:
- =-1: output at the end of the run only
- =0: output every timestep
- \>0 units: output frequency and units (with a space between the frequency number and units e.g 24 hours)
- Values of -1 or 0 do not require units.
- The acceptable values for units are seconds, minutes, hours, days, months, years.
- **time_units** is a string that defines units for time. The acceptable values are seconds, minutes, hours, days, months, years.
- **unlimdim** is a string that defines the name of the unlimited dimension in the output netcdf file, usually “time”.
- **varlist** is a subsection that list all of the variable in the file

**Example:** The following creates a file with data written every 6 hours.
**Example:** The following creates a file with data written every 6 hours.

In the YAML format:
```yaml
diag_files:
- file_name: atmos_6hours
freq: 6
freq_units: hours
freq: 6 hours
time_units: hours
unlimdim: time
varlist:
Expand All @@ -93,10 +94,9 @@ In the legacy ascii format:

**NOTE:** The fourth column (file_format) has been deprecated. Netcdf files will always be written.

Below are some *optional* keys that may be added.
Below are some *optional* keys that may be added.
- **write_file** is a logical that indicates if you want the file to be created (default is true). This is a new feature that is not supported by the legacy ascii data_table.
- **new_file_freq** is a integer that defines the frequency for closing the existing file
- **new_file_freq_units** is a string that defines the time units for creating a new file. Required if “new_file_freq” used. The acceptable values are seconds, minuts, hours, days, months, years.
- **new_file_freq** is a string that defines the frequency and the frequency units (with a space between the frequency number and units) for closing the existing file
- **start_time** is an array of 6 integer indicating when to start the file for the first time. It is in the format [year month day hour minute second]. Requires “new_file_freq”
- **filename_time** is the time used to set the name of new files when using new_file_freq. The acceptable values are begin (which will use the begining of the file's time bounds), middle (which will use the middle of the file's time bounds), and end (which will use the end of the file's time bounds). The default is middle

Expand All @@ -105,12 +105,10 @@ Below are some *optional* keys that may be added.
In the YAML format:
```yaml
- file_name: ocn%4yr%2mo%2dy%2hr
freq: 6
freq: 6 hours
freq_units: hours
time_units: hours
unlimdim: time
new_file_freq: 6
new_file_freq_units: hours
new_file_freq: 6 hours
start_time: 2020 1 1 0 0 0
```
Expand All @@ -127,29 +125,25 @@ ocn_2020_01_01_15.nc for time_bnds [12,18]
ocn_2020_01_01_21.nc for time_bnds [18,24]
```

**NOTE** If using the new_file_freq, there must be a way to distinguish each file, as it was done in the example above.
**NOTE** If using the new_file_freq, there must be a way to distinguish each file, as it was done in the example above.

- **file_duration** is an integer that defines how long the file should receive data after start time in “file_duration_units”. This optional field can only be used if the start_time field is present. If this field is absent, then the file duration will be equal to the frequency for creating new files. The file_duration_units field must also be present if this field is present.
- **file_duration_units** is a string that defines the file duration units. The acceptable values are seconds, minutes, hours, days, months, years.
- **file_duration** is a string that defines how long the file should receive data after start time in “file_duration_units”. This optional field can only be used if the start_time field is present. If this field is absent, then the file duration will be equal to the frequency for creating new files.
- **global_meta** is a subsection that lists any additional global metadata to add to the file. This is a new feature that is not supported by the legacy ascii data_table.
- **sub_region** is a subsection that defines the four corners of a subregional section to capture.

### 2.2.1 Flexible output timings

In order to provide more flexibility in output timings, the new diag_table yaml format allows for different file frequencies for the same file by allowing the `freq`, `freq_units`, `new_file_freq`, `new_file_freq_units`, `file_duration`, `file_duration_units` keys to accept array of integers/strings.
In order to provide more flexibility in output timings, the diag_table yaml format allows for different file frequencies for the same file by allowing the `freq`, `new_file_freq`, and `file_duration` keys to accept a comma seperated list.

For example,
For example,
``` yaml
- file_name: flexible_timing%4yr%2mo%2dy%2hr
freq: 1 1 1
freq_units: hours hours hours
freq: 1 hours, 1 hours, 1 hours
time_units: hours
unlimdim: time
new_file_freq: 6 3 1
new_file_freq_units: hours hours hours
new_file_freq: 6 hours, 3 hours, 1 hours
start_time: 2 1 1 0 0 0
file_duration: 12 3 9
file_duration_units: hours hours hours
file_duration: 12 hours, 3 hours, 9 hours
filename_time: begin
varlist:
- module: ocn_mod
Expand Down Expand Up @@ -195,7 +189,7 @@ In the *yaml diag_table*:
The variables in each file are listed under the varlist section as a dashed array.

- **var_name:** is a string that defines the variable name as it is defined in the register_diag_field call in the model
- **reduction:** is a string that describes the data reduction method to perform prior to writing data to disk. Acceptable values are average, diurnalXX (where XX is the number of diurnal samples), powXX (whre XX is the power level), min, max, none, rms, and sum.
- **reduction:** is a string that describes the data reduction method to perform prior to writing data to disk. Acceptable values are average, diurnalXX (where XX is the number of diurnal samples), powXX (whre XX is the power level), min, max, none, rms, and sum.
- **module:** is a string that defines the module where the variable is registered in the model code
- **kind:** is a string that defines the type of variable as it will be written out in the file. Acceptable values are r4, r8, i4, and i8

Expand All @@ -214,7 +208,7 @@ In the legacy ascii format:
```
"moist", "precip", "precip", "atmos_8xdaily", "all", .true., "none", 2
```
**NOTE:** The fifth column (time_sampling) has be deprecated. The reduction_method (`.true.`) has been replaced with `average`. The output name was not included in the yaml because it is the same as the var_name.
**NOTE:** The fifth column (time_sampling) has been deprecated. The reduction_method (`.true.`) has been replaced with `average`. The output name was not included in the yaml because it is the same as the var_name.

which corresponds to the following model code
```F90
Expand All @@ -226,15 +220,15 @@ where:
- `axes` are the ids of the axes the variable is a function of
- `Time` is the model time

Below are some *optional* keys that may be added.
Below are some *optional* keys that may be added.
- **write_var:** is a logical that is set to false if the user doesn’t want the variable to be written to the file (default: true).
- **out_name:** is a string that defines the name of the variable that will be written to the file (default same as var_name)
- **long_name:** is a string defining the long_name attribute of the variable. It overwrites the long_name in the variable's register_diag_field call
- **attributes:** is a subsection with any additional metadata to add to the variable in the netcdf file. This is a new feature that is not supported by the legacy ascii data_table.
- **zbounds:** is a 2 member array of integers that define the bounds of the z axis (zmin, zmin), optional default is no limits.
- **zbounds:** is a 2 member array of integers that define the bounds of the z axis (zmin, zmin), optional default is no limits.

### 2.4 Variable Metadata Section
Any aditional variable attributes can be added for each varible can be listed under the attributes section as a dashed array. The key is attribute name and the value is the attribute value.
Any aditional variable attributes can be added for each variable can be listed under the attributes section as a dashed array. The key is attribute name and the value is the attribute value.

**Example:**

Expand Down Expand Up @@ -286,15 +280,12 @@ title: test_diag_manager
base_date: 2 1 1 0 0 0
diag_files:
- file_name: wild_card_name%4yr%2mo%2dy%2hr
freq: 6
freq_units: hours
freq: 6 hours
time_units: hours
unlimdim: time
new_file_freq: 6
new_file_freq_units: hours
new_file_freq: 6 hours
start_time: 2 1 1 0 0 0
file_duration: 12
file_duration_units: hours
file_duration: 12 hours
varlist:
- module: test_diag_manager_mod
var_name: sst
Expand All @@ -303,8 +294,7 @@ diag_files:
global_meta:
- is_a_file: true
- file_name: normal
freq: 24
freq_units: days
freq: 24 days
time_units: hours
unlimdim: records
varlist:
Expand All @@ -322,8 +312,7 @@ diag_files:
corner3: -60, 0
corner4: -60, 75
- file_name: normal2
freq: -1
freq_units: days
freq: -1 days
time_units: hours
unlimdim: records
write_file: true
Expand All @@ -346,8 +335,7 @@ diag_files:
corner3: 10, 25
corner4: 20, 25
- file_name: normal3
freq: -1
freq_units: days
freq: -1 days
time_units: hours
unlimdim: records
write_file: false
Expand Down
Loading

0 comments on commit 8514ef8

Please sign in to comment.