Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test new BNU soil type and VIIRS vegetation type data #821

Closed
GeorgeGayno-NOAA opened this issue May 5, 2023 · 35 comments · Fixed by #843
Closed

Test new BNU soil type and VIIRS vegetation type data #821

GeorgeGayno-NOAA opened this issue May 5, 2023 · 35 comments · Fixed by #843
Assignees
Labels
enhancement New feature or request

Comments

@GeorgeGayno-NOAA
Copy link
Collaborator

GeorgeGayno-NOAA commented May 5, 2023

A new version of the VIIRS vegetation type data was created that removes spurious snow points in warm regions.

On hera: /scratch2/NCEPDEV/land/data/input_data/vegetation_type/vegetation_type.viirs.v3.igbp.30s.nc

A new version of the BNU soil type data was created that is compatible with the updated vegetation data. It also includes additional records for the percentage and sand and clay.

On hera: /scratch2/NCEPDEV/land/data/input_data/soil_type/soil_type.bnu.v3.30s.nc

Test these new data in sfc_climo_gen and regenerate all GFS grids (including new soil color field).

Using 30s data requires more MPI tasks. Because of a quirk in the ESMF library, using too many tasks with coarse input data, such as our substrate temperature data set (2.6x1.5 degree), will cause a library error. This can be resolved by using a higher resolution version (0.5-degree) of this data:

On hera: /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/UFS_UTILS/fix/sfc_climo.test/substrate_temperature.gfs.0.5.nc

(Code to create the new 0.5-degree data on Hera: /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/climo_fields_netcdf/sorc/fv3.tbot.gfs.hires.netcdf

Related to #803.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Sanath is testing this dataset. However, he does not need the sand and clay records. I will create a branch where sfc_climo_gen ignores these records for now (the program will crash when it encounters these additional records).

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue May 5, 2023
@GeorgeGayno-NOAA GeorgeGayno-NOAA changed the title Add processing of sand and clay percentages to sfc_climo_gen Test new BNU soil type and VIIRS vegetation type data May 12, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

GeorgeGayno-NOAA commented May 12, 2023

Test the new data with the sfc_climo_gen utility scripts on Hera.

Using f9e63a3, I tried creating a global C1152 grid using the "fraction of each category" method (vegsoilt_frac=.true.). I chose these resources:

#SBATCH --nodes=6 --ntasks-per-node=9
#SBATCH --partition=bigmem
#SBATCH -q debug
#SBATCH -t 00:30:00

After 30 minutes, the job timed out before making it past the vegetation type processing, which is the first field processed.

I reran the same grid, but with vegsoilt_frac=.false. and these resources:

#SBATCH --nodes=3 --ntasks-per-node=14
#SBATCH --partition=bigmem
#SBATCH -q debug
#SBATCH -t 00:30:00

The job ran successfully in about 6 minutes.

Using vegsoilt_frac=.true. is very expensive.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Run some timing tests using the vegsoilt_frac=.true. options (On Hera).

For a C384 grid:

#SBATCH --nodes=6 --ntasks-per-node=12
#SBATCH --partition=bigmem
#SBATCH --exclusive

It ran successfully in 22 minutes.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

The above test was rerun at C768. It took 25 minutes.

(Note: The first try timed out after 30 minutes. The veg file was created at about 28 minutes. So, I assumed this test would take about an hour. That was not the case this time. Don't know why run times are so variable on Hera. Requesting 'exclusive' use of the nodes does not eliminate the variable run times).

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Retried running at C1152 grid with vegsoil_frac=.true. on Hera. Requested the following resources:

#SBATCH --nodes=7 --ntasks-per-node=12
#SBATCH --partition=bigmem
#SBATCH --exclusive

It ran to completion in one hour and one minute. The files were visualized using ncview and looked good.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Tried running a C3072 grid with vegsoil_frac=.false. on Hera. Used the following resources:

#SBATCH --nodes=7 --ntasks-per-node=12
#SBATCH --partition=bigmem
#SBATCH --exclusive

It ran successfully in 6 minutes.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Ran a regional C3445 case on Hera using the 'orog' files here: /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/UFS_UTILS/fix/orog/C3445

Set vegsoil_frac=.true. and used the following resources:

#SBATCH --nodes=5 --ntasks-per-node=12
#SBATCH --partition=bigmem
#SBATCH --exclusive

It ran successfully in 16 minutes.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

The timing tests above were done to determine whether lower resolution versions of the new 'viirs' and 'bnu' data are needed. As long as a developer has access to a machine with large memory, only the high-resolution (30s) is needed. On Hera, users can request 'bigmem' nodes. And WCOSS2 has sufficient memory as well. However, users trying to run on their laptop could have problems.

@barlage - I recommend we don't create lower-resolution versions of these files right now. They can always be created later.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue May 31, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

GeorgeGayno-NOAA commented Jun 1, 2023

So the following snow contaminated files should be removed from all machines:

Under ./fix/sfc_climo/20221017:

  • soil_type.bnu.30s.nc
  • vegetation_type.viirs.igbp.0.03.nc
  • vegetation_type.viirs.igbp.0.1.nc
  • vegetation_type.viirs.igbp.conus.30s.nc
  • vegetation_type.viirs.igbp.0.05.nc
  • vegetation_type.viirs.igbp.30s.nc
  • vegetation_type.viirs.igbp.nh.30s.nc

Under fix/sfc_climo/20220805:

  • vegetation_type.viirs.igbp.0.03.nc
  • vegetation_type.viirs.igbp.0.05.nc
  • vegetation_type.viirs.igbp.0.1.nc
  • vegetation_type.viirs.igbp.conus.0.01.nc

The updated files (without snow contamination) should be added somewhere under fix/sfc_climo

  • /scratch2/NCEPDEV/land/data/input_data/vegetation_type/vegetation_type.viirs.v3.igbp.30s.nc
  • /scratch2/NCEPDEV/land/data/input_data/soil_type/soil_type.bnu.v3.30s.nc

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Jun 1, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

So the following snow contaminated files should be removed from all machines:

Under ./fix/sfc_climo/20221017:

  • soil_type.bnu.30s.nc
  • vegetation_type.viirs.igbp.0.03.nc
  • vegetation_type.viirs.igbp.0.1.nc
  • vegetation_type.viirs.igbp.conus.30s.nc
  • vegetation_type.viirs.igbp.0.05.nc
  • vegetation_type.viirs.igbp.30s.nc
  • vegetation_type.viirs.igbp.nh.30s.nc

Under fix/sfc_climo/20220805:

  • vegetation_type.viirs.igbp.0.03.nc
  • vegetation_type.viirs.igbp.0.05.nc
  • vegetation_type.viirs.igbp.0.1.nc
  • vegetation_type.viirs.igbp.conus.0.01.nc

The updated files (without snow contamination) should be added somewhere under fix/sfc_climo

  • /scratch2/NCEPDEV/land/data/input_data/vegetation_type/vegetation_type.viirs.v2.igbp.30s.nc
  • /scratch2/NCEPDEV/land/data/input_data/soil_type/soil_type.bnu.v2.30s.nc

Tagging @WalterKolczynski-NOAA

@WalterKolczynski-NOAA
Copy link
Contributor

@GeorgeGayno-NOAA Please stage a new, complete sfc_climo directory on Hera and I will copy it to the authoritative fix locations as 20230601. Then we can update the version used by ufs_utils and global-workflow. I do not want to modify/remove files from set existing versions.

@WalterKolczynski-NOAA
Copy link
Contributor

@GeorgeGayno-NOAA @HelinWei-NOAA Is this NOAA-EMC/global-workflow#1617 or separate? If it is separate, there should be a g-w issue for it.

Sorry I didn't lead with that, there have been multiple simultaneous fix requests and I got them mixed up.

@WalterKolczynski-NOAA
Copy link
Contributor

Note: I am on leave next week, so if we can't get this done Friday, it will sit for a week.

@HelinWei-NOAA
Copy link
Collaborator

@barlage What is your opinion? Are all data ready?

@GeorgeGayno-NOAA
Copy link
Collaborator Author

@GeorgeGayno-NOAA @HelinWei-NOAA Is this NOAA-EMC/global-workflow#1617 or separate? If it is separate, there should be a g-w issue for it.

Sorry I didn't lead with that, there have been multiple simultaneous fix requests and I got them mixed up.

It is separate. NOAA-EMC/global-workflow#1617 is for soil color, which is not the same as soil type.

@barlage
Copy link
Collaborator

barlage commented Jun 2, 2023

@HelinWei-NOAA there was a minor issue encountered yesterday with the processing of the high res data in sfc_climo_gen, namely the number of processors required to generate grids with the high res data is more than can be handled with the very low res slope data (limitation of ESMF?). @GeorgeGayno-NOAA has a potential fix but @sanatcumar still needs to generate all the tiled files (sanath - you may need to create these in parallel to speed up the process). I'm hopeful that we can still get this in before @WalterKolczynski-NOAA goes on leave.

@WalterKolczynski-NOAA
Copy link
Contributor

@GeorgeGayno-NOAA @HelinWei-NOAA Is this NOAA-EMC/global-workflow#1617 or separate? If it is separate, there should be a g-w issue for it.
Sorry I didn't lead with that, there have been multiple simultaneous fix requests and I got them mixed up.

It is separate. NOAA-EMC/global-workflow#1617 is for soil color, which is not the same as soil type.

Okay, then please create a new global-workflow issue using the "Fix File Update" template.

@sanatcumar
Copy link
Collaborator

sanatcumar commented Jun 5, 2023

Successfully tested this version of sfc_climo_gen on Hera.

Regenerated a set of surface fields with the following

in /util/sfc_climo_gen/sfc_gen.sh
export vegsoilt_frac=.true. # for res lower and equal to 1152
export vegsoilt_frac=.false. # for res higher than 1152
export FIX_FV3=/scratch1/NCEPDEV/global/glopara/fix/orog/20220805/C${res}

in /ush/sfc_climo_gen.sh
SOIL_TYPE_FILE="/scratch2/NCEPDEV/land/data/input_data/soil_type/soil_type.bnu.v2.30s.nc"
VEG_TYPE_FILE="/scratch2/NCEPDEV/land/data/input_data/vegetation_type/vegetation_type.viirs.v2.igbp.30s.nc"

results at
/scratch2/NCEPDEV/stmp1/Sanath.Kumar/repo_viirs30sV2_bnuV2

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 11, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

@barlage and @sanatcumar - This is just about ready to merge. Should we merge this before or after #825?

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 14, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 15, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 18, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 18, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 18, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 21, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 21, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 21, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

The new data was tested on a C96 global uniform grid. The branch at 397cd43 was used on Hera. A local copy of the ./fix/sfc_climo` data was created under my personal directory. This directory contained the new veg, soil and substrate temperature data.

To test the current data (with the spurious snow points), the ./ush/sfc_climo_gen.sh script was updated to point to the official directory as follows:

 cat << EOF > ./fort.41
 &config
 input_facsf_file="${input_sfc_climo_dir}/facsf.1.0.nc"
-input_substrate_temperature_file="${input_sfc_climo_dir}/substrate_temperature.gfs.0.5.nc"
+input_substrate_temperature_file="/scratch1/NCEPDEV/global/glopara/fix/sfc_climo/20221017/substrate_temperature.2.6x1.5.nc"
 input_maximum_snow_albedo_file="${input_sfc_climo_dir}/maximum_snow_albedo.0.05.nc"
 input_snowfree_albedo_file="${input_sfc_climo_dir}/snowfree_albedo.4comp.0.05.nc"
 input_slope_type_file="${input_sfc_climo_dir}/slope_type.1.0.nc"
-input_soil_type_file="${SOIL_TYPE_FILE}"
+input_soil_type_file="/scratch1/NCEPDEV/global/glopara/fix/sfc_climo/20221017/soil_type.bnu.30s.nc"
 input_soil_color_file="${input_sfc_climo_dir}/soil_color.clm.0.05.nc"
-input_vegetation_type_file="${VEG_TYPE_FILE}"
+input_vegetation_type_file="/scratch1/NCEPDEV/global/glopara/fix/sfc_climo/20221017/vegetation_type.viirs.igbp.30s.nc"
 input_vegetation_greenness_file="${input_sfc_climo_dir}/vegetation_greenness.0.144.nc"
 mosaic_file_mdl="$mosaic_file"

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Results of the above test: All vegetation type files were identical except for tile6. Using the nccmp utility, there were 9 points that were different:

/scratch2/NCEPDEV/stmp1/George.Gayno/my_grids.branch/C96/fix_sfc $ nccmp -dmfsqS C96.vegetation_type.tile6.nc /scratch2/NCEPDEV/stmp1/George.Gayno/my_grids.develop/C96/fix_sfc/C96.vegetation_type.tile6.nc
Variable        Group Count Sum AbsSum Min Max Range Mean StdDev
vegetation_type /         9 -36     36  -4  -4     0   -4      0

The differences were confined to Antarctica. The new data has category '11' (Permanent wetlands). That does not make sense.

Screenshot (45)

@barlage FYI.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Differences between the current and new substrate temperature data were larger. The new data is a remapping of the current global 2.6x1.5 degree dataset to a 0.5-degree dataset. The copygb utility was used for this mapping. For details see (on Hera) /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/climo_fields_netcdf/sorc/fv3.tbot.gfs.hires.netcdf

Here is a plot of tile 3 - using the new data, current data, and the new minus the current. In the broad sense, the plots are very similar. But near the poles, the differences can approach 2 degrees

Screenshot (46)
.

@sanatcumar and @barlage - I rechecked my remapping procedure and I don't see anything I am doing wrong. Should we be concerned about these differences? Or should we implement with the idea that we will replace this dataset (which is probably about 30 years old) with a newer and higher resolution dataset. Recall, the only reason for remapping to 0.5-degree is to avoid that quirky ESMF library issue.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Aug 21, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

Differences between the current and new substrate temperature data were larger. The new data is a remapping of the current global 2.6x1.5 degree dataset to a 0.5-degree dataset. The copygb utility was used for this mapping. For details see (on Hera) /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/climo_fields_netcdf/sorc/fv3.tbot.gfs.hires.netcdf

Here is a plot of tile 3 - using the new data, current data, and the new minus the current. In the broad sense, the plots are very similar. But near the poles, the differences can approach 2 degrees

Screenshot (46) .

@sanatcumar and @barlage - I rechecked my remapping procedure and I don't see anything I am doing wrong. Should we be concerned about these differences? Or should we implement with the idea that we will replace this dataset (which is probably about 30 years old) with a newer and higher resolution dataset. Recall, the only reason for remapping to 0.5-degree is to avoid that quirky ESMF library issue.

I think the problem may be a mistake in the geo referencing in the current file, not the new file. Investigating.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

After some investigation, the latitude records in the official substrate temperature file - /scratch1/NCEPDEV/global/glopara/fix/sfc_climo/20221017/substrate_temperature.2.6x1.5.nc are not correct.

The grib1 version of that file has this geo-referencing:

/scratch1/NCEPDEV/global/glopara/fix/am/20220805 $ wgrib -V global_tg3clim.2.6x1.5.grb
Using NCEP opn table, see -ncep_opn, -ncep_rean options
rec 1:0:date 1900010100 TMP kpds5=11 kpds6=111 kpds7=500 levels=(1,244) grid=255 500 cm down 0-1yr product:ave@1yr:
  TMP=Temp. [K]
  timerange 51 P1 0 P2 1 TimeU 4  nx 138 ny 116 GDS grid 0 num_in_ave 1 missing 0
  center 7 subcenter 0 process 80 Table 2 scan: WE:NS winds(N/S)
  latlon: lat  90.000000 to -88.125000 by 1.539000  nxny 16008
          long 0.000000 to -1.875000 by 2.609000, (138 x 116) scan 0 mode 128 bdsgrid 1
  min/max data 215.5 302.4  num bits 10  BDS_Ref 2155  DecScale 1 BinScale 0

The latitude of the first point is 90.0. In the official netcdf version used by sfc_climo_gen, the first few latitudes are:

lat = 89.2305, 87.6915, 86.1525, 84.6135,

So, the data are shifted to the south by about 0.7 degrees.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

I recreated the OPS substrate temperature file using this code on Hera - /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/climo_fields_netcdf/sorc/fv3.tbot.gfs.netcdf.bugfix

The latitudes now match the grib header:

/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/UFS_UTILS/fix/sfc_climo.test $ ncdump -v lat substrate_temperature.2.6x1.5.corrected.nc
netcdf substrate_temperature.2.6x1.5.corrected {
dimensions:
        idim = 138 ;
        jdim = 116 ;
        jdim_p1 = 117 ;
        time = 1 ;

data:

 lat = 90, 88.451087, 86.902174, 85.353261, 

@GeorgeGayno-NOAA
Copy link
Collaborator Author

The new 0.5-degree version of the substrate data was created using a two-step method.

First, the copygb utilty was used to convert the grib1 file - /scratch1/NCEPDEV/global/glopara/fix/am/20220805/global_tg3clim.2.6x1.5.grb to a global 0.5-degree grid (NCEP grid 235 https://www.nco.ncep.noaa.gov/pmb/docs/on388/tableb.html#GRID235). copygb uses the grib header to geo-reference the input data.

Then, the 0.5-degree grib1 file was converted to netcdf. The latitudes are correct:

/scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/UFS_UTILS/fix/sfc_climo.test $ ncdump -v lat substrate_temperature.gfs.0.5.nc
netcdf substrate_temperature.gfs.0.5 {
dimensions:
        idim = 720 ;
        jdim = 360 ;
        jdim_p1 = 361 ;
        time = 1 ;
data:

 lat = 89.75, 89.25, 88.75, 88.25, 87.75, 87.25

@sanatcumar
Copy link
Collaborator

Good catch.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

Run a quick test with the branch at 5ea89bc. Use a C96 global uniform grid and the corrected OPS substrate data. Then, repeat the test using the 0.5-degree data.

The difference on tile3 is now much smaller and makes sense.

Screenshot (48)

The 0.5-degree data is correct and should be implemented.

@GeorgeGayno-NOAA
Copy link
Collaborator Author

The 'v2' versions of the soil and veg type data still had problems with spurious glaciers. So a version 3 of these files was created.

Updated (and hopefully final) list of files that should be added or removed from all machines:

The following snow contaminated files should be removed:

Under ./fix/sfc_climo/20221017:

  • soil_type.bnu.30s.nc
  • vegetation_type.viirs.igbp.0.03.nc
  • vegetation_type.viirs.igbp.0.1.nc
  • vegetation_type.viirs.igbp.conus.30s.nc
  • vegetation_type.viirs.igbp.0.05.nc
  • vegetation_type.viirs.igbp.30s.nc
  • vegetation_type.viirs.igbp.nh.30s.nc

Under fix/sfc_climo/20220805:

  • vegetation_type.viirs.igbp.0.03.nc
  • vegetation_type.viirs.igbp.0.05.nc
  • vegetation_type.viirs.igbp.0.1.nc
  • vegetation_type.viirs.igbp.conus.0.01.nc

The updated soil and vegetation files (without snow contamination) should be added somewhere under fix/sfc_climo

  • /scratch2/NCEPDEV/land/data/input_data/vegetation_type/vegetation_type.viirs.v3.igbp.30s.nc
  • /scratch2/NCEPDEV/land/data/input_data/soil_type/soil_type.bnu.v3.30s.nc

Also, a new higher resolution version of the 2.6x1.5 degree soil substrate temperature should also be added:

  • /scratch1/NCEPDEV/da/George.Gayno/ufs_utils.git/UFS_UTILS/fix/sfc_climo.test/substrate_temperature.gfs.0.5.nc

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Sep 21, 2023
@GeorgeGayno-NOAA
Copy link
Collaborator Author

The 'v3' versions of the soil and vegetation type data were tested by @sanatcumar. All output looked correct.

GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Sep 22, 2023
GeorgeGayno-NOAA added a commit to GeorgeGayno-NOAA/UFS_UTILS that referenced this issue Sep 26, 2023
that contains the new soil and veg files.

Fixes ufs-community#821.
GeorgeGayno-NOAA added a commit that referenced this issue Sep 26, 2023
Updates to use new versions of the BNU and VIIRS data with spurious ice
points in the tropics removed, and a new higher-res (0.5-degree) version of
the GFS soil substrate data.

Fixes #821 
Fixes #803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants