Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gmtb-gfsphysics optimization: speedup of ccpp init #48

Merged
merged 5 commits into from
Mar 15, 2018
Merged

gmtb-gfsphysics optimization: speedup of ccpp init #48

merged 5 commits into from
Mar 15, 2018

Conversation

climbfuji
Copy link
Collaborator

This PR improves the performance of the CCPP implementation in FV3 v0 by

  • making use of threading (i.e. OpenMP parallel regions) where possible
  • by allocating Interstitial(nt) using the thread nt instead of thread 1 (first touch principle)
  • by using new functionality of the CCPP infrastructure that allows to copy an existing CCPP suite data structure from one cdata structure to another

This PR also

  • adds the missing calls to ccpp_finalize in IPD_CCPP_driver.F90 (IPD step 5)
  • performs some cleanup work and initialization of intent(out) variables in GFS_MP_generic_pre.f90 and GFS_calpreciptype.F90

The results are bit for bit compatible (tested on Theia/Intel and Macbook/GNU with OpenMP enabled).

…) derived type so that it gets allocated in the right place in memory (first touch principle): GFS_layer/GFS_driver.F90
…ata_block to avoid reading SDF multiple times, cleanup and formatting changes: IPD_layer/IPD_CCPP_driver.F90
@llpcarson
Copy link
Contributor

Dom - My tests today, and yesterday, are not giving bit-for-bit results, but I'm not sure that they should. The branch I used yesterday (gmtb-fv3-cleanup-and-error-handling) may or may not have had all of the same fixes as this one. Is this a concern worth tracking down? If your tests show bit-for-bit before/after, that's sufficient for me.

@climbfuji
Copy link
Collaborator Author

@llpcarson Did you create a new baseline or use my baseline? With the changes to the optimization flags in gmtb-gfsphysics PR47, the results have changed (see discussion there) and I did not update the "official" baseline yet. The following one works for me:

/scratch4/BMC/gmtb/Dom.Heinzeller/gmtb-fv3/reference/intel/C96fv3gfs2016092900/

@llpcarson
Copy link
Contributor

llpcarson commented Mar 14, 2018 via email

Copy link
Contributor

@llpcarson llpcarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

@climbfuji
Copy link
Collaborator Author

climbfuji commented Mar 14, 2018

My reference is from straight after pr47 went in, i.e. March 12/13, before the merge this morning and the PRs afterwards. I do get bfb identical results for the C96 test case. But I definitely want to understand why you don't!

@llpcarson
Copy link
Contributor

llpcarson commented Mar 14, 2018 via email

@climbfuji
Copy link
Collaborator Author

Great, thank you!

@climbfuji climbfuji merged commit ed0b2e5 into NCAR:features/ccpp Mar 15, 2018
@climbfuji climbfuji deleted the gmtb-gfsphysics-optimization-speedup-of-ccpp-init-rebased branch March 15, 2018 17:41
hannahcbarnes pushed a commit to hannahcbarnes/ccpp-physics that referenced this pull request Apr 7, 2020
* fv3atm issue NCAR#37: fix the real(8) lat/lon in netcdf file
* fv3atm NCAR#35: Reducing background vertical diffusivities in the inversion layers
* fv3atm NCAR#24: bug in gfsphysics/physics/moninedmf_hafs.f
* fv3atm NCAR#18: Optimize netcdf write component and bugfix for post and samfdeepcnv.f
* set (0-1) bounds for ficein_cpl
* remove cache_size due to lower netcdf verion 4.5.1 on mars
* Change ice falling to 0.9 in gfsphysics/physics/gfdl_cloud_microphys.F90
climbfuji pushed a commit to climbfuji/ccpp-physics that referenced this pull request Aug 10, 2020
…r_master

Update gsd/develop from NCAR master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants