Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass annual-average fields to GLC #1413

Merged
merged 14 commits into from
Apr 27, 2017
Merged

Pass annual-average fields to GLC #1413

merged 14 commits into from
Apr 27, 2017

Conversation

billsacks
Copy link
Member

@billsacks billsacks commented Apr 24, 2017

These changes are from Mariana Vertenstein (primarily) and myself

Until now, the CESM coupler has been sending CISM daily-average fields,
and then CISM has been averaging these fields to annual-averages in
order to do its mass balance time step using annual-average SMB (and
temperature). This has a number of problems:

(1) We could not restart mid-year, because CISM doesn't write the
necessary partial accumulation information to its restart files.

(2) With Bill Lipscomb's new remapping (bilinear + conservation
correction; coming soon), there were too-large corrections needed
when doing the remapping from CLM to CISM on daily fields.

(3) This led to a discrepancy between TG and BG/IG compsets: In BG/IG,
CISM received the annual average of remapped daily fields; in TG,
CISM received the remapped annual average fields. The remapping is
not a linear operation, so there would typically be differences in
these two runs.

This tag, together with cism2_1_30, allows the coupler to make annual
averages before remapping fields to CISM. This was a bit tricky: CISM
still needs to be called once per day in order to support mid-year
restarts (CESM can only restart at a frequency when all components are
called). So now the GLC calling frequency has been separated from the
GLC averaging frequency. Then GLC/CISM is passed a flag saying whether
the inputs are currently valid; CISM only accumulates averages of the
inputs when they are valid. (In the standard case, this means that CISM
accumulates a single value each year, leaving all of the averaging up to
the coupler. But this supports the ability to still do the averaging in
CISM if this is desired - however, note that mid-year restarts won't
work in that case [CISM now aborts if you try to do a mid-year restart
in that configuration].)

This is implemented in such a way that ACME runs with MPAS-LI should be
unaffected: the default GLC_AVG_PERIOD for ACME is
'glc_coupling_period', which makes things collapse to the old behavior.

There are some mostly-unrelated changes in
src/drivers/mct/cime_config/buildnml and
src/drivers/mct/cime_config/namelist_definition_drv.xml. These use
add_default rather than set_value for the various *_cpl_dt entries,
because in general we should not be calling set_value directly. And,
since we're using add_default, we needed to add
skip_default_entry="true" to the relevant xml variables.

In CESM, this change must be coordinated with cism2_1_30 and
CESM-Development/cime_config#6

Test suite: scripts_regression_tests on yellowstone
Also, aux_glc test suite with cism2_1_30:

  • out-of-the-box: tests pass, and bit-for-bit except for IG tests
    without test_coupling
  • with GLC_AVG_PERIOD = 'glc_coupling_period': tests pass and
    bit-for-bit, except for expected failures in tests that restart
    mid-year
    Test baseline: For aux_glc tests, baseline was cism2_1_29
    Test namelist changes: YES - adds new glc_avg_period
    Test status: greater than roundoff-level changes for multi-year runs
    with CISM, other than TG

Fixes none

User interface changes?: new xml variable, GLC_AVG_PERIOD

Code review:

cc @whlipscomb @JeremyFyke @stephenprice @matthewhoffman

Mariana Vertenstein and others added 13 commits April 5, 2017 21:12
Mostly changes comments and documentation

Also removes a trailing comma from a list in python
I'm not sure this is necessary, but it seems safest
Mariana had changed this to use glcrun_avg_alarm. But I think this work
should really be done every time glc is run, even if it's not an
averaging time.
If specifying GLC_AVG_PERIOD = 'glc_coupling_period', we'll
automatically set the averaging clock to match the glc run clock. This
provides more robustness than the earlier hard-coded 'daily' option, in
case the glc coupling period is not actually daily.
Abort if it's ever the case that glcrun_avg_alarm is true, but
glcrun_alarm is false
In principle we shouldn't need to zero the fields at these times -
instead, glc should just ignore the fields at these times. However, some
tests (like an ERS or ERI test that stops the final run segment
mid-year) can fail if we don't explicitly zero the fields, because the
x2g fields can then differ upon restart.

The refactored logic in cesm_comp_mod also results in these changes:

(1) This code:

         !----------------------------------------------------
         !| cpl -> glc
         !----------------------------------------------------

         if (iamin_CPLALLGLCID .and. glc_prognostic) then
            call component_exch(glc, flow='x2c', &
                 infodata=infodata, infodata_string='cpl2glc_run', &
                 mpicom_barrier=mpicom_CPLALLGLCID, run_barriers=run_barriers, &
                 timer_barrier='CPL:C2G_BARRIER', timer_comp_exch='CPL:C2G', &
                 timer_map_exch='CPL:c2g_glcx2glcg', timer_infodata_exch='CPL:c2g_infoexch')
         endif

will now only be executed if glcrun_alarm is true. That's what happened
on master, but the previous version of the branch executed it even if
glcrun_alarm was false.

(2) This code:

                  call cesm_comp_barriers(mpicom=mpicom_CPLID, timer='CPL:GLCPREP_BARRIER')
                  call t_drvstartf ('CPL:GLCPREP',cplrun=.true.,barrier=mpicom_CPLID)
                  if (drv_threading) call seq_comm_setnthreads(nthreads_CPLID)

and then:

                  if (drv_threading) call seq_comm_setnthreads(nthreads_GLOID)
                  call t_drvstopf  ('CPL:GLCPREP',cplrun=.true.)

will happen even if .not. glcrun_avg_alarm
@@ -2171,6 +2190,8 @@ subroutine cesm_run()
if (tod == 0) t24hr_alarm = .true.
if (month==1 .and. day==1 .and. tod==0) t1yr_alarm = .true.

! TODO(wjs, 2017-04-05) I think glcrun_alarm can be removed from infodata: It used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we go ahead and do this as part of the PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made this change, and am rerunning testing

This used to be used by CLM, but is no longer needed.
Copy link
Collaborator

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me.

@apcraig
Copy link
Collaborator

apcraig commented Apr 25, 2017

I'm unclear about this statement "Then GLC/CISM is passed a flag saying whether the inputs are currently valid; CISM only accumulates averages of the inputs when they are valid.". I thought the change was that the coupler was accumulating the input. Should this say, "CISM only runs when the inputs are valid"?

@billsacks
Copy link
Member Author

@apcraig - thanks for your review. Sorry, I realized the wording was confusing there because Ieft out a subtlety. I have reworded that paragraph to the following; I hope this makes things more clear:

This tag, together with cism2_1_30, allows the coupler to make annual
averages before remapping fields to CISM. This was a bit tricky: CISM
still needs to be called once per day in order to support mid-year
restarts (CESM can only restart at a frequency when all components are
called). So now the GLC calling frequency has been separated from the
GLC averaging frequency. Then GLC/CISM is passed a flag saying whether
the inputs are currently valid; CISM only accumulates averages of the
inputs when they are valid. (In the standard case, this means that CISM
accumulates a single value each year, leaving all of the averaging up to
the coupler. But this supports the ability to still do the averaging in
CISM if this is desired - however, note that mid-year restarts won't
work in that case [CISM now aborts if you try to do a mid-year restart
in that configuration].)

@billsacks
Copy link
Member Author

I have rerun scripts_regression_tests on the latest commit, and everything passes except for the expected failures in K_TestCimeCase

@jedwards4b
Copy link
Contributor

There are no expected failures on master - the issues you opened on that have been resolved.
Perhaps you need to rebase?

@@ -1387,7 +1387,7 @@
<!-- group seq_timemgr_inparm -->
<!-- =========================== -->

<entry id="atm_cpl_dt" modify_via_xml="ATM_NCPL">
<entry id="atm_cpl_dt" modify_via_xml="ATM_NCPL" skip_default_entry="true">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these changes for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvertens - can you address this? I don't remember, if I ever knew at all.

@@ -1,8 +1,4 @@
!===============================================================================
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this file had a lot of clean-up in addition to the glc-required changes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, by @mvertens

@mvertens
Copy link
Contributor

mvertens commented Apr 27, 2017 via email

@mvertens
Copy link
Contributor

mvertens commented Apr 27, 2017 via email

@rljacob
Copy link
Member

rljacob commented Apr 27, 2017

Why are the skip_default_entry="true" additions needed for the glc feature?

@mvertens
Copy link
Contributor

mvertens commented Apr 27, 2017 via email

@billsacks
Copy link
Member Author

@mvertens - Sorry it took me so long to get this PR submitted from the time you handed me the code.

I have a feeling that the skip_default_entry="true" changes were needed in conjunction with this change:

--- a/src/drivers/mct/cime_config/buildnml
+++ b/src/drivers/mct/cime_config/buildnml
@@ -101,7 +101,7 @@ def _create_drv_namelists(case, infile, confdir, nmlgen, files):
             totaldt = cpl_dt * int(ncpl)
             if totaldt != basedt:
                 expect(False, " %s ncpl doesn't divide base dt evenly" %comp)
-            nmlgen.set_value(comp.lower() + '_cpl_dt', value=cpl_dt)
+            nmlgen.add_default(comp.lower() + '_cpl_dt', value=cpl_dt)
             mindt = min(mindt, cpl_dt)
 #        elif comp.lower() is not 'cpl':

although I'm not sure that brings us any closer to remembering why these changes were put in on this branch.

@mvertens
Copy link
Contributor

mvertens commented Apr 27, 2017 via email

@billsacks
Copy link
Member Author

@mvertens - I just tried changing the ATM_NCPL xml variable and the atm_cpl_dt namelist item in user_nl_cpl, and got the same behavior in master and on this branch: Both allowed me to change ATM_NCPL, and neither allowed me to change atm_cpl_dt in user_nl_cpl.

I know you had a reason for this change: I remember your explaining it to me back in January. I just forget what it was....


!---------------------------------------------------------------
! Description
! Set glc inputs to zero
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is just to generate consistent 'garbage' for times when the field is not valid. One thought would be to fill it with a very obvious garbage value to help protect against the possibility that GLC accidentally tries to actually use it. But I realize just filling with zeroes might be more aesthetically pleasing and such an approach should not be necessary, so feel free to ignore this comment.

@matthewhoffman
Copy link
Contributor

@billsacks , a cursory review does not reveal anything that concerns me. Thank you for implementing these changes with ACME considerations in mind. I think this capability will be useful for us as well once we start working on refining our coupling interval.

@rljacob
Copy link
Member

rljacob commented Apr 27, 2017

I also didn't see any showstoppers.

@billsacks
Copy link
Member Author

@rljacob and @matthewhoffman - thanks for your reviews of this.

@rljacob - I have added some summarizing information in the top comment, summarizing the changes @mvertens made to the driver namelist generation:

There are some mostly-unrelated changes in
src/drivers/mct/cime_config/buildnml and
src/drivers/mct/cime_config/namelist_definition_drv.xml. These use
add_default rather than set_value for the various *_cpl_dt entries,
because in general we should not be calling set_value directly. And,
since we're using add_default, we needed to add
skip_default_entry="true" to the relevant xml variables.

@matthewhoffman - I see your point about sending garbage, but given the tight timeline here and the fact that you didn't see that as critical, I'm going to leave that as is.

@billsacks billsacks changed the title DO NOT MERGE YET: Pass annual-average fields to GLC Pass annual-average fields to GLC Apr 27, 2017
@billsacks billsacks merged commit fe82bd6 into ESMCI:master Apr 27, 2017
@billsacks billsacks deleted the billsacks/glc_tavg_v2 branch April 27, 2017 20:57
jgfouca added a commit that referenced this pull request Jun 2, 2017
Update CIME to cime5.3

Update ACME to use CIME 5.3.x up to ESMCI hash 6156e0a from Thursday 4/13.

scripts_regression_tests and acme_developer tests pass on penn workstation.

This CIME change involves a large cosmetic change in CIME. Several directories have been renamed.

    The main interface to CIME, the scripts dir remains unchanged.
    cime/cime_config is renamed to cime/config
    The Fortran source code has been consolidated in src directory: externals, components, share, and drivers
    The driver_cpl directory has been moved to src/drivers/mct to prepare for multiple coupler options
    The share/csm_share/share directory has been renamed src/share/util.
    The cime script infrastructure source, utils/python/CIME, has moved to scripts/lib/CIME.

@singhbalwinder : please make sure PNNL cluster merge look correct
@mfdeakin-sandia : please look at cprnc/CMakeLists.txt and also confirm no run_acme changes.

Fixes #1311
Fixes #1380
Fixes #1382
Fixes #1383
Fixes #1396
Fixes #1402
Fixes #1416

[BFB]

* agsalin/update-to-cime5.3: (5649 commits)
  Fix titan problem
  Reset invalid pio_numiotasks
  Fix merge mistake
  Re-add eca testmod, was somehow lost in big merge
  Restore ACME version of shr_orb_cosz
  Fixed whitespace errors in Makefile killing COSP
  HOMME test typo fix
  Fixes to get tests to pass.
  Update ACME for new CIME directory structure
  More renaming.
  Rename ACME/cime directories in prep for CIME merge
  HOMME Improvement
  More edits to README
  Suggest a more portable usage of mktemp
  Tweak a comment
  Add comment on acme side
  Fix case for new unit_testing attribute to get_mpirun
  Allow run_tests.py to auto-determine the machine name
  fix an error
  make user-compset a seperate test
  ...
jgfouca added a commit that referenced this pull request Feb 23, 2018
Update CIME to cime5.3

Update ACME to use CIME 5.3.x up to ESMCI hash 6156e0a from Thursday 4/13.

scripts_regression_tests and acme_developer tests pass on penn workstation.

This CIME change involves a large cosmetic change in CIME. Several directories have been renamed.

    The main interface to CIME, the scripts dir remains unchanged.
    cime/cime_config is renamed to cime/config
    The Fortran source code has been consolidated in src directory: externals, components, share, and drivers
    The driver_cpl directory has been moved to src/drivers/mct to prepare for multiple coupler options
    The share/csm_share/share directory has been renamed src/share/util.
    The cime script infrastructure source, utils/python/CIME, has moved to scripts/lib/CIME.

@singhbalwinder : please make sure PNNL cluster merge look correct
@mfdeakin-sandia : please look at cprnc/CMakeLists.txt and also confirm no run_acme changes.

Fixes #1311
Fixes #1380
Fixes #1382
Fixes #1383
Fixes #1396
Fixes #1402
Fixes #1416

[BFB]

* agsalin/update-to-cime5.3: (5649 commits)
  Fix titan problem
  Reset invalid pio_numiotasks
  Fix merge mistake
  Re-add eca testmod, was somehow lost in big merge
  Restore ACME version of shr_orb_cosz
  Fixed whitespace errors in Makefile killing COSP
  HOMME test typo fix
  Fixes to get tests to pass.
  Update ACME for new CIME directory structure
  More renaming.
  Rename ACME/cime directories in prep for CIME merge
  HOMME Improvement
  More edits to README
  Suggest a more portable usage of mktemp
  Tweak a comment
  Add comment on acme side
  Fix case for new unit_testing attribute to get_mpirun
  Allow run_tests.py to auto-determine the machine name
  fix an error
  make user-compset a seperate test
  ...
jgfouca added a commit that referenced this pull request Mar 13, 2018
Update CIME to cime5.3

Update ACME to use CIME 5.3.x up to ESMCI hash 6156e0a from Thursday 4/13.

scripts_regression_tests and acme_developer tests pass on penn workstation.

This CIME change involves a large cosmetic change in CIME. Several directories have been renamed.

    The main interface to CIME, the scripts dir remains unchanged.
    cime/cime_config is renamed to cime/config
    The Fortran source code has been consolidated in src directory: externals, components, share, and drivers
    The driver_cpl directory has been moved to src/drivers/mct to prepare for multiple coupler options
    The share/csm_share/share directory has been renamed src/share/util.
    The cime script infrastructure source, utils/python/CIME, has moved to scripts/lib/CIME.

@singhbalwinder : please make sure PNNL cluster merge look correct
@mfdeakin-sandia : please look at cprnc/CMakeLists.txt and also confirm no run_acme changes.

Fixes #1311
Fixes #1380
Fixes #1382
Fixes #1383
Fixes #1396
Fixes #1402
Fixes #1416

[BFB]

* agsalin/update-to-cime5.3: (5649 commits)
  Fix titan problem
  Reset invalid pio_numiotasks
  Fix merge mistake
  Re-add eca testmod, was somehow lost in big merge
  Restore ACME version of shr_orb_cosz
  Fixed whitespace errors in Makefile killing COSP
  HOMME test typo fix
  Fixes to get tests to pass.
  Update ACME for new CIME directory structure
  More renaming.
  Rename ACME/cime directories in prep for CIME merge
  HOMME Improvement
  More edits to README
  Suggest a more portable usage of mktemp
  Tweak a comment
  Add comment on acme side
  Fix case for new unit_testing attribute to get_mpirun
  Allow run_tests.py to auto-determine the machine name
  fix an error
  make user-compset a seperate test
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants