Implement the IEFLX fixer #1303

kaizhangpnl · 2017-03-10T01:26:01Z

This PR implements the IEFLX fixer
(https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer)

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]

susburrows · 2017-03-10T01:33:50Z

@kaizhangpnl , @rljacob : I assume we should merge this as soon as possible. Kai, I just want to make sure that you are aware that next is currently frozen, so unless Rob says otherwise we will need to wait until things clear up before merging these changes. I'll coordinate with @singhbalwinder , and merge this after he merges #1301 .

kaizhangpnl · 2017-03-10T01:33:52Z

Hi @susburrows I just assigned a pull request to you. I also assigned a different pull request (#1301) to @singhbalwinder, which should be merged to the master before merging this one. I have discussed both pull requests with @golaz. Thanks!

susburrows · 2017-03-10T01:34:32Z

hah, we timed that almost exactly at the same time :)

kaizhangpnl · 2017-03-10T01:35:24Z

:) Thanks for the information. That's fine.

PeterCaldwell · 2017-03-13T23:33:46Z

Hi Kai –

I went through your code and I believe that it basically works as intended – nice job! I do have some questions/comments though. The first question impacts accuracy, the others just make it easier for other people to understand what you’ve done ;-).

Why isn’t the ieflx correction computed just before it’s applied in phys_run2? In addition to being harder to track what’s going on due to the new code being split up, the SST and qflx values in cam_run1 haven’t been updated yet in cam_run1, so you are using unnecessarily stale information. Why not just compute the correction amount in cam_run2 just before you perform the fix?
I had a really hard time checking whether the units in ieflx_gmean are correct (but they are!) because the density of liquid water is just given the name “thousand” without any explanation. Also, none of the variables you define or use have comments giving their meaning or units. While this isn’t unusual, it causes trouble for other people trying to understand your code.
Why did you add 2 namelist parameters to control this behavior? Couldn’t you ditch the logical l_ieflx_fix and only apply the fix when ieflx_opt is nonzero?
You should clean up ieflx_gmean… it initializes a lot of variables that you never use.

kaizhangpnl · 2017-03-14T13:52:55Z

Hi Peter,

Thanks for your comments. Please find my reply below:

Why isn’t the ieflx correction computed just before it’s applied in phys_run2? In addition to being harder to track what’s going on due to the new code being split up, the SST and qflx values in cam_run1 haven’t been updated yet in cam_run1, so you are using unnecessarily stale information. Why not just compute the correction amount in cam_run2 just before you perform the fix?

The idea was to compute IEFLX using consistent QFLX, SST, and PRECT (variables used during the same physics time step). On the other hand, since we see only tiny changes of QFLX & SST between two adjacent time steps, I think moving the calculation to phys_run2 won't do much harm and I am open to the alternative solution. @philrasch and @golaz, what do you think?

I had a really hard time checking whether the units in ieflx_gmean are correct (but they are!) because the density of liquid water is just given the name “thousand” without any explanation. Also, none of the variables you define or use have comments giving their meaning or units. While this isn’t unusual, it causes trouble for other people trying to understand your code.

I will put some additional notes at places where ieflx_gmean is calculated.

Why did you add 2 namelist parameters to control this behavior? Couldn’t you ditch the logical l_ieflx_fix and only apply the fix when ieflx_opt is nonzero?

The extra namelist variable is used to facilitate a diagnostic calculation, i.e. we can diagnose IEFLX without affecting the simulation. I prefer to leave it as is, so that we don't have to implement it again when we try to improve the internal energy treatment in the future.

You should clean up ieflx_gmean… it initializes a lot of variables that you never use.

Well spotted! The three variables (qflx_glob, rain_glob, snow_glob) were used in a version with more detailed diagnostics, but they are not used in the current code. Will remove them.

PeterCaldwell · 2017-03-14T23:30:54Z

Thanks Kai -

I see what you're saying about wanting to use a consistent SST, QFLX, and PRECT. I'm surprised you don't think QFLX will change much between timesteps since wind and surface temperature could change a lot over half-hour timesteps (particularly over land). I'm also not sure whether the ocean coupling interval could affect this sensitivity. Also, while PRECT is computed in tphysbc using "old" SST and QFLX, it is assumed constant over the whole model step, so I'm not sure we can say it is "inconsistent" with the new SST and QFLX after coupling... And I really do mean "I'm not sure" here - your explanation makes sense and I'm not sure whether or not my suggestion of moving things makes sense.

Thanks for adding the comments about units and stuff. It looks much better. I still think changing the name "thousand" to "rho_water" or "water_density" or something would make the code look more professional. ;-)

kaizhangpnl · 2017-03-15T04:53:23Z

Hi peter,

I'm surprised you don't think QFLX will change much between timesteps since wind and surface temperature could change a lot over half-hour timesteps (particularly over land). I'm also not sure whether the ocean coupling interval could affect this sensitivity. Also, while PRECT is computed in tphysbc using "old" SST and QFLX, it is assumed constant over the whole model step, so I'm not sure we can say it is "inconsistent" with the new SST and QFLX after coupling... And I really do mean "I'm not sure" here - your explanation makes sense and I'm not sure whether or not my suggestion of moving things makes sense.

What I wanted to say was that the global mean diagnostics for qflx and ieflx_up between two time steps were pretty small (most values < 1%). I agree that if we look at the grid-box values of qflx and ieflx_up, the difference would be much larger.

I also don't know which way is better. Fortunately, I have archived the detailed output at each time step from a previous test simulation, so I did an offline calculation. Using the global mean diagnostics as input for the two methods discussed here, the relative difference in the calculated daily mean IEFLX is small (about 4%). However, the absolute difference at each time step could be as large as 1W/m2, and the relative difference 100%. This is because ieflx is a small term compared to ieflx_up and ieflx_dn, and a small relative difference in ieflx_up and ieflx_dn could result a large ieflx.

I will leave it to you and @golaz to decide which one to use.

Thanks for adding the comments about units and stuff. It looks much better. I still think changing the name "thousand" to "rho_water" or "water_density" or something would make the code look more professional. ;-)

Done.

PeterCaldwell · 2017-03-16T01:01:45Z

In thinking more about where to compute the correction, it seems like you have the choice between using stale SST and QFLX information or using PRECT which was computed off an earlier (and therefore potentially inconsistent) surface state. My intuition is that stale input is worse than old PRECT information. I've pinged @golaz for his vote.

kaizhangpnl · 2017-03-16T01:12:53Z

I think the best way is to follow what the ocean model does for the energy fix. If new SST and QFLX are used to calculate the internal energy exchange in the ocean model, we'd better adopt your approach, and vice versa.

I've tried the alternative method in a short simulation and it runs well, so if we are going to use the new approach, I can commit it any time.

golaz · 2017-03-22T00:24:56Z

@kaizhangpnl, adding my two cents to the discussion.

I have a slight preference for @PeterCaldwell suggestion in terms of code placement. Since it's not obvious that either option is the better or the more consistent one, Peter's suggestion would offer the advantage of having more of the code changes grouped together, which would make it more readable. But if you feel strongly about this or think this would take a lot of time to accomplish, I can live with the current option.
In subroutine ieflx_gmean, I'm wondering why you are computing separate global means, one for up and one for down. Since calls to gmean involves parallel communications, their cost is not free. Wouldn't it be more efficient to only compute the global mean on the net flux? Maybe I'm missing something.
I would prefer if IEFLX was not output by default, since it is not directly used as a 2d field, only it's global mean is used.
I'm also wondering whether we should keep the original definition for the output variable SHFLX (removing the modifications in cam_diagnostics.F90) and add a new variable that is the correction term to SHFLX (=ieflx_glob). Not sure what the best name for it would be, SHFLXFIX, IEFIX, ...?

worleyph · 2017-03-22T00:37:14Z

@golaz and @kaizhangpnl, ieflx_gmean could be rewritten to calculate the mean of two fields (up and down) in one call to gmean. This has very close to the same MPI cost as doing just one gmean. FYI.

worleyph · 2017-03-22T00:42:52Z

However, unless there are numerical reasons to compute ieup_glob and iedn_glob before differencing, then taking the difference first and then calculating the gmean would be more efficient.

kaizhangpnl · 2017-03-22T04:27:03Z

@golaz and @worleyph : Thanks for your comments. The two global averages (up/down) were calculated for diagnostics and it is indeed better to compute only one global average. I will change the code following your recommendation.

golaz

@kaizhangpnl, thank you for making the changes. The code modifications look good to me now and should be ready to be merged. Just to be safe, I'd like to ask @PeterCaldwell to take one more look at them as well.

PeterCaldwell

I just took a quick glance since I looked at this in detail before and the new changes just seem to move things around and fix issues that I and others have pointed out. Thanks for your hard work on this, Kai!

kaizhangpnl · 2017-04-04T16:05:37Z

Thank you, @PeterCaldwell and @golaz for your review!

susburrows · 2017-04-05T21:51:13Z

I rebased this to current master in order to avoid duplication of ea58ec0 , which appears in PR #1301

susburrows · 2017-04-05T21:52:43Z

@kaizhangpnl , @rljacob : since this is a bug fix PR, is there a bug / issue report that should be associated with it?

rljacob · 2017-04-07T18:06:24Z

Bisect confirmed that this PR generates a runtime error on melvin (GCC 5.3). GCC 5.3 is also available on Anvil to help you resolve this. Please revert this commit from next.

kaizhangpnl · 2017-04-07T20:51:40Z

Thanks for the note. I am trying to identify the reason.

…into next (PR #1303)" This reverts commit c2710c9, reversing changes made to d47e818. This PR broke integration testing. Atmosphere tests on melvin with GNU compiler are failing. This commit reverts the PR merge to next while we investigate the cause of the test failures.

susburrows · 2017-04-07T21:01:20Z

As Rob noted above, this PR caused runtime failures in atmosphere-only compset tests on melvin (gnu compiler). I've reverted the merge to next until we can sort out the issue.

Error message:
At line 259 of file buffer.F90.inAt line 259 of file buffer.F90.inFortran runtime error: Array bound mismatch for dimension 1 of array 'ptr' (16/15)

kaizhangpnl · 2017-04-07T23:06:57Z

Does anyone know how to configure the model to use the gnu compiler? The old method doesn't work with CIME5.

kaizhangpnl · 2017-04-07T23:11:51Z

By the way, the model compiled with Intel ran well in the debug mode on anvil.

rljacob · 2017-04-07T23:21:28Z

We need to know what kind of error you're seeing and what command you're trying to do.

kaizhangpnl · 2017-04-07T23:29:13Z

I used
$xmlchange_exe --id COMPILER --val "gnu"
in the acme runscript. It did modify env_build.xml:

<entry id="COMPILER" value="gnu">

but when I compiled the model, it said:

mpicc -c -I/blues/gpfs/home/kaizhang/model/ACME_20170407_ieflx/cime/share/timing -O2 -fp-model precise -std=gnu99 -O0 -g -DFORTRANUNDERSCORE -DNO_R16 -DCPRINTEL -DHAVE_MPI /blues/gpfs/home/kaizhang/model/ACME_20170407_ieflx/cime/share/timing/gptl.c
gcc: error: precise: No such file or directory
gcc: error: unrecognized command line option ‘-fp-model’
gmake: *** [gptl.o] Error 1

It seems to me that the model was still being compiled by intel.

kaizhangpnl · 2017-04-07T23:41:34Z

my case directory is :

/home/kaizhang/model/cases/ACME_20170407_ieflx_FC5AV1C-L_ne30_ne30_EXP01_DEBUG_gnu_256p

and build directory is:

/lcrc/group/acme/kaizhang/acme_scratch/ACME_20170407_ieflx_FC5AV1C-L_ne30_ne30_EXP01_DEBUG_gnu_256p/bld

rljacob · 2017-04-08T00:36:27Z

Are you trying to use an existing case directory? If so, you have to do "case.setup -c" after changing a compiler.
I can't read your case directory. But this command worked as was able to build:
./create_newcase -case Fgnudebug -compset FC5AV1C-L -res ne30_ne30 -compiler gnu --machine anvil

kaizhangpnl · 2017-04-08T03:39:38Z

Thanks for your help! The command you recommended works for me as well. However, even after "case.setup -c" the method I used (xmlchange_exe --id COMPILER --val "gnu") still fails with the same error message.

Anyway, I reproduced the gnu-runtime-error problem on anvil. The bug has been found and the model runs well now with gnu on anvil. I will commit the bug fix after a further test.

kaizhangpnl · 2017-04-08T16:52:45Z

The bug fix is committed.

susburrows · 2017-04-10T17:37:19Z

Fantastic, thanks Kai! I'll squash your commits before merging. @rljacob : looks like melvin is reporting essentially a clean dashboard. Am I clear to merge this to next for integration testing?

rljacob · 2017-04-10T17:38:19Z

Yes. I think you have to "revert the revert" to put this back on next.

susburrows · 2017-04-10T17:51:52Z

Thanks Rob. I won't have to do that since I will squash the commits together, so it will have a new hash.

Information on the IE fixer: https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer These modifications add to those in PR #1301. The following variable will be added to the master list if ieflx_opt > 0 SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model) To switch on the IEFLX fixer, set ieflx_opt = 2 By default, the IEFLX fixer is switched off. [BFB]

#1303) This PR implements the IEFLX fixer (https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer) These modifications add to those in PR #1301. The following variable will be added to the master list if ieflx_opt > 0 SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model) To switch on the IEFLX fixer, set ieflx_opt = 2 By default, the IEFLX fixer is switched off. [BFB]

This PR implements the IEFLX fixer (https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer) These modifications add to those in PR #1301. The following variable will be added to the master list if ieflx_opt > 0 SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model) To switch on the IEFLX fixer, set ieflx_opt = 2 By default, the IEFLX fixer is switched off. [BFB] * kaizhangpnl/atm/check_energy_bugfix_AND_ieflx: Implement the IEFLX fixer

PeterCaldwell · 2017-04-25T23:10:57Z

@kaizhangpnl - I'm trying to use this change now and I notice that the namelist variable l_ieflx_fix is defined but is no longer actually used anywhere. Also, the impact of different ieflx_opt choices (or even the set of allowable choices) is not explained anywhere. Fixing these things isn't a huge priority, but should probably be done to avoid confusion in the future. Sorry for not catching this in my last review.

kaizhangpnl · 2017-04-26T22:01:56Z

@PeterCaldwell Thanks for your comments. I thought the code (the two equations) were self-explaining, but it seemed not the case. Apart from the simple description of the equations, probably more background information is needed (as documented on the confluence page). I will create a separate push request later.

For now, please see the commit summary and confluence link below:

This PR implements the IEFLX fixer
(https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer)

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]

@jedwards4b

…ewcase Add input-dir to create_newcase Test suite: code-checker Test baseline: Test namelist changes: Test status: bit for bit Fixes #1303 User interface changes?: None Code review: @jedwards4b

PeterCaldwell · 2017-05-02T15:23:03Z

components/cam/src/physics/cam/check_energy.F90

    if(is_subcol_on()) then
      call pbuf_register_subcol('TEOUT', 'phys_register', teout_idx)
+      call pbuf_register_subcol('IEFLX', 'phys_register', ieflx_idx)


@kaizhangpnl - @tangq noticed that the line just above is a bug: the code will crash if you try to pbuf_register_subcol before you pbuf_add_field. We didn't catch this because we never try running in subcolumn mode. This line isn't needed, however, since the same operation is done correctly just below.

Since this has already been merged to master, please open an issue for this bug.

kaizhangpnl · 2017-05-02T17:05:55Z

@PeterCaldwell Yes, this is a bug. It seems I forgot to remove this line during the last round of code cleanup. I will create a new PR to fix it.

PeterCaldwell · 2017-05-02T17:19:13Z

Yup, no big deal. I missed it in my review and it doesn't affect any of the runs we are doing these days. Might as well fix it though.

kaizhangpnl assigned susburrows Mar 10, 2017

susburrows added Atmosphere BFB PR leaves answers BFB bug fix PR labels Mar 10, 2017

kaizhangpnl requested review from golaz and philrasch March 10, 2017 01:36

golaz requested a review from PeterCaldwell March 28, 2017 23:23

golaz approved these changes Mar 28, 2017

View reviewed changes

PeterCaldwell approved these changes Apr 3, 2017

View reviewed changes

golaz added this to the v1.0beta2 milestone Apr 4, 2017

susburrows changed the title ~~Implemented the IEFLX fixer~~ Implement switch for the IEFLX fixer Apr 5, 2017

susburrows changed the title ~~Implement switch for the IEFLX fixer~~ Implement the IEFLX fixer Apr 5, 2017

susburrows force-pushed the kaizhangpnl/atm/check_energy_bugfix_AND_ieflx branch from 72e839c to af7873e Compare April 5, 2017 21:49

susburrows force-pushed the kaizhangpnl/atm/check_energy_bugfix_AND_ieflx branch from bd7d7d0 to 32edd82 Compare April 10, 2017 17:56

susburrows merged commit 32edd82 into master Apr 11, 2017

PeterCaldwell reviewed May 2, 2017

View reviewed changes

PeterCaldwell mentioned this pull request May 2, 2017

minor bug in atm IEFLX code #1479

Closed

kaizhangpnl mentioned this pull request May 5, 2017

Further modifications to check_energy.F90 #1498

Merged

kaizhangpnl deleted the kaizhangpnl/atm/check_energy_bugfix_AND_ieflx branch July 25, 2017 17:15

rljacob mentioned this pull request Feb 5, 2019

Create F1850SC5-CMIP6 compset #2721

Merged

Implement the IEFLX fixer #1303

Implement the IEFLX fixer #1303

Conversation

kaizhangpnl commented Mar 10, 2017 • edited by susburrows Loading

susburrows commented Mar 10, 2017

kaizhangpnl commented Mar 10, 2017

susburrows commented Mar 10, 2017

kaizhangpnl commented Mar 10, 2017

PeterCaldwell commented Mar 13, 2017

kaizhangpnl commented Mar 14, 2017

PeterCaldwell commented Mar 14, 2017

kaizhangpnl commented Mar 15, 2017

PeterCaldwell commented Mar 16, 2017

kaizhangpnl commented Mar 16, 2017 • edited Loading

golaz commented Mar 22, 2017

worleyph commented Mar 22, 2017

worleyph commented Mar 22, 2017

kaizhangpnl commented Mar 22, 2017

golaz left a comment

Choose a reason for hiding this comment

PeterCaldwell left a comment

Choose a reason for hiding this comment

kaizhangpnl commented Apr 4, 2017

susburrows commented Apr 5, 2017

susburrows commented Apr 5, 2017

rljacob commented Apr 7, 2017

kaizhangpnl commented Apr 7, 2017

susburrows commented Apr 7, 2017 • edited Loading

kaizhangpnl commented Apr 7, 2017

kaizhangpnl commented Apr 7, 2017

rljacob commented Apr 7, 2017

kaizhangpnl commented Apr 7, 2017 • edited Loading

kaizhangpnl commented Apr 7, 2017

rljacob commented Apr 8, 2017

kaizhangpnl commented Apr 8, 2017

kaizhangpnl commented Apr 8, 2017

susburrows commented Apr 10, 2017

rljacob commented Apr 10, 2017

susburrows commented Apr 10, 2017 • edited Loading

PeterCaldwell commented Apr 25, 2017

kaizhangpnl commented Apr 26, 2017

PeterCaldwell May 2, 2017 • edited Loading

Choose a reason for hiding this comment

rljacob May 2, 2017

Choose a reason for hiding this comment

kaizhangpnl commented May 2, 2017

PeterCaldwell commented May 2, 2017

kaizhangpnl commented Mar 10, 2017 •

edited by susburrows

Loading

kaizhangpnl commented Mar 16, 2017 •

edited

Loading

susburrows commented Apr 7, 2017 •

edited

Loading

kaizhangpnl commented Apr 7, 2017 •

edited

Loading

susburrows commented Apr 10, 2017 •

edited

Loading

PeterCaldwell May 2, 2017 •

edited

Loading