Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the IEFLX fixer #1303

Merged
merged 1 commit into from
Apr 11, 2017

Conversation

kaizhangpnl
Copy link
Contributor

@kaizhangpnl kaizhangpnl commented Mar 10, 2017

This PR implements the IEFLX fixer
(https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer)

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]

@susburrows
Copy link
Contributor

@kaizhangpnl , @rljacob : I assume we should merge this as soon as possible. Kai, I just want to make sure that you are aware that next is currently frozen, so unless Rob says otherwise we will need to wait until things clear up before merging these changes. I'll coordinate with @singhbalwinder , and merge this after he merges #1301 .

@kaizhangpnl
Copy link
Contributor Author

Hi @susburrows I just assigned a pull request to you. I also assigned a different pull request (#1301) to @singhbalwinder, which should be merged to the master before merging this one. I have discussed both pull requests with @golaz. Thanks!

@susburrows
Copy link
Contributor

hah, we timed that almost exactly at the same time :)

@susburrows susburrows added Atmosphere BFB PR leaves answers BFB bug fix PR labels Mar 10, 2017
@kaizhangpnl
Copy link
Contributor Author

:) Thanks for the information. That's fine.

@PeterCaldwell
Copy link
Contributor

Hi Kai –

I went through your code and I believe that it basically works as intended – nice job! I do have some questions/comments though. The first question impacts accuracy, the others just make it easier for other people to understand what you’ve done ;-).

  1. Why isn’t the ieflx correction computed just before it’s applied in phys_run2? In addition to being harder to track what’s going on due to the new code being split up, the SST and qflx values in cam_run1 haven’t been updated yet in cam_run1, so you are using unnecessarily stale information. Why not just compute the correction amount in cam_run2 just before you perform the fix?

  2. I had a really hard time checking whether the units in ieflx_gmean are correct (but they are!) because the density of liquid water is just given the name “thousand” without any explanation. Also, none of the variables you define or use have comments giving their meaning or units. While this isn’t unusual, it causes trouble for other people trying to understand your code.

  3. Why did you add 2 namelist parameters to control this behavior? Couldn’t you ditch the logical l_ieflx_fix and only apply the fix when ieflx_opt is nonzero?

  4. You should clean up ieflx_gmean… it initializes a lot of variables that you never use.

@kaizhangpnl
Copy link
Contributor Author

Hi Peter,

Thanks for your comments. Please find my reply below:

Why isn’t the ieflx correction computed just before it’s applied in phys_run2? In addition to being harder to track what’s going on due to the new code being split up, the SST and qflx values in cam_run1 haven’t been updated yet in cam_run1, so you are using unnecessarily stale information. Why not just compute the correction amount in cam_run2 just before you perform the fix?

The idea was to compute IEFLX using consistent QFLX, SST, and PRECT (variables used during the same physics time step). On the other hand, since we see only tiny changes of QFLX & SST between two adjacent time steps, I think moving the calculation to phys_run2 won't do much harm and I am open to the alternative solution. @philrasch and @golaz, what do you think?

I had a really hard time checking whether the units in ieflx_gmean are correct (but they are!) because the density of liquid water is just given the name “thousand” without any explanation. Also, none of the variables you define or use have comments giving their meaning or units. While this isn’t unusual, it causes trouble for other people trying to understand your code.

I will put some additional notes at places where ieflx_gmean is calculated.

Why did you add 2 namelist parameters to control this behavior? Couldn’t you ditch the logical l_ieflx_fix and only apply the fix when ieflx_opt is nonzero?

The extra namelist variable is used to facilitate a diagnostic calculation, i.e. we can diagnose IEFLX without affecting the simulation. I prefer to leave it as is, so that we don't have to implement it again when we try to improve the internal energy treatment in the future.

You should clean up ieflx_gmean… it initializes a lot of variables that you never use.

Well spotted! The three variables (qflx_glob, rain_glob, snow_glob) were used in a version with more detailed diagnostics, but they are not used in the current code. Will remove them.

@PeterCaldwell
Copy link
Contributor

Thanks Kai -

I see what you're saying about wanting to use a consistent SST, QFLX, and PRECT. I'm surprised you don't think QFLX will change much between timesteps since wind and surface temperature could change a lot over half-hour timesteps (particularly over land). I'm also not sure whether the ocean coupling interval could affect this sensitivity. Also, while PRECT is computed in tphysbc using "old" SST and QFLX, it is assumed constant over the whole model step, so I'm not sure we can say it is "inconsistent" with the new SST and QFLX after coupling... And I really do mean "I'm not sure" here - your explanation makes sense and I'm not sure whether or not my suggestion of moving things makes sense.

Thanks for adding the comments about units and stuff. It looks much better. I still think changing the name "thousand" to "rho_water" or "water_density" or something would make the code look more professional. ;-)

@kaizhangpnl
Copy link
Contributor Author

Hi peter,

I'm surprised you don't think QFLX will change much between timesteps since wind and surface temperature could change a lot over half-hour timesteps (particularly over land). I'm also not sure whether the ocean coupling interval could affect this sensitivity. Also, while PRECT is computed in tphysbc using "old" SST and QFLX, it is assumed constant over the whole model step, so I'm not sure we can say it is "inconsistent" with the new SST and QFLX after coupling... And I really do mean "I'm not sure" here - your explanation makes sense and I'm not sure whether or not my suggestion of moving things makes sense.

What I wanted to say was that the global mean diagnostics for qflx and ieflx_up between two time steps were pretty small (most values < 1%). I agree that if we look at the grid-box values of qflx and ieflx_up, the difference would be much larger.

I also don't know which way is better. Fortunately, I have archived the detailed output at each time step from a previous test simulation, so I did an offline calculation. Using the global mean diagnostics as input for the two methods discussed here, the relative difference in the calculated daily mean IEFLX is small (about 4%). However, the absolute difference at each time step could be as large as 1W/m2, and the relative difference 100%. This is because ieflx is a small term compared to ieflx_up and ieflx_dn, and a small relative difference in ieflx_up and ieflx_dn could result a large ieflx.

I will leave it to you and @golaz to decide which one to use.

Thanks for adding the comments about units and stuff. It looks much better. I still think changing the name "thousand" to "rho_water" or "water_density" or something would make the code look more professional. ;-)

Done.

@PeterCaldwell
Copy link
Contributor

In thinking more about where to compute the correction, it seems like you have the choice between using stale SST and QFLX information or using PRECT which was computed off an earlier (and therefore potentially inconsistent) surface state. My intuition is that stale input is worse than old PRECT information. I've pinged @golaz for his vote.

@kaizhangpnl
Copy link
Contributor Author

kaizhangpnl commented Mar 16, 2017

I think the best way is to follow what the ocean model does for the energy fix. If new SST and QFLX are used to calculate the internal energy exchange in the ocean model, we'd better adopt your approach, and vice versa.

I've tried the alternative method in a short simulation and it runs well, so if we are going to use the new approach, I can commit it any time.

@golaz
Copy link
Contributor

golaz commented Mar 22, 2017

@kaizhangpnl, adding my two cents to the discussion.

  1. I have a slight preference for @PeterCaldwell suggestion in terms of code placement. Since it's not obvious that either option is the better or the more consistent one, Peter's suggestion would offer the advantage of having more of the code changes grouped together, which would make it more readable. But if you feel strongly about this or think this would take a lot of time to accomplish, I can live with the current option.
  2. In subroutine ieflx_gmean, I'm wondering why you are computing separate global means, one for up and one for down. Since calls to gmean involves parallel communications, their cost is not free. Wouldn't it be more efficient to only compute the global mean on the net flux? Maybe I'm missing something.
  3. I would prefer if IEFLX was not output by default, since it is not directly used as a 2d field, only it's global mean is used.
  4. I'm also wondering whether we should keep the original definition for the output variable SHFLX (removing the modifications in cam_diagnostics.F90) and add a new variable that is the correction term to SHFLX (=ieflx_glob). Not sure what the best name for it would be, SHFLXFIX, IEFIX, ...?

@worleyph
Copy link
Contributor

@golaz and @kaizhangpnl, ieflx_gmean could be rewritten to calculate the mean of two fields (up and down) in one call to gmean. This has very close to the same MPI cost as doing just one gmean. FYI.

@worleyph
Copy link
Contributor

However, unless there are numerical reasons to compute ieup_glob and iedn_glob before differencing, then taking the difference first and then calculating the gmean would be more efficient.

@kaizhangpnl
Copy link
Contributor Author

@golaz and @worleyph : Thanks for your comments. The two global averages (up/down) were calculated for diagnostics and it is indeed better to compute only one global average. I will change the code following your recommendation.

Copy link
Contributor

@golaz golaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaizhangpnl, thank you for making the changes. The code modifications look good to me now and should be ready to be merged. Just to be safe, I'd like to ask @PeterCaldwell to take one more look at them as well.

Copy link
Contributor

@PeterCaldwell PeterCaldwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just took a quick glance since I looked at this in detail before and the new changes just seem to move things around and fix issues that I and others have pointed out. Thanks for your hard work on this, Kai!

@golaz golaz added this to the v1.0beta2 milestone Apr 4, 2017
@kaizhangpnl
Copy link
Contributor Author

Thank you, @PeterCaldwell and @golaz for your review!

@susburrows susburrows changed the title Implemented the IEFLX fixer Implement switch for the IEFLX fixer Apr 5, 2017
@susburrows susburrows changed the title Implement switch for the IEFLX fixer Implement the IEFLX fixer Apr 5, 2017
@susburrows susburrows force-pushed the kaizhangpnl/atm/check_energy_bugfix_AND_ieflx branch from 72e839c to af7873e Compare April 5, 2017 21:49
@susburrows
Copy link
Contributor

I rebased this to current master in order to avoid duplication of ea58ec0 , which appears in PR #1301

@susburrows
Copy link
Contributor

@kaizhangpnl , @rljacob : since this is a bug fix PR, is there a bug / issue report that should be associated with it?

@rljacob
Copy link
Member

rljacob commented Apr 7, 2017

Bisect confirmed that this PR generates a runtime error on melvin (GCC 5.3). GCC 5.3 is also available on Anvil to help you resolve this. Please revert this commit from next.

@kaizhangpnl
Copy link
Contributor Author

Thanks for the note. I am trying to identify the reason.

susburrows added a commit that referenced this pull request Apr 7, 2017
…into next (PR #1303)"

This reverts commit c2710c9, reversing
changes made to d47e818.

This PR broke integration testing.  Atmosphere tests on melvin with
GNU compiler are failing.  This commit reverts the PR merge to next
while we investigate the cause of the test failures.
@susburrows
Copy link
Contributor

susburrows commented Apr 7, 2017

As Rob noted above, this PR caused runtime failures in atmosphere-only compset tests on melvin (gnu compiler). I've reverted the merge to next until we can sort out the issue.

Error message:
At line 259 of file buffer.F90.inAt line 259 of file buffer.F90.inFortran runtime error: Array bound mismatch for dimension 1 of array 'ptr' (16/15)

@kaizhangpnl
Copy link
Contributor Author

Does anyone know how to configure the model to use the gnu compiler? The old method doesn't work with CIME5.

@kaizhangpnl
Copy link
Contributor Author

By the way, the model compiled with Intel ran well in the debug mode on anvil.

@rljacob
Copy link
Member

rljacob commented Apr 7, 2017

We need to know what kind of error you're seeing and what command you're trying to do.

@kaizhangpnl
Copy link
Contributor Author

kaizhangpnl commented Apr 7, 2017

I used
$xmlchange_exe --id COMPILER --val "gnu"
in the acme runscript. It did modify env_build.xml:

<entry id="COMPILER" value="gnu">

but when I compiled the model, it said:

mpicc -c -I/blues/gpfs/home/kaizhang/model/ACME_20170407_ieflx/cime/share/timing -O2 -fp-model precise -std=gnu99 -O0 -g -DFORTRANUNDERSCORE -DNO_R16 -DCPRINTEL -DHAVE_MPI /blues/gpfs/home/kaizhang/model/ACME_20170407_ieflx/cime/share/timing/gptl.c
gcc: error: precise: No such file or directory
gcc: error: unrecognized command line option ‘-fp-model’
gmake: *** [gptl.o] Error 1

It seems to me that the model was still being compiled by intel.

@kaizhangpnl
Copy link
Contributor Author

my case directory is :

/home/kaizhang/model/cases/ACME_20170407_ieflx_FC5AV1C-L_ne30_ne30_EXP01_DEBUG_gnu_256p

and build directory is:

/lcrc/group/acme/kaizhang/acme_scratch/ACME_20170407_ieflx_FC5AV1C-L_ne30_ne30_EXP01_DEBUG_gnu_256p/bld

@rljacob
Copy link
Member

rljacob commented Apr 8, 2017

Are you trying to use an existing case directory? If so, you have to do "case.setup -c" after changing a compiler.
I can't read your case directory. But this command worked as was able to build:
./create_newcase -case Fgnudebug -compset FC5AV1C-L -res ne30_ne30 -compiler gnu --machine anvil

@kaizhangpnl
Copy link
Contributor Author

Thanks for your help! The command you recommended works for me as well. However, even after "case.setup -c" the method I used (xmlchange_exe --id COMPILER --val "gnu") still fails with the same error message.

Anyway, I reproduced the gnu-runtime-error problem on anvil. The bug has been found and the model runs well now with gnu on anvil. I will commit the bug fix after a further test.

@kaizhangpnl
Copy link
Contributor Author

The bug fix is committed.

@susburrows
Copy link
Contributor

Fantastic, thanks Kai! I'll squash your commits before merging. @rljacob : looks like melvin is reporting essentially a clean dashboard. Am I clear to merge this to next for integration testing?

@rljacob
Copy link
Member

rljacob commented Apr 10, 2017

Yes. I think you have to "revert the revert" to put this back on next.

@susburrows
Copy link
Contributor

susburrows commented Apr 10, 2017

Thanks Rob. I won't have to do that since I will squash the commits together, so it will have a new hash.

Information on the IE fixer:
https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]
@susburrows susburrows force-pushed the kaizhangpnl/atm/check_energy_bugfix_AND_ieflx branch from bd7d7d0 to 32edd82 Compare April 10, 2017 17:56
susburrows added a commit that referenced this pull request Apr 10, 2017
#1303)

This PR implements the IEFLX fixer
(https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer)

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]
@susburrows susburrows merged commit 32edd82 into master Apr 11, 2017
susburrows added a commit that referenced this pull request Apr 11, 2017
This PR implements the IEFLX fixer
(https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer)

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]

* kaizhangpnl/atm/check_energy_bugfix_AND_ieflx:
  Implement the IEFLX fixer
@PeterCaldwell
Copy link
Contributor

@kaizhangpnl - I'm trying to use this change now and I notice that the namelist variable l_ieflx_fix is defined but is no longer actually used anywhere. Also, the impact of different ieflx_opt choices (or even the set of allowable choices) is not explained anywhere. Fixing these things isn't a huge priority, but should probably be done to avoid confusion in the future. Sorry for not catching this in my last review.

@kaizhangpnl
Copy link
Contributor Author

@PeterCaldwell Thanks for your comments. I thought the code (the two equations) were self-explaining, but it seemed not the case. Apart from the simple description of the equations, probably more background information is needed (as documented on the confluence page). I will create a separate push request later.

For now, please see the commit summary and confluence link below:

This PR implements the IEFLX fixer
(https://acme-climate.atlassian.net/wiki/display/ATM/Simulation+with+the+IEFLX+fixer)

These modifications add to those in PR #1301.

The following variable will be added to the master list if ieflx_opt > 0
SHFLXFIX : SHFLX plus global mean IEFLX (sensible heat flux used in the atmosphere model)

To switch on the IEFLX fixer, set
ieflx_opt = 2

By default, the IEFLX fixer is switched off.

[BFB]

agsalin pushed a commit that referenced this pull request May 1, 2017
…ewcase

Add input-dir to create_newcase
Test suite: code-checker
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #1303

User interface changes?: None

Code review: @jedwards4b
if(is_subcol_on()) then
call pbuf_register_subcol('TEOUT', 'phys_register', teout_idx)
call pbuf_register_subcol('IEFLX', 'phys_register', ieflx_idx)
Copy link
Contributor

@PeterCaldwell PeterCaldwell May 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaizhangpnl - @tangq noticed that the line just above is a bug: the code will crash if you try to pbuf_register_subcol before you pbuf_add_field. We didn't catch this because we never try running in subcolumn mode. This line isn't needed, however, since the same operation is done correctly just below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this has already been merged to master, please open an issue for this bug.

@kaizhangpnl
Copy link
Contributor Author

@PeterCaldwell Yes, this is a bug. It seems I forgot to remove this line during the last round of code cleanup. I will create a new PR to fix it.

@PeterCaldwell
Copy link
Contributor

Yup, no big deal. I missed it in my review and it doesn't affect any of the runs we are doing these days. Might as well fix it though.

@kaizhangpnl kaizhangpnl deleted the kaizhangpnl/atm/check_energy_bugfix_AND_ieflx branch July 25, 2017 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Atmosphere BFB PR leaves answers BFB
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants