Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add MACHINFO and ENVINFO to env files #466

Merged
merged 6 commits into from
Jun 16, 2020
Merged

Conversation

apcraig
Copy link
Contributor

@apcraig apcraig commented Jun 11, 2020

PR checklist

  • Short (1 sentence) summary of your PR:
    Add MACHINFO and ENVINFO variables to env files
  • Developer(s):
    apcraig
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    ran a small test suite on cheyenne to verify scripts are working
    https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#f31cfc99e29bcef50d4e3b46c129c838a8a72ee8
    Can see the new output in the individual machine/env file.
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

Also updated the scripts implementation to consistently use "env" and "mach". Shifted internal variables from "env" to "mach" and from "compiler" to "env".

MACHINFO and ENVINFO were added on many machines but not all. These are not required variables but are recommended. I am happy to add information from other machines before we merge this PR if that information is provided to me. These variables are used in the test suites and test reporting to better capture the machines and system software that is being tested.

…re self consistent using mach and env instead of env and compiler
@apcraig apcraig self-assigned this Jun 11, 2020
Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like ICE_MACHINE_MACHINFO.
It seems like ICE_MACHINE_ENVINFO will be a maintenance headache. Are you sure you want to do it this way? Is is possible to instead define a command for each machine which, when executed, provides the info? For badger it would be 'module list', but I could see others using 'env' + 'grep' or similar.

@apcraig
Copy link
Contributor Author

apcraig commented Jun 11, 2020

module list is probably not going to be adequate. see some discussion here #458. sys admins can name modules anyway they want and often they are not complete. you really need to do an ifort --version and review the env settings that are generated when the modules are loaded in some cases. I understand the envinfo seems like a pain, but it only needs to be update anytime the modules (or port more generally) is created or changed. I think the burden will not be that big.

I don't know a good general way to supply useful info consistently across machines at this point. I do worry that the wrong information is worse than no information, so think we need to be mindful to update the envinfo when the port is modified.

@eclare108213
Copy link
Contributor

On some machines, the modules are loaded as the default, which can change without users necessarily noticing. At LANL, we get an email every few weeks showing new and deprecated stuff, and I'm not even sure if they say when a default assignment is changing. I don't generally pay much attention until something breaks, but I do tend to prefer the default so that I don't have to change it myself. Maybe in this kind of case it's better to not supply the info in our files.

@apcraig
Copy link
Contributor Author

apcraig commented Jun 11, 2020

That's a good point. We have the same issue with the conda and travis setup. We really don't know what version we're getting. You can look at the implementation for conda and travis. Leaving out details in the INFO env variables or not defining them at all is a reasonable approach.

I have mixed feelings about just using the default modules on machines in general, but that's a separate discussion. I understand why it can make sense and be convenient sometimes.

@phil-blain
Copy link
Member

Hi Tony. This looks good to me.

I've added the info for our supercomputers in this branch: https://github.com/phil-blain/CICE/tree/add-eccc-machineinfo.
You should be able to

git pull https://github.com/phil-blain/CICE.git add-eccc-machineinfo

to add this commit to this branch, it should fast-forward.

cesium, fram and millikan are just workstations that sit under our desks and that we use for development, so I think it's not worth documenting the environment for these.

@apcraig
Copy link
Contributor Author

apcraig commented Jun 12, 2020

Thanks @phil-blain! I have pulled your changes onto my branch.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not thrilled with having this info hard-wired, but I'll approve to keep things moving forward. Maybe we can come up with a better approach in the future. Thanks for implementing -

@apcraig
Copy link
Contributor Author

apcraig commented Jun 16, 2020

I will look into capturing information at run time separately. As an aside, there is no reason you can't do something like

setenv ICE_MACHINE_ENVINFO module list > & tmp | cat tmp

that will capture all the loaded modules at the time. Of course, that can leave you with something like

echo $ICE_MACHINE_ENVINFO
Currently Loaded Modulefiles: 1) modules/3.2.10.6 2) cce/8.6.4 3) craype-network-aries 4) craype/2.5.13 5) cray-libsci/17.11.1 6) udreg/2.3.2-6.0.7.1_5.13__g5196236.ari 7) ugni/6.0.14.0-6.0.7.1_3.13__gea11d3d.ari 8) pmi/5.0.12 9) dmapp/7.1.1-6.0.7.1_6.2__g45d1b37.ari 10) gni-headers/5.0.12.0-6.0.7.1_3.11__g3b1768f.ari 11) xpmem/2.2.15-6.0.7.1_5.11__g7549d06.ari 12) job/2.2.3-6.0.7.1_5.44__g6c4e934.ari 13) dvs/2.7_2.2.120-6.0.7.1_12.1__g74cb2cc4 14) alps/6.6.43-6.0.7.1_5.46__ga796da32.ari 15) rca/2.2.18-6.0.7.1_5.48__g2aa4f39.ari 16) atp/2.1.1 17) perftools-base/6.5.2 18) PrgEnv-cray/6.0.4 19) cray-mpich/7.6.3 20) java/jdk1.8.0_152 21) eproxy/2.0.22-6.0.7.1_7.5__g1ebe45c.ari 22) craype-broadwell 23) craype-hugepages2M 24) pbs 25) ccm/2.5.4-6.0.7.1_5.27__g394754f.ari

I'm still not convinced there is a universal solution, but will continue to think about it and try other things.

@apcraig apcraig merged commit 821a68b into CICE-Consortium:master Jun 16, 2020
@apcraig apcraig deleted the swdoc branch August 17, 2022 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants