-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CaltechHPC environment #5959
Conversation
Marking priority because BBH evolutions are essentially broken on CaltechHPC at the moment. |
I'm testing this now. I'll report back once I'm confident everything is working smoothly (or as non-roughly as possible) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I have tested this out and am now confident that this will work on all nodes of CaltechHPC (skylake, cascadelake, icelake).
Note that we had to add the environment variable FI_PROVIDER=tcp
to the skylake-2024-04
environment because the mpi module kept trying to use UCX for the fabric but because of #3886, we can't use UCX, so we go with TCP instead.
-D ENABLE_PARAVIEW=ON \ | ||
-D ENABLE_PARAVIEW=OFF \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were there issues finding paraview? This is really only an issue for the CLI so not an immediate priority.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues, I sent a request to the CaltechHPC admins to build ParaView with the same Python etc as the rest of the build. They haven't responded yet.
|
||
spectre_load_modules() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[random line] Should we remove the caltech_hpc_gcc_icelake.sh
env now and rename this one to just caltech_hpc_gcc.sh
? I think that'll be less confusing for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think so. Now or later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say now. Just rip the Band-Aid off. Be sure to also change the cluster installation instructions, maybe adding a little description that whatever node you build on, you can only run on that type of node or newer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok done
45f6225
to
0ba915b
Compare
This allows extra params like `-p reservation=sxs` passed to BBH pipeline commands.
- The previous environment had an issue with loading numeric initial data (took ~40min, now takes 16 seconds). - It also ran BBH evolutions extremely slowly for some reason. - It was actually built for cascadelake, not skylake. Kyle and I reinstalled the environment on skylake, so we should be able to run on all available nodes on CaltechHPC now. I confirmed that the IO issues are fixed and a BBH runs at reasonable speed.
9ecf5cc
to
89c1526
Compare
Kyle and I reinstalled the environment on skylake, so we should be able to run on all available nodes on CaltechHPC now. I confirmed that the IO issues are fixed and a BBH runs at reasonable speed.
Proposed changes
Upgrade instructions
Code review checklist
make doc
to generate the documentation locally intoBUILD_DIR/docs/html
.Then open
index.html
.code review guide.
bugfix
ornew feature
if appropriate.Further comments