-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix various cluster issues #1807
Fix various cluster issues #1807
Conversation
Job Mingw Test on cb12a1d : invalidated by @joshua-cogliati-inl failed in fetch |
Job Mingw Test on e76a2bc : invalidated by @joshua-cogliati-inl failed in fetch |
1 similar comment
Job Mingw Test on e76a2bc : invalidated by @joshua-cogliati-inl failed in fetch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look good. Noting that adding profiling to RrR jobs may slow them down a bit, but is probably worth it in the feedback.
After many re-tries, it appears there's an issue on the Windows testing machine, specifically with removing an "out" file from the rom trainer test: raven/tests/framework/CodeInterfaceTests/RAVEN/ROM/FirstMRun/4/out~test_rom_trainer |
If tests pass without further modifications, this is approved for merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshua-cogliati-inl Thanks for fixing the cluster tests. I have some minor comments for your consideration.
#self.thread in ray.wait([self.thread], timeout=waitTimeOut)[0] | ||
#which ran slower in ray 1.9 | ||
else: | ||
self.thread.finished |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this line be return self.thread.finished
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. (Wait, you mean this is Python, not LISP?)
@@ -1,5 +1,5 @@ | |||
<?xml version="1.0" ?> | |||
<Simulation verbosity="debug"> | |||
<Simulation verbosity="debug" profile="jobs"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think TestInfo need to be added, and the revisions node need to be updated to reflect the python command change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this isn't a test file, it's a file run by a test file. When we first added this test we had a discussion about it, and decided that the "inner" of the RrR tests should not be considered the test file; rather the "outer" should. I think the philosophical idea was that the "outer" is actually the test, while the "inner" doesn't get seen by the testing harness, and it would be confusing if this data was loaded as if it were a separate test into the regression test documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me.
Job Mingw Test on 28c7722 : invalidated by @joshua-cogliati-inl failed in fetch |
Job Mingw Test on e76a2bc : invalidated by @joshua-cogliati-inl failed in fetch |
Job Test qsubs sawtooth on e2dab00 : invalidated by @joshua-cogliati-inl FAILED: Failed tests/cluster_tests/test_mpi Diff tests/cluster_tests/test_pbsdsh Diff tests/cluster_tests/test_mpiqsub_parameters Diff tests/cluster_tests/AdaptiveSobol/test_parallel_adaptive_sobol Diff tests/cluster_tests/test_mpi_forced Diff tests/cluster_tests/test_mpiqsub_nosplit Diff tests/cluster_tests/RavenRunsRaven/ROM Diff tests/cluster_tests/RavenRunsRaven/Code |
Job Test qsubs sawtooth on 6008ff6 : invalidated by @joshua-cogliati-inl testing to see if running on /scratch |
72c1906
to
1fc5ac2
Compare
@@ -14,7 +14,7 @@ | |||
""" | |||
Created on Mar 5, 2013 | |||
|
|||
@author: alfoa, cogljj, crisr | |||
@author: alfoa, cogljj, crisr, talbpw, maljdp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you blame me for this 😁
Pull Request Description
What issue does this change request address?
Closes #1809
What are the significant changes in functionality due to this change request?
Checks for establish_conda_env.sh before sourcing.
Runs same version of python in RavenRunsRaven code test (and adds feature to be able to do this.)
Export RAVEN_FRAMEWORK_DIR for MPILegacySimulationMode
Switches to using ray.get instead of ray.wait which works better empirically.
Adds some extra debugging information.
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.
<internalParallel>
to True.raven/tests/framework/user_guide
andraven/docs/workshop
) have been changed, the associated documentation must be reviewed and assured the text matches the example.