Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ntest_simulation_module failing without MPI #3120

Closed
prckent opened this issue Apr 20, 2021 · 4 comments
Closed

ntest_simulation_module failing without MPI #3120

prckent opened this issue Apr 20, 2021 · 4 comments

Comments

@prckent
Copy link
Contributor

prckent commented Apr 20, 2021

Describe the bug
https://cdash.qmcpack.org/CDash/testSummary.php?project=1&name=ntest_nexus_simulation_module&date=2021-04-20

The most recent updates #3108 fixed the mpi situation but not the no mpi situation.

Edit: Interestingly the Intel2019.1 build passed while gcc, llvm, and pgi builds did not. But perhaps enough of Intel MPI was picked up even though not enabled(???)

Test output

Test name     : simulation_module
Test sublabel : test_execute
Test exception: "AssertionError: "
Test backtrace:
  File "/scratch/pk7/QMCPACK_CI_BUILDS_DO_NOT_REMOVE/qmcpack/nexus/bin/nxs-test", line 478, in run
    self.operation()
  File "/scratch/pk7/QMCPACK_CI_BUILDS_DO_NOT_REMOVE/qmcpack/nexus/bin/nxs-test", line 1064, in simulation
    nunit('execute')
  File "/scratch/pk7/QMCPACK_CI_BUILDS_DO_NOT_REMOVE/qmcpack/nexus/bin/nxs-test", line 349, in nunit
    run_external_unit_test(test_name,unit_test)
  File "/scratch/pk7/QMCPACK_CI_BUILDS_DO_NOT_REMOVE/qmcpack/nexus/bin/nxs-test", line 388, in run_external_unit_test
    unit_test()
  File "/scratch/pk7/QMCPACK_CI_BUILDS_DO_NOT_REMOVE/qmcpack/nexus/tests/unit/test_simulation_module.py", line 2386, in test_execute
    assert(open(outfile,'r').read().strip()=='run')

Test status: fail

To Reproduce

Latest develop, software versions as nitrogen/sulfur. (ornl_versions.sh)

Expected behavior

Test passes

System:

  • system name [e.g. fusiont5, summit]
  • modules loaded [e.g. output of module list]
  • other systems where this is reproducible [e.g. "my laptop", "none"]

Additional context
Add any other context about the problem here.

@jtkrogel
Copy link
Contributor

jtkrogel commented Jun 4, 2021

Nexus assumes you are going to run MPI applications with it. MPI availability is basically a requirement then for a correct Nexus install, which this failing test exposes.

For the purpose of running tests with QMCPACK "no MPI", perhaps a flag should be passed to nxs-test (e.g. --no_mpi) by ctest for these builds? The flag could modify the Nexus tests to not require MPI.

@prckent
Copy link
Contributor Author

prckent commented Jun 11, 2021

I just tracked this down again, forgetting the answer was already here (rainy Friday).

cat ../qmcpack/nexus/tests/unit/test_simulation_output/test_execute/runs/test_sim66.err 
/bin/sh: line 2: mpirun: command not found

A couple of questions around lightweight ways to improve the situation:

Q. How should one use Nexus with no MPI library installed? If it is in practice today a requirement then we need to plan appropriately for installations and builds for workshops.

Q. Why not check for mpirun on the path and skip this test if not found? Or ignore this error if mpirun is mentioned in the error?

@jtkrogel
Copy link
Contributor

Nexus can be used w/o MPI, but that is not the typical use case. I think "tests pass" should imply "examples will work" and al the examples use MPI. I'm happy to supply e.g. a --no_mpi flag to the test system if you know you won't be using it with, and/or don't care about MPI and this will modify the test to not use mpirun.

I likely also can improve the test failure message (exposed with --verbose to state that mpirun was not found).

@prckent
Copy link
Contributor Author

prckent commented Jun 14, 2021

We also use "tests pass" for installations though, so I think an update is needed to handle these cases.

Suggestion to keep this super simple:

  • Rename this test now to include mpi in the name. That will make the origin of the failure
  • Sometime in the future, skip mpi dependent tests when mpirun is not found on the path. Preferable to yet another flag to keep track of, configure in our testing, document etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants