-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exec format error when running Optimas with HiPACE++ on Maxwell #235
Comments
I think most likely it is picking up mpich and then in your subprocess is using openMPI (assuming you are using an env_script). The quickest way to force it is probably to specify openmpi. In libEnsemble this would be either. exctr = MPIExecutor(custom_info={"mpi_runner": "openmpi"}) or as a platforms spec. libE_specs["platform_specs"] = {
"mpi_runner": "openmpi",
} In Optimas, if your exploration object is called exp.libE_specs["platform_specs"] = {
"mpi_runner": "openmpi",
} |
Hang on, I just realised Optimas has a more direct option to set this in your calling script. ev = TemplateEvaluator(
env_mpi='openmpi' If you already have this and its failing, let me know. |
Hey, Thanks for the fast response. So I think you're right that using the Here's how the submission file looks like that optimas generates now:
The machine file that it refers to also seems to be generated okay (I think?) Unfortunately there's no indication in the .err or .out files about what the exact issue might be now... Let me know if I can provide more details. Cheers, |
What happens now if you run:
by itself? You could try without the machinefile and/or in an interactive session, and set the node name in the machinefile to what your on. I wonder if on your system something is needed like starting the file with: #!/bin/bash I would see if you can get it to run that file on your system. Let me know what if gives you. You could also see if sourcing the file makes a difference. |
Hey, Sorry for the delayed reply, was doing some investigating. So submitting the file as a batch job works fine (I submit specifically to the node mentioned in the machine file as well). But when I run interactively I get these errors:
So it seems like something dodgy going on with environments that I'm yet to understand? |
On some systems there are differences when you run interactively from batch, such as whether your .bashrc is run. This varies from one system to another. It seems hipace is picking up the wrong standard C++ library. You could echo you LD_LIBARY_PATH in batch and try and replicate it. You said you had your original issue with the libensemble docs example script. Perhaps check if that works now to see if libEnsemble is still an issue. |
Should be fixed by Libensemble/libensemble#1392 which runs user scripts in shell. |
Sorry for the slow reply, was away for the last half of last week. I've just now installed the version of libensemble in the git branch you referenced and can confirm that this seems to have solved my issue. Thanks very much for your help :) |
Hi all,
I have been trying to do some simple grid scans with Optimas and HiPACE++ on Maxwell and am encountering an error that seems to be related to the submission script that Optimas (or libEnsemble?) generates e.g:
I checked and this also seems to happen with the examples also provided in the docs.
Here is an example of one of the submission scripts that are generated:
I try to submit the same script as a single batch job which also fails but then seems to suggest that
--ppn
isn't a valid parameter.... is this a bug or perhaps something to do with having the wrong version of openmpi? Or maybe even a peculiarity of Maxwell...Any advice is appreciated!
Cheers,
Lewis
The text was updated successfully, but these errors were encountered: