Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to CMake for compilation #5

Merged
merged 25 commits into from
Apr 18, 2024
Merged

Switch to CMake for compilation #5

merged 25 commits into from
Apr 18, 2024

Conversation

RaulPPelaez
Copy link
Contributor

@RaulPPelaez RaulPPelaez commented Feb 25, 2024

In this PR I update the build system in libMobility:

  1. Now CMake is used instead of Makefiles. Things like MKL are autodetected and linked
  2. Added a conda environment.yml with all required dependencies
  3. Add an install rule so that users can simply "import libMobility" after running make install.
  4. Removed most of the git submodules, leaving only UAMMD (which will be added as a conda depdency when UAMMD is available in conda-forge) and Lanczos.
  5. Added a CI script that will try to compile, install and import every solver each time a PR is opened or merged.

By installing libMobility to the conda environment all libraries (python and C++) and headers will be available without any user intervention as if they were other system libraries.

For instance, an user seeking to link NBody into his C++ code can just link with -llibMobility_NBody and include the headers as:

#include "MobilityInterface/MobilityInterface.h"
#include"libMobility/solvers/NBody/mobility.h"
#include"libMobility/solvers/PSE/mobility.h"

without having to add any "-L" or "-I" directives.

Additionally, after calling make install, the user can import libMobility without extra steps.

cc @brennansprinkle @rykerfish please check this out and lmk if you ran into any trouble with the latest install instructions.

@RaulPPelaez
Copy link
Contributor Author

The CI does not have a GPU available, so the only thing it can do is to check that compilation succeeds and that it is possible to import the different solvers from Python (which does not require CUDA).

@RaulPPelaez
Copy link
Contributor Author

You can check the output of the CI script by clicking on "Actions" in the top menu. You can also see the run for the latest commit in this PR at the bottom here by clicking on Details:
image

@RaulPPelaez
Copy link
Contributor Author

@rykerfish It would be great if you could review and approve this PR.

@RaulPPelaez RaulPPelaez mentioned this pull request Feb 25, 2024
Merged
Copy link
Contributor

@rykerfish rykerfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a few minor dependency problems on my side (missing specific CUDA libraries, e.g. cufft), compilation in both single and double precision works great- much easier than before.

Looking at the example in both python and c++, I found some oddities. Will follow up with a commit and another comment shortly. I went through and approved anyways since the compilation at least behaves normally, and the other issues might be on my side.

@rykerfish
Copy link
Contributor

I (understandably) don't have push permissions to this branch, so I'll detail the issues here.

  1. For the python side, I was having to use import libMobility.Solver specifically to access any of the solvers (e.g. import libMobility.SelfMobility). When constructing the solver, it seemed like the constructor had to be explicitly called using libMobility.SelfMobility.SelfMobility(args) which led to a repetitive naming convention. Was there a better way around this that I didn't figure out?
  2. All solvers except SelfMobility gave me issues in both the python and c++ examples. SelfMobility worked as expected.

For the python example, NBody would give floating point exceptions in the call to MDot. For PSE, I encountered the error below.
RuntimeError: CudaCheckError() failed at /home/rfish/school/codes/libMobility/third_party/uammd/src/utils/ParticleSorter.cuh:234: the provided PTX was compiled with an unsupported toolchain. - code: 222

On the c++ side, it would randomly pick between one of two different errors. The first was an out of memory error from CUDA, and the second being shown below:
terminate called after throwing an instance of 'thrust::system::system_error' what(): __copy::trivial_device_copy H->D: failed: cudaErrorInvalidValue: invalid argument

All of the above could absolutely be an issue with my installation (nvcc is version 12.3), but I have been using older versions of libMobility for Brennan's fiber-related projects that gives me some assurance it's working.

@RaulPPelaez
Copy link
Contributor Author

I added you as a maintainer, I thought I already did! You should have push access to this branch now. But ofc feel free to just open a new PR with my commits.

@RaulPPelaez
Copy link
Contributor Author

  1. We should be able to overcome that by adding a init.py under solvers that just makes every SolverName/SolverName available as SolverName (there used to be some file like that under /python). That way libMobility.solvers would include every solver under that namespace.
  2. Those are all CUDA installation issues. I would bet the problem is your NVIDIA driver is too old for CUDA 12.3

If updating the driver is not a possibility then you need to use a previous version of CUDA. Conda-forge does not expose the convenient cuda-nvcc and such packages for CUDA<12. You would have to either change to cudatoolkit-dev, use the nvidia channel or just use the system's CUDA installation.

@rykerfish
Copy link
Contributor

Thanks for that- I had forgotten to update my drivers when I updated from 12.3 to 12.4. Updating those made most things good to go.

One note- the NBody example in python can still cause exceptions. Setting NperBatch to -1 can leave NperBatch uninitialized since it defaults to the number of particles. However, if initialize hasn't been called (and it hasn't been yet in example.py), then the number of particles is uninitialized as well, leaving NperBatch as a random value. Getting random values like 0 or larger than memory caused me some floating point exceptions or out of memory errors.

I'm still having unknown issues with the c++ example, which I'll look into more soon.

Now that things are mostly working on my side, I plan to experiment and get more used to using libMobility. I'll keep notes so I can ask you any questions that come up, and hopefully I can use the notes to start working on documentation as per #6 if that sounds good to you!

…tion so that modules can be imported differently.
…ositions weren't getting set and NBody wasn't getting the batch size initialized.
@rykerfish
Copy link
Contributor

@RaulPPelaez Sorry about the delay, but I finally got around to making the changes I wanted to make here. I set up an init.py file and (after some learning) adjusted cmake to put it in the right place to fix the solver naming issue in the Python imports.

I was having random behavior breaking the example codes, so I made those work on my end. I think the main problem is that you can get undefined behavior in by calling NBody's setParameters() before initialize() since NBody's numberParticles parameter won't be initialized, making the default setting for NperBatch unsafe. Should that behavior get modified? To keep the parameters decoupled, we could check if the solver has been initialized within setParameters(), but there is probably a better solution.

@RaulPPelaez
Copy link
Contributor Author

That is a good finding about setParameters. Please open another PR for that. In this one we should just get the CMake compilation working.

@RaulPPelaez
Copy link
Contributor Author

See here for an example of a CMakeLists that installs a python library
https://github.com/openmm/NNPOps/blob/d15cb9196e283b6b55f88a93d85232458f64fa18/CMakeLists.txt#L103-L119
I think you did good, however.
Please review the PR and lets merge it!

@rykerfish rykerfish self-requested a review April 18, 2024 21:22
@rykerfish rykerfish merged commit 633d3a4 into main Apr 18, 2024
1 check passed
@RaulPPelaez RaulPPelaez deleted the cmake branch May 4, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants