Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple OpenMP runtimes in the same process #47

Open
rohany opened this issue May 24, 2022 · 3 comments
Open

Support for multiple OpenMP runtimes in the same process #47

rohany opened this issue May 24, 2022 · 3 comments

Comments

@rohany
Copy link

rohany commented May 24, 2022

I'm using TBLIS in a system that has support for running multiple OpenMP runtimes within the same process, which is somewhat unusual. I'm tracking down some weird performance issues (StanfordLegion/legion#1266) when using TBLIS in this situation, and am wondering if there are some architectural issues within TBLIS (such as global state / locks) that could cause interference between independent TBLIS calls on these different OpenMP runtimes.

@devinamatthews
Copy link
Owner

During normal computation, the only global locking is done when checking out a block of memory from the global pool. This should scale roughly the same as if you were using one OpenMP runtime across all of the cores in the first place. I'm not exactly sure what "running multiple OpenMP runtimes within the same process" even means, though. Anything else works explicitly amongst the threads spawned by #pragma omp parallel, so if those thread groups are distinct then there shouldn't be any performance impact.

@jeffhammond
Copy link

Multiple OpenMP runtimes in a single process is not a legal use case for OpenMP. Nothing is the specification required it to work and there are good reasons why it cannot work.

You'll find that if you use the KMP runtime in Intel and LLVM, it supports the GOMP symbols required to interoperate with OpenMP. Just make sure only KMP is in the library load path.

You might need to set an Intel compiler flag to force the use of GOMP symbols instead of IOMP5 for this to work perfectly.

NVHPC (formerly PGI) also has a GOMP compatible runtime. The same caveat about a compiler flag for the OpenMP runtime ABI may apply.

You'll find interoperability is more robust for common OpenMP features. Some parts of OpenMP 4+ tasking and target offload are less reliable when mixing runtimes, but I don't think that's relevant to TBLIS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants