-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read CPU mask of main thread early to avoid reading it after it's been reset by other libraries #738
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
msimberg
changed the title
Get cpu mask early
Read CPU mask of main thread early to avoid reading it after it's been reset by other libraries
Aug 14, 2023
bors try |
tryBuild failed: |
bors try |
tryBuild failed: |
aurianer
approved these changes
Aug 15, 2023
msimberg
force-pushed
the
get-cpu-mask-early
branch
from
August 15, 2023 12:56
4633bb2
to
7426fad
Compare
bors try |
bors merge |
bors bot
added a commit
that referenced
this pull request
Aug 15, 2023
738: Read CPU mask of main thread early to avoid reading it after it's been reset by other libraries r=msimberg a=msimberg Part of #568. This is part one that helps work around problems when using pika together with OpenMP. This makes `topology` store the current CPU mask on construction that can then be retrieved using `get_cpubind_mask_main_thread`. This also makes sure that the `topology` object is constructed early in a global constructor so that the mask is actually read on the main thread. This change helps if OpenMP sets the binding of the main thread when it first encounters a parallel region. LLVM OpenMP seems to behave this way. However, GNU OpenMP actually sets the binding in a global constructor (https://github.com/gcc-mirror/gcc/blob/7879f589af911ea6a910d08919014b0b2df1b4b1/libgomp/env.c#L2409), and since it's called in a shared library we can't directly influence when that happens I will open a follow-up PR to also allow overriding the mask explicitly for the cases when GNU OpenMP is used (most cases...). One can in theory control the order in which constructors in shared libraries are called by playing around with linking order, but this seems like a very error-prone approach so I'm not even going to try to go that way. Fly-by: I've renamed `create_topology` to `get_topology` since that's what it's primarily used for (it does also create the object, but only on the first call...). Co-authored-by: Mikael Simberg <mikael.simberg@iki.fi>
Build failed: |
bors merge |
tryMerge conflict. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Part of #568. This is part one that helps work around problems when using pika together with OpenMP. This makes
topology
store the current CPU mask on construction that can then be retrieved usingget_cpubind_mask_main_thread
. This also makes sure that thetopology
object is constructed early in a global constructor so that the mask is actually read on the main thread.This change helps if OpenMP sets the binding of the main thread when it first encounters a parallel region. LLVM OpenMP seems to behave this way. However, GNU OpenMP actually sets the binding in a global constructor (https://github.com/gcc-mirror/gcc/blob/7879f589af911ea6a910d08919014b0b2df1b4b1/libgomp/env.c#L2409), and since it's called in a shared library we can't directly influence when that happens I will open a follow-up PR to also allow overriding the mask explicitly for the cases when GNU OpenMP is used (most cases...). One can in theory control the order in which constructors in shared libraries are called by playing around with linking order, but this seems like a very error-prone approach so I'm not even going to try to go that way.
Fly-by: I've renamed
create_topology
toget_topology
since that's what it's primarily used for (it does also create the object, but only on the first call...).