Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prov/util: Switch ep_list_lock to always be mutex #9377

Merged
merged 1 commit into from
Sep 27, 2023

Conversation

a-szegel
Copy link
Contributor

Switch ep_list_lock to always be a mutex lock until we can agree on a better long term decision. The issue is that some endpoints want the ability to not require FI_THREAD_SAFE at the domain level, but they want to be able to control the endpoint in a seperate thread. The ep_list_lock is used for both the data path, and the control path. In order to adhere to the current api, and not break providers that follow the current api, we need to revert this lock to a mutex until we can get a optimal solution for everyone.

@a-szegel
Copy link
Contributor Author

I do not think this is the right long term solution, but it gives us time to come up with a better option (without fully reverting).

@shefty
Copy link
Member

shefty commented Sep 27, 2023

I agree that we don't want to revert completely. We will also need to update the other places where this change was made. Counter and EQ? (I don't remember).

FWIW, I think apps using msg ep's may be more impacted by the threading options than rdm ep's. That's where my concern is coming from.

Switch ep_list_lock to always be a mutex lock until we can agree on a
better long term decision. The issue is that some endpoints want the
ability to not require FI_THREAD_SAFE at the domain level, but they want
to be able to control the endpoint in a seperate thread.  The
ep_list_lock is used for both the data path, and the control path.  In
order to adhere to the current api, and not break providers that follow
the current api, we need to revert this lock to a mutex until we can get
a optimal solution for everyone.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
@a-szegel
Copy link
Contributor Author

Done. I don't want to trade use cases. I want to get the optimal approach for both use cases.

Copy link
Member

@shefty shefty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@shefty
Copy link
Member

shefty commented Sep 27, 2023

I agree. However I think doing so may require changing the behavior outlined in the man pages. That's doable for version 2, but may be difficult in v1.

@shefty shefty merged commit 64b31be into ofiwg:main Sep 27, 2023
8 checks passed
a-szegel added a commit to a-szegel/libfabric that referenced this pull request Oct 23, 2023
I mistakenly reverted the wrong lock on PR#9377, and caused a
performance regression.  This change makes the ep->lock be domain
dependent, and properly revers util_cntr ep_list_lock to always be
mutex.  This restores most of the performance regression.

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
a-szegel added a commit to a-szegel/libfabric that referenced this pull request Oct 23, 2023
I mistakenly reverted ep->lock to always be an ofi_mutex.
Restore ep->lock to depend on domain threading model

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
a-szegel added a commit to a-szegel/libfabric that referenced this pull request Oct 25, 2023
I mistakenly reverted ep->lock to always be an ofi_mutex.
Restore ep->lock to depend on domain threading model

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
j-xiong pushed a commit that referenced this pull request Oct 26, 2023
I mistakenly reverted ep->lock to always be an ofi_mutex.
Restore ep->lock to depend on domain threading model

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Juee14Desai pushed a commit to Juee14Desai/libfabric that referenced this pull request Jan 31, 2024
I mistakenly reverted ep->lock to always be an ofi_mutex.
Restore ep->lock to depend on domain threading model

Signed-off-by: Seth Zegelstein <szegel@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants