Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Intel ThreadBuildingBlocks #1315

Closed
wants to merge 2 commits into from
Closed

Conversation

jmjatlanta
Copy link
Contributor

@jmjatlanta jmjatlanta commented Sep 10, 2018

Another option to fix issue #1256

Added Intel ThreadBuildingBlocks to give us a library for a concurrent unordered set. This library has the following caveats:

  • This is not the latest from Intel. It is the 2017 version, maintained by github user wjakob. The official Intel version implements cmake in a way that I (and others) cannot figure out. I am still working on this in my tbb branch.
  • The function erase is not part of tbb::concurrent_unordered_set although they provide an unsafe_erase function. We are using this in 4 or 5 places, and need to look hard at these areas for ways to protect ourselves.

Please take a look and add your thoughts.

@abitmore
Copy link
Member

@pmconrad
Copy link
Contributor

Interesting. Thanks for the link @abitmore .

As far as I understand, erase creates problems not at the point where it is used, but because it interferes with all other threads that are currently using the object in question. So effectively this change doesn't help us at all.

Similarly, #1308 doesn't help us because it protects us only when getting iterators but not when using them.

Right now I don't see an easy solution.

@jmjatlanta
Copy link
Contributor Author

jmjatlanta commented Sep 14, 2018

As for erase: I believe we are protected if someone else already has the peer_connection_ptr, as it is reference counted ( a std::shared_ptr ).

As for iterator: I have seen other implementations that make the guts of the "for loop" a lambda, and either:

1 lock the collection via a lock guard then call std::for_each or
2 provide a for_each member function that does what 1 does

Personally I like 1, as it is more expressive, but 2 hides complexity ( option 1 requires the mutex to be exposed ). Pick your poison.

The side effect of course is that nobody can iterate (or do anything else) while someone else is iterating. But in this code, I believe we can mitigate its impact:

  1. The code was originally written to do everything in 1 thread. So adding multithreading here SHOULD not affect performance too much.
  2. We should take a look at each lambda as we build it to optimize, and hopefully shorten their duration.

Please shoot holes in my erase theory and lambda idea. I am looking forward to your comments.

@pmconrad
Copy link
Contributor

Yes, shared_ptr protects the thing it points to, but the shared_ptr itself is endangered by erase.

I think your idea for iteration may cause problems if the code within the loop yields. Not sure if it does that.

@abitmore
Copy link
Member

Afaics network_broadcast_api and witness plugin (block production) push data to p2p from wild, so not only one thread is accessing data in the module, they may or may not access the active connection set.

@abitmore
Copy link
Member

abitmore commented Sep 14, 2018

if the code within the loop yields, when other threads are blocked there, I guess the code will resume?

@jmjatlanta
Copy link
Contributor Author

I think your idea for iteration may cause problems if the code within the loop yields. Not sure if it does that.

@pmconrad I may be misunderstanding what you are saying, but here is a demo that may help: https://gist.github.com/jmjatlanta/696fd4a00a18c9ca014a2a8a446d1f04

@pmconrad
Copy link
Contributor

Thanks. Looks like the lock is kept during the yield. Makes sense.

There is the danger that if we lock the entire loop, and call other operations (add/erase) on the list that require locking, then we will create a deadlock.

@jmjatlanta
Copy link
Contributor Author

jmjatlanta commented Sep 28, 2018

There is the danger that if we lock the entire loop, and call other operations (add/erase) on the list that require locking, then we will create a deadlock.

We could use careful coding, or a smarter mutex (such as recursive_mutex). My vote is on the smarter mutex.

@jmjatlanta
Copy link
Contributor Author

My brain merged two things together earlier, and I need to sort it out.

I was proposing a lock around iterations. This is our lock, not used with the locking mechanisms inside the collection. That means that add/erase will use their internal locking, and we will prevent other iterators from starting to iterate because we have our lock. That protects us from other iterators, but not from add/erase.

If we want to "stop the world" while we iterate, we must be in control when iterating, adding, and erasing. We can do that in our code. But IMO by the time we did that we would have more or less what is #1308.

@pmconrad
Copy link
Contributor

It might be a good idea to create a high-level view of how the P2P layer operates.

@abitmore
Copy link
Member

abitmore commented Feb 18, 2019

I guess this won't be merged? Close it?

@jmjatlanta
Copy link
Contributor Author

I guess this won't be merged? Close it?

I am guessing it won't be merged. But I was holding off closing it until we get a resolution on how to resolve the concurrency issue with unordered_set. Should it be decided that my implementation of a concurrent unordered_set is not what we want, we will have to look at other options. Honestly, I feel that TBB is still not what we want, due to the way fc does threading contexts.

@jmjatlanta
Copy link
Contributor Author

Intel TBB locking will conflict with fc-context and fc-thread. AFAICS, TBB is not an option. Closing.

@jmjatlanta jmjatlanta closed this Mar 22, 2019
@pmconrad pmconrad removed this from the 3.1.0 - Feature Release milestone Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants