-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run Blockchain actor on its own thread using a PinnedDispatcher. #681
Conversation
The |
Maybe you don’t understand me. The snapshot is not guaranteed to get the most recent version of the database if it isn’t in the same thread. |
To be clear: Creating a snapshot in another thread is not guaranteed to get the latest version of the database if it has recently written to the database from another thread. |
neo/neo/Persistence/LevelDB/DbSnapshot.cs Line 33 in 5ce0e4c
The internal implementation of |
It is thread safe but it will not be guaranteed to get the same version of the database. |
It will be guaranteed to get the same version of the database. That's why they design snapshot for leveldb. |
I have run multiple tests and proved without any doubt that it absolutely does not always get the latest version of the database if the database was very recently changed on another thread. It doesn’t matter that the creation of the snapshot is after the write it is a matter of memory visibility; LevelDB is thread safe in that it won’t cause corruption in this case but it is not thread safe to guarantee that it will get the latest version of the database in this case. |
I’ve tried two different versions of LevelDB, and in practice, both of them do not work as you are saying. |
Snapshot in LevelDB is meant to give you a read-only version of the database. It isn’t guaranteed to give you the latest version if asking on another thread. It will give you a consistent version from the perspective of batch writes, but not necessarily the latest. |
Can you put the test code on the conversation? |
You can easily test it by running code that imports 1 block at a time from a chain.acc file. You will see that it will fail after a while due to the snapshot getting a previous view of the database. |
Do you mean the |
No I will give an example expressing the problem: Persist block 5
(Thread switch) Persist block 6
|
Getting snapshot is always happened after last commit, why it is possible to get block 4 snapshot? Unless commit is not complete but start getting next snapshot. |
My suspicion is related to memory visibility across threads and what is happening internally in LevelDB. It also may still be necessary for synchronous writes to ensure the snapshot is complete in addition to this change, to fully prevent the problem. That was the PR I previously closed #680 . However, to be safe, I think we may need both. |
…in the latest data.
In conclusion it needs synchronous writes from #680 , but that alone isn't enough it also needs the |
LevelDB should be using memory barriers and CAS primitives to be able to ensure the latest data will be read from other threads, so that these changes shouldn’t be necessary. We probably need to deep dive into LevelDB to see what is happening. Also maybe these issues are specific to the OS on which I’m testing. |
Even with these changes I'm still having syncing problems. I'm going to close this until I fully root cause the issue and am able to sync the chain without issues. |
So it seems my issues were actually caused by a bug in LevelDB. I no longer seem to encounter the issues when using LevelDB with the following changes applied: |
In testing the improved
ImportBlocks
plugin neo-project/neo-modules#66 I encountered similar issues as I had previously described here neo-project/neo-vm#53. When I had encountered these issues in the past, it was also when using code that imported blocks using multipleImport
messages. I've finally root caused the issue to be related to theBlockchain
actor running on different OS threads, and I will describe the issue in more detail below.Specifically, consider this line that gets a snapshot of the database:
neo/neo/Ledger/Blockchain.cs
Line 450 in 5ce0e4c
This line works reliably to get the latest snapshot, when the same OS thread previously persisted the block through the following code:
neo/neo/Ledger/Blockchain.cs
Line 623 in 5ce0e4c
However, if the actor switched threads between invocations, LevelDB does not ensure that the snapshot subsequently retrieved will be the latest snapshot (it doesn't guarantee atomic access to the latest snapshot last written across threads). By forcing the
Blockchain
to run on the same thread using aPinnedDispatcher
the consistency is preserved and the updatedImportBlocks
plugin from neo-project/neo-modules#66 works without any issues. Also, this as it is now in regular operation could have a bug if two blocks are persisted in fast succession.