Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[base node] Block chain db locks up with read-write lock #4668

Closed
hansieodendaal opened this issue Sep 13, 2022 · 1 comment · Fixed by #4674
Closed

[base node] Block chain db locks up with read-write lock #4668

hansieodendaal opened this issue Sep 13, 2022 · 1 comment · Fixed by #4674
Assignees

Comments

@hansieodendaal
Copy link
Contributor

hansieodendaal commented Sep 13, 2022

Sometimes the block chain db locks up with interlock - this was observed in a big stress test with a base node serving a dedicated wallet, a SHA3 miner and an XMRIG miner.

2022-09-13 00:40:56.936229100 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6576
2022-09-13 00:40:56.936246000 [c::cs::database] TRACE [add_block] acquired write access db lock for block #6576 in 600.00ns
2022-09-13 00:40:57.717735500 [c::cs::database] TRACE [add_block] released write access db lock for block #6576 
2022-09-13 00:58:58.408781900 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6577
2022-09-13 00:58:58.408788900 [c::cs::database] TRACE [add_block] acquired write access db lock for block #6577 in 500.00ns
2022-09-13 00:58:58.926669600 [c::cs::database] TRACE [add_block] released write access db lock for block #6577 
2022-09-13 01:06:23.420061300 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:06:23.420069100 [c::cs::database] TRACE [add_block] acquired write access db lock for block #6578 in 800.00ns
2022-09-13 01:06:23.944608200 [c::cs::database] TRACE [add_block] released write access db lock for block #6578 
2022-09-13 01:06:46.663132800 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:06:47.201773700 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6579
2022-09-13 01:08:23.542110800 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:10:13.128309400 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:19:51.308916800 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:22:17.251338900 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:24:06.834677600 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:27:47.983194000 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:28:46.756211400 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:29:13.694456600 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:44:57.777042700 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:45:40.835611500 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
2022-09-13 01:46:56.703743600 [c::cs::database] TRACE [add_block] waiting for write access to add block block #6578
@hansieodendaal
Copy link
Contributor Author

hansieodendaal commented Sep 13, 2022

Other symptoms: Watched command status failed: deadline has elapsed (see related #4618)

01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 266, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 266, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 166, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 9, Rpc: 7/10000, RandomX: #0 with flags FLAG_HARD_AES | FLAG_JIT | FLAG_ARGON2_SSSE3 | FLAG_ARGON2_AVX2 | FLAG_ARGON2
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 10, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 13, Rpc: 7/10000
01:05 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 13, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 12, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6577 (Mon, 12 Sep 2022 22:58:53 +0000), Mempool: 21tx (567378g, +/- 5blks), Connections: 18, Banned: 0, Messages (last 60s): 9, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6578 (Mon, 12 Sep 2022 23:06:20 +0000), Mempool: 17tx (459306g, +/- 4blks), Connections: 18, Banned: 0, Messages (last 60s): 11, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6578 (Mon, 12 Sep 2022 23:06:20 +0000), Mempool: 17tx (459306g, +/- 4blks), Connections: 18, Banned: 0, Messages (last 60s): 12, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6578 (Mon, 12 Sep 2022 23:06:20 +0000), Mempool: 17tx (459306g, +/- 4blks), Connections: 18, Banned: 0, Messages (last 60s): 13, Rpc: 7/10000
01:06 v0.38.2, esmeralda, State: Listening, Tip: 6578 (Mon, 12 Sep 2022 23:06:20 +0000), Mempool: 17tx (459306g, +/- 4blks), Connections: 18, Banned: 0, Messages (last 60s): 12, Rpc: 7/10000
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed
Watched command `status` failed: deadline has elapsed

@hansieodendaal hansieodendaal changed the title [base node] Block chain db locks up with interlock [base node] Block chain db locks up with read-write lock Sep 13, 2022
@stringhandler stringhandler moved this from Bugs to In Progress in Tari Esme Testnet Sep 13, 2022
stringhandler pushed a commit that referenced this issue Sep 14, 2022
Description
---
This fixes a deadlock in the code. 

The `RwLock` allows a mutex to exist that allows concurrent reads but blocks concurrent writes, this behaves differently depending on the OS.
On Linux systems: The lock favors reads, meaning that as long as there exists a reader lock, it will allow a new reader lock to open. This means that the system can cause starvation of writers. 
On Mac/Win the lock has equal ordering, this means that as soon as a writer queues for a lock, all additional readers will be blocked till after the writer has acquired and released its lock. 

This behavior can be dangerous if recursive locks are used, as was the case here. 


At about the same time, a block was submitted, and a template was constructed for a new miner. 
The `add_block` process requires a write lock, while the `block template` process requires a read lock. 
The `template process` was first in acquiring a lock on the read, followed shortly by the `add_block` on the blocking for a write. But the deadlock was caused after the `add_block` blocked for a write, the `block template` required an additional read_lock on the calculation of the mmr roots. And thus, the entire `block_chain db `class is deadlocked. 


Fixes: #4668
Repository owner moved this from In Progress to Done in Tari Esme Testnet Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants