Skip to content

Commit

Permalink
Merge #5962: fix: deadlock over cs_main and contributionsCacheCs in d…
Browse files Browse the repository at this point in the history
…kssessionmgr

ded1b5a fix: deadlock over cs_main and contributionsCacheCs in dkssessionmgr (Konstantin Akimov)

Pull request description:

  ## Issue being fixed or feature implemented
  **It fixes rpc failure: "Work queue depth exceeded"**

  As I checked on running `dashd` in deadlock condition:
  Thread 78 is a thread that keep `cs_main`:
  ```
  #14 0x0000aaaad1f8d604 in BuildSimplifiedMNListDiff () at evo/simplifiedmns.cpp:364
  ```
  but it is locked by `contributionsCacheCs`
  ```
  #8  llmq::CDKGSessionManager::GetVerifiedContributions () at llmq/dkgsessionmgr.cpp:392
  ```

  On other hand, `contributionsCacheCs` is blocked by Thread 59
  ```
  #17 0x0000aaaad1ba1940 in llmq::CDKGSessionManager::GetVerifiedContributions () at llmq/dkgsessionmgr.cpp:393
  ```
  and it makes circuit lock by waiting `cs_main` in
  ```
  #9  ReadBlockFromDisk () at node/blockstorage.cpp:75
  ```

  See dashpay/dash-issues#69 for more details

  Seems introduced there: #3911

  ## What was done?
  Deadlock is removed by reducing scope of mutex

  ## How Has This Been Tested?
  I reviewed 2 different servers which have status `work queue exceeded`, both have same deadlock, so, this patch should fix this issue. Once this fix is merged, we can test it on testnet.

  ## Breaking Changes
  N/A

  ## Checklist:
  - [x] I have performed a self-review of my own code
  - [x] I have commented my code, particularly in hard-to-understand areas
  - [ ] I have added or updated relevant unit/integration/functional/e2e tests
  - [ ] I have made corresponding changes to the documentation
  - [x] I have assigned this pull request to a milestone

Top commit has no ACKs.

Tree-SHA512: 4fe5c03c464ee6934fb927b897f007b65a8995723196edaffdae067edee7067da151130d4c4bac47d3418fdad5c8e130682f42d7ef9c044380a8c8fff78ee008
  • Loading branch information
PastaPastaPasta committed Apr 2, 2024
2 parents 938cc70 + ded1b5a commit fcee7a5
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion src/llmq/dkgsessionmgr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,6 @@ void CDKGSessionManager::WriteEncryptedContributions(Consensus::LLMQType llmqTyp

bool CDKGSessionManager::GetVerifiedContributions(Consensus::LLMQType llmqType, const CBlockIndex* pQuorumBaseBlockIndex, const std::vector<bool>& validMembers, std::vector<uint16_t>& memberIndexesRet, std::vector<BLSVerificationVectorPtr>& vvecsRet, std::vector<CBLSSecretKey>& skContributionsRet) const
{
LOCK(contributionsCacheCs);
auto members = utils::GetAllQuorumMembers(llmqType, m_dmnman, pQuorumBaseBlockIndex);

memberIndexesRet.clear();
Expand All @@ -399,6 +398,9 @@ bool CDKGSessionManager::GetVerifiedContributions(Consensus::LLMQType llmqType,
memberIndexesRet.reserve(members.size());
vvecsRet.reserve(members.size());
skContributionsRet.reserve(members.size());

// NOTE: the `cs_main` should not be locked under scope of `contributionsCacheCs`
LOCK(contributionsCacheCs);
for (const auto i : irange::range(members.size())) {
if (validMembers[i]) {
const uint256& proTxHash = members[i]->proTxHash;
Expand Down

0 comments on commit fcee7a5

Please sign in to comment.