Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): fixes stale chain metadata being sent to listening state #5039

Conversation

sdbondi
Copy link
Member

@sdbondi sdbondi commented Dec 12, 2022

Description

  • Removes "caching" of peer metadata
  • Send only one peer chain metadata at a time to the listening state
  • increase chain metadata event channel to 20 (so that 20 of the last chain metadata received can be read by listening)

Motivation and Context

Fixes #5037 and #5030 (needs to be confirmed since this case is not easy to reproduce)

What I've observed is that the peer is banned twice. The first time, for 30 minutes (between 04:07:23.802123900 and 2022-12-09 04:41:26.365429300) then again permanently (bad metadata sig). It's clear in the logs that the pong is received before being banned, the node is banned (chain metadata service clears the peer's metadata) but the message is already in the domain pipeline so another one is received. The chain metadata service now keeps the peer's chain metadata and sends it to the listening state every time because it is not cleared.

TL;DR classic race condition.

Order of events:

  • peer is banned for 30 minutes
  • peer ban expires
  • we receive a ping from the peer
  • at almost the same time peer is banned (see logs in this comment)
  • chain metadata service clears the peer from peer_chain_metadata
  • the ping/pong is received (already in the pipeline from before the ban)
  • the chain metadata is added to the vec, and is never cleared (because the peer is not banned again)
  • the peer stays in the list despite being banned and NOT connected, so the header sync continues to try to connect to it

How Has This Been Tested?

Manually: rewind-blockchain and sync, enters sync mode timeously
Removed some tests that test removed functions
Added a new unit test for determine_sync_state

@CjS77 CjS77 added P-acks_required Process - Requires more ACKs or utACKs P-reviews_required Process - Requires a review from a lead maintainer to be merged labels Dec 12, 2022
@sdbondi sdbondi force-pushed the core-remove-chain-metadata-list branch from b22e2ad to f3d2c9c Compare December 12, 2022 07:52
@CjS77 CjS77 removed the P-reviews_required Process - Requires a review from a lead maintainer to be merged label Dec 12, 2022
@stringhandler stringhandler merged commit aaf99b7 into tari-project:development Dec 12, 2022
sdbondi added a commit to sdbondi/tari that referenced this pull request Dec 12, 2022
stringhandler pushed a commit that referenced this pull request Dec 12, 2022
Description
---
- adds check for peer banning in the listening state
- fix build/test error in #5039 

Motivation and Context
---
Do one final check that the peer is not banned before processing chain metadata in the listening state (can happen due to race conditions)
Fix test failure

How Has This Been Tested?
---
@hansieodendaal
Copy link
Contributor

hansieodendaal commented Dec 12, 2022

Nice, I will test it later on with a stress test by unbanning all manually banned peers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-acks_required Process - Requires more ACKs or utACKs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ChainMetadataService will not clear out stale metadata
4 participants