-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replay Stage start_leader() can use wrong parent fork() #3238
Conversation
core/src/poh_recorder.rs
Outdated
pub fn bank(&self) -> Option<Arc<Bank>> { | ||
self.working_bank.clone().map(|w| w.bank) | ||
} | ||
pub fn tick_height(&self) -> u64 { | ||
self.poh.tick_height | ||
} | ||
pub fn ticks_per_slot(&self) -> u64 { | ||
self.ticks_per_slot | ||
} | ||
// synchronize PoH with a bank | ||
pub fn reset(&mut self, tick_height: u64, blockhash: Hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about passing both slot and tick_height to this function instead? might obviate the need for storing ticks_per_slot in poh_recorder
…ly the biggest indexed bank in frozen_slots()
core/src/replay_stage.rs
Outdated
@@ -179,58 +189,55 @@ impl ReplayStage { | |||
bank_forks: &Arc<RwLock<BankForks>>, | |||
poh_recorder: &Arc<Mutex<PohRecorder>>, | |||
cluster_info: &Arc<RwLock<ClusterInfo>>, | |||
blocktree: &Blocktree, | |||
ticks_per_slot: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you need to calculate a slot, why not do this above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
donezo
Codecov Report
@@ Coverage Diff @@
## master #3238 +/- ##
========================================
- Coverage 81.5% 81.4% -0.1%
========================================
Files 131 131
Lines 19920 19954 +34
========================================
+ Hits 16238 16259 +21
- Misses 3682 3695 +13 |
…3238) * Make sure start_leader starts on the last voted block, not necessarily the biggest indexed bank in frozen_slots() * Fix tvu test
* Make sure start_leader starts on the last voted block, not necessarily the biggest indexed bank in frozen_slots() * Fix tvu test
* build(deps): bump anyhow from 1.0.89 to 1.0.90 Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.89 to 1.0.90. - [Release notes](https://github.com/dtolnay/anyhow/releases) - [Commits](dtolnay/anyhow@1.0.89...1.0.90) --- updated-dependencies: - dependency-name: anyhow dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * Update all Cargo files --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Problem
Currently in start_leader(), the "parent" bank of any newly started TPU is the "greatest" bank in the set of all frozen banks.
However, in our current code, we can vote for slots out of order:
For instance, we could vote for slot 21, and then later vote for slot 19 on a different fork. Then
when we start up a TPU later, we assume slot 21 is the parent, even though the PohRecorder started from slot 19
Summary of Changes
Derive the slot instead based on the first tick stored in the PohRecorder cache
Fixes #