Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockstore: populate duplicate shred proofs for merkle root conflicts #34270

Merged
merged 8 commits into from
Jan 3, 2024

Conversation

AshwinSekar
Copy link
Contributor

Problem

We only construct duplicate proofs for conflicting shreds, LAST_SHRED_IN_SLOT, and erasure meta conflicts.

Summary of Changes

When receiving a new shred check to see if there is a merkle root conflict from an existing merkle root meta. If there is such a conflict, construct a duplicate proof and notify gossip and the duplicate block state machine.

Split from #33889
Fixes #33644

Copy link

codecov bot commented Nov 29, 2023

Codecov Report

Attention: 17 lines in your changes are missing coverage. Please review.

Comparison is base (59dd007) 81.8% compared to head (0c1f848) 81.8%.

Additional details and impacted files
@@           Coverage Diff            @@
##           master   #34270    +/-   ##
========================================
  Coverage    81.8%    81.8%            
========================================
  Files         824      824            
  Lines      222261   222327    +66     
========================================
+ Hits       181887   182009   +122     
+ Misses      40374    40318    -56     

@AshwinSekar AshwinSekar self-assigned this Nov 29, 2023
@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch from f1d368d to 58802c3 Compare November 30, 2023 22:38
@AshwinSekar AshwinSekar marked this pull request as ready for review November 30, 2023 22:39
@AshwinSekar
Copy link
Contributor Author

Only last commit is relevant, builds on top of #34268 and #34269

@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch from 58802c3 to e9a174a Compare November 30, 2023 23:01
@AshwinSekar
Copy link
Contributor Author

I have removed the refactor portion of this pr. This can be merged independently of #34268 and #34269

ledger/src/blockstore.rs Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch from e9a174a to 5b986db Compare December 1, 2023 02:40
@AshwinSekar AshwinSekar requested a review from t-nelson December 1, 2023 03:19
@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch from 82ac70c to c0e3f85 Compare December 1, 2023 03:38
wen-coding
wen-coding previously approved these changes Dec 1, 2023
if let Some(shred) = just_inserted_shreds.get(&key) {
Cow::Borrowed(shred.payload())
} else {
} else if shred_type == ShredType::Data {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a match expression here?
This code is not safe if ShredType enum is extended with a new variant. (unlikely that it happens but no reason to leave this foot gun here).

self.get_coding_shred(slot, u64::from(index))
.unwrap()
.unwrap_or_else(|| {
panic!("{} {} {:?} must be present!", slot, index, shred_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What ensures that the shred is present here?
This is a pretty risky method that internally relies on certain presumptions where the caller may not be aware of.
I do see the logic was already present before this commit, but the commit is extending the uses of this method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing. This method is used for cases where a meta structure has already informed you of the shreds existence, where it is preferable to panic if there is an inconsistency.

However if you prefer I can return an option, and have the callers error! and continue instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think returning an Option is better here. Maybe in a separate pr first if that is changing too much of existing code.
this function as it stands is a pretty risky code.

Comment on lines 1531 to 1533
slot: Slot,
index: u64,
index: u32,
shred_type: ShredType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just take a ShredId instead of these 3 arguments?

@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch 2 times, most recently from 169f8ce to 38a4e8d Compare December 6, 2023 17:28
@AshwinSekar AshwinSekar marked this pull request as draft December 7, 2023 18:08
@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Dec 22, 2023
@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch 2 times, most recently from 4e73705 to 0c05052 Compare December 23, 2023 00:23
@AshwinSekar AshwinSekar removed the stale [bot only] Added to stale content; results in auto-close after a week. label Dec 23, 2023
@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch from 0c05052 to ffc9039 Compare December 23, 2023 00:26
@AshwinSekar AshwinSekar added the feature-gate Pull Request adds or modifies a runtime feature gate label Dec 23, 2023
@AshwinSekar AshwinSekar marked this pull request as ready for review December 23, 2023 03:17
Comment on lines 167 to 168
let send_merkle_root_conflicts = cluster_nodes::check_feature_activation(
&feature_set::merkle_conflict_duplicate_proofs::id(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using two different names to refer to the same thing,

send_merkle_root_conflicts
merkle_conflict_duplicate_proofs

Can we use the same name for the feature and the flag, so that it is easier to find all the instances?

Comment on lines +183 to +185
// Although this proof can be immediately stored on detection, we wait until
// here in order to check the feature flag, as storage in blockstore can
// preclude the detection of other duplicate proofs in this slot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why blockstore doesn't check the feature flag before sending this?

// A previous shred has been inserted in this batch or in blockstore
// Compare our current shred against the previous shred for potential
// conflicts
if !self.perform_merkle_check(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perform_merkle_check is a pretty confusing name and error prone.
Like by no means it is clear if false is good or bad.

self.get_coding_shred(slot, u64::from(index))
.unwrap()
.unwrap_or_else(|| {
panic!("{} {} {:?} must be present!", slot, index, shred_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think returning an Option is better here. Maybe in a separate pr first if that is changing too much of existing code.
this function as it stands is a pretty risky code.

Comment on lines 1596 to 1601
/// Returns true if there is no merkle root conflict between
/// the existing `merkle_root_meta` and `shred`
///
/// Otherwise return false and if not already present, add duplicate proof to
/// `duplicate_shreds`.
fn perform_merkle_check(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a better name to be more clear what true or false means at the call site without having to look up this comment. otherwise the usage will be very error prone.

Comment on lines 1622 to 1624
merkle_root_meta.merkle_root(),
merkle_root_meta.first_received_shred_index(),
merkle_root_meta.first_received_shred_type(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just {:?} the merkle_root_meta instead of these 3 lines?

shred: &Shred,
duplicate_shreds: &mut Vec<PossibleDuplicateShred>,
) -> bool {
let new_merkle_root = shred.merkle_root().ok();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we check for the feature_set::merkle_conflict_duplicate_proofs feature activation here in this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also considered this as an option - since this is code is temporary until the feature flag is turned on I went with this approach to keep the PR small.

There are 183 calls (mostly all testing code) for blockstore.insert_shreds that we would need to be altered and then removed on feature flag cleanup.

I figured it would be easiest to check the feature flag in window_service and once the feature flag has been activated move the blockstore.store_duplicate_slot from window_service to blockstore.insert_shreds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine, if there is no correctness concern.
e.g. when blockstore returns a duplicate proof it will stop processing there and ignores other variants of duplicate proofs.
But if window-service ignores that because the feature is not activated, then it is possible that the other variants of duplicates proofs were also present but not processed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we don't insert into blockstore until window service, other variants of duplicate proofs will not be precluded from being processed if the feature flag is not present.

additionally because they are separate enum variants, multiple signals can be sent to window service if a block violates multiple duplicate conditions.

@AshwinSekar AshwinSekar force-pushed the merkle-root-dup-proof branch from ffc9039 to bf6fea9 Compare January 3, 2024 03:04
@AshwinSekar AshwinSekar merged commit 1908841 into solana-labs:master Jan 3, 2024
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-gate Pull Request adds or modifies a runtime feature gate
Projects
Development

Successfully merging this pull request may close these issues.

Extend duplicate proofs for merkle shreds
3 participants