Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

send duplicate shred proofs for conflicting shred scenarios #32965

Conversation

AshwinSekar
Copy link
Contributor

@AshwinSekar AshwinSekar commented Aug 23, 2023

These are multiple last_shred_in_slot shreds and
coding shreds with conflicting erasure metas.

Problem

These scenarios are detected and stored in blockstore, but never sent to gossip or the state machine.

Summary of Changes

Fix the plumbing so these are sent to gossip and the state machine.

Split from #32862

@AshwinSekar AshwinSekar force-pushed the duplicate-shred-proof-last-slot-erasure-meta branch from 5c90b2c to 3812210 Compare August 24, 2023 00:33
@codecov
Copy link

codecov bot commented Aug 24, 2023

Codecov Report

Merging #32965 (3812210) into master (79c02c8) will increase coverage by 0.0%.
Report is 8 commits behind head on master.
The diff coverage is 89.9%.

❗ Current head 3812210 differs from pull request most recent head 5907204. Consider uploading reports for the commit 5907204 to get more accurate results

@@           Coverage Diff            @@
##           master   #32965    +/-   ##
========================================
  Coverage    81.9%    82.0%            
========================================
  Files         785      784     -1     
  Lines      213295   212931   -364     
========================================
- Hits       174794   174704    -90     
+ Misses      38501    38227   -274     

bw-solana
bw-solana previously approved these changes Aug 25, 2023
Copy link
Contributor

@bw-solana bw-solana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AshwinSekar
Copy link
Contributor Author

Will wait for behzad before merging.

Comment on lines 134 to 141
if shred1.index() != shred2.index() && shred1.shred_type() == ShredType::Code {
if shred1.fec_set_index() != shred2.fec_set_index() {
return Err(Error::InvalidErasureMetaConflict);
}
let erasure_meta =
ErasureMeta::from_coding_shred(shred1).expect("Shred1 should be a coding shred");
if erasure_meta.check_coding_shred(shred2) {
return Err(Error::InvalidErasureMetaConflict);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not technically exhaustive.
Any 2 shreds with different but overlapping erasure sets can be considered duplicate.
In particular doesn't even need to have the same fec_set_index.

Technically the overlapping erasure sets might result in non-duplicate shreds, but a leader shouldn't generate overlapping erasure sets unless malicious or buggy.

We don't currently identify those cases because we just lookup by the erasure set:
https://github.com/solana-labs/solana/blob/5de27d9a0/ledger/src/blockstore.rs#L1197-L1206
but I think that has to be fixed at some point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know, right now i've only replicated the cases that we catch in blockstore.
I created #33037, and will extend the checks to catch these types of conflicts in a follow up pr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a comment reference to #33037 here?

Comment on lines 32 to 33
// Shred index of the first shred in the proof
pub(crate) shred_index: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use to this any more?
Can we just change this to _unused?

Comment on lines 88 to 91
#[error("invalid last index conflict")]
InvalidLastIndexConflict,
#[error("invalid erasure meta conflict")]
InvalidErasureMetaConflict,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you keep this enum sorted?

Comment on lines 127 to 128
(ix1, true, ix2, false) if ix1 > ix2 => return Err(Error::InvalidLastIndexConflict),
(ix1, false, ix2, true) if ix1 < ix2 => return Err(Error::InvalidLastIndexConflict),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if both are last_in_slot but ix1 != ix2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will get caught by the _ => () case as this is a valid last index conflict proof.

Copy link
Contributor

@behzadnouri behzadnouri Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part (and now the function as a whole) has become too confusing.
We want to check if the shreds indeed indicate a duplicate slot.
So we need to check that the shreds have the same slot and type and then either of these conditions hold:

  • same index, different payload.
  • one has last_in_slot but the other has higher index.
  • the erasure configs are conflicting.

It is more readable to check if these conditions hold than check if one of them does not hold.

@AshwinSekar AshwinSekar force-pushed the duplicate-shred-proof-last-slot-erasure-meta branch 2 times, most recently from cfa2d69 to f45864c Compare August 28, 2023 22:05
Comment on lines 32 to 35

#[allow(dead_code)]
shred_index: u32,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just rename this to _unused.
You will need to update frozen-abi digest in a couple of places but that is fine.
Also please don't add unnecessary blank lines:
https://google.github.io/styleguide/cppguide.html#Vertical_Whitespace

return Err(Error::ShredTypeMismatch);
}

if shred1.index() == shred2.index() && shred1.payload() == shred2.payload() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If shred1.index() != shred2.index() then the payload cannot be the same.
The index is encoded in the payload.
So the first check is unnecessary.

Comment on lines 134 to 141
if shred1.index() != shred2.index() && shred1.shred_type() == ShredType::Code {
if shred1.fec_set_index() != shred2.fec_set_index() {
return Err(Error::InvalidErasureMetaConflict);
}
let erasure_meta =
ErasureMeta::from_coding_shred(shred1).expect("Shred1 should be a coding shred");
if erasure_meta.check_coding_shred(shred2) {
return Err(Error::InvalidErasureMetaConflict);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a comment reference to #33037 here?

Comment on lines 127 to 128
(ix1, true, ix2, false) if ix1 > ix2 => return Err(Error::InvalidLastIndexConflict),
(ix1, false, ix2, true) if ix1 < ix2 => return Err(Error::InvalidLastIndexConflict),
Copy link
Contributor

@behzadnouri behzadnouri Aug 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part (and now the function as a whole) has become too confusing.
We want to check if the shreds indeed indicate a duplicate slot.
So we need to check that the shreds have the same slot and type and then either of these conditions hold:

  • same index, different payload.
  • one has last_in_slot but the other has higher index.
  • the erasure configs are conflicting.

It is more readable to check if these conditions hold than check if one of them does not hold.

));
assert!(blockstore.has_duplicate_shreds_in_slot(0));
assert_eq!(duplicate_shreds.len(), 1);
matches!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert_matches

matches by itself does not do any assertions here.

));

assert_eq!(duplicate_shreds.len(), 1);
matches!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@@ -344,7 +344,7 @@ impl ErasureMeta {

// Returns true if the erasure fields on the shred
// are consistent with the erasure-meta.
pub(crate) fn check_coding_shred(&self, shred: &Shred) -> bool {
pub fn check_coding_shred(&self, shred: &Shred) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To minimize the public api surface, instead of making these 2 functions public, can you add a method which takes two shreds and returns false if their erasure configs mismatch.
That function can call these 2 functions here internally.

Comment on lines 153 to 155
if !blockstore.has_duplicate_shreds_in_slot(shred_slot) {
if let Some(existing_shred_payload) = blockstore.is_shred_duplicate(&shred) {
blockstore.store_duplicate_slot(
shred_slot,
existing_shred_payload.clone(),
shred.clone().into_payload(),
)?;
(shred, existing_shred_payload)
} else {
// Shred is not duplicate
return Ok(());
}
} else {
// Shred has already been handled
return Ok(());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more readable if you avoid nested branches:

if blockstore.has_duplicate_shreds_in_slot(shred_slot) {
    return Ok(()); // A duplicate is already recorded.
}
let Some(existing_shred_payload) = blockstore.is_shred_duplicate(&shred) else {
    return Ok(()); // Not a duplicate.
}
blockstore.store_duplicate_slot(
    shred_slot,
    existing_shred_payload.clone(),
    shred.clone().into_payload(),
)?;
(shred, existing_shred_payload)

@AshwinSekar AshwinSekar force-pushed the duplicate-shred-proof-last-slot-erasure-meta branch from f45864c to 8ee8631 Compare August 30, 2023 18:54
behzadnouri
behzadnouri previously approved these changes Aug 31, 2023
Copy link
Contributor

@behzadnouri behzadnouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, aside from that extraneous #[allow(dead_code)].

@@ -29,7 +29,8 @@ pub struct DuplicateShred {
pub(crate) from: Pubkey,
pub(crate) wallclock: u64,
pub(crate) slot: Slot,
shred_index: u32,
#[allow(dead_code)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you start the field name with underscore, don't need #[allow(dead_code)] anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review! I've removed the #[allow(dead_code)]

@AshwinSekar AshwinSekar added the automerge Merge this Pull Request automatically once CI passes label Aug 31, 2023
@mergify mergify bot removed the automerge Merge this Pull Request automatically once CI passes label Aug 31, 2023
@mergify
Copy link
Contributor

mergify bot commented Aug 31, 2023

automerge label removed due to a CI failure

@AshwinSekar AshwinSekar merged commit 6d2dae6 into solana-labs:master Aug 31, 2023
@AshwinSekar AshwinSekar self-assigned this Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

3 participants