-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
send duplicate shred proofs for conflicting shred scenarios #32965
send duplicate shred proofs for conflicting shred scenarios #32965
Conversation
5c90b2c
to
3812210
Compare
Codecov Report
@@ Coverage Diff @@
## master #32965 +/- ##
========================================
Coverage 81.9% 82.0%
========================================
Files 785 784 -1
Lines 213295 212931 -364
========================================
- Hits 174794 174704 -90
+ Misses 38501 38227 -274 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Will wait for behzad before merging. |
gossip/src/duplicate_shred.rs
Outdated
if shred1.index() != shred2.index() && shred1.shred_type() == ShredType::Code { | ||
if shred1.fec_set_index() != shred2.fec_set_index() { | ||
return Err(Error::InvalidErasureMetaConflict); | ||
} | ||
let erasure_meta = | ||
ErasureMeta::from_coding_shred(shred1).expect("Shred1 should be a coding shred"); | ||
if erasure_meta.check_coding_shred(shred2) { | ||
return Err(Error::InvalidErasureMetaConflict); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not technically exhaustive.
Any 2 shreds with different but overlapping erasure sets can be considered duplicate.
In particular doesn't even need to have the same fec_set_index
.
Technically the overlapping erasure sets might result in non-duplicate shreds, but a leader shouldn't generate overlapping erasure sets unless malicious or buggy.
We don't currently identify those cases because we just lookup by the erasure set:
https://github.com/solana-labs/solana/blob/5de27d9a0/ledger/src/blockstore.rs#L1197-L1206
but I think that has to be fixed at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, right now i've only replicated the cases that we catch in blockstore.
I created #33037, and will extend the checks to catch these types of conflicts in a follow up pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put a comment reference to #33037 here?
gossip/src/duplicate_shred.rs
Outdated
// Shred index of the first shred in the proof | ||
pub(crate) shred_index: u32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a use to this any more?
Can we just change this to _unused
?
gossip/src/duplicate_shred.rs
Outdated
#[error("invalid last index conflict")] | ||
InvalidLastIndexConflict, | ||
#[error("invalid erasure meta conflict")] | ||
InvalidErasureMetaConflict, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you keep this enum sorted?
gossip/src/duplicate_shred.rs
Outdated
(ix1, true, ix2, false) if ix1 > ix2 => return Err(Error::InvalidLastIndexConflict), | ||
(ix1, false, ix2, true) if ix1 < ix2 => return Err(Error::InvalidLastIndexConflict), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if both are last_in_slot
but ix1 != ix2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will get caught by the _ => ()
case as this is a valid last index conflict proof.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part (and now the function as a whole) has become too confusing.
We want to check if the shreds indeed indicate a duplicate slot.
So we need to check that the shreds have the same slot and type and then either of these conditions hold:
- same index, different payload.
- one has
last_in_slot
but the other has higher index. - the erasure configs are conflicting.
It is more readable to check if these conditions hold than check if one of them does not hold.
cfa2d69
to
f45864c
Compare
gossip/src/duplicate_shred.rs
Outdated
|
||
#[allow(dead_code)] | ||
shred_index: u32, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just rename this to _unused
.
You will need to update frozen-abi digest in a couple of places but that is fine.
Also please don't add unnecessary blank lines:
https://google.github.io/styleguide/cppguide.html#Vertical_Whitespace
gossip/src/duplicate_shred.rs
Outdated
return Err(Error::ShredTypeMismatch); | ||
} | ||
|
||
if shred1.index() == shred2.index() && shred1.payload() == shred2.payload() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If shred1.index() != shred2.index()
then the payload cannot be the same.
The index is encoded in the payload.
So the first check is unnecessary.
gossip/src/duplicate_shred.rs
Outdated
if shred1.index() != shred2.index() && shred1.shred_type() == ShredType::Code { | ||
if shred1.fec_set_index() != shred2.fec_set_index() { | ||
return Err(Error::InvalidErasureMetaConflict); | ||
} | ||
let erasure_meta = | ||
ErasureMeta::from_coding_shred(shred1).expect("Shred1 should be a coding shred"); | ||
if erasure_meta.check_coding_shred(shred2) { | ||
return Err(Error::InvalidErasureMetaConflict); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put a comment reference to #33037 here?
gossip/src/duplicate_shred.rs
Outdated
(ix1, true, ix2, false) if ix1 > ix2 => return Err(Error::InvalidLastIndexConflict), | ||
(ix1, false, ix2, true) if ix1 < ix2 => return Err(Error::InvalidLastIndexConflict), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part (and now the function as a whole) has become too confusing.
We want to check if the shreds indeed indicate a duplicate slot.
So we need to check that the shreds have the same slot and type and then either of these conditions hold:
- same index, different payload.
- one has
last_in_slot
but the other has higher index. - the erasure configs are conflicting.
It is more readable to check if these conditions hold than check if one of them does not hold.
ledger/src/blockstore.rs
Outdated
)); | ||
assert!(blockstore.has_duplicate_shreds_in_slot(0)); | ||
assert_eq!(duplicate_shreds.len(), 1); | ||
matches!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_matches
matches
by itself does not do any assertions here.
ledger/src/blockstore.rs
Outdated
)); | ||
|
||
assert_eq!(duplicate_shreds.len(), 1); | ||
matches!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
ledger/src/blockstore_meta.rs
Outdated
@@ -344,7 +344,7 @@ impl ErasureMeta { | |||
|
|||
// Returns true if the erasure fields on the shred | |||
// are consistent with the erasure-meta. | |||
pub(crate) fn check_coding_shred(&self, shred: &Shred) -> bool { | |||
pub fn check_coding_shred(&self, shred: &Shred) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To minimize the public api surface, instead of making these 2 functions public, can you add a method which takes two shreds and returns false
if their erasure configs mismatch.
That function can call these 2 functions here internally.
core/src/window_service.rs
Outdated
if !blockstore.has_duplicate_shreds_in_slot(shred_slot) { | ||
if let Some(existing_shred_payload) = blockstore.is_shred_duplicate(&shred) { | ||
blockstore.store_duplicate_slot( | ||
shred_slot, | ||
existing_shred_payload.clone(), | ||
shred.clone().into_payload(), | ||
)?; | ||
(shred, existing_shred_payload) | ||
} else { | ||
// Shred is not duplicate | ||
return Ok(()); | ||
} | ||
} else { | ||
// Shred has already been handled | ||
return Ok(()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be more readable if you avoid nested branches:
if blockstore.has_duplicate_shreds_in_slot(shred_slot) {
return Ok(()); // A duplicate is already recorded.
}
let Some(existing_shred_payload) = blockstore.is_shred_duplicate(&shred) else {
return Ok(()); // Not a duplicate.
}
blockstore.store_duplicate_slot(
shred_slot,
existing_shred_payload.clone(),
shred.clone().into_payload(),
)?;
(shred, existing_shred_payload)
The scenarios are multiple last_shred_in_slot shreds and coding shreds with conflicting erasure metas.
f45864c
to
8ee8631
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, aside from that extraneous #[allow(dead_code)]
.
gossip/src/duplicate_shred.rs
Outdated
@@ -29,7 +29,8 @@ pub struct DuplicateShred { | |||
pub(crate) from: Pubkey, | |||
pub(crate) wallclock: u64, | |||
pub(crate) slot: Slot, | |||
shred_index: u32, | |||
#[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you start the field name with underscore, don't need #[allow(dead_code)]
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the review! I've removed the #[allow(dead_code)]
automerge label removed due to a CI failure |
These are multiple last_shred_in_slot shreds and
coding shreds with conflicting erasure metas.
Problem
These scenarios are detected and stored in blockstore, but never sent to gossip or the state machine.
Summary of Changes
Fix the plumbing so these are sent to gossip and the state machine.
Split from #32862