-
Notifications
You must be signed in to change notification settings - Fork 781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Add a flag for storing invalid blocks #4194
Conversation
sorry - deleted my earlier comment, it was for another issue 😅 |
@@ -1015,6 +1025,10 @@ impl<T: BeaconChainTypes> Worker<T> { | |||
); | |||
} | |||
}; | |||
|
|||
if result.is_err() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if storing the BlockError
as well would help with debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we don't need to store the block error because it's already logged above. Although it could be convenient.
My concern is that this PR currently counts EL timeouts as invalid blocks and will store the block SSZ unnecessarily. I guess this is OK if we don't mind the invalid block storage containing a few false positives, although it would hamper an effort to analyze every block in the directory. I suspect in practice we'll use the invalid block storage to look up specific blocks based on Invalid gossip beacon block
log messages or manual reports. In this case, it doesn't matter if there's some junk in the storage, apart from it being a waste of space.
Maybe we could just filter out the most common false positive (TimedOut
) and allow the rest? I think this would be quite pragmatic using a function like is_timeout
on ExecutionPayloadError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I figured out how to reply to this comment. The same comment appears a second time, but without the reply field 🤷
Perhaps writing out the error message is a nice happy medium about the timeout stuff mentioned in #4194 (comment). Super easy access to the error message might make those timeouts easier to ignore. I'll add it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great 🎉 I only have a minor question above.
@@ -1015,6 +1025,10 @@ impl<T: BeaconChainTypes> Worker<T> { | |||
); | |||
} | |||
}; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a trace!
just above this that tries to log the invalid block as SSZ. I think we may as well remove that (I don't think anybody likes running with trace logs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AgeManning IIRC you've used this in the past. Are you OK if we remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SSZ will still be accessible on the filesystem, but you'd have to run with the flag provided to get it.
@@ -1015,6 +1025,10 @@ impl<T: BeaconChainTypes> Worker<T> { | |||
); | |||
} | |||
}; | |||
|
|||
if result.is_err() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we don't need to store the block error because it's already logged above. Although it could be convenient.
My concern is that this PR currently counts EL timeouts as invalid blocks and will store the block SSZ unnecessarily. I guess this is OK if we don't mind the invalid block storage containing a few false positives, although it would hamper an effort to analyze every block in the directory. I suspect in practice we'll use the invalid block storage to look up specific blocks based on Invalid gossip beacon block
log messages or manual reports. In this case, it doesn't matter if there's some junk in the storage, apart from it being a waste of space.
Maybe we could just filter out the most common false positive (TimedOut
) and allow the rest? I think this would be quite pragmatic using a function like is_timeout
on ExecutionPayloadError
I can't seem to reply to your comment about timeouts @michaelsproul, but perhaps we may be interested in blocks that cause the EL to timeout? EDIT: I hit send accidentally. I can definitely see that accidental timeouts might be annoying, but perhaps it would be good for identifying "bomb blocks" in testnets (and mainnet). It's hard to say what's going to be most useful, I don't feel that strongly either way. |
@paulhauner Yeah good point, lets leave them in and if we find they're annoying we can revisit |
Alrighty @jimmygchen and/or @michaelsproul, we're now storing the errors in a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!!
bors r+ |
## Issue Addressed NA ## Proposed Changes Adds a flag to store invalid blocks on disk for teh debugz. Only *some* invalid blocks are stored, those which: - Were received via gossip (rather than RPC, for instance) - This keeps things simple to start with and should capture most blocks. - Passed gossip verification - This reduces the ability for random people to fill up our disk. A proposer signature is required to write something to disk. ## Additional Info It's possible that we'll store blocks that aren't necessarily invalid, but we had an internal error during verification. Those blocks seem like they might be useful sometimes.
Pull request successfully merged into unstable. Build succeeded! The publicly hosted instance of bors-ng is deprecated and will go away soon. If you want to self-host your own instance, instructions are here. If you want to switch to GitHub's built-in merge queue, visit their help page.
|
## Issue Addressed NA ## Proposed Changes Adds a flag to store invalid blocks on disk for teh debugz. Only *some* invalid blocks are stored, those which: - Were received via gossip (rather than RPC, for instance) - This keeps things simple to start with and should capture most blocks. - Passed gossip verification - This reduces the ability for random people to fill up our disk. A proposer signature is required to write something to disk. ## Additional Info It's possible that we'll store blocks that aren't necessarily invalid, but we had an internal error during verification. Those blocks seem like they might be useful sometimes.
NA Adds a flag to store invalid blocks on disk for teh debugz. Only *some* invalid blocks are stored, those which: - Were received via gossip (rather than RPC, for instance) - This keeps things simple to start with and should capture most blocks. - Passed gossip verification - This reduces the ability for random people to fill up our disk. A proposer signature is required to write something to disk. It's possible that we'll store blocks that aren't necessarily invalid, but we had an internal error during verification. Those blocks seem like they might be useful sometimes.
NA Adds a flag to store invalid blocks on disk for teh debugz. Only *some* invalid blocks are stored, those which: - Were received via gossip (rather than RPC, for instance) - This keeps things simple to start with and should capture most blocks. - Passed gossip verification - This reduces the ability for random people to fill up our disk. A proposer signature is required to write something to disk. It's possible that we'll store blocks that aren't necessarily invalid, but we had an internal error during verification. Those blocks seem like they might be useful sometimes.
Issue Addressed
NA
Proposed Changes
Adds a flag to store invalid blocks on disk for teh debugz. Only some invalid blocks are stored, those which:
Additional Info
It's possible that we'll store blocks that aren't necessarily invalid, but we had an internal error during verification. Those blocks seem like they might be useful sometimes.