-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow execution clients several seconds to construct blocks #4012
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes are mostly of best-effort nature and might be okay for 22.8, given that validator performance is tied to it. I don't have the metrics though to see how frequently this change actually would improve the situation.
proc runProposalForkchoiceUpdated(self: ref ConsensusManager) {.async.} = | ||
withState(self.dag.headState): | ||
let | ||
nextSlot = state.data.slot + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edge cases:
- During sync
- When the latest N slot were empty, e.g., we will propose
wallSlot + 1
, butdag.headState.slot < wallSlot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wastes a bit of time on the EL locally, nothing else.
finalizedBlockRoot: finalizedBlockRoot, | ||
feeRecipient: feeRecipient) | ||
except CatchableError as err: | ||
error "Engine API fork-choice update failed", err = err.msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strictVerification
assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if ideal. Those are maybe best when it's verifying something truly Nimbus-internal, but this isn't. There are exogenous factors involved here, and strictVericiation
, though strict, shouldn't trigger except on actual Nimbus logic errors fixable solely in Nimbus code.
if lastFcU.headBlockRoot == latestHead and | ||
lastFcU.finalizedBlockRoot == latestFinalized and | ||
lastFcU.feeRecipient == feeRecipient: | ||
some bellatrix.PayloadID(lastFcU.payloadId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long do these payloadId
stay valid? Considering we also have other fcu tasks scheduled concurrently that may select different heads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few seconds at least, within a slot. That's fine though, if the head/finalized/fee recipient (i.e. fork choice parameters, though maybe prudent to add the rest, e.g., timestamp
) mismatch, it just won't be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
223f949 adds timestamp
, so the validity length per se isn't that crucial. It's always within a slot this way, and that's definitely fine.
proc getFeeRecipient(node: BeaconNode, | ||
pubkey: ValidatorPubKey, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks correct, but it doesn't seem necessary to move this if slated for 22.8. It introduces quite some noise to this PR, also with the keymanagerHost
moving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah, it was one of the issues I wondered about. For now, I agree, it's better to leave in place, reduce risk, though the duplicate logic should be removed eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
223f949 restores the validator_duties
getFeeRecipient
setup, except for adding a comment to remove it later.
# TODO after things stabilize with this, check for upcoming proposal and | ||
# don't bother sending first fcU, but initially, keep both in place | ||
asyncSpawn self.runProposalForkchoiceUpdated() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is yet another forkChoiceUpdated
in block_processor.nim
(expectValidForkchoiceUpdated
). The function here (updateHeadWithExecution
) is only called when a block is processed with a payload that initially returned an inconclusive result (syncing/accepted). The other one is called on blocks with payloads that already have valid
verdict. It seems a bit counterintuitive to only do this if there was any recent syncing/accepted
block but not if there only werevalid
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has to wait until dag.updateHead
runs, which happens after the bulk of the block_processor
/consensus_manager
logic, because that's when/where dag.headState
is updated, which is necessary to get the RANDAO information from said state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f8b4984 runs it when newPayload
is VALID
let proposer = dag.getProposer(dag.head, slot + 1) | ||
if proposer.isNone(): | ||
return Opt.none((ValidatorIndex, ValidatorPubKey)) | ||
Opt.some((proposer.get, dag.validatorKey(proposer.get).get().toPubKey)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, dag.getProposer
already validates that dag.validatorKey
will return some
result.
get_randao_mix(state.data, get_current_epoch(state.data)).data | ||
feeRecipient = self.getFeeRecipient( | ||
nextProposer, validatorIndex, nextSlot.epoch) | ||
headBlockRoot = self.dag.loadExecutionBlockRoot(self.dag.head) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is called from an async context, dag.head
may already point to a new head by this time. Around the merge, this could be a non-execution block with headBlockRoot.isZero
, which is not documented as an allowed value for forkChoiceUpdated
(in the runForkchoiceUpdated
wrapper, it even asserts for that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if lastFcU.headBlockRoot == latestHead and | ||
lastFcU.finalizedBlockRoot == latestFinalized and | ||
lastFcU.feeRecipient == feeRecipient: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If latestHead
ends up as zero (due to no terminalBlockHash
being reached / pre-merge), this may be triggered unintentionally (as lastFcu
was never initialized), and resolve to some default(PayloadId)
, so will run execution_engine.getPayload
instead of using the pre-merge default(bellatrix.ExecutionPayload)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getExecutionPayload
is only called in a merge context:
nimbus-eth2/beacon_chain/validators/validator_duties.nim
Lines 487 to 506 in e7f6aac
elif slot.epoch < node.dag.cfg.BELLATRIX_FORK_EPOCH or | |
not ( | |
is_merge_transition_complete(proposalState.bellatrixData.data) or | |
((not node.eth1Monitor.isNil) and | |
node.eth1Monitor.terminalBlockHash.isSome)): | |
# https://github.com/nim-lang/Nim/issues/19802 | |
(static(default(bellatrix.ExecutionPayload))) | |
else: | |
let | |
pubkey = node.dag.validatorKey(validator_index) | |
maybeExecutionPayload = (await getExecutionPayload( | |
node, proposalState, | |
slot.epoch, validator_index, | |
# TODO https://github.com/nim-lang/Nim/issues/19802 | |
if pubkey.isSome: pubkey.get.toPubKey else: default(ValidatorPubKey))) | |
if maybeExecutionPayload.isNone: | |
warn "Unable to get execution payload. Skipping block proposal", | |
slot, validator_index | |
return ForkedBlockResult.err("Unable to get execution payload") | |
maybeExecutionPayload.get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
223f949 (a) adds a timestamp
check which should never accidentally match and (b) adds an explicit initialization check (which makes things uglier enough I'd typically avoid it, but it's more risk-conservative)
Tha's the intent, yes. It's not an optimal design from scratch. |
…lization; check timestamp consistency
@@ -34,20 +47,34 @@ type | |||
# ---------------------------------------------------------------- | |||
eth1Monitor*: Eth1Monitor | |||
|
|||
# Allow determination of preferred fee recipient during proposals | |||
# ---------------------------------------------------------------- | |||
dynamicFeeRecipientsStore: DynamicFeeRecipientsStore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Er, is this a second instance besides the one in the BeaconNode
? This is not reasonable, because this object is configured through the REST API (see "/eth/v1/validator/prepare_beacon_proposer"). A separate instance would be useless unless hooked with that API end-point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.keymanagerHost[].getSuggestedFeeRecipient(pubkey).valueOr: | ||
self.keymanagerHost[].defaultFeeRecipient | ||
else: | ||
self.keymanagerHost[].defaultFeeRecipient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.keymanagerHost
is nil
here (see the check above). That's why the original definition of this function needed to access the value from the BeaconNode
config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresses #3952 and #3953