-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle EL offline/unavailable in the Sync API #290
Handle EL offline/unavailable in the Sync API #290
Conversation
a98f51c
to
c5fd9de
Compare
There's a fairly long discussion in APIs channel on discord re. this ticket. I believe the preference would be to not return a 503, but instead have a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just PR comment now.
oh - can we add an entry in changes.md to indicate that the endpoint has been updated? Cheers! |
@rolfyone I added an entry in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Add compatibility with ethereum/beacon-APIs#290 to the beacon node. Behaviour when configured with multiple ELs is not specified; intention suggests to indicate whether all ELs are offline.
Add compatibility with ethereum/beacon-APIs#290 to the beacon node. Behaviour when configured with multiple ELs is not specified; intention suggests to indicate whether all ELs are offline.
Updates the validator client to track the new `el_offline` field in the `/eth/v1/node/syncing` response. ethereum/beacon-APIs#290 If `el_offline` is present, the BN will be treated same as if it was optimistically synced. Note that a BN could report that it is in sync, but may still have intermittent connection issues to the EL, making it impossible to `getPayload` from them since Capella.
Updates the validator client to track the new `el_offline` field in the `/eth/v1/node/syncing` response. ethereum/beacon-APIs#290 If `el_offline` is present, the BN will be treated same as if it was optimistically synced. Note that a BN could report that it is in sync, but may still have intermittent connection issues to the EL, making it impossible to `getPayload` from them since Capella.
is there a rationale for this field? ie what actually useful information does it bring, on top of |
@arnetheduck The rationale for us in Lighthouse is:
Credit to @paulhauner for the bomb block argument. |
I appreciate that Nimbus has decided that they feel it's safe to go ahead and optimistically import against the optimistic sync spec (I mean that genuinely, since the spec may be over-restrictive now the merge transition has passed). However I don't think that there's been enough evidence presented about the liveness-safety of that approach to say that everyone should be optimistically importing on errors and that the API should only recognize that approach. Deciding to optimistically import blocks with an offline/erroring EE is a non-trivial decision which, in my opinion, is neither objectively good or objectively bad. I see that giving CL clients the ability to differentiate between optimistic and offline allows for greater client diversity. |
Is this something we should consider raising an issue on? |
We could, although I think @paulhauner's argument implies that we shouldn't expect every impl to do as Nimbus does |
I'm asking for the rationale simply because the PR fails to include one: I can however see at least a few problems with it:
This seems unrelated to I can see the argument about liveness safety, but this seems to me more of an implementation detail in the BN more than anything else: it's something that should be solved at the optimistic sync spec / execution api spec level rather than leaking into the VC. The internal Nimbus implementation "logically" considers itself to be one of many EL:s connected (we support driving multiple EL:s with a single BN) and performs the checks that an execution client is mandated to perform according to the execution API specification in order to be allowed to return Long story short, because of these differences in implementations and the implied reliance on internal implementation details in how it should be implemented, it risks becoming an obstacle to diversity rather than a boon: we suddenly need to know how VC:s respond to it / what they do with the information in order to "balance" the implementation well. |
When the execution client if offline/unavailable, it can't be properly reported by the sync API response.
Added an
el_offline
field to theGetSyncingStatusResponse
.