-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Queued child chain blocks that got stuck might cause operator to invalidate the child chain #702
Comments
Investigation lead to discovering this is not an issue with the Watcher nor Child chain implementation per se. To begin with - Watcher validly signals the invalid block. This block spends from
|
Dropping take-aways from slack convo with @InoMurko @JBunCE :
Propositions:
But seriously, this is ugly, because whatever state we hold, if it's not out there on ethereum it's getting stale every minute 😞
we have this thing called So the same rule applies here I guess. Gist of the rule is that such block must be shown so that it allows exit challenges to go through comfortably before they mature.
only it's manifestation via Block submission can not go through for many reasons. We fail loudly (and There are probably ones that aren't loud enough, but are also much less common, like a bad network congestion condition or child chain being censored, those are also much harder to detect 😞 |
Infura! |
hm, I missed this getting closed, but I think Infura doesn't solve this, (see ":account_locked or other reasons for queued blocks to get stuck"). It could actually make it worse, since any Infura-related delay in pushing the submission through can put us in the same bad position. I'll take the liberty to reopen, rephrase the title to demonstrate the issue better and put some clarifications in the description |
OK. to clarify, :account_locked is not what could happen on Infura. What could happen is, for example, saturation of the network would prevent us to submit a block "in time". The other part, which makes things a bit more complicated, is the introduction of the secure submission layer. |
As a start, I'll increase observability for the queued up blocks. |
Once we have that, we will be able to observe the queue size by:
On Datadog, we can plot this and create alarms/monitors that get triggered if the queue_size starts rising and it's not drained fast enough. |
Closing in favour of #1629 which I'm currently working on |
tl;dr Whenever a formed and queued child chain block (as done by
OMG.ChildChain.BlockQueue.Server
) gets stuck on any level of the flow (see comments for possible scenarios) it can become stale.Such stuck block, while valid at the moment of forming, can become invalid in time, if for example an exit gets started, matures and is processed, resulting in a UTXO spent in the stuck block going missing.
Original synopsis (and original title) below:
Resyncing staging-v0.1 suddenly fails with :tx_execution
when using 112fb91
The text was updated successfully, but these errors were encountered: