Improve mindreader/merger behavior when syncing a large chain from block 1 #81

sduchesneau · 2020-07-03T13:08:32Z

Context

Currently, the mindreader creates "one-block-files" and the merger collects them to create "merged (100) blocks files".

The reason for this two-steps architecture is for having a few mindreader processes running on HEAD and getting different versions of same block numbers when microforks are created, we want to make sure that any single block (forked out or not) that has gone through our system (mindreader->relayer->...) gets written to a merged blocks file so it can be retrieved afterwards.

This process is NOT MEANT for going over millions of blocks in an attempt to sync/bootstrap a large chain, it is meant for providing a stronger guarantee to the clients accessing the service that any block served close to the HEAD, in real-time, will be able to be served later, even if it was forked out, so the cursors are always valid. When default settings are used to sync linearly a large chain like eos-mainnet, something is bound to fail (usually an out-of-memory error happens at some point in time and crashes the mindreader's nodeos instance into a corrupted state)

This has been discussed in #26 and on multiple other occasions.

However, we keep seeing that users try to sync the whole chain from the start directly (ex: #80) . We talked to some of our users who prefer this approach without going into the operational steps of gathering nodeos snapshots and running them in parallel, even if it takes weeks of sync time. (throwing "time" at the problem instead of operational complexity.).

Proposed behavior

Make the mindreader able to decide to produce merged blocks-files depending when it is "catching up" (the timestamp of blocks being processed could be enough to determine that)
Make the mindreader able to flush to disk a "partially-merged" blocks-file and append to it on restart.
add a dfuseeos sync command that launches only the syncing components:

mindreader with auto-merge mode
trxdb-loader
search-indexer

Note that search-indexer uses small shards (200-blocks) by default, this will be ridiculously inefficient if it covers 128 million blocks, but does not work well with large shards close to the head. However, search-archive does not support serving different sizes of shards directly, so it cannot be run in the same single-process afterwards. A solution will have to be proposed to the user.

The text was updated successfully, but these errors were encountered:

abourget · 2020-07-03T15:52:38Z

About point 3 of Proposed behaviors:

I'd prefer we do not push to users phases of procesing, additional config files, and templates that force them to understand ALL the complexities when they barely are getting started.

I'd prefer the thing just WORKS, and the default behavior is you have to WAIT.

We can have the programs self adjust when they know their status, wait before starting, etc.. based on catch-up mode.

--

Regarding huge indexes, the search-archive, defaults could be a moving window of 1 week or so, so search-archive, knowing its not near live, would just do nothing.

Eventually, you could start from a snapshot, and get a partial sync.. but we'll slowly get there.

matthewdarwin · 2020-07-08T20:52:41Z

Point (1) on its own without any of the other features is very useful.

sduchesneau · 2020-07-29T23:58:12Z

flag could be like this: mergeBlockFiles=[always, auto, none]
also add a reference to a "blockmeta" service that may (or may not) be able to

state is [catching up] if:

the blocks are at least 5 minutes late, and the destination merged file already exists
the blocks are at least 5 minutes late, and a connection to a "blockmeta" service tells us that we are processing blocks that are passed the LIB
the blocks are at least 12 hours late

sduchesneau · 2020-08-03T19:42:44Z

changed a bit the requirements for simplicity:
state is 'catching up' if:

the destination merged-block exist (ex: you have a few mindreaders and a merger, one mindreader crashed and restarted from snapshot...)
the blocks being processed are at least [x hours] old (12h by default, it is risky to cover 'live' blocks like this, because you could end up with the wrong chain in case there is a huge reorg and the LIB does not advance in an EOS chain)

sduchesneau · 2020-08-05T12:47:39Z

Modified the requirements again, the "destination merged-block exists" was too risky close to the head..

so now, it will produce merged block files automatically if either of these is true when the mindreader stars:

the blocks being processed are at least [x hours] old (12h by default, it is risky to cover 'live' blocks like this, because you could end up with the wrong chain in case there is a huge reorg and the LIB does not advance in an EOS chain)
The mindreader can connect to 'blockmeta' and determine the LIB. The current block num + 100 is behind the LIB (so no risk of fork)

When those two conditions become false (when close to HEAD), it will stop producing merged-blocks and go back to producing one-block-files, being careful to change its operation mode only on boundary blocks (modulo 100)

sduchesneau · 2020-08-05T12:49:24Z

Note that the merger, if it is waiting to produce merged blocks [200,201...] it looks at the merged-blocks-storage and if it sees that file appear, it skips to the next, so it should operate well with the new mindreader behavior.

This removes the need for: a) a "catchup phase" with a specific mindreader flag and b) going through hoops to join the batch-produced blocks with the one produced from the merger.

abourget · 2020-08-05T12:59:36Z

When it skips, are there no chances of missing forks? How does it know what was in that file in order to skip to the right place, what if the merger had a 99b unknown to others?

Mindreader auto-merging files and continue from partial merged

sduchesneau · 2020-08-05T13:25:13Z

When it skips, are there no chances of missing forks? How does it know what was in that file in order to skip to the right place, what if the merger had a 99b unknown to others?

it only skips when the merger cannot find enough one-block-files to complete the merge before a mindreader uploads the merged file in its place. A mindreader will only upload that merged file if it is "catching up" (ex: restarted with >100 blocks behind the lib or with these blocks older than 12h...)

so unless you stop the merger for a few minutes, then restart a mindreader from an earlier point in time, let it catch up, then start the merger again, you won't lose any fork. In that particular chain of event, you would just end up losing a block that got forked out, because the correct fork will be in the merged file.

The risk AND impact is minimal and gives you the valid forks.

sduchesneau mentioned this issue Aug 3, 2020

auto toggle one-block-files vs merged, restore partial merged on start streamingfast/node-manager#11

Merged

sduchesneau self-assigned this Aug 3, 2020

This was referenced Aug 5, 2020

merger initiates in non-optimal order #94

Closed

Mindreader auto-merging files and continue from partial merged #109

Merged

sduchesneau closed this as completed in #109 Aug 5, 2020

sduchesneau added a commit that referenced this issue Aug 5, 2020

Merge PR#109 from feature/auto-merge closes #99 closes #81

c1e8122

Mindreader auto-merging files and continue from partial merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve mindreader/merger behavior when syncing a large chain from block 1 #81

Improve mindreader/merger behavior when syncing a large chain from block 1 #81

sduchesneau commented Jul 3, 2020

abourget commented Jul 3, 2020

matthewdarwin commented Jul 8, 2020

sduchesneau commented Jul 29, 2020

sduchesneau commented Aug 3, 2020

sduchesneau commented Aug 5, 2020

sduchesneau commented Aug 5, 2020

abourget commented Aug 5, 2020

sduchesneau commented Aug 5, 2020

Improve mindreader/merger behavior when syncing a large chain from block 1 #81

Improve mindreader/merger behavior when syncing a large chain from block 1 #81

Comments

sduchesneau commented Jul 3, 2020

Context

Proposed behavior

abourget commented Jul 3, 2020

matthewdarwin commented Jul 8, 2020

sduchesneau commented Jul 29, 2020

sduchesneau commented Aug 3, 2020

sduchesneau commented Aug 5, 2020

sduchesneau commented Aug 5, 2020

abourget commented Aug 5, 2020

sduchesneau commented Aug 5, 2020