Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve first sync #154

Merged
merged 4 commits into from
Jun 25, 2024
Merged

feat: improve first sync #154

merged 4 commits into from
Jun 25, 2024

Conversation

anxolin
Copy link
Contributor

@anxolin anxolin commented Jun 18, 2024

Description

This PR is an attempt to improve the first sync for watch-tower

Context: Before this PR

The issue with the current model is that

  1. It enter SYNC mode
  2. In SYNC mode it will check all pending blocks from last processed until the tip of the blockchains
  • Instead of actually processing the blocks, its just "saving" the block for processing later (its creating a sync plan)
  1. Once in SYNC, it will apply the plan
  • This time it will process for real the blocks

The issues we experience is becasue between 2 and 3 there's a delay and the metric of "blocks processed" goes down. Thats our main metric.

The solution is to not make a plan and process blocks as we find them instead of waiting to be fully catched up.

Proposed solution

I just refactored the 2 points where we do the processing of blocks (applying the plan, and when watching for new blocks).

Now I use an auxiliary function that will process the block, update the metrics, and persist in the database.

This should help in several ways:

  • We will improve the metric of "blocks processed". This metric is key for our alerts. We don't want to be notified if during a restart watchtower needs some time to consume all pending blocks
  • Additionally should help with very big syncs, like first syncs or big downtimes. Before this PR it was indexing all blocks before processing any block. This approach should be better because in case of a crash, next run will resume the work where it left it.
  • Derived from the item above, I suspect the memory issues could be related to applying a big plan when the pod restarts

Test

I haven't tested this much and is very sensitive change. I would love to get some feedback first, then I'd like to test it in staging.

I just did a minimal test of running it in Arbitrum locally:

image

The watch-tower arrived to a SYNC state and processed blocks as it found them:
image

@anxolin anxolin requested review from mfw78 and fleupold June 18, 2024 16:19
@anxolin anxolin force-pushed the improve-first-sync branch 2 times, most recently from 3e88069 to 4a52b71 Compare June 18, 2024 16:28
@anxolin anxolin requested a review from a team June 19, 2024 08:02
src/services/chain.ts Outdated Show resolved Hide resolved
@anxolin anxolin merged commit fd653a7 into main Jun 25, 2024
4 checks passed
@anxolin anxolin deleted the improve-first-sync branch June 25, 2024 16:01
@github-actions github-actions bot locked and limited conversation to collaborators Jun 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants