eth/downloader: drop beacon head updates if the syncer is restarting #27397

karalabe · 2023-05-31T10:43:52Z

When a reorg happens, the beacon/skeleton syncer needs to stop, swap out it's internal state to the new head and restart. If the downloader is busy importing blocks in the mean time, it will prevent the beacon syncer from completing it's restart loop until all queued blocks are consumed.

During this "hang", any newly arriving newPayload or forckchoiceUpdate messages will get blocked as they cannot feed their respective new data to the beacon syncer. If the "hang" lasts 15 minutes, these update messages will keep piling and piling, consuming RAM and goroutines.

The PR "fixes" this issue by dropping update messages on the floor while in this limbo state. This should unblock engine API events and prevent bloat.

That said, it does introduce a wonky behavior: if a newPayload arrives back to back after a fockchoiceUpdate that triggered a reorg, that newPayload might get ignored. This isn't a big of a deal for a syncing node, but if the node is already in sync, it will "miss" a block and will need to wait for the next one to come back into sync. If the entire network does this however, will it ever come back into sync? Not clear, seems a bit unsafe.

holiman · 2023-05-31T12:04:18Z

The PR "fixes" this issue by dropping update messages on the floor while in this limbo state. This should unblock engine API events and prevent bloat.

So if we have this flow of events piling up: h1, h2, h3, h4, h5, h6 . In your solution, as I understand it, if the block happens, then the h2, .. h6 are dropped on the floor, and h1 is the only one accepted. Yes?

Seems to me that a better solution would be to, even if the internals are busy, keep the best/latest update, and discard earlier. So that h1, ... h5 are dropped to the floor, and h6 is eventually processed.

So basically;

try to import header h,
Can we do that now or is the lock held?
Ok, lock held, import is in progress.
Compare h and possibly swap h for latestUnprocessedFCUHeader

Once the api is done with whatever it was doing, it takes the latestUnprocessedFCUHeader and processes it.

How about something like that?

karalabe · 2023-06-01T12:36:01Z

Not sure it matters. If there's a header gap, you need to do a sync run afterwards. Now whether you keep the last header or drop all and wait for teh next update is kind of similar. Tho' I can see your version a bit more stable. All in all I'm unhappy about both.

That said, the moment the thing locks up, even if we block all goroutines somewhere the order of insertion after that will be random, so it's gonna get wonky anyway.

I guess a possibly better solution would be to have a queue of headers that arrived whilst we were offline and try to consume it when the syncer is restarted. That way we could preserve the inbound order at least and not lost any headers.

karalabe · 2023-06-01T12:37:16Z

Ah, I think I can make this more elegant. I'll make the suspend run on it's own goroutine instead of the syncer thread and have the synced keep eating up events in between.

holiman

The change LGTM. Minor note, though, it "looks" more complex than it actually is, in my humble opinion... For me at least, it would look simpler if you had moved the drainer into a new routine, and left the other one where it was. Like this:

	defer func() {
		done := make(chan struct{})
		go func() {
			// Keep draining the headEvents, to not block sender (updates
			// coming in via the beacon RPC api).
			for {
				select {
				case <-done:
					return
				case event := <-s.headEvents:
					event.errc <- errors.New("beacon syncer reorging")
				}
			}
		}()
		if filled := s.filler.suspend(); filled != nil {
			// If something was filled, try to delete stale sync helpers. If
			// unsuccessful, warn the user, but not much else we can do (it's
			// a programming error, just let users report an issue and don't
			// choke in the meantime).
			if err := s.cleanStales(filled); err != nil {
				log.Error("Failed to clean stale beacon headers", "err", err)
			}
		}
		close(done)
	}()

Do with it as you see fit :)

…thereum#27397) * eth/downloader: drop beacon head updates if the syncer is restarting * eth/donwloader: v2 of the goroutine spike preventer

…tarting (ethereum#27397)" This reverts commit 0968b3d.

eth/downloader: drop beacon head updates if the syncer is restarting

64010fc

karalabe requested review from holiman and rjl493456442 as code owners May 31, 2023 10:43

karalabe added this to the 1.12.1 milestone May 31, 2023

eth/donwloader: v2 of the goroutine spike preventer

e1f1a26

holiman approved these changes Jun 2, 2023

View reviewed changes

karalabe merged commit a7b2106 into ethereum:master Jun 5, 2023

holiman mentioned this pull request Aug 22, 2023

What CL behaviour would trigger beacon syncer reorging? #27962

Closed

devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023

Revert "eth/downloader: drop beacon head updates if the syncer is res…

5422e21

…tarting (ethereum#27397)" This reverts commit 0968b3d.

devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023

Revert "eth/downloader: drop beacon head updates if the syncer is res…

223b385

…tarting (ethereum#27397)" This reverts commit 0968b3d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eth/downloader: drop beacon head updates if the syncer is restarting #27397

eth/downloader: drop beacon head updates if the syncer is restarting #27397

karalabe commented May 31, 2023 •

edited

Loading

holiman commented May 31, 2023

karalabe commented Jun 1, 2023

karalabe commented Jun 1, 2023

holiman left a comment

eth/downloader: drop beacon head updates if the syncer is restarting #27397

eth/downloader: drop beacon head updates if the syncer is restarting #27397

Conversation

karalabe commented May 31, 2023 • edited Loading

holiman commented May 31, 2023

karalabe commented Jun 1, 2023

karalabe commented Jun 1, 2023

holiman left a comment

Choose a reason for hiding this comment

karalabe commented May 31, 2023 •

edited

Loading