use LRU strategy for shuffling/epoch caches #4196

etan-status · 2022-09-29T01:46:41Z

When EL newPayload is slow (e.g., Raspberry Pi with Besu), the epoch and shuffling caches tend to fill up with multiple copies per epoch when processing gossip and performing validator duties close to wall slot. The old strategy of evicting oldest epoch led to the same item being evicted over and over, leading to blocking of over 5 minutes in extreme cases where alternate epochs/shuffling got loaded repeatedly. Changing the cache eviction strategy to least-recently-used seems to improve the situation drastically. A simple implementation was selected based on single linked-list without a hashtable.

When EL `newPayload` is slow (e.g., Raspberry Pi with Besu), the epoch and shuffling caches tend to fill up with multiple copies per epoch when processing gossip and performing validator duties close to wall slot. The old strategy of evicting oldest epoch led to the same item being evicted over and over, leading to blocking of over 5 minutes in extreme cases where alternate epochs/shuffling got loaded repeatedly. Changing the cache eviction strategy to least-recently-used seems to improve the situation drastically. A simple implementation was selected based on single linked-list without a hashtable.

etan-status · 2022-09-29T01:48:13Z

Example dump, listing all active epochRef while being 1 epoch behind for several minutes due to slow EL import, so new shuffling is re-computed after each individual slot, while gossip from wall clock is processed simultaneously.

eth2_1 | INF 2022-09-29 01:44:44.963+00:00 Done putting epochRef topics="chaindag" key=149979:3b7f5e11:4799296 items="@["28>149979:3b7f5e11:4799296", "1>149977:c24da32b:4799263", "27>149978:3136b1b8:4799295", "11>149976:2316bf41:4799231", "26>149979:a8fab490:4799294", "25>149978:a8fab490:4799294", "23>149978:ff769ac5:4799293", "24>149979:ff769ac5:4799293", "22>149978:a06866fc:4799292", "21>149979:48906c70:4799291", "20>149978:48906c70:4799291", "18>149978:4bfc74e9:4799290", "19>149979:4bfc74e9:4799290", "17>149978:f237566c:4799289", "16>149979:f237566c:4799289", "14>149978:f41c4294:4799288", "15>149979:ee47915e:4799287", "13>149978:ee47915e:4799287", "2>149979:a473b75c:4799286", "12>149978:a473b75c:4799286", "9>149978:950f7360:4799285", "10>149979:950f7360:4799285", "8>149978:484dbee7:4799284", "7>149979:2472c2de:4799283", "6>149978:2472c2de:4799283", "5>149978:96729301:4799282", "0>149979:a5e1bbdd:4799281", "4>149978:a5e1bbdd:4799281", "31>149979:e68d4600:4799280", "3>149978:e68d4600:4799280", "29>149978:e6d39b4d:4799279", "30>149979:dc9217b9:4799278"]"

github-actions · 2022-09-29T04:12:47Z

Unit Test Results

      9 files ±0   678 suites ±0 17m 57s ⏱️ - 2m 53s
1 996 tests ±0 1 847 ✔️ ±0 149 💤 ±0 0 ❌ ±0
8 114 runs ±0 7 941 ✔️ ±0 173 💤 ±0 0 ❌ ±0

Results for commit 352d932. ± Comparison against base commit 1819d79.

arnetheduck · 2022-09-29T05:43:08Z

A simpler implementation will put a timestamp or (shared) counter in each entry, significantly simplifying the implementation - hash table indeed wouldn't make sense for so few entries

arnetheduck · 2022-09-29T05:44:27Z

also - which of the shuffling or epoch caches is it? shuffling caches should be a lot more stable given the 1-epoch lag in selecting them - epoch caches are more fickle indeed

etan-status · 2022-09-29T08:51:05Z

I encountered this with the epoch cache on my Pi, not sure how much the shuffling cache is affected as well. However, spending the 17 extra-bytes is cheaper than risking running into a pessimistic situation; also in situations of heavy forking. Note that the previous solution also is O(n) to access, so there's not much to gain from using the previous logic compared to a LRU scheme (the rationale for shufflingref was copy/pasted from the epochref implementation).

Logical timestamp solution:

https://github.com/apple/HomeKitADK/blob/fb201f98f5fdc7fef6a455054f08b59cca5d1ec8/HAP/HAPPairingBLESessionCache.c#L52-L102

Regular timestamps:

Same but with Moment, downside is 8 byte per item and syscall, so rather pick the logical one above

Single linked-list:

The same one here, but optimized some further to eliminate struct padding between the individual elements by having the index-order stored separately (so it's really just 33 bytes and not 33*4 or 33*8).

Double linked-list:

The fancy solution with hashtable + pointers in both direction, but seems rather overkill for 32 and 16 items.

Considering the size and access frequency, I think the logical timestamp solution may indeed be preferred. Has the advantages that it was also tested for years already ^^ Which one would you prefer?

arnetheduck · 2022-09-29T08:55:02Z

logical is fine - as it iterates through the items it can just collect the highest "timestamp" so far and +1 or something similar - I agree that full LRU is overkill, but I also consider the half-assed singly-linked-list overkill (in terms of maintenance complexity).. a full LRU with hash and all would be generally useful perhaps (and maybe motivate the maintenance), and if we had one we would perhaps use it here too, but as is, the dumb logical timetstamp approach feels more right.

arnetheduck · 2022-09-29T08:56:38Z

not sure how much the shuffling cache is affected as well.

this is important: the epoch cache is cheap per-instance, the shuffling cache is not - also, there may potentially be many epochrefs that reuse the same shuffling, so it might be that we should bump epochref caches to 64 for example

etan-status · 2022-09-29T09:21:53Z

Will add a small debug level log whenever an item is evicted due to capacity limits. This allows analysis in adverse forking situations.

So far, with 32 / 16 unchanged, and only updating to LRU, I managed to get Nimbus+Besu synced for the first time in 3 days on mainnet.

arnetheduck · 2022-09-29T09:47:25Z

beacon_chain/consensus_object_pools/blockchain_dag.nim

@@ -265,6 +265,55 @@ func atSlot*(dag: ChainDAGRef, bid: BlockId, slot: Slot): Opt[BlockSlotId] =
  else:
    dag.getBlockIdAtSlot(slot)

+func nextTimestamp[I, T](cache: var LRUCache[I, T]): uint32 =
+  if cache.timestamp == uint32.high:


most of us lazy people would have gone with a uint64 and no reset loop :)

etan-status · 2022-09-29T10:12:29Z

beacon_chain/consensus_object_pools/blockchain_dag.nim

+          break
+    if min != 0:
+      {.noSideEffect.}:
+        debug "Cache full - evicting LRU", cache = typeof(T).name, capacity = I


Seeing it for both ShufflingRef and EpochRef on Raspi around ~10 minutes behind wall clock.

weird, means there's deep forks happening at a high rate or we're creating shufflings / states unnecessarily .. is this mainnet? I don't remember seeing that much churn

ie might be worth looking into why we're even creating these shufflings / epochrefs

Mainnet with in-BN validators and Besu (slow processing / pessimistic sync)

yeah, but why the shufflings? until we're synced, we shouldn't be making shufflings and after, we should only make them for legit forks

"We're synced" depends on syncHorizon, so there are 3 parallel tasks going on:

Catching up, this is around ~10 minutes behind when full caches are observed

Gossip, validating those triggers new EpochRef to get proposerkey, and as part of epochref init it triggers shuffling

Validator duties, based on however far it is caught up, but forward propagated to current slot

Catching up, this is around ~10 minutes behind when full caches are observed

This should certainly not trigger epochref starvation - ie when were syncing it's a single history where the time-based eviction should work just fine.

Gossip, validating those triggers new EpochRef to get proposerkey, and as part of epochref init it triggers shuffling

does this trigger before we've validated the parent? because if we've validated the parent, we shouldn't need a new shuffling except once per epoch.

Ditto attestations: we can't gossip-check those until the block being voted for has been synced so head-compatible votes shouldn't cause epochrefs

Validator duties, based on however far it is caught up, but forward propagated to current slot

this one is uglier indeed - computing proposers while syncing is messy but necessary, but why isn't the epoch-based cache working out here? it should be roughly equivalent to LRU...

etan-status · 2022-09-29T11:31:15Z

I did 3 syncs from ~10 minutes behind on Mainnet using Pi with validators and Besu 22.7.4 EL + restart script on incorrect INVALID, and it synced within ~45 minutes in all three attempts. Without this patch, I have only managed to get in sync a single time since the merge, with Besu 22.7.2. On the other attempts, the sync got stuck ~10 minutes behind wall clock.

arnetheduck · 2022-09-29T11:36:10Z

The patch is good in general (LRU makes more sense for sure) - but it still sounds weird to me that we're generating that many epochrefs on an ordinary sync - something else is wrong also if this is happening on mainnet, or the various params are not in tune with each other (ie too few caches vs sync horizon etc)

arnetheduck · 2022-09-29T12:32:26Z

Per discussion elsewhere, and for future reference - a significant contributing factor to the poor eviction performance of the pre-LRU design is the usage of epoch: when there are multiple caches from the same epoch, the first one of those will keep getting evicted -this happens when computing proposer lookahead: many epochrefs with the same epoch but slowly increasing dependent block slot - we'd likely see a similar improvement from a cache that uses slot-level granularity based on dependent root distance.

that said, LRU should work in a similar way.

tersec · 2022-09-29T14:20:48Z

This jenkins/nimbus-eth2/macos/aarch64 failure is actually a real finalization failure, not yet another macOS-triggered build timeout

etan-status · 2022-09-29T14:43:58Z

Good catch, looks unrelated to this PR though, so still fine to go ahead with this one.

aarch64 is the one without Geth, so it should be the least complex one of the finalization tests.
Briefly scrolled through the logs, seeing Upgrade failed and Noise handshake, peer infos don't match! messages; not sure if related or normal.

change to logical timestamps + log evictions

b76dd87

arnetheduck reviewed Sep 29, 2022

View reviewed changes

arnetheduck approved these changes Sep 29, 2022

View reviewed changes

etan-status commented Sep 29, 2022

View reviewed changes

etan-status enabled auto-merge (squash) September 29, 2022 11:31

arnetheduck approved these changes Sep 29, 2022

View reviewed changes

rm debug logging

2457c11

etan-status merged commit 5968ed5 into unstable Sep 29, 2022

etan-status deleted the dev/etan/dg-lrucache branch September 29, 2022 14:55

tersec mentioned this pull request Oct 4, 2022

Investigate reported cases of slow block production #4022

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use LRU strategy for shuffling/epoch caches #4196

use LRU strategy for shuffling/epoch caches #4196

etan-status commented Sep 29, 2022

etan-status commented Sep 29, 2022 •

edited

Loading

github-actions bot commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

etan-status commented Sep 29, 2022 •

edited

Loading

arnetheduck commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

etan-status commented Sep 29, 2022

arnetheduck Sep 29, 2022

etan-status Sep 29, 2022

arnetheduck Sep 29, 2022

arnetheduck Sep 29, 2022

etan-status Sep 29, 2022 •

edited

Loading

arnetheduck Sep 29, 2022

etan-status Sep 29, 2022 •

edited

Loading

arnetheduck Sep 29, 2022

etan-status commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

tersec commented Sep 29, 2022

etan-status commented Sep 29, 2022

use LRU strategy for shuffling/epoch caches #4196

use LRU strategy for shuffling/epoch caches #4196

Conversation

etan-status commented Sep 29, 2022

etan-status commented Sep 29, 2022 • edited Loading

github-actions bot commented Sep 29, 2022

Unit Test Results

arnetheduck commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

etan-status commented Sep 29, 2022 • edited Loading

arnetheduck commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

etan-status commented Sep 29, 2022

arnetheduck Sep 29, 2022

Choose a reason for hiding this comment

etan-status Sep 29, 2022

Choose a reason for hiding this comment

arnetheduck Sep 29, 2022

Choose a reason for hiding this comment

arnetheduck Sep 29, 2022

Choose a reason for hiding this comment

etan-status Sep 29, 2022 • edited Loading

Choose a reason for hiding this comment

arnetheduck Sep 29, 2022

Choose a reason for hiding this comment

etan-status Sep 29, 2022 • edited Loading

Choose a reason for hiding this comment

arnetheduck Sep 29, 2022

Choose a reason for hiding this comment

etan-status commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

arnetheduck commented Sep 29, 2022

tersec commented Sep 29, 2022

etan-status commented Sep 29, 2022

etan-status commented Sep 29, 2022 •

edited

Loading

etan-status commented Sep 29, 2022 •

edited

Loading

etan-status Sep 29, 2022 •

edited

Loading

etan-status Sep 29, 2022 •

edited

Loading