Switch to SDK branch that lowers epoch time #451

ValarDragon · 2021-08-28T07:03:00Z

This switches us to the SDK branch that fixes the N^2 issues within the CacheKV store (osmosis-labs/cosmos-sdk#24 )

It also adds some scaffolding so that we can test with LevelDB backends, which at some point ended up being necessary due to a memDB bug and makes benchmarks more accurate due to avoiding some weird oddities with the memDB.

(I'm not sure if the memDB bug still gets triggered if we make some insanely high loads, but it definitely seemed like a bug on google's sides, as I traced it back when it applied to a perfectly normal set operation that caused google's BTree to infinitely hang)

Before:

BenchmarkDistributionLogicMedium-16            4         270370131 ns/op        96074306 B/op    1334220 allocs/op

After:

BenchmarkDistributionLogicMedium-16          285           3644674 ns/op         1192226 B/op      20120 allocs/op

Speedup:

>>> old = 270370131
>>> new = 3644674
>>> 100 * (old - new) / old
98.6519686969416

~99% speedup of the benchmark!

We should backport this to v3.2.0, and check if this empirically improves the epoch time performance moreso than prior WIP v3.2.0 branch.

Also interestingly, now bech32 encoding / decoding work takes 10% of the compute time after this change.

ValarDragon · 2021-08-28T07:05:03Z

x/lockup/keeper/lock_test.go

 	addr1 := sdk.AccAddress([]byte("addr1---------------"))
 	coins := sdk.Coins{sdk.NewInt64Coin("stake", 10)}
 	startAveragingAt := 1000
-	totalNumLocks := 2000
+	totalNumLocks := 5000


This now works just fine with 20000, that'd take 7 seconds on my laptop. Just kept it lower to not take more time in CI / local testing.

ValarDragon · 2021-08-28T18:58:21Z

Hrmm, on mainnet this still has execution time at 2 minutes total, which seems like a bit over a 2x improvement. Trying to see if I can make the benchmark parameterization comparable.

ValarDragon · 2021-08-28T20:27:56Z

Ok, benchmark accuracy improved! It still was that the bulk of the time before was in the CacheKV effects, but the prior benchmark wasn't well capturing the other things.

Now the time that is split around: 30% is in bech32 encoding / decoding, 10-15% in emitting events, 8% in validating denoms, and a lot in many random places in bank. I'm going to work on knocking out a few of these simple fixes at least, that I'm hoping should be a quick additional 20-30% speedup.

After next commit:

BenchmarkDistributionLogicLarge-16    	       1	165136166385 ns/op	43062670968 B/op	732285874 allocs/op

(Thats taking an insane 43 GB of data written to/from heap 😬 -- at least only 700MB of heap space actually needed to be allocated tho)

codecov-commenter · 2021-08-28T20:32:26Z

Codecov Report

Merging #451 (3897547) into main (1a1c591) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #451   +/-   ##
=======================================
  Coverage   19.00%   19.00%           
=======================================
  Files         144      144           
  Lines       22546    22546           
=======================================
  Hits         4285     4285           
  Misses      17517    17517           
  Partials      744      744

Impacted Files	Coverage Δ
x/incentives/keeper/gauge.go	`68.66% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a1c591...3897547. Read the comment docs.

ValarDragon · 2021-08-29T07:33:24Z

The above numbers have been improved (with only non-breaking changes) to be:

BenchmarkDistributionLogicLarge-16             1        88318071639 ns/op       34189730880 B/op        601589333 allocs/op

** Use some grains of salt, the benchmark itself seems to have an ~8% variance in each number on my machine, due to lock distribution things.

The next step non-state breaking big steps are basically:

Remove events from the Bank module that are triggered here - Should save massive amounts of computation time, as we will be allocating less data and garbage collecting less data.
Lower the number of calls to SubtractCoins by making a single SendCoinsFromModuleToManyAccounts
Lower bech32 decryption overhead

Theres a lot of state breaking options to do afterwards, and changes to the logic code to combine processing of all gauges for denom at once.

I think the main immediate thing worth doing is (1), and saving the rest for later. We'll start hitting marginal utility on this side, versus the refactor of distribution logic. (That could risk being state breaking due to IAVL insertion order dependencies) The small-medium refactoring of the distribution logic should be a 2-3x speedup. (But also obviated with F1 work later)

Switch to SDK branch that eliminates epoch time

7e34047

ValarDragon commented Aug 28, 2021

View reviewed changes

Improve benchmark accuracy

33d6f9f

Sync more things

3897547

ValarDragon changed the title ~~Switch to SDK branch that eliminates epoch time~~ Switch to SDK branch that lowers epoch time Aug 29, 2021

ValarDragon merged commit 6d082da into main Aug 29, 2021

ValarDragon deleted the speedup_epochs branch August 29, 2021 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to SDK branch that lowers epoch time #451

Switch to SDK branch that lowers epoch time #451

ValarDragon commented Aug 28, 2021 •

edited

Loading

ValarDragon Aug 28, 2021 •

edited

Loading

ValarDragon commented Aug 28, 2021 •

edited

Loading

ValarDragon commented Aug 28, 2021 •

edited

Loading

codecov-commenter commented Aug 28, 2021 •

edited

Loading

ValarDragon commented Aug 29, 2021 •

edited

Loading

Switch to SDK branch that lowers epoch time #451

Switch to SDK branch that lowers epoch time #451

Conversation

ValarDragon commented Aug 28, 2021 • edited Loading

ValarDragon Aug 28, 2021 • edited Loading

Choose a reason for hiding this comment

ValarDragon commented Aug 28, 2021 • edited Loading

ValarDragon commented Aug 28, 2021 • edited Loading

codecov-commenter commented Aug 28, 2021 • edited Loading

Codecov Report

ValarDragon commented Aug 29, 2021 • edited Loading

ValarDragon commented Aug 28, 2021 •

edited

Loading

ValarDragon Aug 28, 2021 •

edited

Loading

ValarDragon commented Aug 28, 2021 •

edited

Loading

ValarDragon commented Aug 28, 2021 •

edited

Loading

codecov-commenter commented Aug 28, 2021 •

edited

Loading

ValarDragon commented Aug 29, 2021 •

edited

Loading