-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to SDK branch that lowers epoch time #451
Conversation
addr1 := sdk.AccAddress([]byte("addr1---------------")) | ||
coins := sdk.Coins{sdk.NewInt64Coin("stake", 10)} | ||
startAveragingAt := 1000 | ||
totalNumLocks := 2000 | ||
totalNumLocks := 5000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now works just fine with 20000, that'd take 7 seconds on my laptop. Just kept it lower to not take more time in CI / local testing.
Hrmm, on mainnet this still has execution time at 2 minutes total, which seems like a bit over a 2x improvement. Trying to see if I can make the benchmark parameterization comparable. |
Ok, benchmark accuracy improved! It still was that the bulk of the time before was in the CacheKV effects, but the prior benchmark wasn't well capturing the other things. Now the time that is split around: 30% is in bech32 encoding / decoding, 10-15% in emitting events, 8% in validating denoms, and a lot in many random places in bank. I'm going to work on knocking out a few of these simple fixes at least, that I'm hoping should be a quick additional 20-30% speedup. After next commit:
(Thats taking an insane 43 GB of data written to/from heap 😬 -- at least only 700MB of heap space actually needed to be allocated tho) |
Codecov Report
@@ Coverage Diff @@
## main #451 +/- ##
=======================================
Coverage 19.00% 19.00%
=======================================
Files 144 144
Lines 22546 22546
=======================================
Hits 4285 4285
Misses 17517 17517
Partials 744 744
Continue to review full report at Codecov.
|
The above numbers have been improved (with only non-breaking changes) to be:
** Use some grains of salt, the benchmark itself seems to have an ~8% variance in each number on my machine, due to lock distribution things. The next step non-state breaking big steps are basically:
Theres a lot of state breaking options to do afterwards, and changes to the logic code to combine processing of all gauges for denom at once. I think the main immediate thing worth doing is (1), and saving the rest for later. We'll start hitting marginal utility on this side, versus the refactor of distribution logic. (That could risk being state breaking due to IAVL insertion order dependencies) The small-medium refactoring of the distribution logic should be a 2-3x speedup. (But also obviated with F1 work later) |
This switches us to the SDK branch that fixes the N^2 issues within the CacheKV store (osmosis-labs/cosmos-sdk#24 )
It also adds some scaffolding so that we can test with LevelDB backends, which at some point ended up being necessary due to a memDB bug and makes benchmarks more accurate due to avoiding some weird oddities with the memDB.
(I'm not sure if the memDB bug still gets triggered if we make some insanely high loads, but it definitely seemed like a bug on google's sides, as I traced it back when it applied to a perfectly normal set operation that caused google's BTree to infinitely hang)
Before:
After:
Speedup:
~99% speedup of the benchmark!
We should backport this to v3.2.0, and check if this empirically improves the epoch time performance moreso than prior WIP v3.2.0 branch.
Also interestingly, now bech32 encoding / decoding work takes 10% of the compute time after this change.