WIP: Jae/simulator improvements #2900

jaekwon · 2018-11-26T12:33:26Z

See the Tendermint PR for more details: tendermint/tendermint#2913

This PR makes the CacheKVStore.Iterator/ReverseIterator faster, and also makes simulation faster by faking the IAVL store mount and using a db adapter instead.

Requires corresponding dependency changes in Tendermint and IAVL.

cwgoes · 2018-11-26T13:58:35Z

Working on fixes to this PR in a separate one for clarity - #2901.

baseapp/baseapp.go

cwgoes · 2018-11-26T14:03:28Z

x/mock/simulation/rand_util.go

@@ -62,3 +62,29 @@ func RandTimestamp(r *rand.Rand) time.Time {
 	unixTime := r.Int63n(253373529600)
 	return time.Unix(unixTime, 0)
 }
+
+// Derive a new rand deterministically from a rand.
+// Unlike rand.New(rand.NewSource(seed)), the result is "more random"


Why is this "more random"? I don't really follow - is there a substantive difference between reinitializing a new PRNG and just stepping the same PRNG?

cc @ValarDragon

This is not more random in a meaningful way. This doesn't even increase the repetitivity of the internal RNG state. (Since your combining multiple PRNG's with the same cycle size via xor)

This typically may be done to combine more entropy in some strange settings, however here that is also not the case since each seed is derived from the original r instance.

The rng used in golang is essentially a lagged fibonacci random number generator, with fixes that improve the seeding. (Source: https://groups.google.com/forum/#!topic/golang-nuts/RZ1G3_cxMcM) That has an insane cycle length (2^603) and is sufficiently random for our purposes. It is not suitable for things like monte-carlo simulations, but nor is this. It would make sense to switch to a monte-carlo safe rng in the future.

The repetitivity/cycle-length doesn't matter, as you mentioned. A problem of the original math.Rand (with default Source) is that seed values that have the same remainder when divided by 2^31-1 generate the same pseudo-random sequence.. There are only two billion unique initial random sequences for a new Rand (with default Source)!

In order to make simulation deterministic while being able to make changes in the middle somewhere, we can't just re-use the same *Rand, because changes to the usage of the *Rand of a middle transaction causes rippling changes for all future operations. In some situations this can make debugging more difficult, because e.g. just because you mutate/mute/remove a single operation of a block doesn't mean that you're really testing the same scenario, as later operations of the same block (as an example) would be completely different.

The solution is to construct a tree of rands, as it's already done here... Here, we construct a predetermined set of operations, and then for further logic within an operation, we pass in a derived Rand. In the future we may want to derive third and fourth order rands in this way.

Once you're deriving a bunch of rands in this way, the limitation of 2^31-1 possible sequences does become meaningful. Sometimes you'll end up with the same seed for a derived rand, which means for that branch of rand usage, it's no longer random... the same sequence was used in another branch of simulation logic!

This DeriveRand() function is designed to solve this issue. The possibility of deriving the same rand sequence becomes negligible.

Thanks for the clarification, I didn't know that! The comment should be amended from making the result more "random", which its not, to instead increasing the seed space.

However I still don't think this is worth doing. With there being 2**31 different valid seeds, by application of birthday collisions we'd have to do 54 thousand simulations with no change to the simulator in order for there to be a 50% probability of a single duplicate seed. We'd have to do 8.3 million simulations before there is a 1% probability each successive simulation was for nought. I frankly don't think we will do 54 thousand long simulations ever before changing the simulator, let alone 8.3 million. I don't think this is worth the performance trade-off, as speed of the simulator matters.

cwgoes · 2018-11-26T14:47:12Z

Merging #2901 since it seems non-controversial and I want to debug other sim fixes separately.

* Fix upstream versions * Linter fix * Pin exact revisions, not branches * Rename 'MountStoresIAVL' to 'MountStores'

cwgoes · 2018-11-26T14:48:30Z

The normal Gaia simulation fails with seed 99:

make test_sim_gaia_fast

Simulating... block 14/500, operation 650/696. Invariants broken after BeginBlock
unexpected leftover validator pool coins: 23.2500000000
Too many logs to display, instead writing to simulation_log_2018-11-26 15:32:43.txt
--- FAIL: TestFullGaiaSimulation (3.68s)
    invariants.go:27: 
FAIL

Everything else seems OK, import/export & simulation after import work on all other seeds.

Still running 500-block multi-seed sim.

rigelrozanski · 2018-11-26T17:21:54Z

interestingly enough, this is not an accum issue, I ran the validator accum invariants when this fails and everything is proper. This suggests to me that this is possibly a rounding issue when we withdraw.

…ovements

baseapp/baseapp.go

cmd/gaia/app/app.go

store/cachekvstore.go

alexanderbez · 2018-11-26T18:02:21Z

store/iavlstore_test.go

-// make a tree and save it
-func newTree(t *testing.T, db dbm.DB) (*iavl.MutableTree, CommitID) {
+// make a tree with data from above and save it
+func newAlohaTree(t *testing.T, db dbm.DB) (*iavl.MutableTree, CommitID) {


indeed, 🌴 ?

It's full of some random data include the word "aloha". Differentiates it from a "newTree".

store/prefixstore.go

rigelrozanski · 2018-11-26T21:59:31Z

figured out what the new bug is related to for seed 99- basically in block 14 there are no bonded validators, but validators are still expected to validate for a block or two after they’re no longer bonded - so what is happening is that fees are being paid during those blocks but there is no accum which is being created, hence the fees go to nobody!
oddly enough within this simulation during this brief state when the validator set dips down to zero (but is still running for a block) new validators are bonded and the network seems to want to keep running… so even if I skip this invariant check for block 14, the invariant still fails for block 15 somehow… this seems sketchy to me lol - we should probably shut down everything as soon as there are no bonded tokens left

codecov · 2018-11-26T22:16:50Z

Codecov Report

Merging #2900 into develop will decrease coverage by 0.37%.
The diff coverage is 44%.

@@             Coverage Diff             @@
##           develop    #2900      +/-   ##
===========================================
- Coverage    56.72%   56.35%   -0.38%     
===========================================
  Files          120      120              
  Lines         8396     8416      +20     
===========================================
- Hits          4763     4743      -20     
- Misses        3311     3351      +40     
  Partials       322      322

rigelrozanski · 2018-11-26T22:24:27Z

fixed in #2905

Also opened up a reference issue for the early shutdown (#2906)

I don't forsee this circumstance happening in GoS as AiB will maintain the large backbone validator running... however this should get resolved prelaunch

jaekwon added 4 commits November 26, 2018 04:29

Make simulator operations predetermined

19adbfa

error upon invariance break

380e277

Use new dbm.ReverseIterator API

5c568c8

Simulation speed improvements

c944ade

jaekwon requested review from cwgoes, ebuchman and rigelrozanski as code owners November 26, 2018 12:33

jaekwon mentioned this pull request Nov 26, 2018

WIP: Make simulator operations predetermined #2897

Closed

cwgoes mentioned this pull request Nov 26, 2018

WIP: Fixes to simulator improvements #2901

Merged

cwgoes reviewed Nov 26, 2018

View reviewed changes

Fixes to simulator improvements (#2901)

6058900

* Fix upstream versions * Linter fix * Pin exact revisions, not branches * Rename 'MountStoresIAVL' to 'MountStores'

cwgoes requested a review from zramsay as a code owner November 26, 2018 14:47

jackzampolin mentioned this pull request Nov 26, 2018

Game of Stakes / gaia-9002 release checklist #2754

Closed

27 tasks

Merge remote-tracking branch 'origin/develop' into jae/simulator_impr…

989f0f9

…ovements

alexanderbez reviewed Nov 26, 2018

View reviewed changes

rigelrozanski mentioned this pull request Nov 26, 2018

Early shutdown if no bonded validators remain #2906

Closed

rigelrozanski and others added 3 commits November 27, 2018 00:47

Merge PR #2905: Zero Power Block fees to community pool

92917fe

Update allocation.go

b07b809

Fixes from review

8e8065a

jaekwon merged commit d1e7622 into develop Nov 27, 2018

cwgoes deleted the jae/simulator_improvements branch November 27, 2018 12:44

alexanderbez mentioned this pull request Nov 27, 2018

R4R: Fix starting Gaia Lite #2918

Merged

5 tasks

abelliumnt mentioned this pull request Dec 5, 2018

R4R: Add unit tests for store and merge PRs irisnet/irishub#766

Merged

This was referenced Feb 5, 2019

Upgrade to 0.30 SDK TruStory/truchain#327

Merged

Upgrade to 0.30 SDK TruStory/truchain#353

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Jae/simulator improvements #2900

WIP: Jae/simulator improvements #2900

jaekwon commented Nov 26, 2018 •

edited

Loading

cwgoes commented Nov 26, 2018

cwgoes Nov 26, 2018

ValarDragon Nov 26, 2018

jaekwon Nov 27, 2018

ValarDragon Nov 27, 2018 •

edited

Loading

cwgoes commented Nov 26, 2018

cwgoes commented Nov 26, 2018 •

edited

Loading

rigelrozanski commented Nov 26, 2018

alexanderbez Nov 26, 2018

cwgoes Nov 26, 2018

jaekwon Nov 27, 2018

rigelrozanski commented Nov 26, 2018

codecov bot commented Nov 26, 2018 •

edited

Loading

rigelrozanski commented Nov 26, 2018 •

edited

Loading

WIP: Jae/simulator improvements #2900

WIP: Jae/simulator improvements #2900

Conversation

jaekwon commented Nov 26, 2018 • edited Loading

cwgoes commented Nov 26, 2018

cwgoes Nov 26, 2018

Choose a reason for hiding this comment

ValarDragon Nov 26, 2018

Choose a reason for hiding this comment

jaekwon Nov 27, 2018

Choose a reason for hiding this comment

ValarDragon Nov 27, 2018 • edited Loading

Choose a reason for hiding this comment

cwgoes commented Nov 26, 2018

cwgoes commented Nov 26, 2018 • edited Loading

rigelrozanski commented Nov 26, 2018

alexanderbez Nov 26, 2018

Choose a reason for hiding this comment

cwgoes Nov 26, 2018

Choose a reason for hiding this comment

jaekwon Nov 27, 2018

Choose a reason for hiding this comment

rigelrozanski commented Nov 26, 2018

codecov bot commented Nov 26, 2018 • edited Loading

Codecov Report

rigelrozanski commented Nov 26, 2018 • edited Loading

jaekwon commented Nov 26, 2018 •

edited

Loading

ValarDragon Nov 27, 2018 •

edited

Loading

cwgoes commented Nov 26, 2018 •

edited

Loading

codecov bot commented Nov 26, 2018 •

edited

Loading

rigelrozanski commented Nov 26, 2018 •

edited

Loading