Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix negative stake & invariance bug #3033

Merged
merged 3 commits into from
Dec 8, 2018
Merged

Conversation

jaekwon
Copy link
Contributor

@jaekwon jaekwon commented Dec 7, 2018

Upon debugging #3019, I discovered, in order,:

(1) the transactions of the offending block aren't relevant.
(2) stake.EndBlocker is the offender.
(3) UnbondAllMatureValidatorQueue is the offender.
(4) that RemoveValidator was being called.
(5) the pool is being passed around in places where it is unnecessary.
(6) that this validator had 0 shares and negative staking tokens.
(7) RemoveTokens is not being re-used, and it lacks sanity checks.

There are 2 bugs here that are addressed here:

  • negative staking token can happen because RemoveDelShares isn't clipping properly.
  • RemoveValidator is not removing the tokens properly from the pool.

Also, the pool is removed where unnecessary, and RemoveTokens is reused.

I've verified that these changes would have made 9002 pass at least that offending block, though I'm not certain that I've found all the places where negative staked validators can occur.

TODO: Run simulations. --- Simulation is still failing.
TODO: Create PositiveValidatorStakeInvariant

types/decimal.go Outdated Show resolved Hide resolved
@cwgoes
Copy link
Contributor

cwgoes commented Dec 7, 2018

OK - here's a PR for linter fixes and the non-negative tokens check: #3037

Which simulations are still failing? They seem to pass on CI and for me.

* Update invariant; fix lint

* Fix linter
Copy link
Contributor

@rigelrozanski rigelrozanski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazong debugging Jae.. really appreciate this!

@codecov
Copy link

codecov bot commented Dec 7, 2018

Codecov Report

Merging #3033 into develop will decrease coverage by 0.03%.
The diff coverage is 86.11%.

@@             Coverage Diff             @@
##           develop    #3033      +/-   ##
===========================================
- Coverage     52.2%   52.17%   -0.04%     
===========================================
  Files          140      140              
  Lines         9764     9766       +2     
===========================================
- Hits          5097     5095       -2     
- Misses        4328     4330       +2     
- Partials       339      341       +2

// this happens if shares are zero but tokens are not.
pool := k.GetPool(ctx)
pool.LooseTokens = pool.LooseTokens.Sub(validator.Tokens)
k.SetPool(ctx, pool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably indicate "burn" tokens here - because that's what's happening.
Also FYI this shouldn't be necessary once the Dec->Int Update is made due to an update to RemoveDelegatorShares https://github.com/cosmos/cosmos-sdk/pull/2958/files#diff-ff4682cf67c96b47c013393f099d1d0eR407

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment, added TODO.

issuedTokens := v.DelegatorShareExRate().Mul(delShares)
v.Tokens = v.Tokens.Sub(issuedTokens)
delTokens := v.DelegatorShareExRate().Mul(delShares)
delTokens = sdk.MinDec(delTokens, v.Tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary? When are there insufficient tokens? Would that mean some other delegator got more than they ought to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this logic is updated in #2958 p.s. as mentioned in the previous comment

Copy link
Contributor

@cwgoes cwgoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, spun out one question into an issue - #3041

@cwgoes cwgoes merged commit 1ba93ea into develop Dec 8, 2018
@cwgoes cwgoes deleted the jae/removevalidator_negstake_fix branch December 8, 2018 00:05
jaekwon added a commit that referenced this pull request Dec 8, 2018
Fixes regression introduced by #2984.
Continuiation of #3033 , which didn't fix the simulation issues.
(candidate) Complete solution for #3019, 9002 halt bug.

From #2984, it isn't sufficient to take the fee pool rewards of a validator. Since we don't track delegator accums (as we do with validator accums), and because onValidatorModified >updateValidatorDistInfoFromPool is also being called upon delegation updates (or at least I believe this is the reason), it is necessary to also withdraw self delegation.

TODO: I don't think self-delegation should be required to be modified here... consider using a delegation hook to do the self-delegation withdraw part instead, e.g. splitting the updateValidatorDistInfoFromPool function into two. It might not result in cleaner code, however. Think hard.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants