Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing PowerReduction param causing network halt #9447

Closed
2 of 4 tasks
anilcse opened this issue Jun 2, 2021 · 7 comments · Fixed by #9495
Closed
2 of 4 tasks

Changing PowerReduction param causing network halt #9447

anilcse opened this issue Jun 2, 2021 · 7 comments · Fixed by #9495
Assignees
Milestone

Comments

@anilcse
Copy link
Collaborator

anilcse commented Jun 2, 2021

Summary of Bug

PowerReduction is now changed into an on-chain param. Updating it via a param change proposal is causing the consensus failure. It's occuring on a validator getting jailed once after the param is updated. Here's are the logs: https://pastebin.com/5RDu2mwH

Version

v0.43-beta1

Steps to Reproduce

  • Create a new chain with multiple validators
  • Create a param change proposal to change PowerReduction param
{
    "title": "Staking power reduction param change proposal",
    "description": "Update power reduction",
    "changes": [
        {
            "subspace": "staking",
            "key": "PowerReduction",
            "value": "10000000"
        }
    ],
    "deposit": "10000000stake"
}
  • Vote and let the proposal succeed
  • Bring one of the validatos down and wait for it to jail
  • It causes CONSENSUS FAILURE!

cc @sunnya97 @robert-zaremba @clevinson @aaronc


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@amaury1093
Copy link
Contributor

Is someone taking a look at this? I believe it's one of the last pieces blocking a 0.43-RC

@amaury1093
Copy link
Contributor

amaury1093 commented Jun 9, 2021

I managed to reproduce the error in a script on my am/9447-power-red branch (it's macos-specific, and uses gsed, feel free to tweak to make it work on your OS).

git checkout am/9447-power-red
./power_reduction.sh
# Follow instructions
# Will error with consensus failure at around block 100:
"should never retrieve a jailed validator from the power store"

link to logs of the couple of blocks leading to the consensus error.

@alexanderbez
Copy link
Contributor

alexanderbez commented Jun 9, 2021

The PowerReduction was never meant to be an on-chain param, at least not with further modifications and state changes to account for this. I can't think of what needs to be done, but I don't think it's trivial.

Since you're changing the way the state-machine views and treats validator powers, you're also changing the way validators are retrieved from the state. Specifically, the state-machine returns to Tendermint validator set updates every block (or epoch) via ApplyAndReturnValidatorSetUpdates. Since you've changed PowerReduction, the iteration is now not deterministic in a sense. So it would technically be possible to fetch a jailed validator from the store if the power reduction is changed in a certain way.

@amaury1093
Copy link
Contributor

So the action here is to revert #8505? @sunnya97 are you okay with that?

@amaury1093
Copy link
Contributor

amaury1093 commented Jun 10, 2021

I created a PR to revert 8505: #9495.

I also re-opened #8365 to discuss if it's desirable to have this feature for v0.44, in a way that doesn't break consensus.

@mergify mergify bot closed this as completed in #9495 Jun 14, 2021
mergify bot pushed a commit that referenced this issue Jun 14, 2021
<!--
The default pull request template is for types feat, fix, or refactor.
For other templates, add one of the following parameters to the url:
- template=docs.md
- template=other.md
-->

## Description

Closes: #9447

This PR partially reverts #8505. Namely:
- it removes PowerReduction as a staking on-chain param
- however, it keeps #8505's API changes regarding adding a `powerReduction` function argument to staking functions. This allows us to rely less on global variables in said functions.

<!-- Add a description of the changes that this PR introduces and the files that
are the most critical to review. -->

---

### Author Checklist

*All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.*

I have...

- [ ] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] added `!` to the type prefix if API or client breaking change
- [ ] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting))
- [ ] provided a link to the relevant issue or specification
- [ ] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules)
- [ ] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing)
- [ ] added a changelog entry to `CHANGELOG.md`
- [ ] included comments for [documenting Go code](https://blog.golang.org/godoc)
- [ ] updated the relevant documentation or specification
- [ ] reviewed "Files changed" and left comments if necessary
- [ ] confirmed all CI checks have passed

### Reviewers Checklist

*All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.*

I have...

- [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title
- [ ] confirmed `!` in the type prefix if API or client breaking change
- [ ] confirmed all author checklist items have been addressed 
- [ ] reviewed state machine logic
- [ ] reviewed API design and naming
- [ ] reviewed documentation is accurate
- [ ] reviewed tests and test coverage
- [ ] manually tested (if applicable)
@robert-zaremba
Copy link
Collaborator

Specifically, the state-machine returns to Tendermint validator set updates every block (or epoch) via ApplyAndReturnValidatorSetUpdates. Since you've changed PowerReduction, the iteration is now not deterministic in a sense.

Why it is not deterministic? Since blocks are finalized, big majority of the validators will have the same state. There is no randomness.

@alexanderbez
Copy link
Contributor

alexanderbez commented Aug 19, 2021

@robert-zaremba because you need to go into the x/staking store and change every validator's power (and maybe other indexes) using the new power reduction value. The current x/params and x/gov model doesn't allow for us to execute arbitrary logic on param changes. With the new x/gov and x/param redesign, this will be trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants