Skip to content

Commit

Permalink
Implementation of the new validator disabling strategy (#2226)
Browse files Browse the repository at this point in the history
Closes #1966,
#1963 and
#1962.

Disabling strategy specification
[here](#2955). (Updated
13/02/2024)

Implements:
* validator disabling for a whole era instead of just a session
* no more than 1/3 of the validators in the active set are disabled
Removes:
* `DisableStrategy` enum - now each validator committing an offence is
disabled.
* New era is not forced if too many validators are disabled.

Before this PR not all offenders were disabled. A decision was made
based on [`enum
DisableStrategy`](https://github.com/paritytech/polkadot-sdk/blob/bbb6631641f9adba30c0ee6f4d11023a424dd362/substrate/primitives/staking/src/offence.rs#L54).
Some offenders were disabled for a whole era, some just for a session,
some were not disabled at all.

This PR changes the disabling behaviour. Now a validator committing an
offense is disabled immediately till the end of the current era.

Some implementation notes:
* `OffendingValidators` in pallet session keeps all offenders (this is
not changed). However its type is changed from `Vec<(u32, bool)>` to
`Vec<u32>`. The reason is simple - each offender is getting disabled so
the bool doesn't make sense anymore.
* When a validator is disabled it is first added to
`OffendingValidators` and then to `DisabledValidators`. This is done in
[`add_offending_validator`](https://github.com/paritytech/polkadot-sdk/blob/bbb6631641f9adba30c0ee6f4d11023a424dd362/substrate/frame/staking/src/slashing.rs#L325)
from staking pallet.
* In
[`rotate_session`](https://github.com/paritytech/polkadot-sdk/blob/bdbe98297032e21a553bf191c530690b1d591405/substrate/frame/session/src/lib.rs#L623)
the `end_session` also calls
[`end_era`](https://github.com/paritytech/polkadot-sdk/blob/bbb6631641f9adba30c0ee6f4d11023a424dd362/substrate/frame/staking/src/pallet/impls.rs#L490)
when an era ends. In this case `OffendingValidators` are cleared
**(1)**.
* Then in
[`rotate_session`](https://github.com/paritytech/polkadot-sdk/blob/bdbe98297032e21a553bf191c530690b1d591405/substrate/frame/session/src/lib.rs#L623)
`DisabledValidators` are cleared **(2)**
* And finally (still in `rotate_session`) a call to
[`start_session`](https://github.com/paritytech/polkadot-sdk/blob/bbb6631641f9adba30c0ee6f4d11023a424dd362/substrate/frame/staking/src/pallet/impls.rs#L430)
repopulates the disabled validators **(3)**.
* The reason for this complication is that session pallet knows nothing
abut eras. To overcome this on each new session the disabled list is
repopulated (points 2 and 3). Staking pallet knows when a new era starts
so with point 1 it ensures that the offenders list is cleared.

---------

Co-authored-by: ordian <noreply@reusable.software>
Co-authored-by: ordian <write@reusable.software>
Co-authored-by: Maciej <maciej.zyszkiewicz@parity.io>
Co-authored-by: Gonçalo Pestana <g6pestana@gmail.com>
Co-authored-by: Kian Paimani <5588131+kianenigma@users.noreply.github.com>
Co-authored-by: command-bot <>
Co-authored-by: Ankan <10196091+Ank4n@users.noreply.github.com>
  • Loading branch information
7 people authored Apr 26, 2024
1 parent 97f7425 commit 988e30f
Show file tree
Hide file tree
Showing 33 changed files with 777 additions and 702 deletions.
11 changes: 1 addition & 10 deletions polkadot/runtime/parachains/src/disputes/slashing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ use sp_runtime::{
KeyTypeId, Perbill,
};
use sp_session::{GetSessionNumber, GetValidatorCount};
use sp_staking::offence::{DisableStrategy, Kind, Offence, OffenceError, ReportOffence};
use sp_staking::offence::{Kind, Offence, OffenceError, ReportOffence};
use sp_std::{
collections::{btree_map::Entry, btree_set::BTreeSet},
prelude::*,
Expand Down Expand Up @@ -134,15 +134,6 @@ where
self.time_slot.clone()
}

fn disable_strategy(&self) -> DisableStrategy {
match self.kind {
SlashingOffenceKind::ForInvalid => DisableStrategy::Always,
// in the future we might change it based on number of disputes initiated:
// <https://github.com/paritytech/polkadot/issues/5946>
SlashingOffenceKind::AgainstValid => DisableStrategy::Never,
}
}

fn slash_fraction(&self, _offenders: u32) -> Perbill {
self.slash_fraction
}
Expand Down
3 changes: 1 addition & 2 deletions polkadot/runtime/test-runtime/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,6 @@ parameter_types! {
pub const RewardCurve: &'static PiecewiseLinear<'static> = &REWARD_CURVE;
pub const MaxExposurePageSize: u32 = 64;
pub const MaxNominators: u32 = 256;
pub storage OffendingValidatorsThreshold: Perbill = Perbill::from_percent(17);
pub const MaxAuthorities: u32 = 100_000;
pub const OnChainMaxWinners: u32 = u32::MAX;
// Unbounded number of election targets and voters.
Expand Down Expand Up @@ -349,7 +348,6 @@ impl pallet_staking::Config for Runtime {
type SessionInterface = Self;
type EraPayout = pallet_staking::ConvertCurve<RewardCurve>;
type MaxExposurePageSize = MaxExposurePageSize;
type OffendingValidatorsThreshold = OffendingValidatorsThreshold;
type NextNewSession = Session;
type ElectionProvider = onchain::OnChainExecution<OnChainSeqPhragmen>;
type GenesisElectionProvider = onchain::OnChainExecution<OnChainSeqPhragmen>;
Expand All @@ -364,6 +362,7 @@ impl pallet_staking::Config for Runtime {
type BenchmarkingConfig = runtime_common::StakingBenchmarkingConfig;
type EventListeners = ();
type WeightInfo = ();
type DisablingStrategy = pallet_staking::UpToLimitDisablingStrategy;
}

parameter_types! {
Expand Down
5 changes: 2 additions & 3 deletions polkadot/runtime/westend/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -613,7 +613,6 @@ parameter_types! {
// this is an unbounded number. We just set it to a reasonably high value, 1 full page
// of nominators.
pub const MaxNominators: u32 = 64;
pub const OffendingValidatorsThreshold: Perbill = Perbill::from_percent(17);
pub const MaxNominations: u32 = <NposCompactSolution16 as frame_election_provider_support::NposSolution>::LIMIT as u32;
pub const MaxControllersInDeprecationBatch: u32 = 751;
}
Expand All @@ -634,7 +633,6 @@ impl pallet_staking::Config for Runtime {
type SessionInterface = Self;
type EraPayout = pallet_staking::ConvertCurve<RewardCurve>;
type MaxExposurePageSize = MaxExposurePageSize;
type OffendingValidatorsThreshold = OffendingValidatorsThreshold;
type NextNewSession = Session;
type ElectionProvider = ElectionProviderMultiPhase;
type GenesisElectionProvider = onchain::OnChainExecution<OnChainSeqPhragmen>;
Expand All @@ -647,6 +645,7 @@ impl pallet_staking::Config for Runtime {
type BenchmarkingConfig = runtime_common::StakingBenchmarkingConfig;
type EventListeners = NominationPools;
type WeightInfo = weights::pallet_staking::WeightInfo<Runtime>;
type DisablingStrategy = pallet_staking::UpToLimitDisablingStrategy;
}

impl pallet_fast_unstake::Config for Runtime {
Expand Down Expand Up @@ -1649,7 +1648,7 @@ pub mod migrations {
}

/// Unreleased migrations. Add new ones here:
pub type Unreleased = ();
pub type Unreleased = (pallet_staking::migrations::v15::MigrateV14ToV15<Runtime>,);
}

/// Unchecked extrinsic type as expected by this runtime.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ requests = { memory = "2G", cpu = "1" }
[[relaychain.node_groups]]
name = "honest-validator"
count = 3
args = ["-lparachain=debug"]
args = ["-lparachain=debug,runtime::staking=debug"]

[[relaychain.node_groups]]
image = "{{MALUS_IMAGE}}"
Expand Down
28 changes: 28 additions & 0 deletions prdoc/pr_2226.prdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
title: Validator disabling strategy in runtime

doc:
- audience: Node Operator
description: |
On each committed offence (no matter slashable or not) the offending validator will be
disabled for a whole era.
- audience: Runtime Dev
description: |
The disabling strategy in staking pallet is no longer hardcoded but abstracted away via
`DisablingStrategy` trait. The trait contains a single function (make_disabling_decision) which
is called for each offence. The function makes a decision if (and which) validators should be
disabled. A default implementation is provided - `UpToLimitDisablingStrategy`. It
will be used on Kusama and Polkadot. In nutshell `UpToLimitDisablingStrategy`
disables offenders up to the configured threshold. Offending validators are not disabled for
offences in previous eras. The threshold is controlled via `DISABLING_LIMIT_FACTOR` (a generic
parameter of `UpToLimitDisablingStrategy`).

migrations:
db: []
runtime:
- reference: pallet-staking
description: |
Renames `OffendingValidators` storage item to `DisabledValidators` and changes its type from
`Vec<(u32, bool)>` to `Vec<u32>`.

crates:
- name: pallet-staking
3 changes: 1 addition & 2 deletions substrate/bin/node/runtime/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -654,7 +654,6 @@ parameter_types! {
pub const SlashDeferDuration: sp_staking::EraIndex = 24 * 7; // 1/4 the bonding duration.
pub const RewardCurve: &'static PiecewiseLinear<'static> = &REWARD_CURVE;
pub const MaxNominators: u32 = 64;
pub const OffendingValidatorsThreshold: Perbill = Perbill::from_percent(17);
pub const MaxControllersInDeprecationBatch: u32 = 5900;
pub OffchainRepeat: BlockNumber = 5;
pub HistoryDepth: u32 = 84;
Expand Down Expand Up @@ -690,7 +689,6 @@ impl pallet_staking::Config for Runtime {
type EraPayout = pallet_staking::ConvertCurve<RewardCurve>;
type NextNewSession = Session;
type MaxExposurePageSize = ConstU32<256>;
type OffendingValidatorsThreshold = OffendingValidatorsThreshold;
type ElectionProvider = ElectionProviderMultiPhase;
type GenesisElectionProvider = onchain::OnChainExecution<OnChainSeqPhragmen>;
type VoterList = VoterList;
Expand All @@ -703,6 +701,7 @@ impl pallet_staking::Config for Runtime {
type EventListeners = NominationPools;
type WeightInfo = pallet_staking::weights::SubstrateWeight<Runtime>;
type BenchmarkingConfig = StakingBenchmarkingConfig;
type DisablingStrategy = pallet_staking::UpToLimitDisablingStrategy;
}

impl pallet_fast_unstake::Config for Runtime {
Expand Down
3 changes: 1 addition & 2 deletions substrate/frame/babe/src/mock.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,6 @@ parameter_types! {
pub const BondingDuration: EraIndex = 3;
pub const SlashDeferDuration: EraIndex = 0;
pub const RewardCurve: &'static PiecewiseLinear<'static> = &REWARD_CURVE;
pub const OffendingValidatorsThreshold: Perbill = Perbill::from_percent(16);
pub static ElectionsBounds: ElectionBounds = ElectionBoundsBuilder::default().build();
}

Expand Down Expand Up @@ -174,7 +173,6 @@ impl pallet_staking::Config for Test {
type UnixTime = pallet_timestamp::Pallet<Test>;
type EraPayout = pallet_staking::ConvertCurve<RewardCurve>;
type MaxExposurePageSize = ConstU32<64>;
type OffendingValidatorsThreshold = OffendingValidatorsThreshold;
type NextNewSession = Session;
type ElectionProvider = onchain::OnChainExecution<OnChainSeqPhragmen>;
type GenesisElectionProvider = Self::ElectionProvider;
Expand All @@ -187,6 +185,7 @@ impl pallet_staking::Config for Test {
type EventListeners = ();
type BenchmarkingConfig = pallet_staking::TestBenchmarkingConfig;
type WeightInfo = ();
type DisablingStrategy = pallet_staking::UpToLimitDisablingStrategy;
}

impl pallet_offences::Config for Test {
Expand Down
3 changes: 1 addition & 2 deletions substrate/frame/beefy/src/mock.rs
Original file line number Diff line number Diff line change
Expand Up @@ -158,7 +158,6 @@ parameter_types! {
pub const SessionsPerEra: SessionIndex = 3;
pub const BondingDuration: EraIndex = 3;
pub const RewardCurve: &'static PiecewiseLinear<'static> = &REWARD_CURVE;
pub const OffendingValidatorsThreshold: Perbill = Perbill::from_percent(17);
pub static ElectionsBoundsOnChain: ElectionBounds = ElectionBoundsBuilder::default().build();
}

Expand Down Expand Up @@ -188,7 +187,6 @@ impl pallet_staking::Config for Test {
type UnixTime = pallet_timestamp::Pallet<Test>;
type EraPayout = pallet_staking::ConvertCurve<RewardCurve>;
type MaxExposurePageSize = ConstU32<64>;
type OffendingValidatorsThreshold = OffendingValidatorsThreshold;
type NextNewSession = Session;
type ElectionProvider = onchain::OnChainExecution<OnChainSeqPhragmen>;
type GenesisElectionProvider = Self::ElectionProvider;
Expand All @@ -201,6 +199,7 @@ impl pallet_staking::Config for Test {
type EventListeners = ();
type BenchmarkingConfig = pallet_staking::TestBenchmarkingConfig;
type WeightInfo = ();
type DisablingStrategy = pallet_staking::UpToLimitDisablingStrategy;
}

impl pallet_offences::Config for Test {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ pub(crate) const LOG_TARGET: &str = "tests::e2e-epm";
use frame_support::{assert_err, assert_noop, assert_ok};
use mock::*;
use sp_core::Get;
use sp_npos_elections::{to_supports, StakedAssignment};
use sp_runtime::Perbill;

use crate::mock::RuntimeOrigin;
Expand Down Expand Up @@ -127,75 +126,48 @@ fn offchainify_works() {
}

#[test]
/// Replicates the Kusama incident of 8th Dec 2022 and its resolution through the governance
/// Inspired by the Kusama incident of 8th Dec 2022 and its resolution through the governance
/// fallback.
///
/// After enough slashes exceeded the `Staking::OffendingValidatorsThreshold`, the staking pallet
/// set `Forcing::ForceNew`. When a new session starts, staking will start to force a new era and
/// calls <EPM as election_provider>::elect(). If at this point EPM and the staking miners did not
/// have enough time to queue a new solution (snapshot + solution submission), the election request
/// fails. If there is no election fallback mechanism in place, EPM enters in emergency mode.
/// Recovery: Once EPM is in emergency mode, subsequent calls to `elect()` will fail until a new
/// solution is added to EPM's `QueuedSolution` queue. This can be achieved through
/// `Call::set_emergency_election_result` or `Call::governance_fallback` dispatchables. Once a new
/// solution is added to the queue, EPM phase transitions to `Phase::Off` and the election flow
/// restarts. Note that in this test case, the emergency throttling is disabled.
fn enters_emergency_phase_after_forcing_before_elect() {
/// Mass slash of validators shouldn't disable more than 1/3 of them (the byzantine threshold). Also
/// no new era should be forced which could lead to EPM entering emergency mode.
fn mass_slash_doesnt_enter_emergency_phase() {
let epm_builder = EpmExtBuilder::default().disable_emergency_throttling();
let (ext, pool_state, _) = ExtBuilder::default().epm(epm_builder).build_offchainify();

execute_with(ext, || {
log!(
trace,
"current validators (staking): {:?}",
<Runtime as pallet_staking::SessionInterface<AccountId>>::validators()
);
let session_validators_before = Session::validators();

roll_to_epm_off();
assert!(ElectionProviderMultiPhase::current_phase().is_off());
let staking_builder = StakingExtBuilder::default().validator_count(7);
let (mut ext, _, _) = ExtBuilder::default()
.epm(epm_builder)
.staking(staking_builder)
.build_offchainify();

ext.execute_with(|| {
assert_eq!(pallet_staking::ForceEra::<Runtime>::get(), pallet_staking::Forcing::NotForcing);
// slashes so that staking goes into `Forcing::ForceNew`.
slash_through_offending_threshold();

assert_eq!(pallet_staking::ForceEra::<Runtime>::get(), pallet_staking::Forcing::ForceNew);
let active_set_size_before_slash = Session::validators().len();

advance_session_delayed_solution(pool_state.clone());
assert!(ElectionProviderMultiPhase::current_phase().is_emergency());
log_current_time();
// Slash more than 1/3 of the active validators
let mut slashed = slash_half_the_active_set();

let era_before_delayed_next = Staking::current_era();
// try to advance 2 eras.
assert!(start_next_active_era_delayed_solution(pool_state.clone()).is_ok());
assert_eq!(Staking::current_era(), era_before_delayed_next);
assert!(start_next_active_era(pool_state).is_err());
assert_eq!(Staking::current_era(), era_before_delayed_next);
let active_set_size_after_slash = Session::validators().len();

// EPM is still in emergency phase.
assert!(ElectionProviderMultiPhase::current_phase().is_emergency());
// active set should stay the same before and after the slash
assert_eq!(active_set_size_before_slash, active_set_size_after_slash);

// session validator set remains the same.
assert_eq!(Session::validators(), session_validators_before);

// performs recovery through the set emergency result.
let supports = to_supports(&vec![
StakedAssignment { who: 21, distribution: vec![(21, 10)] },
StakedAssignment { who: 31, distribution: vec![(21, 10), (31, 10)] },
StakedAssignment { who: 41, distribution: vec![(41, 10)] },
]);
assert!(ElectionProviderMultiPhase::set_emergency_election_result(
RuntimeOrigin::root(),
supports
)
.is_ok());
// Slashed validators are disabled up to a limit
slashed.truncate(
pallet_staking::UpToLimitDisablingStrategy::<SLASHING_DISABLING_FACTOR>::disable_limit(
active_set_size_after_slash,
),
);

// EPM can now roll to signed phase to proceed with elections. The validator set is the
// expected (ie. set through `set_emergency_election_result`).
roll_to_epm_signed();
//assert!(ElectionProviderMultiPhase::current_phase().is_signed());
assert_eq!(Session::validators(), vec![21, 31, 41]);
assert_eq!(Staking::current_era(), era_before_delayed_next.map(|e| e + 1));
// Find the indices of the disabled validators
let active_set = Session::validators();
let expected_disabled = slashed
.into_iter()
.map(|d| active_set.iter().position(|a| *a == d).unwrap() as u32)
.collect::<Vec<_>>();

assert_eq!(pallet_staking::ForceEra::<Runtime>::get(), pallet_staking::Forcing::NotForcing);
assert_eq!(Session::disabled_validators(), expected_disabled);
});
}

Expand Down Expand Up @@ -253,77 +225,7 @@ fn continuous_slashes_below_offending_threshold() {
}

#[test]
/// Slashed validator sets intentions in the same era of slashing.
///
/// When validators are slashed, they are chilled and removed from the current `VoterList`. Thus,
/// the slashed validator should not be considered in the next validator set. However, if the
/// slashed validator sets its intention to validate again in the same era when it was slashed and
/// chilled, the validator may not be removed from the active validator set across eras, provided
/// it would selected in the subsequent era if there was no slash. Nominators of the slashed
/// validator will also be slashed and chilled, as expected, but the nomination intentions will
/// remain after the validator re-set the intention to be validating again.
///
/// This behaviour is due to removing implicit chill upon slash
/// <https://github.com/paritytech/substrate/pull/12420>.
///
/// Related to <https://github.com/paritytech/substrate/issues/13714>.
fn set_validation_intention_after_chilled() {
use frame_election_provider_support::SortedListProvider;
use pallet_staking::{Event, Forcing, Nominators};

let (ext, pool_state, _) = ExtBuilder::default()
.epm(EpmExtBuilder::default())
.staking(StakingExtBuilder::default())
.build_offchainify();

execute_with(ext, || {
assert_eq!(active_era(), 0);
// validator is part of the validator set.
assert!(Session::validators().contains(&41));
assert!(<Runtime as pallet_staking::Config>::VoterList::contains(&41));

// nominate validator 81.
assert_ok!(Staking::nominate(RuntimeOrigin::signed(21), vec![41]));
assert_eq!(Nominators::<Runtime>::get(21).unwrap().targets, vec![41]);

// validator is slashed. it is removed from the `VoterList` through chilling but in the
// current era, the validator is still part of the active validator set.
add_slash(&41);
assert!(Session::validators().contains(&41));
assert!(!<Runtime as pallet_staking::Config>::VoterList::contains(&41));
assert_eq!(
staking_events(),
[
Event::Chilled { stash: 41 },
Event::ForceEra { mode: Forcing::ForceNew },
Event::SlashReported {
validator: 41,
slash_era: 0,
fraction: Perbill::from_percent(10)
}
],
);

// after the nominator is slashed and chilled, the nominations remain.
assert_eq!(Nominators::<Runtime>::get(21).unwrap().targets, vec![41]);

// validator sets intention to stake again in the same era it was chilled.
assert_ok!(Staking::validate(RuntimeOrigin::signed(41), Default::default()));

// progress era and check that the slashed validator is still part of the validator
// set.
assert!(start_next_active_era(pool_state).is_ok());
assert_eq!(active_era(), 1);
assert!(Session::validators().contains(&41));
assert!(<Runtime as pallet_staking::Config>::VoterList::contains(&41));

// nominations are still active as before the slash.
assert_eq!(Nominators::<Runtime>::get(21).unwrap().targets, vec![41]);
})
}

#[test]
/// Active ledger balance may fall below ED if account chills before unbonding.
/// Active ledger balance may fall below ED if account chills before unbounding.
///
/// Unbonding call fails if the remaining ledger's stash balance falls below the existential
/// deposit. However, if the stash is chilled before unbonding, the ledger's active balance may
Expand Down
Loading

0 comments on commit 988e30f

Please sign in to comment.