-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade in-memory and on-disk state representation with tree states #3206
base: unstable
Are you sure you want to change the base?
Conversation
## Issue Addressed Part of #3651. ## Proposed Changes Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. #3693, #3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`. Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. #3206). ## Additional Info I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
## Issue Addressed Part of sigp#3651. ## Proposed Changes Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. sigp#3693, sigp#3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`. Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. sigp#3206). ## Additional Info I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
ba7ad3e
to
23db089
Compare
* Address some clippy lints * Box errors to fix error size lint * Add Default impl for Validator * Address more clippy lints * Re-implement `check_state_diff` * Fix misc test compile errors
* Use "rebasing" to minimise BeaconState mem usage * Update metastruct * Use upstream milhouse, update cargo lock * Rebase caches for extra memory savings
* Get tests passing * Get benchmarks compiling * Fix EF withdrawals test * Remove unused deps * Fix tree_hash panic in tests * Fix slasher compilation * Fix ssz_generic test * Get more tests passing * Fix EF tests for real * Fix local testnet scripts
Workaround for #4236
* Fix db-migration-period default * Fix version regex
* Relocate epoch cache to BeaconState * Optimize per block processing by pulling previous epoch & current epoch calculation up. * Revert `get_cow` change (no performance improvement) * Initialize `EpochCache` in epoch processing and load it from state when getting base rewards. * Initialize `EpochCache` at start of block processing if required. * Initialize `EpochCache` in `transition_blocks` if `exclude_cache_builds` is enabled * Fix epoch cache initialization logic * Remove FIXME comment. * Cache previous & current epochs in `consensus_context.rs`. * Move `get_base_rewards` from `ConsensusContext` to `BeaconState`. * Update Milhouse version
@@ -377,131 +378,3 @@ fn slot_of_prev_restore_point<E: EthSpec>(current_slot: Slot) -> Slot { | |||
let slots_per_historical_root = E::SlotsPerHistoricalRoot::to_u64(); | |||
(current_slot - 1) / slots_per_historical_root * slots_per_historical_root | |||
} | |||
|
|||
#[cfg(test)] | |||
mod test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional delete? BlockRootsIterator
still exists
@@ -26,6 +26,7 @@ impl<E: EthSpec> LevelDB<E> { | |||
let mut options = Options::new(); | |||
|
|||
options.create_if_missing = true; | |||
options.write_buffer_size = Some(512 * 1024 * 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment justifying the increased value
consensus/state_processing/src/common/initiate_validator_exit.rs
Outdated
Show resolved
Hide resolved
let epoch_participation = state.get_epoch_participation_mut( | ||
data.target.epoch, | ||
ctxt.previous_epoch, | ||
ctxt.current_epoch, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this optimization make a difference? Introduced in 2df714e
consensus/types/src/beacon_state.rs
Outdated
.map_or(false, VList::has_pending_updates) | ||
|| self | ||
.inactivity_scores() | ||
.map_or(false, VList::has_pending_updates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgetting to add a field in a new fork here may cause either a panic or data corruption. This function could be generated by a macro?
base_inner, | ||
|_, self_field, base_field| { self_field.rebase_on(base_field) } | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not using Deneb's bimap_beacon_state_deneb_tree_list_fields
* Delete unused epoch processing code * Compare total deltas * Remove unnecessary apply_pending * cargo fmt * Remove newline
"Updating historic state config"; | ||
"previous_config" => ?disk_config.hierarchy_config, | ||
"new_config" => ?db.config.hierarchy_config, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelsproul I recall you commenting changing the hierarchy_config
is not possible without a re-sync. Should this comment state so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well if you see this comment then it has already worked, so not sure if it's helpful at this stage.
You can change it by running prune-states
and then restarting with the new config. I think it probably makes sense to just have that in the docs.
* Fix beta compiler lints * Fix fork choice tests * Fix op_pool tests --------- Co-authored-by: dapplion <35266934+dapplion@users.noreply.github.com>
* Fix CLI help text * Tree states release v5.0.111-exp
* Fix clippy * Fix beacon chain tests * Comment out dodgy iterator test * Fix imports for beta compiler
This is an umbrella PR for fundamental changes to Lighthouse's state representation.
The plan is to merge smaller PRs into this one, and then merge it to
unstable
without squashing.TODO List
There are few things that need fixing on this branch before it's ready for primetime:
get_state
. The recursion is liable to overflow the stack, so we should switch to an iterative algorithm.jemalloc
FIXME(sproul)
s in the codeIssues Addressed
Closes #2806
Closes #2805
Closes #2944
Closes #2371
Closes #3208
Closes #3505