Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Bank/Snapshot) Add prior_roots to bank fields. #23331

Closed
wants to merge 1 commit into from
Closed

(Bank/Snapshot) Add prior_roots to bank fields. #23331

wants to merge 1 commit into from

Conversation

yhchiang-sol
Copy link
Contributor

@yhchiang-sol yhchiang-sol commented Feb 24, 2022

Summary of Changes

Prior roots are defined as roots that are no longer roots in the current bank instance.
This PR adds prior_roots to bank fields to allow snapshots carrying prior roots information.

As the PR changes the snapshot format, it also includes the following changes:

  • the existing newer snapshot becomes older.
  • snapshot version V1_2_0 points to older (i.e., the existing version).
  • the new snapshot version V1_2_1 points to the updated newer, which contains the new prior_roots field.
  • snapshot version still defaults to V1_2_0, and a separate PR will bump the version to V1_2_1.

Test Plan

Update test_bank_serialize_style to include a test prior_roots.
Verify the test will fail if I comment out the forwarding assignment for prior_roots.

@yhchiang-sol yhchiang-sol marked this pull request as draft February 24, 2022 20:37
@yhchiang-sol yhchiang-sol marked this pull request as ready for review February 24, 2022 21:38
@yhchiang-sol
Copy link
Contributor Author

Looks like the deserialization is handled here in deserialize_bank_fields, which is done under deserialize_from via DeserializableVersionedBank:

fn deserialize_bank_fields<R>(
mut stream: &mut BufReader<R>,
) -> Result<(BankFieldsToDeserialize, AccountsDbFields), Error>
where
R: Read,
{
let bank_fields = deserialize_from::<_, DeserializableVersionedBank>(&mut stream)?.into();
let accounts_db_fields = Self::deserialize_accounts_db_fields(stream)?;
Ok((bank_fields, accounts_db_fields))
}

And deserialize_bank_fields is used in bank_from_streams.

Comment on lines +205 to +221
let mut test_prior_roots = vec![0, 1, 2];
bank2.prior_roots.append(&mut test_prior_roots);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I created a test prior_roots, and the field will be verified at line 255 assert!(bank2 == dbank);.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right spot for tests like this. Maybe another test can be added that explicitly is said to test for forwards/ backwards compatibility?

Basically I'm not sure what the right behavior is here. An old snapshot should be usable, but will require the Bank to reconstruct prior_roots (if that's needed, or is prior_root always empty when loading from a snapshot?). A new snapshot won't work with an old validator, so that's not something I think we need to test for.

If I can pinpoint what I'm thinking about and trying to articulate, I'll make sure to reply. Thanks for bearing with me 😊

@jeffwashington
Copy link
Contributor

I imagine the abi hash test will fail. That is fun to learn about ;-)
Then, we'll need a snapshot version change, I imagine, because I bet the deserialization will not work from a snapshot that did not have this prior_roots field previously in the bank. That's where the snapshot version comes in (I think). I think you'll have to bump the snapshot version somehow. There's a lot here I have not done yet.
This is off to a good start.

@yhchiang-sol
Copy link
Contributor Author

Looks like serde_snapshot::tests::test_bank_serialize::BankAbiTestWrapperNewer_frozen_abi::test_abi_digest does fail as expected, which means my PR has some effect!

---- serde_snapshot::tests::test_bank_serialize::BankAbiTestWrapperNewer_frozen_abi::test_abi_digest stdout ----
--
  | thread 'serde_snapshot::tests::test_bank_serialize::BankAbiTestWrapperNewer_frozen_abi::test_abi_digest' panicked at 'assertion failed: `(left == right)`
  | left: `"HVyzePMkma8T54PymRW32FAgDXpSdom59K6RnPsCNJjj"`,
  | right: `"13foLhXovret1Wns2W1UzqagFmMFiUKqYirggMH7BFsE"`: Possibly ABI changed? Confirm the diff by rerunning before and after this test failed with SOLANA_ABI_DUMP_DIR!', runtime/src/serde_snapshot/tests.rs:310:5
  | stack backtrace:
  | 0: rust_begin_unwind
  | at /rustc/777bb86bcdbc568be7cff6eeeaaf81a89b4aa50b/library/std/src/panicking.rs:577:5
  | 1: core::panicking::panic_fmt
  | at /rustc/777bb86bcdbc568be7cff6eeeaaf81a89b4aa50b/library/core/src/panicking.rs:110:14
  | 2: core::panicking::assert_failed_inner
  | 3: core::panicking::assert_failed
  | at /rustc/777bb86bcdbc568be7cff6eeeaaf81a89b4aa50b/library/core/src/panicking.rs:149:5
  | 4: solana_runtime::serde_snapshot::tests::test_bank_serialize::BankAbiTestWrapperNewer_frozen_abi::test_abi_digest
  | at ./src/serde_snapshot/tests.rs:310:5
  | 5: solana_runtime::serde_snapshot::tests::test_bank_serialize::BankAbiTestWrapperNewer_frozen_abi::test_abi_digest::{{closure}}
  | at ./src/serde_snapshot/tests.rs:310:5
  | 6: core::ops::function::FnOnce::call_once
  | at /rustc/777bb86bcdbc568be7cff6eeeaaf81a89b4aa50b/library/core/src/ops/function.rs:227:5
  | 7: core::ops::function::FnOnce::call_once
  | at /rustc/777bb86bcdbc568be7cff6eeeaaf81a89b4aa50b/library/core/src/ops/function.rs:227:5
  | note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.


I will dive deeper into it!

@brooksprumo
Copy link
Contributor

I'll take a look at this PR in-depth soon; just wanted to share this PR of mine where I did just this (to add a new snapshot version):
#22467

@yhchiang-sol
Copy link
Contributor Author

I'll take a look at this PR in-depth soon; just wanted to share this PR of mine where I did just this (to add a new snapshot version):
#22467

Thanks for the information, @brooksprumo. It is indeed more complicated than I thought.

Quick question: other than test and util, do we always keep only at most two versions of serde_snapshot? Naming older and newer? When we introduce a new one, we always move the existing "newer" into "older", then start modifying the "newer"? Once the transition is done and we purge the older one?

In case the first answer is no, then do we want to rename those "older" to something like "verXXX" or something?

@yhchiang-sol
Copy link
Contributor Author

Updated the PR description. Specifically adds the followings:

  • the existing newer snapshot becomes older.
  • snapshot version V1_2_0 points to older (i.e., the existing version).
  • the new snapshot version V1_2_1 points to the updated newer, which contains the new prior_roots field.
  • snapshot version still defaults to V1_2_0, and a separate PR will bump the version to V1_2_1.

@Ashry00

This comment was marked as spam.

@brooksprumo
Copy link
Contributor

@yhchiang-sol Please do not merge based on Ashry00's approval.

@brooksprumo
Copy link
Contributor

@yhchiang-sol

Quick question: other than test and util, do we always keep only at most two versions of serde_snapshot? Naming older and newer? When we introduce a new one, we always move the existing "newer" into "older", then start modifying the "newer"? Once the transition is done and we purge the older one?

In case the first answer is no, then do we want to rename those "older" to something like "verXXX" or something?

Historically (afaict), the max that we've kept around is two versions. Here are two (big) PRs for the history: PR #9980 and PR #10581. (If folks have older snapshots and need to boot them, then they'll need to run that older version of the validator (or other bin) anyway.)

The naming was/is "older" and "newer", yes. In those PRs you'll see comments where initially the files/names were based on the version number, but that was changed since moving snapshot versions (default and supported) should happen relatively quickly (sorry for not getting the specific comment!).

And yes, purging "older" happens once we say that older snapshot versions cannot be loaded by the cluster anymore.

@brooksprumo
Copy link
Contributor

@yhchiang-sol

Can you also provide context for this PR? What is prompting the addition/use of prior roots? I'm also curious about the use of prior_roots (the field in Bank), its anticipated size, and how it grows—especially since Bank is shared between threads (behind an Arc), which will influence how this field can be modified or not (or copied!).

@yhchiang-sol
Copy link
Contributor Author

Can you also provide context for this PR? What is prompting the addition/use of prior roots? I'm also curious about the use of prior_roots (the field in Bank), its anticipated size, and how it grows—especially since Bank is shared between threads (behind an Arc), which will influence how this field can be modified or not (or copied!).

I think @jeffwashington would be the best person to answer this question :p

@jeffwashington
Copy link
Contributor

jeffwashington commented Feb 25, 2022

Can you also provide context for this PR?

For eliminating rewrites/ancient append vecs as well as a change @carllin is working on, we need the ability to remember slots within that last epoch that WERE roots even if they no longer contain any accounts. Clean and shrink cause us to remove outdated accounts from a slot's appendvec over time. If all accounts are removed from a slot (because they were updated later or became 0 lamport in that slot), then the appendvec is removed and the fact that a root existed at that slot is lost. We need to retain knowledge of slots (within the last epoch) that used to exist but are now gone/missing/ghosts. Bank is possibly not the most intuitive place to persist this. Somewhere in the snapshot, we need to persist this list of slots which now have no non-empty appendvecs but which WERE roots within the last epoch.

@brooksprumo
Copy link
Contributor

Can you also provide context for this PR?

For eliminating rewrites/ancient append vecs as well as a change @carllin is working on, we need the ability to remember slots within that last epoch that WERE roots even if they no longer contain any accounts. Clean and shrink cause us to remove outdated accounts from a slot's appendvec over time. If all accounts are removed from a slot (because they were updated later or became 0 lamport in that slot), then the appendvec is removed and the fact that a root existed at that slot is lost. We need to retain knowledge of slots (within the last epoch) that used to exist but are now gone/missing/ghosts. Bank is possibly not the most intuitive place to persist this. Somewhere in the snapshot, we need to persist this list of slots which now have no non-empty appendvecs but which WERE roots within the last epoch.

@jeffwashington Thanks! Now I remember you explaining this to me for eliminating rewrites. This might need to be in Bank, since the list of prior roots could change in each bank based on the transactions it processes; or at least I'm not sure how to remove it from Bank without just duplicating the same information somewhere else...

@yhchiang-sol Can you add a doc comment (///) to the field in Bank that distills Jeff's comment?

Also, maybe this prior_roots should be a DashSet? Since I think it'll need to be behind something that handles synchronization (like a Mutex<Vec<Slot>>), and I'm making assumptions on the usage patterns being mostly reads, then a Set (and DashSet in particular) would optimize for reads and support Sync.

Thinking more, I'm wondering if the ordering for this PR (updating the snapshot version) should come after introducing/using prior_roots?

@jeffwashington
Copy link
Contributor

Also, maybe this prior_roots should be a DashSet? Since I think it'll need to be behind something that handles synchronization (like a Mutex<Vec<Slot>>), and I'm making assumptions on the usage patterns being mostly reads, then a Set (and DashSet in particular) would optimize for reads and support Sync.

We might need to distinguish what gets persisted vs what exists in memory. Remember that these are old roots. Meaning all contents have been roots. From that sense, every bank will have the same view on the shared data. Roots today are stored from accounts. The in-memory, runtime data structure will probably be a RootsTracker bit field. The persisted information looks like Vec. We can tweak the exact data structures and who has what later. The real value in this pr to me is that we are working through what it takes to add info like this to the snapshot. If we can persist Vec, we can save and reconstruct correctly. Perhaps we do even better.

@yhchiang-sol
Copy link
Contributor Author

Updated the PR:

  • Change V1_2_1 to V1_3_0.
  • Keep using Vec<Slot> based on the discussion above.
  • Improve comments in the test.

@yhchiang-sol
Copy link
Contributor Author

Updated abi hash. Included code comments for prior_roots in bank.rs

@codecov
Copy link

codecov bot commented Mar 2, 2022

Codecov Report

Merging #23331 (6cbe253) into master (eff085f) will increase coverage by 0.1%.
The diff coverage is 87.8%.

❗ Current head 6cbe253 differs from pull request most recent head 7e60cd5. Consider uploading reports for the commit 7e60cd5 to get more accurate results

@@            Coverage Diff            @@
##           master   #23331     +/-   ##
=========================================
+ Coverage    81.3%    81.4%   +0.1%     
=========================================
  Files         572      576      +4     
  Lines      155876   156849    +973     
=========================================
+ Hits       126815   127794    +979     
+ Misses      29061    29055      -6     

@solana-labs solana-labs temporarily blocked Ashry00 Mar 3, 2022
@mvines
Copy link
Member

mvines commented Mar 3, 2022

Could we add prior_roots by making them optional in the current snapshot format rather than adding an entirely new snapshot version? This could then be rolled out in two stages:

  1. Update the existing branches to accept / ignore prior_roots
  2. New code on master to write prior_roots

A new snapshot format entirely is a serious PITA to roll out. Trusted validators need to serve the old format until everybody updates to a software version that supports the new format, so lots of server ops coordination

@jeffwashington
Copy link
Contributor

  1. Update the existing branches to accept / ignore prior_roots

Yes. This sounds amazing. Can you please point @yhchiang-sol to an example of this?

@mvines
Copy link
Member

mvines commented Mar 3, 2022

  1. Update the existing branches to accept / ignore prior_roots

Yes. This sounds amazing. Can you please point @yhchiang-sol to an example of this?

It looks like simply adding the #[serde(deserialize_with = "default_on_eof")] directive above the declaration of BankFieldsToDeserialize::prior_roots might do it. That's generally the direction I think we can take here to make serde deserialize the field if it exists and skip it otherwise:

pub fn default_on_eof<'de, T, D>(d: D) -> Result<T, D::Error>

If a vector of length 0 for prior_roots is significant, the definition could be changed slightly from pub(crate) prior_roots: Vec<Slot> to pub(crate) prior_roots: Option<Vec<Slot>>, so that the "older" snapshots would deserialize prior_roots to None while "newer" snapshots deserialize to Some(_).

@jeffwashington
Copy link
Contributor

jeffwashington commented Mar 3, 2022

/// longer contain any accounts. Without remembering those prior roots,
/// accounts that we skip rewrites might have their rent collection time
/// point to the incorrect roots as their correct roots were removed.
pub(crate) prior_roots: Vec<Slot>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carllin is this a sufficient data structure to store what you were looking into regarding slashing and such?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll require (Slot, Hash) to be perfectly safe, but I wouldn't worry about that for now if that imposes a big burden.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash = bank hash at that slot?
Are you confident this is useful info for what you're doing? Do we keep around the bank hashes for past slots during the entire epoch? I can look. Just thought you may know.
I could just store default hashes at the moment or Option with None. So, they'd be present in the format, but not populated.

@jeffwashington
Copy link
Contributor

here is an impl of optional persistence of prior_roots based on what @mvines said

I think this is the piece we need to get into probably 1.9 asap so it gets propagated.
We need to confirm this meets @carllin 's needs. @CriesofCarrots and I are running a kin sim now where these snapshots are used by all validators and rpc servers. Hopefully this demonstrates that this will work for sure for my eliminate rewrites use case.

@jeffwashington
Copy link
Contributor

Adding it to bank persistence doesn't work as expected because we serialize bank, then we serialize accounts db, so reading a bank field that don't exist causes errors in deserializing accounts db. Thankfully, 'prior_roots' belongs better in accounts_db ANYWAY, so we CAN add the field to the end of what we deserialize/serialize for accounts_db.

While we're at it, I may try to figure out how to deal with the expectation that we know the hash of the accounts at that slot at the time we serialize. We would like to calculate the full accounts snapshot hash in the background-background, allowing the flush, clean, shrink loop to run more often in the background instead of being gated by calculating the hash...

runtime/src/bank.rs Outdated Show resolved Hide resolved
runtime/src/serde_snapshot/newer.rs Show resolved Hide resolved
@yhchiang-sol
Copy link
Contributor Author

  • Move comments to struct Bank.
  • Keep the updated abi hash as using #[serde(deserialize_with = "default_on_eof")] will still change the hash.
  • rebase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants