Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] Correctly purge zero lamport accounts #7013

Closed
wants to merge 41 commits into from

Conversation

ryoqun
Copy link
Member

@ryoqun ryoqun commented Nov 18, 2019

[READ FIRST] REVIEW POINTS

This is still a draft. Code is full of any kinds of dirtiness/roughness. It seems the current impl mostly works as intended.

I want reviews for following points to be sure I can continue to the correct direction before wasting time for wrong direction:

Problem

Sometimes snapshot restore fails (#6890). At first I thought this is a vote-account-specific issue. But it turned out this is not correct. The more general problem description is that snapshots restoration can't handle any kinds of accounts with 0 lamport balance at the moment, albeit prior attempt to fix this.
This means this can easily be triggered by just running solana pay XXXXXX 0 lamports. (By the way, solana pay should at least warn for such a meaningless transfer of 0 lamports...)

The problem is that removed zero lamport account can reappear when restoring from snapshots, although we intended to purge them when creating snapshots.
As for the in-process state consistency, validator correctly purges zero accounts by purge_zero_lamport_accounts. And indeed, it can successfully do this periodically while creating newer snapshots until it restarts, because the running validator never actually uses the snapshot.

Then, when restoring from snapshot after simple restart or by different validator startup, the solana-validator loads the snapshot. Such bad snapshots aborts the restoration.

The reason zero lamport accounts can reappear is that while purge_zero_lamport_accounts does correctly purges zero lamport accounts from the in-process AccountsIndex, it fails to update the ref-counted AccountStorage correctly.

Solution

First, leaving zero lamport accounts as is in the AccountStorage is not acceptable. Because this allows malicious snapshot-serving validators enlarge the active dataset of account db as they like via a crafted snapshot. Also they can shield otherwise legitimate accounts as zero balance.

On the other hand, merely decrementing the ref-count of AccountStorage is not sufficient with the hope the GC nature of AccountStorage solves everything for us, because accounts are first grouped, then those groups are ref-counted as the minimum unit. When the group containing zero lamport account contains other active account, the group as a whole can not be purged.

Also, it's preferable to preserve copy-on-write semantics wherever possible for sound solutions. So inplace update isn't well welcomed.

In these constraints, I'd like to propose to retain the zero lamport account with greatest slot value, while purging older ones.

This rather not a straight fix. This is because there isn't a good way to convey this info (the purged-ness of accounts) to next invocation or joining validator.

So, in my solution, the retained zero lamport account works as special marker as the purged entry. And older account entries shadowed by copy-on-writes will eventually removed as the cluster moves the state forward and uses fewer old slots, then finally the special marker itself can be removed. I think this introduces exploitable fragmentation like normal accounts can cause today. So for that fix, we'll need a genral compaction mechanism.

Other possible solutions

  • In-place overwrite of accounts data? (maybe bad for performance, rollback, bankhash, security)
  • Introduce some end-of-life marker for zero-lamport accounts? (what about bankhash?, moderate cost to implement =~ considerable risks for more bugs for such low-level system component.)
  • Teach serializer/deserializer to specially handle accounts marked as purged with a tombstone? (This might be a leaky abstraction; on top of that, the same applies for impl cost and risks.)
  • start serializing/deserializing the index as well (takes more time for snapshot creation).

@ryoqun ryoqun requested review from sakridge and mvines November 18, 2019 17:02
@@ -841,6 +842,25 @@ impl AccountsDB {
infos
}

pub fn verify_zero_lamports(&self, _slot: Slot, ancestors: &HashMap<Slot, usize>) -> bool {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This moved from here.

let list = accounts_index.account_maps.get(&pubkey).unwrap().read().unwrap();
let list: Vec<(Slot, AccountInfo)> = list.iter().filter(|(slot, _)| accounts_index.is_root(*slot)).cloned().collect();
error!("ryoqun had zero?: {}, {:?}, {}, list: {}", pubkey, account, _b, list.len());
if list.len() > 1 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By this way, I think we can prevent attackers from shielding legitimate accounts as zero balance.

@@ -309,12 +309,13 @@ pub fn process_ping(
let mut submit_count = 0;
let mut confirmed_count = 0;
let mut confirmation_time: VecDeque<u64> = VecDeque::with_capacity(1024);
let lamport = 0;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are needed for testing only. I'll create separate PR for this function addition to solana ping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, this is done #7029

@ryoqun ryoqun force-pushed the no-bad-account-in-snapshot branch from 5571cae to 6c258bb Compare November 21, 2019 09:31
@@ -224,6 +226,7 @@ pub fn bank_from_archive<P: AsRef<Path>>(
unpacked_accounts_dir,
)?;

bank.purge_zero_lamport_accounts();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, now we both do purge zero lamport accounts when saving and loading snapshots.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In contract to this, this call cannot be omitted for security reasons.


pub fn restore_account_count(&self) {
let mut count_and_status = self.count_and_status.write().unwrap();
*count_and_status = (self.all_existing_accounts().len(), count_and_status.1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively makes deseriazation ignore count_and_status in snapshot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: rewrote code to avoid this unneeded serialization first of all.

self.cleanup_dead_slots(&mut dead_slots, last_root);
error!("ryoqun: zla dead slots 2 {:?}", dead_slots);
for slot in dead_slots {
self.purge_slot(slot);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sure, but this is needed?

.cloned()
.collect();
for storage in storage_maps.into_iter() {
storage.restore_account_count();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is tightly paired with this code.

slot_id, store.slot_id,
"AccountDB::accounts_index corrupted. Storage should only point to one slot"
);
store.remove_account();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maintains the count of count_and_status of for updated non-zero-lamports accounts.


pub fn restore_account_count(&self) {
let mut count_and_status = self.count_and_status.write().unwrap();
let new_count = self.all_existing_accounts().len();
Copy link
Member

@sakridge sakridge Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right actually. This version of the account could be masked by a version of the account on another fork. This should only count accounts where the latest version is represented in this append vec.

Copy link
Member Author

@ryoqun ryoqun Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for early preview!

Yes, your concern is right.

To make sure not to count older versions of accounts, we gradually decrease the count there by calling remove_account, once increased to the full count of stored accounts here. I think calculating the right count here is difficult because this method can't tell there are any other newer representation of accounts if any.

As for the freshness check of the actual account contents like the counts, we realize it by replacing older account with new one in chronological order when generating the index as done by the original implementation. The handling is similar to counts.

Yeah, code is a bit difficult to read. Maybe I can add comment to this method, or moreover directly embed the code into generate_index as this is the sole caller of this (and should be in the future). Even if we completely transition not to serialize the count, I'll leave a comment explaining that.

@codecov
Copy link

codecov bot commented Nov 21, 2019

Codecov Report

Merging #7013 into master will increase coverage by 9.3%.
The diff coverage is 95.9%.

@@           Coverage Diff            @@
##           master   #7013     +/-   ##
========================================
+ Coverage    70.1%   79.4%   +9.3%     
========================================
  Files         230     230             
  Lines       51040   45317   -5723     
========================================
+ Hits        35782   36003    +221     
+ Misses      15258    9314   -5944

@@ -151,6 +151,7 @@ where
}

pub fn add_snapshot<P: AsRef<Path>>(snapshot_path: P, bank: &Bank) -> Result<()> {
trace!("ryoqun saving snapshot");
bank.purge_zero_lamport_accounts();
Copy link
Member Author

@ryoqun ryoqun Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: We can reduce the frequency of purge_zero_lamport_accounts invocations for the performance gains because the invocation is completely optional for creating snapshots other than freeing up some memory consumed by zero-lamport accounts.

storage.all_existing_accounts().iter().for_each(|a| {
if a.meta.pubkey == *pubkey
&& *version != a.meta.write_version
&& storage.remove_account() == 0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way we reduce count as many as the number of older accounts.

@ryoqun
Copy link
Member Author

ryoqun commented Nov 22, 2019

@sakridge Finally, I settled on the algorithms for the count reconstruction when loading snapshots and got the CI green. It seems this impl now is working fine locally. I've also replied your comment. Does this PR looks right? There are still rough edges.

TODOs (Once the soundness of this PR confirmed)

  • Update PR description.
  • Stop serializing the count.
  • Clean up unit tests/impl.
  • Add more unit tests and local-cluster tests.

@sakridge
Copy link
Member

@sakridge Finally, I settled on the algorithms for the count reconstruction when loading snapshots and got the CI green. It seems this impl now is working fine locally. I've also replied your comment. Does this PR looks right? There are still rough edges.

TODOs (Once the soundness of this PR confirmed)

  • Update PR description.
  • Stop serializing the count.
  • Clean up unit tests/impl.
  • Add more unit tests and local-cluster tests.

@ryoqun Is there a difference between this logic and just fixing up the count_and_status beforehand and then doing the clean as in #7010

If that solution is equivalent it seems like a lot less code and logic than this and also you have some very expensive loops over all account vecs and the accounts_index pubkey map. I think some of those could be reduced, but if we can share the purge logic vs. having the separate generate_index logic do it, then I think it would be preferable.

@ryoqun
Copy link
Member Author

ryoqun commented Nov 24, 2019

@ryoqun Is there a difference between this logic and just fixing up the count_and_status beforehand and then doing the clean as in #7010

@sakridge There is no difference between this logic and the final form of the PR #7010. Your solution is better than mine. :)

I misunderstood that we must precisely manage the count in the AccountStorageEntry inbound/online to avoid resource leaks between validator's restarts, but it seems this is not hard requirement after all. It will be just enough for us to have out-of-bound/offline vacuum tool (if needed in the future) for cleaning up unused AppendVecs, which got never referenced from the current account index.

So, I'm closing this. I'm really confident the fix by you (#7010) does fixes the problem #6890.

Thanks for working on this by actually writing better solution!

@sakridge
Copy link
Member

@ryoqun Is there a difference between this logic and just fixing up the count_and_status beforehand and then doing the clean as in #7010

@sakridge There is no difference between this logic and the final form of the PR #7010. Your solution is better than mine. :)

I misunderstood that we must precisely manage the count in the AccountStorageEntry inbound/online to avoid resource leaks between validator's restarts, but it seems this is not hard requirement after all. It will be just enough for us to have out-of-bound/offline vacuum tool (if needed in the future) for cleaning up unused AppendVecs, which got never referenced from the current account index.

So, I'm closing this. I'm really confident the fix by you (#7010) does fixes the problem #6890.

Thanks for working on this by actually writing better solution!

Ok. Thanks so much for your help root-causing the issue. That really helped.

}

#[test]
fn test_accounts_db_serialize_zero_and_free() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryoqun ryoqun force-pushed the no-bad-account-in-snapshot branch from f51930a to 8b54fba Compare November 26, 2019 17:20
let mut current_slot = 1;
accounts.store(current_slot, &[(&pubkey, &account)]);
//accounts.store(current_slot, &[(&pubkey2, &account)]); // <= Enabling this line causes even my PR fails...
error!("#1: {:#?}", accounts.get_storage_entries()); // #1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steps how things to evolve!

#1: Add a test account with some lamports at slot1: The account data is stored into the index and new storage entry is created with slot: 0 and count: 1, status: Available.

#2: Make the test account unfunded at slot2: The account data is stored into the index and another new storage entry is created with slot: 1 and count: 1, status: Available.

#3: Fill the StorageEnty created at #2 with full of outdated accont datas: To make that happen, repeatedly store another unrelated accounts, resulting the StorageEntry with slot: 1, count: 1, status: Full and new StorageEntry containing spilled ones.

#4: Assertion of existing of zero lamport account succeeds: At this moment, the index holds the account data (lamport=0) inserted at #2;

#5: Purge the zero lamport account (Simulate call of this function from periodical snapshot creation)

#6: on master: Because #5's purge decrements the #2 StorageEntry's count to 0, it's got flushed, resulting with count: 0, status: Available
    In my PR: dead_slots are purged in purge_zero_lamport_accounts, #1's StorageEntry is also purged.)

#7: on master: StorageEntry from #1 is alive.
    in my pr: all of slots containg the test account is now refreshed as blank.

Copy link
Member Author

@ryoqun ryoqun Nov 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sakridge I found these costly recalculation of count doesn't really help much.. I found this not-passing test scenario even with the impl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, yet more correct solution would be stop doing store.remove_account() for zero lamport accounts without any complex things.

@ryoqun
Copy link
Member Author

ryoqun commented Dec 3, 2019

(I will close this PR once we're really sure we fixed the stability issue of snapshots after some idling at the TdS cluster)

@ryoqun ryoqun closed this Dec 3, 2019
@ryoqun
Copy link
Member Author

ryoqun commented Dec 3, 2019

(I will close this PR once we're really sure we fixed the stability issue of snapshots after some idling at the TdS cluster)

Err, well I clicked the close button. Ok, close 'em all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants