-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup bigtable block upload by factor of 8-10x #24534
Speedup bigtable block upload by factor of 8-10x #24534
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks pretty good; thanks for digging into this! A couple small questions, and I'd like to play around with it a bit...
this line is sus: https://github.com/solana-labs/solana/blob/master/ledger/src/bigtable_upload.rs#L173 using blocking receiver and passing it to async thingy? cc @mvines |
Codecov Report
@@ Coverage Diff @@
## master #24534 +/- ##
=========================================
- Coverage 82.1% 82.0% -0.1%
=========================================
Files 612 610 -2
Lines 168534 168373 -161
=========================================
- Hits 138396 138152 -244
- Misses 30138 30221 +83 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of upload speeds are you seeing with this changeset?
Why did you change your mind about the blockstore threads?
i think i was seeing ~200mbit/s, but can't remember specifics. i can test it again by disabling bigtable uploads on our node, run for ~1-2 hours, stop it, then use the ledger-tool upload command. lmk if that's something you're interested in! we're running on 25gbps server so we have plenty of bandwidth left over. if we could have a threadpool that's constantly popping items off a queue and running i think we could get that number way up (as opposed to running NUM_PARALLEL uploads and waiting for them all to complete before popping more off.
@t-nelson mentioned something about number of CPUs. I feel like that's probably better, but tbh don't have any super strong feelings about it. more is probably better tho, esp. if you're trying to catchup! during normal operation im seeing it keep up just fine and upload 2-3 blocks at a time with tip |
just wrote a script to check our bigtable highest slot against yours:
|
git commit: 00c5ec9 num_cpus: 48
num_cpus: 48
|
git commit 62a40f1
|
git commit 0d797e2 (master)
|
in summary, seems like uploading blocks is network + cpu bound. haven't profiled, but guessing attempting to compressing each item that gets uploaded to bigtable 3 different ways is expensive 😆 |
72df590
to
92ce23b
Compare
@CriesofCarrots are you running this on your warehouse nodes now? looks like they're caught up :) |
No, I'm not sure why we're caught up suddenly 🤷♀️ (although Joe might know) |
all good! feel free to ping me here or discord if you need anything! just rebased! |
Hey @buffalu , sorry for the delay here. I'm hoping to wrap up my changes in this area today/tomorrow. I played around with some of your changes on top, and it seems like the new spawns do make a huge difference. But reading the blocks from Blockstore was never a limiting factor on our warehouse nodes, so I think I'd like to first try a smaller changeset here without the multiple Blockstore threads. What do you think? |
from what i remember i do think i was seeing blockstore take ~100ms per block so it might be worth checking to make sure it's not going to be a limiting factor if someone wants to crank it up more (ie using ledger-tool/custom number of cpus in the config). |
@buffalu , I'm finally done mucking around in the bigtable_upload files. Are you still willing to rebase this? |
yeah! gimme a few hours, will get to later after meetings 😢 |
@CriesofCarrots can you elaborate what you mean by "It's never the bottleneck on our nodes, but just talked to a partner blocked by it." are they running with the new tokio::spawn but single blockstore thread? or just in general bigtable upload being slow |
They are running with vanilla v1.9 (don't recall which patch), and it was clear from the timings that the upload thread was sitting idle waiting for blockstore reads |
92ce23b
to
eb95861
Compare
ok should be good. caught a potential unwrap error + fixed that. lmk if anything else sticks out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started reviewing this, but looks like we lost your changes to use num_cpus as basis for num_blocks_to_upload_in_parallel. Can you restore that? 🙏
yeah, assumed we wanted to use your config already. do you think it makes sense for num blockstore threads = num sending threads = num_blocks_to_upload_in_parallel? |
Oh gotcha. Sorry if that was confusing. I just set up relative values.
Yes, that makes sense to me |
f1bbebe
to
a29f94e
Compare
this prob doesn't make sense for this PR, but we also noticed using separate LedgerStorage instances can massively speed things up too. we can get ~500-1k blocks per second reading with this change across 64 threads and 500 slot request sizes. cloning causes them to use the same channel and TCP connection, if you create a new instance for each thread they'd each have their own connection + channel |
Yeah, let's look at that separately |
storage-bigtable/src/lib.rs
Outdated
} | ||
|
||
let results = futures::future::join_all(tasks).await; | ||
let results: Vec<_> = results.into_iter().map(|r| r.unwrap()).collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's map the tokio Error to something, just to be safe. Can you add an Error
variant?
Also, can we rewrite this whole block to only iterate once?
Something like:
let mut bytes_written = 0;
let mut maybe_first_err: Option<Error> = None;
for result in results {
match result {
Err(err) => if maybe_first_err.is_none() {
maybe_first_err = Some(Error::TokioError(err));
}
Ok(Err(err)) => if maybe_first_err.is_none() {
maybe_first_err = Some(Error::BigTableError(err));
}
Ok(Ok(bytes)) => {
bytes_written += bytes;
}
}
}
if let Some(err) = maybe_first_err {
return Err(err);
}
One additional request: when you pin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have this running on two testnet warehouse nodes, and it's working great! I noticed a couple logging things (see comments).
Otherwise, just the Cargo.lock reverts, and I think that's it from me!
Added multiple blockstore read threads. Run the bigtable upload in tokio::spawn context. Run bigtable tx and tx-by-addr uploads in tokio::spawn context.
5e2042f
to
6a299d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. just some nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the iterations on this @buffalu !
Merging on red; downstream-anchor-projects has been removed on master (saving CI the load of rebasing this) |
Added multiple blockstore read threads. Run the bigtable upload in tokio::spawn context. Run bigtable tx and tx-by-addr uploads in tokio::spawn context. (cherry picked from commit 6bcadc7) # Conflicts: # Cargo.lock # programs/bpf/Cargo.lock # storage-bigtable/Cargo.toml
…25278) * Speedup bigtable block upload by factor of 8-10x (#24534) Added multiple blockstore read threads. Run the bigtable upload in tokio::spawn context. Run bigtable tx and tx-by-addr uploads in tokio::spawn context. (cherry picked from commit 6bcadc7) # Conflicts: # Cargo.lock # programs/bpf/Cargo.lock # storage-bigtable/Cargo.toml * Fix conflicts Co-authored-by: buffalu <85544055+buffalu@users.noreply.github.com> Co-authored-by: Tyera Eulberg <tyera@solana.com>
Added multiple blockstore read threads. Run the bigtable upload in tokio::spawn context. Run bigtable tx and tx-by-addr uploads in tokio::spawn context.
Added multiple blockstore read threads. Run the bigtable upload in tokio::spawn context. Run bigtable tx and tx-by-addr uploads in tokio::spawn context.
Problem
We were seeing our bigtable block upload running behind tip by several thousand blocks. We noticed it was uploading a block every ~500ms.
Summary of Changes
Added multiple blockstore read threads.
Run the bigtable upload in tokio::spawn context.
Run bigtable tx and tx-by-addr uploads in tokio::spawn context.
We're seeing around 50-60ms per block upload now, so we should be able to easily catch up and maintain bigtable state on the tip.
I think there's more to squeeze out, but curious what you guys think about this before I go down the next optimization rabbit hole.