Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(provider): add block_range_with_senders #5647

Conversation

yash-atreya
Copy link
Contributor

@yash-atreya yash-atreya commented Dec 1, 2023

Mentioned in #5497 here

#5620 and #5630 address performance improvements in block-level tracing by retrieving senders of the txs from cache instead of recovering the signer from signed_transaction.
This PR attempts to make a similar enhancement, however, by using the db instead of the cache.

a2a6a05 adds a new method to BlockReader for reading a range of blocks (like block_range) along with senders. This method is named block_range_with_senders and returns ProviderResult<Vec<BlockWithSenders>>.

5f69696 replace block_range with block_range_with_senders in trace_filter

crates/rpc/rpc/src/trace.rs Outdated Show resolved Hide resolved
@@ -1213,6 +1213,69 @@ impl<TX: DbTx> BlockReader for DatabaseProvider<TX> {
}
Ok(blocks)
}

fn block_range_with_senders(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptal @joshieDo

@yash-atreya yash-atreya requested a review from mattsse December 4, 2023 02:04
@joshieDo
Copy link
Collaborator

joshieDo commented Dec 4, 2023

i like the direction, especially because we were recovering for every tx individually. However... Senders table can actually be pruned, and this would mess up this call.

There's Block::with_senders which recovers all senders for a block in parallel if worth it. There should be some conditional here, if we werent able to read all from the DB, then calculate it using this.

A benchmark of Block::with_senders vs reading from the table would be interesting, i'm not sure it's always worth it to read from db instead of calculating.

@joshieDo
Copy link
Collaborator

joshieDo commented Dec 4, 2023

fn block_with_senders actually calls with_senders and doesn't read from the DB. We probably want to harmonize here, but I don't know which one is faster (maybe doing some simple heuristic here). but there should always be the fallback of calculating it in case its pruned

@yash-atreya
Copy link
Contributor Author

yash-atreya commented Dec 9, 2023

@joshieDo @mattsse
Just summarizing how to move forward from here:

  • Check if the Senders table is pruned.

  • Fetch from db using Block::with_senders if not pruned or recover_signers if pruned or if some senders are not found for the given range

  • Benchmark recovering vs reading from db

@github-actions github-actions bot added the S-stale This issue/PR is stale and will close with no further activity label Dec 31, 2023
@github-actions github-actions bot closed this Jan 7, 2024
@mattsse mattsse reopened this Jan 7, 2024
@github-actions github-actions bot removed the S-stale This issue/PR is stale and will close with no further activity label Jan 8, 2024
@gakonst
Copy link
Member

gakonst commented Jan 12, 2024

@joshieDo I think this is right?

@yash-atreya do you have a rough timeline for moving forward with this PR?

@yash-atreya
Copy link
Contributor Author

@yash-atreya do you have a rough timeline for moving forward with this PR?

Will try to get this over the line by the end of the coming week.

@yash-atreya
Copy link
Contributor Author

Hey @joshieDo @mattsse ptal.

I was able to test on my archive node.

Before i.e latest alpha.16:
Screenshot 2024-01-24 at 5 50 10 PM

After improvement:
Screenshot 2024-01-24 at 5 52 15 PM

However, this is the best case improvement without having to recover any signer as I'm an running an archive.

The margin of improvement will decrease with different pruning configuration of a node.

A benchmark of Block::with_senders vs reading from the table would be interesting, i'm not sure it's always worth it to read from db instead of calculating.

I'd like to test and benchmark this against different pruning configurations. But need some pointers about how to navigate that.

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can use the two existing functions here and zip the results?

block_range and get_take_block_transaction_range

senders_cursor.walk_range(tx_range)?.collect::<Result<Vec<_>, _>>()?;

// This approach is similar to what is done in
// `get_take_block_transaction_range`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the existing function here?

Copy link
Contributor Author

@yash-atreya yash-atreya Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. block_range can be used here, but we still need the body_indices cursor to get the TxNumber's in order to get the tx_range to walk using the senders_cursor and also populate any missing senders after recovering them.
  2. get_take_block_transaction_range cannot be used as it will require changing the trait bounds of the BlockReader implementation to add DbTxMut.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, since this is a read-only method and TAKE would always be false in get_take_block_transaction_range. I don't think it makes sense to use it.

@mattsse
Copy link
Collaborator

mattsse commented Feb 1, 2024

ptal @joshieDo

I don't like how this effectively duplicates the other functions, this should be easier

@mattsse
Copy link
Collaborator

mattsse commented Feb 6, 2024

friendly bump @joshieDo

@emhane emhane added the C-perf A change motivated by improving speed, memory usage or disk footprint label Feb 17, 2024
@emhane emhane added A-rpc Related to the RPC implementation A-db Related to the database C-enhancement New feature or request labels Feb 17, 2024
@yash-atreya
Copy link
Contributor Author

@joshieDo friendly bump here.

@joshieDo
Copy link
Collaborator

joshieDo commented Feb 29, 2024

Sorry for the delay, actually missed the numerous notifications. Unfortunately with #5191 , some of this code has to be transitioned to the new paradigm, where Headersand Transactions live both on database and static files.

There are some examples on the usage (eg. ProviderFactory::receipts_by_tx_range, DatabaseProvider::transactions_by_tx_range_with_cursor), to get data from both storage options under one api.

Once again I'm sorry, I can take this PR forward with the new changes starting next week or two if you wish, given the delay and overall changes

@gakonst
Copy link
Member

gakonst commented Mar 1, 2024

Let's pause on this as not P0 and Yash's on Foundry stuff, and we can revisit if you have capacity to take over post beta @joshieDo ?

@github-actions github-actions bot added the S-stale This issue/PR is stale and will close with no further activity label Mar 23, 2024
@github-actions github-actions bot closed this Mar 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-db Related to the database A-rpc Related to the RPC implementation C-enhancement New feature or request C-perf A change motivated by improving speed, memory usage or disk footprint S-stale This issue/PR is stale and will close with no further activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants