Create benchmark for tracking indexing performance #94

casey · 2022-02-01T09:39:06Z

This issue is for tracking performance related to initial indexing of the blockchain.

@cberner What would be a useful benchmark for tracking index speed? I improved the integration test block and transaction generation code, so I could pretty easily create any amount of blocks and transactions, with arbitrary relationships, spread among any number of blockfiles.

The only thing relevant to indexing that's changed since we last worked on the code together is that I now store the full blocks in the redb database, using a new hash-to-block table. I think that there are no real guarantees about the contents of bitcoind's blocks directory, so I want to move away from using it as much as possible, except for perhaps initial indexing.

cberner · 2022-02-02T01:59:02Z

I think the easiest would be to track bytes of blockfiles ingested per second. That seems like a good metric, unless there are going to be blocks that are small but really expensive to process

casey · 2022-02-02T02:57:30Z

I think the easiest would be to track bytes of blockfiles ingested per second.

Sounds good.

That seems like a good metric, unless there are going to be blocks that are small but really expensive to process

Transactions with many inputs and outputs will probably be more expensive to process on a per-byte basis, but maybe not enough so to be worth thinking hard about.

I'll probably just generate a few gigs of "average" blocks, whatever that means.

By the way, I'm using APFS, which is a copy-on-write filesystem. Will that have any weird performance consequences for redb?

cberner · 2022-02-02T03:59:15Z

Hmm, do you know what granularity it's copy-on-write at? I'd guess no, if things like postgres or sqlite work on APFS. If the granularity is really course, then you'd want to maximize the size of your write transactions.

casey · 2022-02-02T05:52:08Z

Would it be the same as the block size? If so, my APFS filesystem which was initialized with default values is 4096 bytes/block.

cberner · 2022-02-02T06:08:27Z

It sounds like yes? If so, then that should be totally fine. 4096 is the same as the page size, and redb is copy-on-write at the page level (I removed those complicated delta writes)

casey linked a pull request Feb 13, 2022 that will close this issue

Add Benchmark #111

Merged

terror closed this as completed in #111 Feb 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create benchmark for tracking indexing performance #94

Create benchmark for tracking indexing performance #94

casey commented Feb 1, 2022

cberner commented Feb 2, 2022

casey commented Feb 2, 2022

cberner commented Feb 2, 2022

casey commented Feb 2, 2022

cberner commented Feb 2, 2022

Create benchmark for tracking indexing performance #94

Create benchmark for tracking indexing performance #94

Comments

casey commented Feb 1, 2022

cberner commented Feb 2, 2022

casey commented Feb 2, 2022

cberner commented Feb 2, 2022

casey commented Feb 2, 2022

cberner commented Feb 2, 2022