Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index is slow #371

Closed
4 of 8 tasks
casey opened this issue Aug 24, 2022 · 6 comments
Closed
4 of 8 tasks

Index is slow #371

casey opened this issue Aug 24, 2022 · 6 comments

Comments

@casey
Copy link
Collaborator

casey commented Aug 24, 2022

What we know

The ordinals.com index is syncing very slowly. At the current rate, around 3 seconds per block, it will take about 9 days to sync to chain tip. Previous syncs were more like 36 hours, so this is a severe performance regression.

A possible culprit is the new OUTPOINT_TO_TXID table. This table maps outpoints to the txid, if any, that spends them. This is required to improve the error messages when we don't have an outpoint in OUTPOINT_TO_ORDINAL_RANGES table.

The database used to top out at around 50GB, but is currently 89 gigs and growing.

Things to try

  • revert change and redeploy, but save the old database for inspection. gotta have working block explorer! (@casey)
  • using the logs, graph the time it took each block to sync. what does this graph look like? (@terror)
  • make a scatter plot with ordinal ranges written and time. is there a trend there? (@terror)
  • make a scatter plot with txs in block and time. is there a trend there? (@terror)
  • if we estimate how much each table should take up, given how much data it contains, and add them together, is that about the size of the database? If not, maybe we have a space leak, either in our use of REDB, or in REDB itself.
  • figure out how much ram the additional table will take up. is this under 128 gb? if so, deploy to a new server with 128 gigs of ram.
  • Can we trigger this locally? What about on memory constrained VM or server?
  • ping chris once we know a little more, he might have ideas. I think lmdb didn't have this performance cliff, so if there are any differences, that might help.
@casey casey changed the title Index is slow with new table Index is slow Aug 24, 2022
@casey
Copy link
Collaborator Author

casey commented Sep 3, 2022

Blocked on cberner/redb#344.

@casey
Copy link
Collaborator Author

casey commented Sep 7, 2022

@virtuallythere, we should benchmark this more, but it seems like there's already a juicy target for optimization, looking at the flamegraph you posted: cberner/redb#344 (comment)

We're spending around 40% of the time requesting blocks from bitcoin core, and in that thread and context switch stuff that's parallelizing calculating transaction IDs. That can be merged into one background thread that fetches new blocks and calculates TXIDs for transactions in advance. (Alternatively, if it's faster, we could just not calculate TXIDs ahead of time.)

@veryordinally
Copy link
Collaborator

@casey Hah, I came to the same conclusion this morning. Here are my thoughts.

What we know (from the flamegraphs)

  1. We spend about 15% of time on what I categorized as "Rust parallelization" or "thread overhead" (the right part on the flamegraph)
  2. Of the rest, 25% is getting/hashing blocks from bitcoin core, 75% is "database"
  3. Of the database stuff, 1/3 is insert, 1/3 is delete, 1/3 is commit

What I think we should do based on that

  1. Quickest win to reduce elapsed time is likely to move requesting blocks to a separate thread using a queue to always have enough blocks ready for downstream processing. My gut guess was that this should cut out about 25% of elapsed time (it may be a little more, but it's hard for me to predict how the 15% of "thread overhead" behaves)
  2. Biggest potential for substantial time savings is the "batched update" strategy that I suggested last night, but didn't express very well. The idea is to exploit assumed "temporal locality" of ordinal movements by using an in-memory index (likely just a btree for the index, and a list for the outputs to be deleted) to batch updates for some chunk of n blocks (with useful values of n probably ranging between a few thousand and tens of thousands - depending on available memory and on what kind of locality patterns we find), and then only do the aggregate/batched updates to redb every n blocks. Why do I think that would be faster? On the plus side, we have to do high frequency updates only to a much smaller, in-memory data structure. B-tree insert/delete should be O(log n), so an update to a tree of 10% of the size might be twice as fast, comparing in-memory to in-memory, but the difference should be bigger if this in in-memory vs mmaped to disk. On the other side, if we have no ability to do any meaningful batching of updates within the n blocks we consider, we are just duplicating work and might even be slower. So, before we dive into this, we should collect some simple statistics about locality properties.

@veryordinally
Copy link
Collaborator

Here's the key results from using https://crates.io/crates/cargo-instruments to profile the indexing of mainnet from blocks 450006 to block 451008 (why the unround numbers you may ask: there seems to be an off by some bug in --height-limit, see #517):

Screenshot 2022-09-11 at 15 13 47

The command used was:

cargo instruments --template time -- --cookie-file /opt/node/bitcoin/data/.cookie --data-dir profiling-data --height-limit 451000 index

starting with an index that was at block 450006, so around 1000 blocks indexed. I expect this to give us a pretty representative view of where time goes in indexing mainnet with full blocks.

This is using most recent master as of Sep 11 (commit 5987958)

@veryordinally
Copy link
Collaborator

Working on #500 to help prioritize further optimization efforts.

veryordinally added a commit to veryordinally/ord that referenced this issue Oct 25, 2022
Working on issue ordinals#371
Most UTXOs are short-lived so to speed up indexing, we introduce an in-memory HashMap that collects all outpoint-to-ordinal-range mappings since the last DB commit, and serves as a cache. On mainnet this avoids around 80% of the writes to the redb BTree and results in a 2x speedup for indexing at the expense of slightly higher RAM usage.
Also introduces more detailed logging and time keeping.
@casey casey changed the title Index is slow Speed up index Oct 26, 2022
@casey casey changed the title Speed up index Index is slow Oct 26, 2022
@casey
Copy link
Collaborator Author

casey commented Oct 26, 2022

The index is still slow, but this particular issue isn't appearing anymore, so closing in favor of a new issue, #713.

@casey casey closed this as completed Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants