Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up find #149

Merged
merged 8 commits into from
Feb 21, 2022
Merged

Speed up find #149

merged 8 commits into from
Feb 21, 2022

Conversation

casey
Copy link
Collaborator

@casey casey commented Feb 20, 2022

No description provided.

@casey casey marked this pull request as ready for review February 21, 2022 01:18
@casey casey enabled auto-merge (squash) February 21, 2022 01:23
@casey casey merged commit a5c26bc into master Feb 21, 2022
@casey casey deleted the fast-find branch February 21, 2022 01:26
@casey
Copy link
Collaborator Author

casey commented Feb 21, 2022

@cberner This PR implements the ordinal lookup strategy that you came up with, with some small tweaks.

The keys are (ordinal, block, transaction), encoded big-endian, and the values are satpoints, i.e. txid, output index, satoshi offset.

For every newly moved ordinal range, an entry is inserted mapping (range-start, block, transaction) to the satpoint of the first ordinal in the range.

To do a lookup for an ordinal, you get the reversed range between &[] and (ordinal, u64::max_value(), u64::max_value()) and go backwards until you find an entry.

This has the upside that the math is a bit simpler, you don't have to sort some of the fields of the keys in different orders, but has the downside that you have to check whether the ordinal has been mined as of the index height before doing a query.

@casey
Copy link
Collaborator Author

casey commented Feb 21, 2022

@cberner Bitcoin comes with basic serialization, but annoyingly it serializes everything little endian, so I have to write custom serializers when I need big endian values so that keys sort in the right order.

@cberner
Copy link
Contributor

cberner commented Feb 21, 2022

Nice! Have you tried indexing the whole blockchain yet?

@casey
Copy link
Collaborator Author

casey commented Feb 21, 2022

I've indexed part of it, but haven't done the whole thing. I started indexing it, and it got really slow at around block 175000. Previous blocks took much less than a second to index, but at that height, blocks started taking a minute or more to index. The blocks at that height don't differ in size or structure from earlier blocks, so I think it's gotta be a space leak, or something else bad happening in redb when the database size gets large. I haven't opened an issue yet because I haven't found out a convenient way to reproduce, or profiled it to see what's taking up the time.

By the way, what do you recommend for generating a flamegraph? I tried using flamegraph-rs but it was extremely slow and I gave up.

If you're feeling ambitious, you could sync a bitcoin core node, and then try indexing the chain and see what happens. It should be pretty straightforward, and you could see how it performs yourself. Although it does take ~500 GiB for the chain, plus however much space the ord database takes up.

We'll give it another shot in the next few days though, and hopefully come up with an easy way to reproduce the issue, or maybe find out that it's a problem on our end.

@cberner
Copy link
Contributor

cberner commented Feb 21, 2022

I use flamegraph-rs, and it works great, but ya you need to keep the profiling time short. Maybe you can artificially abort the indexing after a few seconds for one of those slow blocks? I usually run it on my benchmarks which take maybe 10-20secs to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants