Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build index in parallel #458

Closed
casey opened this issue Sep 2, 2022 · 2 comments
Closed

Build index in parallel #458

casey opened this issue Sep 2, 2022 · 2 comments

Comments

@casey
Copy link
Collaborator

casey commented Sep 2, 2022

The ordinal index is currently single threaded. It would be nice to figure out how to build it in parallel. One thing that will reduce parallelism is the fact that redb writers must be serialized.

The two sources of parallelism I can think of are:

  • Process blocks in parallel
  • Process transactions in parallel

Straw man proposal:

  • Create queue of transactions to process
  • Create a pool of workers that grabs transactions from the queue, figures out which ranges are in the inputs, which ranges are in the outputs
  • Workers then either commit those changes to the database, or communicate them to a writer thread that writes them to the database

This is blocked on #111, benchmarks, since you can't optimize what you can't benchmark.

@veryordinally
Copy link
Collaborator

Started looking at literature after our discussion last night. Found an interesting PhD thesis https://tuprints.ulb.tu-darmstadt.de/19668/1/Identification_of_Suitable_Parallelization_Patterns_for_Sequential_Programs_Ul_Huda.pdf

@casey
Copy link
Collaborator Author

casey commented Nov 15, 2022

I think this is either impossible or will complicate the codebase enormously, so closing for now.

@casey casey closed this as completed Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants