Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Concurrent Searching #2587

Closed
2 of 9 tasks
reta opened this issue Mar 24, 2022 · 4 comments
Closed
2 of 9 tasks

[META] Concurrent Searching #2587

reta opened this issue Mar 24, 2022 · 4 comments
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Search:Aggregations

Comments

@reta
Copy link
Collaborator

reta commented Mar 24, 2022

Is your feature request related to a problem? Please describe.
At least since Apache Lucene 6.x, there is a new experimental low-level API which allows to parallelize execution of the search across segments [3]. As of latest Apache Lucene 8.10.1, the API is still marked as experimental (see please [1]). The community feedback on this feature is looking positive so far (see please [2]), there are high chances that for certain kind of indices parallelizing the search over segments could bring performance benefits.

[1] https://lucene.apache.org/core/8_10_1/core/org/apache/lucene/search/IndexSearcher.html#search-org.apache.lucene.search.Query-org.apache.lucene.search.CollectorManager-
[2] https://engineeringblog.yelp.com/2021/09/nrtsearch-yelps-fast-scalable-and-cost-effective-search-engine.html
[3] https://blog.mikemccandless.com/2019/10/concurrent-query-execution-in-apache.html

Describe the solution you'd like
Support the concurrent search over Apache Lucene segments

Describe alternatives you've considered
N/A

Additional context

@msfroh
Copy link
Collaborator

msfroh commented Apr 13, 2023

@reta -- out of curiosity, is there any plan to link concurrent search with a merge policy that evens-out segment sizes?

On Amazon Product Search, we used a merge policy that dynamically adjusts the max segment size (to something like min(5GB, max(1GB, totalIndexSize/5)), IIRC) combined with a merge-on-commit setting that would merge all segments less than some threshold (something like 100MB). Basically, "lower the ceiling and raise the floor".

(I talked about this around minute 14:00 of https://www.youtube.com/watch?v=UwclHSeE_B8. Sorry for the shameless plug. 😁 )

@reta
Copy link
Collaborator Author

reta commented Apr 13, 2023

@msfroh this is great idea I think, I remember we have discussed that at OpenSearchCon as well, will follow up with the issue, thank you!

@hdhalter
Copy link

Hi @yigithub , if anything here is related to 2.10, please create a doc issue or PR for the update. Thanks!

@yigithub yigithub self-assigned this Jan 22, 2024
@sohami
Copy link
Collaborator

sohami commented Jun 3, 2024

Closing this META issue as concurrent search is GA now. Other improvements or follow-up items can be tracked separately in the respective issues which is also tracked via the Concurrent Search Project Board.

@sohami sohami closed this as completed Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Aggregations
Projects
Status: Done
Development

No branches or pull requests

7 participants