-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Lucene specific file extensions to core HybridFS #721
Add Lucene specific file extensions to core HybridFS #721
Conversation
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
d47eca6
to
1d38554
Compare
Don't we want to load the files early using something like this: https://github.com/opensearch-project/OpenSearch/blob/2.5/server/src/main/java/org/opensearch/index/IndexModule.java#L147 |
Overall code looks good to me. Please reply comment of Jack on the adding another file type. |
Signed-off-by: Martin Gaievski <gaievski@amazon.com>
Codecov Report
@@ Coverage Diff @@
## main #721 +/- ##
============================================
- Coverage 84.64% 84.43% -0.22%
Complexity 1072 1072
============================================
Files 151 152 +1
Lines 4345 4356 +11
Branches 389 389
============================================
Hits 3678 3678
- Misses 489 498 +9
- Partials 178 180 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
* Add lucene vector specific file extensions for io with mmap Signed-off-by: Martin Gaievski <gaievski@amazon.com> (cherry picked from commit 8a2aa04)
* Add lucene vector specific file extensions for io with mmap Signed-off-by: Martin Gaievski <gaievski@amazon.com> (cherry picked from commit 8a2aa04)
Signed-off-by: Martin Gaievski gaievski@amazon.com
Description
We're including file extensions for vector value files from Lucene 9.4 to the list of extensions that Core OpenSearch will use with HybridFS store type and MMap file I/O. This increases performance for both data ingestion and queries for p99. Setting is abstracted at engine level with specific implementation for Lucene. Setting is set at cluster defaults level, index specific overrides will take priority over it.
Tested locally on 1M dataset
Before change (baseline):
data ingestion
query
With the change
data ingestion
query
Issues Resolved
#637
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.