Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Elasticsearch Approximate Nearest Neighbors #5557

Closed
AndreasR90 opened this issue Aug 12, 2023 · 6 comments
Closed

Support Elasticsearch Approximate Nearest Neighbors #5557

AndreasR90 opened this issue Aug 12, 2023 · 6 comments
Assignees
Labels
1.x wontfix This will not be worked on

Comments

@AndreasR90
Copy link

Elasticsearch>8.0 has an implementation for an aNN (approximate Nearest Neighbor) Algorithm based on HSNW. The corresponding blogpost https://www.elastic.co/blog/introducing-approximate-nearest-neighbor-search-in-elasticsearch-8-0 indicates that this gives a significant speedup for the query times in comparison to the currently used the exact kNN match. The obvious downside is, that not all actual nearest neighbors are found. In my opinion the decision which algorithm to use should be given to the user of haystack.

It would be ideal to have an additional argument for the ElasticsearchDocumentstore (>=8) where the user can choose which query is used.

@anakin87
Copy link
Member

Hello, @AndreasR90!

Related: #2810

@bogdankostic do you have any insights to share on this point?

@bogdankostic
Copy link
Contributor

I haven't tried approximate knn with Elasticsearch 8 yet, but I agree with @AndreasR90 that we should allow to set the index_type for ElasticsearchDocumentStore, just as we do with OpenSearch.

I had a quick look at the Elasticsearch documentation and it seems that Elasticsearch is creating always an index of type HNSW, so indexing time wouldn't even increase for users deciding to use aproximate knn instead of exact knn with Elasticsearch 8.

To perform an approximate knn search, we would just need to set the knn option in the request body instead of using script_score.

@AndreasR90
Copy link
Author

AndreasR90 commented Aug 14, 2023

I had a closer look into this yesterday and have a first implementation of this feature. I can create a draft PR this afternoon. What do you think @bogdankostic ?

@bogdankostic
Copy link
Contributor

@AndreasR90 Yes, creating a draft PR would be awesome. ⭐

@AndreasR90
Copy link
Author

Hi @bogdankostic,
as promised I opened the Draft PR yesterday. Feel free to have a look and provide feedback 😊

@bogdankostic bogdankostic self-assigned this Aug 30, 2023
@masci masci added 1.x wontfix This will not be worked on labels Jan 8, 2024
@masci
Copy link
Contributor

masci commented Jan 8, 2024

Closing as won't fix, Haystack 2.x supports HNSW.

@masci masci closed this as completed Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.x wontfix This will not be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants