Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug where off-heap scorer would kick on even for float vectors #13850

Merged
merged 1 commit into from
Oct 2, 2024

Conversation

benwtrent
Copy link
Member

introduced in the major refactor #13779

Off-heap scoring is only present for byte[] vectors, and it isn't enough to verify that the vector provider also satisfies the HasIndexSlice interface. The vectors need to be byte vectors otherwise, the slice iterations and scoring are completely nonsensical leading to HNSW graph building to run until the heat-death of the universe.

@benwtrent benwtrent requested review from msokolov and ChrisHegarty and removed request for msokolov October 2, 2024 13:10
Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for finding and fixing Ben. LGTM

@benwtrent
Copy link
Member Author

I am gonna merge and backport to 10_0_0 & 10x. I am not putting a changes entry as this could be seen as a bugfix for an unreleased feature (the vector API rewrite), if folks disagree with this, I will happily put a bugfix item.

@benwtrent benwtrent merged commit 56e9468 into apache:main Oct 2, 2024
3 checks passed
@benwtrent benwtrent deleted the bugfix-use-correct-scorer branch October 2, 2024 13:27
benwtrent added a commit that referenced this pull request Oct 2, 2024
…13850)

introduced in the major refactor #13779

Off-heap scoring is only present for byte[] vectors, and it isn't enough to verify that the vector provider also satisfies the HasIndexSlice interface. The vectors need to be byte vectors otherwise, the slice iterations and scoring are completely nonsensical leading to HNSW graph building to run until the heat-death of the universe.
benwtrent added a commit that referenced this pull request Oct 2, 2024
…13850)

introduced in the major refactor #13779

Off-heap scoring is only present for byte[] vectors, and it isn't enough to verify that the vector provider also satisfies the HasIndexSlice interface. The vectors need to be byte vectors otherwise, the slice iterations and scoring are completely nonsensical leading to HNSW graph building to run until the heat-death of the universe.
ChrisHegarty added a commit that referenced this pull request Oct 2, 2024
This is a test only change that verifies the behaviour when float vector values are passed to our FlatVectorsScorer implementations. This would have caught the bug causing #13844, subsequently fixed by #13850.
ChrisHegarty added a commit that referenced this pull request Oct 2, 2024
This is a test only change that verifies the behaviour when float vector values are passed to our FlatVectorsScorer implementations. This would have caught the bug causing #13844, subsequently fixed by #13850.
ChrisHegarty added a commit that referenced this pull request Oct 2, 2024
This is a test only change that verifies the behaviour when float vector values are passed to our FlatVectorsScorer implementations. This would have caught the bug causing #13844, subsequently fixed by #13850.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants