Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: change scalar index to return RowIdTreeMap instead of u64 array #2587

Merged
merged 3 commits into from
Jul 11, 2024

Conversation

westonpace
Copy link
Contributor

@westonpace westonpace commented Jul 11, 2024

The filtering paths that use scalar indices are all built to use RowIdTreeMap. Scalar indices were built to use UInt64Array. This means we have to copy from one to the other (an O(N log(N)) operation. When we just had btree indices this was a fairly cheap operation since the indices shouldn't be returning large #'s of rows anyways. Now that we have BITMAP and LABEL_LIST we can have indices that return very large #'s of rows and this copy was expensive. By changing the scalar indices to use RowIdTreeMap directly we skip the conversion.

NOTE: any LABEL_LIST or BITMAP index that was previously created will be invalid and will need to be recreated. Since we haven't had an official release of these index types I am not considering this a breaking change.

Copy link
Contributor

@eddyxu eddyxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTm

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 87.58621% with 18 lines in your changes missing coverage. Please review.

Project coverage is 79.91%. Comparing base (a237fca) to head (e0ef6a9).

Files Patch % Lines
rust/lance-core/src/utils/mask.rs 66.66% 6 Missing ⚠️
rust/lance-index/src/scalar/bitmap.rs 83.87% 5 Missing ⚠️
rust/lance-index/src/scalar/label_list.rs 85.18% 1 Missing and 3 partials ⚠️
rust/lance-index/src/scalar/lance_format.rs 95.74% 0 Missing and 2 partials ⚠️
rust/lance-index/src/scalar/flat.rs 83.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2587      +/-   ##
==========================================
- Coverage   79.93%   79.91%   -0.02%     
==========================================
  Files         212      212              
  Lines       61660    61639      -21     
  Branches    61660    61639      -21     
==========================================
- Hits        49285    49256      -29     
- Misses       9420     9445      +25     
+ Partials     2955     2938      -17     
Flag Coverage Δ
unittests 79.91% <87.58%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@westonpace westonpace merged commit c89223b into lancedb:main Jul 11, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants