Skip to content

Commit

Permalink
fix: filter the vec index in function Index::scalar_index_info (#3000)
Browse files Browse the repository at this point in the history
Create the vector index error when the schema has nullable field.


![image](https://github.com/user-attachments/assets/d323b9e0-6d39-4293-92b7-3ba3adfc8b2b)

When the vector column had build index already, the dataset scaner will
assign the filter in function `compute_partitions`.
```python
    if dataset.schema.field(column).nullable and filter_nan:
        filt = f"{column} is not null"
    else:
        filt = None
```

The scanner will use the scalar index scanner and this scanner treat the
vector index as scalar index(btree index).

This pr will filter the vector index in `scalar_index_info` and it will
only return the scalar index now.
  • Loading branch information
SaintBacchus authored Oct 14, 2024
1 parent 5c8f565 commit f98ffdd
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion rust/lance/src/index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -781,7 +781,14 @@ impl DatasetIndexInternalExt for Dataset {
let indices = self.load_indices().await?;
let schema = self.schema();
let mut indexed_fields = Vec::new();
for index in indices.iter().filter(|idx| idx.fields.len() == 1) {
for index in indices.iter().filter(|idx| {
let idx_schema = schema.project_by_ids(idx.fields.as_slice());
let is_vector_index = idx_schema
.fields
.iter()
.any(|f| matches!(f.data_type(), DataType::FixedSizeList(_, _)));
idx.fields.len() == 1 && !is_vector_index
}) {
let field = index.fields[0];
let field = schema.field_by_id(field).ok_or_else(|| Error::Internal {
message: format!(
Expand Down

0 comments on commit f98ffdd

Please sign in to comment.