Skip to content

Commit

Permalink
doc: pgvector compatibility
Browse files Browse the repository at this point in the history
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
  • Loading branch information
cutecutecat committed Jan 18, 2024
1 parent 6ba9402 commit ed18755
Show file tree
Hide file tree
Showing 2 changed files with 80 additions and 0 deletions.
1 change: 1 addition & 0 deletions .vitepress/config.mts
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ export default defineConfig({
{ text: 'Indexing', link: '/usage/indexing' },
{ text: 'Search', link: '/usage/search' },
{ text: 'Monitoring', link: '/usage/monitoring' },
{ text: 'Compatible with pgvector', link: '/usage/compatibility' },
]
},
{
Expand Down
79 changes: 79 additions & 0 deletions src/usage/compatibility.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Compatible with pgvector

pgvecto.rs can be configured to be compatible with `pgvector` at:
- Create table
- Create vector indexes
- Search vectors

For `create table` and `search vectors`, pgvecto.rs supports this feature natively.

For `create vector indexes' this feature should be enabled with `SET vectors.pgvector_compat=on;`.


## Examples

It's easy to enable compatibility mode and run a search.
```sql
DROP TABLE IF EXISTS t;
SET vectors.pgvector_compat=on;
CREATE TABLE t (val vector(3));
INSERT INTO t (val) SELECT ARRAY[random(), random(), random()]::real[] FROM generate_series(1, 1000);
CREATE INDEX hnsw_l2_index ON t USING hnsw (val vector_l2_ops);
SELECT COUNT(1) FROM (SELECT 1 FROM t ORDER BY val <-> '[0.5,0.5,0.5]' limit 100) t2;
DROP INDEX hnsw_l2_index;
```

Multiply types of indices are accepted:
```sql
SET vectors.pgvector_compat=on;
-- [hnsw + vector_l2_ops] index with default options
CREATE INDEX hnsw_l2_index ON t USING hnsw (val vector_l2_ops);
-- [hnsw + vector_cosine_ops] index with single ef_construction option
CREATE INDEX hnsw_cosine_index ON t USING hnsw (val vector_cosine_ops) WITH (ef_construction = 80);
-- anonymous [hnsw + vector_ip_ops] with all options
CREATE INDEX ON t USING hnsw (val vector_ip_ops) WITH (ef_construction = 80, m = 12);
-- [ivfflat + vector_l2_ops] index with default options
CREATE INDEX ivfflat_l2_index ON t USING ivfflat (val vector_l2_ops);
-- [ivfflat + vector_ip_ops] index with all options
CREATE INDEX ivfflat_ip_index ON t USING ivfflat (val vector_cosine_ops) WITH (nlist = 80);
-- anonymous [ivf + vector_ip_ops] with all options
CREATE INDEX ON t USING ivfflat (val vector_ip_ops) WITH (lists = 80)
```

## Limitation

For compatibility, we strive to maintain a consistent user experience, but there are still some limitations in two aspects:

- Some features of pgvector.rs are not available in `compatibility mode'.
- Some features of pgvector are different in `compatibility mode

### Unavailable features of pgvecto.rs
When pgvecto.rs is running in `compatibility mode`, some features of pgvecto.rs are not available:
- `flat` index
- Quantization, including `scalar quantization` and `product quantization`
- prefilter and vbase

And for index `ivfflat` and `hnsw` only the following options are available.
Their default value is **different from pgvecto.rs original**, which keeps the same from `pgvector`.

Options for `ivfflat

| Key | Type | Default | Description |
| ---------------- | ------- | ------- | ----------------------------------------- |
| lists | integer | `100` | Number of cluster units. |

Options for `hnsw`.

| key | type | default | description |
| --------------- | ------- | ------- | -------------------------------------- |
| m | integer | `16` | Maximum degree of the node. |
| ef_construction | integer | `64` | Search extent in construction. |

### Difference from pgvector

Although in `compatibility mode', there are still differences from pgvector.

Known problems are not limited to:
- Create vector indexes is asynchronous at pgvecto.rs, instead of synchronous at pgvector
- Query Options [ivfflat.probes](https://github.com/pgvector/pgvector#query-options-1) need to be `vectors.ivf_nprobe`, and [hnsw.ef_search](https://github.com/pgvector/pgvector?tab=readme-ov-file#query-options) need to be `vectors.hnsw_ef_search`
- [Vector Functions](https://github.com/pgvector/pgvector?tab=readme-ov-file#vector-functions) and [Aggregate Functions](https://github.com/pgvector/pgvector?tab=readme-ov-file#aggregate-functions) are not supported

0 comments on commit ed18755

Please sign in to comment.