-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: cutecutecat <junyuchen@tensorchord.ai>
- Loading branch information
1 parent
6ba9402
commit ed18755
Showing
2 changed files
with
80 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# Compatible with pgvector | ||
|
||
pgvecto.rs can be configured to be compatible with `pgvector` at: | ||
- Create table | ||
- Create vector indexes | ||
- Search vectors | ||
|
||
For `create table` and `search vectors`, pgvecto.rs supports this feature natively. | ||
|
||
For `create vector indexes' this feature should be enabled with `SET vectors.pgvector_compat=on;`. | ||
|
||
|
||
## Examples | ||
|
||
It's easy to enable compatibility mode and run a search. | ||
```sql | ||
DROP TABLE IF EXISTS t; | ||
SET vectors.pgvector_compat=on; | ||
CREATE TABLE t (val vector(3)); | ||
INSERT INTO t (val) SELECT ARRAY[random(), random(), random()]::real[] FROM generate_series(1, 1000); | ||
CREATE INDEX hnsw_l2_index ON t USING hnsw (val vector_l2_ops); | ||
SELECT COUNT(1) FROM (SELECT 1 FROM t ORDER BY val <-> '[0.5,0.5,0.5]' limit 100) t2; | ||
DROP INDEX hnsw_l2_index; | ||
``` | ||
|
||
Multiply types of indices are accepted: | ||
```sql | ||
SET vectors.pgvector_compat=on; | ||
-- [hnsw + vector_l2_ops] index with default options | ||
CREATE INDEX hnsw_l2_index ON t USING hnsw (val vector_l2_ops); | ||
-- [hnsw + vector_cosine_ops] index with single ef_construction option | ||
CREATE INDEX hnsw_cosine_index ON t USING hnsw (val vector_cosine_ops) WITH (ef_construction = 80); | ||
-- anonymous [hnsw + vector_ip_ops] with all options | ||
CREATE INDEX ON t USING hnsw (val vector_ip_ops) WITH (ef_construction = 80, m = 12); | ||
-- [ivfflat + vector_l2_ops] index with default options | ||
CREATE INDEX ivfflat_l2_index ON t USING ivfflat (val vector_l2_ops); | ||
-- [ivfflat + vector_ip_ops] index with all options | ||
CREATE INDEX ivfflat_ip_index ON t USING ivfflat (val vector_cosine_ops) WITH (nlist = 80); | ||
-- anonymous [ivf + vector_ip_ops] with all options | ||
CREATE INDEX ON t USING ivfflat (val vector_ip_ops) WITH (lists = 80) | ||
``` | ||
|
||
## Limitation | ||
|
||
For compatibility, we strive to maintain a consistent user experience, but there are still some limitations in two aspects: | ||
|
||
- Some features of pgvector.rs are not available in `compatibility mode'. | ||
- Some features of pgvector are different in `compatibility mode | ||
|
||
### Unavailable features of pgvecto.rs | ||
When pgvecto.rs is running in `compatibility mode`, some features of pgvecto.rs are not available: | ||
- `flat` index | ||
- Quantization, including `scalar quantization` and `product quantization` | ||
- prefilter and vbase | ||
|
||
And for index `ivfflat` and `hnsw` only the following options are available. | ||
Their default value is **different from pgvecto.rs original**, which keeps the same from `pgvector`. | ||
|
||
Options for `ivfflat | ||
|
||
| Key | Type | Default | Description | | ||
| ---------------- | ------- | ------- | ----------------------------------------- | | ||
| lists | integer | `100` | Number of cluster units. | | ||
|
||
Options for `hnsw`. | ||
|
||
| key | type | default | description | | ||
| --------------- | ------- | ------- | -------------------------------------- | | ||
| m | integer | `16` | Maximum degree of the node. | | ||
| ef_construction | integer | `64` | Search extent in construction. | | ||
|
||
### Difference from pgvector | ||
|
||
Although in `compatibility mode', there are still differences from pgvector. | ||
|
||
Known problems are not limited to: | ||
- Create vector indexes is asynchronous at pgvecto.rs, instead of synchronous at pgvector | ||
- Query Options [ivfflat.probes](https://github.com/pgvector/pgvector#query-options-1) need to be `vectors.ivf_nprobe`, and [hnsw.ef_search](https://github.com/pgvector/pgvector?tab=readme-ov-file#query-options) need to be `vectors.hnsw_ef_search` | ||
- [Vector Functions](https://github.com/pgvector/pgvector?tab=readme-ov-file#vector-functions) and [Aggregate Functions](https://github.com/pgvector/pgvector?tab=readme-ov-file#aggregate-functions) are not supported |