Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Pgvector patch #4103

Merged
merged 4 commits into from
Apr 16, 2024
Merged

fix: Pgvector patch #4103

merged 4 commits into from
Apr 16, 2024

Conversation

HaoXuAI
Copy link
Collaborator

@HaoXuAI HaoXuAI commented Apr 16, 2024

What this PR does / why we need it:

Make the postgres online store compatible with old write and read, when the pgvector is enabled. This is done by adding a new column as vector_value, to store the vector value if the pgvector is enabled.

Which issue(s) this PR fixes:

Fixes

Signed-off-by: cmuhao <sduxuhao@gmail.com>
"pgvector_enabled" in config.online_config
and config.online_config["pgvector_enabled"]
"pgvector_enabled" in config.online_store
and config.online_store.pgvector_enabled
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be some sort of a feature view-level config? This global config makes it impossible to use postgres both for traditional feature lookup and vector search at the same time, isn't that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good point. Before the pr we were unable to do that because vector is mixed with feature value. Now it's doable as vector is a new feature now. Let me see how that works

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tokoko on a second thought, I would rather not mess up the feature view at the moment. Until we figure out a better design.
Also it still won't impact the original read write flow with or without the pgvrctor_enabled.
And Postgres user has to install the extension before it's able to use the pgvector, so it's better to have this config at high level as well.

Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This global config makes it impossible to use postgres both for traditional feature lookup and vector search at the same time, isn't that right?

Agreed with @tokoko here. I'll approve but we should plan to tackle this soon because context aware RAG would be very useful for a case in which you want to inject structured data into the context along with the documents.

@HaoXuAI HaoXuAI merged commit 5c4a9c5 into master Apr 16, 2024
24 checks passed
@HaoXuAI
Copy link
Collaborator Author

HaoXuAI commented Apr 16, 2024

This global config makes it impossible to use postgres both for traditional feature lookup and vector search at the same time, isn't that right?

Agreed with @tokoko here. I'll approve but we should plan to tackle this soon because context aware RAG would be very useful for a case in which you want to inject structured data into the context along with the documents.

Hmmm, the get_online_feature will still work :)

@tokoko
Copy link
Collaborator

tokoko commented Apr 16, 2024

@HaoXuAI I also think this is not urgent, but we should definitely try to end up with a good design of how these two worlds will interact in the future, for example as @franciscojavierarceo pointed out, can there be a use case when a single feature service would generate the results containing results from both feature lookup and vector search?

In this specific case I was referring to the fact that if pgvector is enabled, materialize step will make a call to get_list_val_str function that raises an exception for many data types, so effectively materialization for both types of features can't coexist right now if I'm not mistaken.

@HaoXuAI
Copy link
Collaborator Author

HaoXuAI commented Apr 16, 2024

@HaoXuAI I also think this is not urgent, but we should definitely try to end up with a good design of how these two worlds will interact in the future, for example as @franciscojavierarceo pointed out, can there be a use case when a single feature service would generate the results containing results from both feature lookup and vector search?

In this specific case I was referring to the fact that if pgvector is enabled, materialize step will make a call to get_list_val_str function that raises an exception for many data types, so effectively materialization for both types of features can't coexist right now if I'm not mistaken.

With this pr it should work, the get_list_val_str will return null for not acceptable value type. And the vector_value field is null able. And materialization should still work in that way.
Good call on that though. I think we can revisit the design after another online store implemented.

lokeshrangineni pushed a commit to lokeshrangineni/feast that referenced this pull request Apr 16, 2024
jeremyary pushed a commit that referenced this pull request Apr 17, 2024
# [0.37.0](v0.36.0...v0.37.0) (2024-04-17)

### Bug Fixes

* Pgvector patch ([#4103](#4103)) ([5c4a9c5](5c4a9c5))
* Remove top-level grpc import in cli ([#4107](#4107)) ([4362b6c](4362b6c))

### Features

* Add tags to dynamodb config ([#4100](#4100)) ([b08b8d5](b08b8d5))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants