-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(server): separate face search relation #10371
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also mark the clip embeddings as external?
Yep, this makes the clip embeddings external too. |
1540f20
to
2adf3cf
Compare
|
||
await queryRunner.query(`ALTER TABLE asset_faces ADD COLUMN "embedding" vector(512)`); | ||
await queryRunner.query(`ALTER TABLE face_search ALTER COLUMN embedding SET STORAGE DEFAULT`); | ||
await queryRunner.query(`ALTER TABLE smart_search ALTER COLUMN embedding SET STORAGE DEFAULT`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed this line. Nice!
Description
This PR addresses a subtle issue with the current facial recognition. Each time a face is assigned or reassigned a person, for both the initial and later facial recognition runs, an additional duplicate embedding is inserted into the face vector index. This can lead to index degradation as the majority of the index is duplicated, in turn leading to faces sometimes not being recognized when they should.
This PR changes the embedding to be in a separate table as a one-one relation, similar to how smart search is handled. This means changes to the face, such as which person it's assigned to, have no effect on the index. It has a smaller but notable benefit of making these changes faster and producing less WAL.
A notable benefit of this change is also that it makes supporting manually added faces easier as an embedding is no longer required.
Also sets storage to external so Postgres doesn't try to compress the embeddings, following the finding here.
Fixes #10277
How Has This Been Tested?
Tested that the migration is successful without loss of data and that both face detection and facial recognition jobs continue to work.
SELECT idx_tuples FROM pg_vector_index_stat WHERE indexname = 'face_index';
is identical to the number of faces, i.e. no duplicate embeddings