Add support for creating index by passing embeddings explicitly #1597

abhinav-upadhyay · 2023-03-11T10:41:51Z

I have a use case where I want to be able to create multiple indices of the same set of documents, essentially each index will be built based on some criteria so that I can query from the right set of documents. (I am using FAISS at the moment which does not have great options for filtering within one giant index so they recommend creating multiple indices)

It would be expensive to generate embeddings by calling OpenAI APIs for each document multiple times to populate each of the indices. I think having an interface similar to add_texts and add_documents which allows the user to pass the embeddings explicitly might be an option to achieve this?

As I write, I think I might be able to get around by passing a wrapper function to FAISS as the embedding function which can internally cache the embeddings for each document and avoid the duplicate calls to the embeddings API.

However, creating this issue in case others also think that an add_embeddings API or something similar sounds like a good idea?

The text was updated successfully, but these errors were encountered:

This allows the users to pass pre-created embeddings explicitly to be indexed by the vector store. If the users wish to create the embeddings for their documents and reuse them, this can save extra API calls to the embeddings API endpoints Fixes langchain-ai#1597

dosubot · 2023-08-17T16:06:27Z

Hi, @abhinav-upadhyay! I'm here to help the LangChain team manage their backlog, and I wanted to let you know that we are marking this issue as stale.

Based on my understanding, you are requesting support for creating multiple indices based on different criteria using explicit embeddings. You have suggested an add_embeddings API to avoid the need for generating embeddings multiple times for each document. However, there hasn't been any activity on this issue yet.

Could you please let us know if this issue is still relevant to the latest version of the LangChain repository? If it is, please comment on the issue to let the LangChain team know. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding and contribution to the LangChain project!

abhinav-upadhyay mentioned this issue Mar 13, 2023

Add utility method from_embedding_vecotrs in FAISS wrapper #1627

Closed

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Aug 17, 2023

dosubot bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 24, 2023

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for creating index by passing embeddings explicitly #1597

Add support for creating index by passing embeddings explicitly #1597

abhinav-upadhyay commented Mar 11, 2023

dosubot bot commented Aug 17, 2023

Add support for creating index by passing embeddings explicitly #1597

Add support for creating index by passing embeddings explicitly #1597

Comments

abhinav-upadhyay commented Mar 11, 2023

dosubot bot commented Aug 17, 2023