-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for creating index by passing embeddings explicitly #1597
Comments
This allows the users to pass pre-created embeddings explicitly to be indexed by the vector store. If the users wish to create the embeddings for their documents and reuse them, this can save extra API calls to the embeddings API endpoints Fixes langchain-ai#1597
Hi, @abhinav-upadhyay! I'm here to help the LangChain team manage their backlog, and I wanted to let you know that we are marking this issue as stale. Based on my understanding, you are requesting support for creating multiple indices based on different criteria using explicit embeddings. You have suggested an Could you please let us know if this issue is still relevant to the latest version of the LangChain repository? If it is, please comment on the issue to let the LangChain team know. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days. Thank you for your understanding and contribution to the LangChain project! |
I have a use case where I want to be able to create multiple indices of the same set of documents, essentially each index will be built based on some criteria so that I can query from the right set of documents. (I am using FAISS at the moment which does not have great options for filtering within one giant index so they recommend creating multiple indices)
It would be expensive to generate embeddings by calling OpenAI APIs for each document multiple times to populate each of the indices. I think having an interface similar to
add_texts
andadd_documents
which allows the user to pass the embeddings explicitly might be an option to achieve this?As I write, I think I might be able to get around by passing a wrapper function to FAISS as the embedding function which can internally cache the embeddings for each document and avoid the duplicate calls to the embeddings API.
However, creating this issue in case others also think that an
add_embeddings
API or something similar sounds like a good idea?The text was updated successfully, but these errors were encountered: