QA on documents with LangChain framework and Hugging Face LLM and prompt Templates
LangChain is a framework designed to simplify the creation of LLM applications. The core building block of LangChain applications is LLMChain. This is combined with-
- Large Language Model - is the core engine
- Prompt templates - provide instructions to the language model
- Output parser - These translate the raw output from LLM
For LLM, I use Hugging Face API to access the model. The procedure of my project:
- Load the document
- Split the document into chunks for embedding and vector store
- Sentence_transformer used for embedding. Find more in SBERT.net https://lnkd.in/eZNVEu_6
- According to my query, similarity_search is done based on VectorStore or db and query. This is to find out the lowest distanced embedded chunks.
- Load the LLM from HuggingFaceHub
- for LLMChain input the lowest distanced embedded chunks with LLM,
- For proper answers, try different models, used for specific tasks. (eg: text_to_text)