Pipenv is a dependency manager for python projects. It is similar to npm (node) or bundler (Ruby).
pip install --user pipenv
Flask is a simple web application framework to easily build webapps. This is based on WSGI toolkit and jinja2 template engine.
pipenv install flask
app.py is the entry point file which defines all the routes for the web application
Run the following command to start the demo hello world app in port 5002 in debug mode
flask --app app.py run --debug -p 5002
Update .gitignore file with __pycache__
so that all the cache files are not stored in git
LangChain is a framework for developing applications powered by language models. It creates useful abstractions over LLMs to assemble complex applications as chains
pipenv install langchain
Integrating to OpenAI's API is provided as a module that will work with LangChain. Install it
pipenv install openai
OpenAI APIs expect a key to authenticate (and charge) for its ChatGPT model. Configure it
- Create a file called env.py (python-dotenv is the right way). But, workshop 🤷
- Add a new key OPENAI_API_KEY into it with a valid key from your account page in openai.com
Let us create a simple hello world chat endpoint to wire things together
-
Create a new POST endpoint in app.py that instantiates
OpenAI
object with the key defined above (OPEN_API_KEY) -
Make a POST call with a question and see its response
curl --location --request POST 'localhost:5002/chat' --header 'Content-Type: application/json' --data-raw '{"question" : "Hello. How are you?"}'
-
Congratulations, You have an API powered by LLM at your service now! 😊
Given that this is a LLM workshop, Let us not waste time in building the chat window interface. Feel free to use the html and js files in this repo. Don't lose lots of time tweaking and customizing them 😉
- Copy the
chat.html
file inside templates folder (this is where flask will look for html files) - Copy the
chat.css
file inside static folder - Copy also the
chat.js
file which has helper methods to send and render received messages - Update
app.py
route to return the html template for root path/
- Reload the browser window to see the page rendered with a chat window
- This code calls the API endpoint
/chat
when "Send" button is clicked. Try sending a message and watch the server logs!
RAG is an AI framework for retrieving facts from an external knowledge base to ground large language models with appropriate context to generate relavant and better responses
Qdrant is a vector similarity search engine build with rust with easy to use API to store and search
- Pull qdrant docker image using command
docker pull qdrant/qdrant
- Start qdrant
docker run -p 6333:6333 -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant
- Qdrant will start running in port 6333 and will store all the vector data in the folder
qdrant_storage
(Include also this in the .gitignore file) - Have a look at https://qdrant.tech/documentation/quick-start/ if you want play with a vector store before integrating it with LLM
Let us load some large documents into Qdrant as vectors by using OpenAIEmbeddings functions that convert the text into vectors
Install all the dependent modules that is needed for us to load the text as embeddings in vector database
pipenv install qdrant-client
pipenv install "openai[embeddings]"
pipenv install tiktoken
Let us define a constant in our env.py file to store the database url for our application to connect. Name this constant as QDRANT_URL
. If you are running your database locally as per the above step, its value will be http://localhost:6333
To make it easy I have included a huge insurance terms document in this repo (data/insurance.txt
). Let us load this document as chunks into our vector database using the script in load_vectordb.py
. It does the following
- Loads the file using
TextLoader
- Splits the contents of the file into smaller chunks using
RecursiveCharacterTextSplitter
- Converts this chunks one by one into embeddings using
OpenAIEmbeddings
and stores them in the vector database under the collection nameinsurance
Note: To run the script load_vectordb.py
use the command pipenv shell
to get into the context where all the dependent modules are loaded.
Now, Let us verify if the document chunks are loaded properly into our vector database. The script in the file test_vectordb.py
will help you with that. It does the following
- It creates a
QdrantClient
instance to connect to the database running in your local (using the above definedQDRANT_URL
constant) - Creates an instance of vector store with which we can query the database
- Runs a similarity search for the given query and prints the results on the screen for our verification
Note: To run the script test_vectordb.py
use the command pipenv shell
to get into the context where all the dependent modules are loaded.
Let us write a small helper class called Bot
that will help us to write all the things together. Please note we are instantiating lots of classes here for every request. We can get away with this because it is a workshop. Restructure the code effeciently for production applications 😊
Ref: https://python.langchain.com/docs/use_cases/question_answering/