Duke MEng AIPI ChatBot

This is a chatbot that can answer questions about the Duke MEng AIPI program. Built using streamlit as the frontend and a Mistral-7B model fine tuned on instruction data

Dataset Collection

Scraped the internal and external program websites for the Duke MEng AIPI program.
Iterated over each link present in the sitemap.xml file
Extracted the text from each link and saved it in a JSON file.
Also copied over the FAQ doc from the internal program website
Once I had a list of all these files, iterated over each file and passed the scraped data on to Gemini for further cleaning and better formatting
Saved all the cleaned data in a single text file: data/processed/context.txt

System Architecture

There are 3 primary components to the system

Vector Database
HuggingFace Inference Endpoint
Streamlit Application

Workflow: Once the context.txt file was created, chunked the document by paragraphs. Each paragraph was then converted to vector embeddings using the all-MiniLM-L6-v2 model. These embeddings were then stored in ChromaDB. ChromaDB was hosted on Azure on compute optimized instance with the following specs:

The script for ingestion can be found in scripts/ingest_data_into_vector_db.py

The model is deployed on a dedicated HuggingFace serverless protected endpoint which can only be accessed using a certain HF_TOKEN which is injected into the streamlit app as an environment variable:

Finally this model API endpoint was called by the streamlit interface:

Performance Evaluation

Using Human-as-a-Judge for the performance metric. 3 testers including myself evaluated the response of the model to the same 20 questions. On average, the model answered 16/20 questions correctly. Sample of questions used:

1. What courses does Professor Brinnae Bent take?
2. Who is the director of the AIPI program?
3. What are some housing options nearby?
4. How many credits do I need to graduate?
5. What courses can I take in the fall?
...

Cost estimation

Training (RunPod): 2 GPU x 80GB VRAM H100 NVIDIA GPU + 125 GB RAM: $4.59 / hr

Inference (HuggingFace Serverless Inference Endpoint - Dedicated): 1 GPU x 80GB VRAM A100 NVIDIA GPU + 145 GB RAM: $4 / hr

Hosting ChromaDB (Azure): 1 Standard_F4s_v2 - $ 0.0169 / hr

Approach to cost minimization

Training: Quantize the model and use techniques like QLora for finetuning (this way we can train it on a massive CPU cluster instead of 1 big GPU but this would be slower)
Inference: This was the cheapest option available, tried RunPod serverless and HuggingFace, even the smaller T4 GPUs don't work because it's a 7B model
Hosting ChromaDB (Azure): Can use a smaller instance here with lesser vCPUs and RAM

Demo

Use this link

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets/images		assets/images
data		data
notebooks		notebooks
scripts		scripts
.gitignore		.gitignore
README.md		README.md
app.py		app.py
constants.py		constants.py
packages.txt		packages.txt
requirements.txt		requirements.txt
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Duke MEng AIPI ChatBot

Dataset Collection

System Architecture

Performance Evaluation

Cost estimation

Approach to cost minimization

Demo

About

Releases 1

Packages

Languages

rootsec1/duke-meng-ai-chatbot

Folders and files

Latest commit

History

Repository files navigation

Duke MEng AIPI ChatBot

Dataset Collection

System Architecture

Performance Evaluation

Cost estimation

Approach to cost minimization

Demo

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages