Q&A System using BERT and Faiss Vector Database

Overview

This project is a Question & Answer system implemented using DistilBERT for text representation and Faiss (Facebook AI Similarity Search) for efficient similarity search in a vector database. The system is designed to provide accurate and relevant answers to user queries by searching through a large collection of documents.

Features

DistilBERT-based Text Representation: Utilizes the DistilBERT model to convert questions and documents into dense vector representations.
Faiss Vector Database: Stores the vector representations of the documents for fast similarity search.
Efficient Retrieval: Finds the most relevant documents to a given question by performing efficient similarity searches in the Faiss vector database.

Installation

Requirements

Python 3.x
PyTorch
Transformers
Faiss
Streamlit (for the web-based interface)

Setup

Clone the repository:

git clone https://github.com/VuBacktracking/bert-faiss-qa-sytem.git

Clone the repository:

pip install -r requirements.txt

Train and Download the DistilBERT model:

python3 trainer.py

Note: You can check my model in the link: https://huggingface.co/vubacktracking/distilbert-base-uncased-finetuned-squad2

Build the Faiss vector database:

python3 faiss_index.py

Usage

Streamlit Web App Interface

streamlit run app.py

Open your web browser and navigate to http://localhost:8501/ to use the web-based Q&A system.

How it Works

BERT Embeddings:
- The preprocessed text is converted into vector embeddings using the DistilBERT model.
Faiss Indexing:
- The DistilBERT embeddings of the documents are indexed in the Faiss vector database.
Query Processing:
- When a user inputs a question, the question is converted into a DistilBERT embedding.
- Faiss is used to find the most similar embeddings (i.e., the most relevant documents) to the question embedding.
Answer Extraction:
- The relevant documents are ranked, and the most relevant answer passages are extracted and presented to the user.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
app		app
assets		assets
cfg		cfg
utils		utils
.gitignore		.gitignore
README.md		README.md
app.py		app.py
faiss_index.py		faiss_index.py
qa-system.py		qa-system.py
requirements.txt		requirements.txt
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q&A System using BERT and Faiss Vector Database

Table of Contents

Overview

Features

Installation

Requirements

Setup

Usage

Streamlit Web App Interface

How it Works

Demo

Extractive Q&A

Closed Generative Q&A

Acknowledgments

About

Releases

Packages

Languages

VuBacktracking/bert-faiss-qa-system

Folders and files

Latest commit

History

Repository files navigation

Q&A System using BERT and Faiss Vector Database

Table of Contents

Overview

Features

Installation

Requirements

Setup

Usage

Streamlit Web App Interface

How it Works

Demo

Extractive Q&A

Closed Generative Q&A

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages