This directory defines the backend for the Byte-Barometer application. It also provides utility for populating the relevant vector database indices.
The backend application offers a websocket endpoint where clients can query for a subject making use of the hybrid embedding search capabilities of some vector databases. In short a query involves the following steps:
-
The query string is received over websocket, and the sender client id is noted.
-
The query string is converted into a sparse embedding and a dense embedding, using Splade and OpenAI respectively. These are weighted to prioritize between semantic and conventional search.
-
A vector database query is performed to find the N closest entries sorted by relevancy.
-
Aspect based sentiment analysis is applied on the entries, with the query string as the aspect, and as results become available they are published back to the client.
The backend has been containerized, but in order to make use of the GPU acceleration it is neccessary to install the nvidia-container-toolkit on the host system NVIDIA container toolkit. Ensure that nvidia-smi
corresponds as expected in the host and container system.
Create an .env
file in the root of the project and add your Pinecone API key and environment details:
HUGGINGFACE_API_KEY=<api_key>
OPENAI_API_KEY=<api-key>
PINECONE_ENVIRONMENT=<environment>
PINECONE_API_KEY=<api-key>
PINECONE_INDEX=<index-name>
ENABLE_GPU=True
The backend application will regularly fetch, process and store new comments from hackernews so that they may be queried. However this will just happen to new comments, to populate the index with an initial set of data you can do as follows:
source .venv/bin/activate
# Last two months, up to 200 000 documents
python3 populate.py -l 5184000 -d 200000
Alternatively if you prefer the docker image:
docker run --gpus all -v .:/app -it --env-file ../.env --entrypoint python3 byte-barometer populate.py -l 72000 -d 10000