-
Clone the repository:
git clone git@github.com:knmlprz/ChatKNML.git
-
Navigate to the project directory:
cd ChatKNML
-
Create a new branch for your development work:
git checkout -b {your_branch_name}
-
Make sure you are working on the correct branch:
git status
-
Copy the
.env.example
file:cp .env.example .env
Modify the environment variables to suit your requirements.
-
Launching services using the "dev" profile:
docker compose --profile dev up
docker compose --profile prod up
-
Download model (must have for service llm-embedding to work!!!)
Download model (size of file 3.6GB ):
curl -o ./llm/models/llama-2-7b.Q3_K_L.gguf -L https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q3_K_L.gguf
or
wget -P ./llm/models/llama-2-7b.Q3_K_L.gguf https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q3_K_L.gguf
-
Launching llm and embedding
2.1. Running on cpu
docker compose --profile cpu up
2.2. Running on gpu
docker compose --profile gpu up
Swegger with EP for completions(llm + embedding) and only embedding is here