build/docs: maintain backwards compatibility

winstxnhdw · Feb 25, 2024 · fdda767 · fdda767
1 parent f523aa8
commit fdda767
Show file tree

Hide file tree

Showing 4 changed files with 16 additions and 7 deletions.
diff --git a/Dockerfile b/Dockerfile
@@ -6,5 +6,6 @@ ENV OMP_NUM_THREADS 4
 ENV CT2_USE_EXPERIMENTAL_PACKED_GEMM 1
 ENV CT2_FORCE_CPU_ISA AVX512
 ENV WORKER_COUNT 2
+ENV EVENTS_PER_WINDOW 15
 
 EXPOSE $APP_PORT
diff --git a/Dockerfile.build b/Dockerfile.build
@@ -26,7 +26,8 @@ FROM python:slim
 ENV HOME /home/user
 ENV PYTHONUNBUFFERED 1
 ENV PYTHONDONTWRITEBYTECODE 1
-ENV EVENTS_PER_WINDOW 15
+ENV SERVER_PORT 5000
+ENV EVENTS_PER_WINDOW 100000
 
 RUN useradd -m -u 1000 user
 

diff --git a/Dockerfile.cuda-build b/Dockerfile.cuda-build
@@ -26,6 +26,7 @@ FROM nvidia/cuda:12.3.1-runtime-ubuntu22.04
 ENV HOME /home/user
 ENV PYTHONUNBUFFERED 1
 ENV PYTHONDONTWRITEBYTECODE 1
+ENV SERVER_PORT 5000
 ENV EVENTS_PER_WINDOW 100000
 ENV USE_CUDA True
 

diff --git a/README.md b/README.md
@@ -250,9 +250,7 @@ You can self-host the API and access the Swagger UI at [localhost:7860/api/docs]
 
 ```bash
 docker run --rm \
-  -e SERVER_PORT=5000 \
   -e APP_PORT=7860 \
-  -e EVENTS_PER_WINDOW=100000 \
   -p 7860:7860 \
   ghcr.io/winstxnhdw/nllb-api:main
 ```
@@ -269,9 +267,7 @@ After creating your permissible cache directory, you can mount it to the contain
 
 ```bash
 docker run --rm \
-  -e SERVER_PORT=5000 \
   -e APP_PORT=7860 \
-  -e EVENTS_PER_WINDOW=100000 \
   -p 7860:7860 \
   -v ./cache:/home/user/.cache \
   ghcr.io/winstxnhdw/nllb-api:main
@@ -286,16 +282,26 @@ You can pass the following environment variables to optimise the API for your ow
 
 ```bash
 docker run --rm \
-  -e SERVER_PORT=5000 \
   -e APP_PORT=7860 \
-  -e EVENTS_PER_WINDOW=100000 \
   -e OMP_NUM_THREADS=6 \
   -e WORKER_COUNT=1 \
   -p 7860:7860 \
   -v ./cache:/home/user/.cache \
   ghcr.io/winstxnhdw/nllb-api:main
 ```
 
+### Rate Limiting
+
+You can set a rate limit on the number of requests per minute with the following environment variable.
+
+```bash
+docker run --rm \
+  -e APP_PORT=7860 \
+  -e EVENTS_PER_WINDOW=15 \
+  -p 7860:7860 \
+  ghcr.io/winstxnhdw/nllb-api:main
+```
+
 ### CUDA Support
 
 You can accelerate your inference with CUDA by building and using `Dockerfile.cuda-build` instead.