Skip to content

Commit

Permalink
build/docs: maintain backwards compatibility
Browse files Browse the repository at this point in the history
  • Loading branch information
winstxnhdw committed Feb 25, 2024
1 parent f523aa8 commit fdda767
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 7 deletions.
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,6 @@ ENV OMP_NUM_THREADS 4
ENV CT2_USE_EXPERIMENTAL_PACKED_GEMM 1
ENV CT2_FORCE_CPU_ISA AVX512
ENV WORKER_COUNT 2
ENV EVENTS_PER_WINDOW 15

EXPOSE $APP_PORT
3 changes: 2 additions & 1 deletion Dockerfile.build
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ FROM python:slim
ENV HOME /home/user
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
ENV EVENTS_PER_WINDOW 15
ENV SERVER_PORT 5000
ENV EVENTS_PER_WINDOW 100000

RUN useradd -m -u 1000 user

Expand Down
1 change: 1 addition & 0 deletions Dockerfile.cuda-build
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ FROM nvidia/cuda:12.3.1-runtime-ubuntu22.04
ENV HOME /home/user
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
ENV SERVER_PORT 5000
ENV EVENTS_PER_WINDOW 100000
ENV USE_CUDA True

Expand Down
18 changes: 12 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,9 +250,7 @@ You can self-host the API and access the Swagger UI at [localhost:7860/api/docs]

```bash
docker run --rm \
-e SERVER_PORT=5000 \
-e APP_PORT=7860 \
-e EVENTS_PER_WINDOW=100000 \
-p 7860:7860 \
ghcr.io/winstxnhdw/nllb-api:main
```
Expand All @@ -269,9 +267,7 @@ After creating your permissible cache directory, you can mount it to the contain

```bash
docker run --rm \
-e SERVER_PORT=5000 \
-e APP_PORT=7860 \
-e EVENTS_PER_WINDOW=100000 \
-p 7860:7860 \
-v ./cache:/home/user/.cache \
ghcr.io/winstxnhdw/nllb-api:main
Expand All @@ -286,16 +282,26 @@ You can pass the following environment variables to optimise the API for your ow
```bash
docker run --rm \
-e SERVER_PORT=5000 \
-e APP_PORT=7860 \
-e EVENTS_PER_WINDOW=100000 \
-e OMP_NUM_THREADS=6 \
-e WORKER_COUNT=1 \
-p 7860:7860 \
-v ./cache:/home/user/.cache \
ghcr.io/winstxnhdw/nllb-api:main
```

### Rate Limiting

You can set a rate limit on the number of requests per minute with the following environment variable.

```bash
docker run --rm \
-e APP_PORT=7860 \
-e EVENTS_PER_WINDOW=15 \
-p 7860:7860 \
ghcr.io/winstxnhdw/nllb-api:main
```

### CUDA Support

You can accelerate your inference with CUDA by building and using `Dockerfile.cuda-build` instead.
Expand Down

0 comments on commit fdda767

Please sign in to comment.