Skip to content

2.2.5 Backend: Aphrodite Engine

av edited this page Sep 14, 2024 · 2 revisions

Handle: aphrodite URL: http://localhost:33921

aphrodite

PygmalionAI's large-scale inference engine

Starting

# [Optional] pre-pull the image, ~5GB
harbor pull aphrodite

# Start the service
harbor up aphrodite

# [Optional] When loading closed/gated models
# provision the token
harbor hf token <your-token>

Models

# Open HF Search to find the models
harbor find gptq awq

# Download model repo to the global HF cache
# user/repo format
harbor hf download infly/INF-34B-Chat-AWQ

# Get/set the model to run
# in the aphrodite engine
harbor aphrodite model infly/INF-34B-Chat-AWQ

Configuration

# See available options
harbor run aphrodite --help

# Get/Set the extra arguments for
# the aphrodite engine
harbor aphrodite args
Clone this wiki locally