Tenstorrent Inference Server (tt-inference-server
) is the repo of available model APIs for deploying on Tenstorrent hardware.
https://github.com/tenstorrent/tt-inference-server
Please follow setup instructions found in each model folder's README.md doc
Model | Hardware |
---|---|
LLaMa 3.1 70B | TT-QuietBox & TT-LoudBox |
Mistral 7B | n150 and n300 |