Skip to content

Latest commit

 

History

History
100 lines (88 loc) · 6.55 KB

README.md

File metadata and controls

100 lines (88 loc) · 6.55 KB

The default model repository of openllm

This repo (on main branch) is already included by openllm by default.

If you want more up-to-date untested models, please add our nightly branch.

openllm repo add nightly https://github.com/bentoml/openllm-models@nightly

Supported Models

$ openllm repo update
$ openllm model list
model              version                                     repo     required GPU RAM    platforms
-----------------  ------------------------------------------  -------  ------------------  -----------
codestral          codestral:22b-v0.1-fp16-c5de                default  80G                 linux
gemma              gemma:2b-instruct-fp16-e2a4                 default  12G                 linux
                   gemma:7b-instruct-fp16-0686                 default  24G                 linux
                   gemma:7b-instruct-awq-4bit-873b             default  12G                 linux
gemma2             gemma2:9b-instruct-fp16-83e0                default  24G                 linux
                   gemma2:27b-instruct-fp16-f15e               default  80G                 linux
jamba1.5           jamba1.5:mini-fp16-84d4                     default  80Gx4               linux
llama2             llama2:7b-chat-fp16-b5d9                    default  16G                 linux
                   llama2:7b-chat-awq-4bit-475c                default  12G                 linux
                   llama2:13b-chat-fp16-4e91                   default  40G                 linux
                   llama2:70b-chat-fp16-9cdd                   default  80Gx2               linux
llama3             llama3:8b-instruct-fp16-74ee                default  24G                 linux
                   llama3:8b-instruct-awq-4bit-fc96            default  12G                 linux
                   llama3:70b-instruct-fp16-e7d3               default  80Gx2               linux
                   llama3:70b-instruct-awq-4bit-a475           default  80G                 linux
llama3.1           llama3.1:8b-instruct-fp16-905c              default  24G                 linux
                   llama3.1:8b-instruct-awq-4bit-374a          default  12G                 linux
                   llama3.1:70b-instruct-fp16-c48a             default  80Gx2               linux
                   llama3.1:70b-instruct-awq-4bit-4fb4         default  80G                 linux
                   llama3.1:405b-instruct-awq-4bit-797f        default  80Gx4               linux
llama3.1-nemotron  llama3.1-nemotron:70b-instruct-fp16-981c    default  80Gx2               linux
llama3.2           llama3.2:1b-instruct-fp16-e7a2              default  12G                 linux
                   llama3.2:1b-instruct-ggml-fp16-linux-60fa   default                      linux
                   llama3.2:1b-instruct-ggml-fp16-darwin-8d35  default                      macos
                   llama3.2:3b-instruct-fp16-eb4d              default  12G                 linux
                   llama3.2:11b-vision-instruct-9ca1           default  80G                 linux
mistral            mistral:7b-instruct-fp16-268f               default  24G                 linux
                   mistral:7b-instruct-awq-4bit-db1e           default  12G                 linux
                   mistral:24b-instruct-nemo-e34c              default  80G                 linux
mistral-large      mistral-large:123b-instruct-fp16-d9ee       default  80Gx4               linux
                   mistral-large:123b-instruct-awq-4bit-399e   default  80G                 linux
mixtral            mixtral:8x7b-instruct-v0.1-fp16-b6ea        default  80Gx2               linux
                   mixtral:8x7b-instruct-v0.1-awq-4bit-ddae    default  40G                 linux
phi3               phi3:3.8b-instruct-fp16-45ff                default  12G                 linux
                   phi3:3.8b-instruct-ggml-q4-463e             default                      macos
pixtral            pixtral:12b-240910-4d87                     default  80G                 linux
qwen2              qwen2:0.5b-instruct-fp16-9f3e               default  12G                 linux
                   qwen2:1.5b-instruct-fp16-8e30               default  12G                 linux
                   qwen2:7b-instruct-fp16-b00f                 default  24G                 linux
                   qwen2:7b-instruct-awq-4bit-7a7b             default  12G                 linux
                   qwen2:57b-a14b-instruct-fp16-eb8e           default  80Gx2               linux
                   qwen2:72b-instruct-fp16-94de                default  80Gx2               linux
                   qwen2:72b-instruct-awq-4bit-6b8b            default  80G                 linux
qwen2.5            qwen2.5:0.5b-instruct-fp16-bb8e             default  12G                 linux
                   qwen2.5:1.5b-instruct-fp16-2af3             default  12G                 linux
                   qwen2.5:3b-instruct-fp16-fc88               default  12G                 linux
                   qwen2.5:7b-instruct-fp16-7b24               default  24G                 linux
                   qwen2.5:14b-instruct-fp16-5fb9              default  80G                 linux
                   qwen2.5:14b-instruct-ggml-q4-darwin-1cf2    default                      macos
                   qwen2.5:14b-instruct-ggml-q8-darwin-f06a    default                      macos
                   qwen2.5:32b-instruct-fp16-7c97              default  80G                 linux
                   qwen2.5:32b-instruct-awq-4bit-45f4          default  40G                 linux
                   qwen2.5:32b-instruct-ggml-fp16-darwin-809c  default                      macos
                   qwen2.5:72b-instruct-fp16-4256              default  80Gx2               linux
                   qwen2.5:72b-instruct-ggml-q4-darwin-a138    default                      macos
qwen2vl            qwen2vl:7b-instruct-fp16-b706               default  24G                 linux

Development Guide

Open PRs to the nightly branch to add new models or update existing models.

You can also fork this repo and add your own models.

Use openllm repo add to use your own model repository.