Releases: TabbyML/tabby
Releases Β· TabbyML/tabby
v0.2.1
π Features
Chat Model & Web Interface
We have introduced a new argument, --chat-model
, which allows you to specify the model for the chat playground located at http://localhost:8080/playground
To utilize this feature, use the following command in the terminal:
tabby serve --device metal --model TabbyML/StarCoder-1B --chat-model TabbyML/Mistral-7B
ModelScope Model Registry
Mainland Chinese users have been facing challenges accessing Hugging Face due to various reasons. The Tabby team is actively working to address this issue by mirroring models to a hosting provider in mainland China called modelscope.cn.
# Download from the Modelscope registry
TABBY_REGISTRY=modelscope tabby download --model TabbyML/WizardCoder-1B
π§° Fixes and improvements
- Implemented more accurate UTF-8 incremental decoding in the GitHub pull request.
- Fixed the stop words implementation by utilizing RegexSet to isolate the stop word group.
- Improved model downloading logic; now Tabby will attempt to fetch the latest model version if there's a remote change, and the local cache key becomes stale.
- set default num_replicas_per_device for ctranslate2 backend to increase parallelism.
π« New Contributors
Full Changelog: v0.1.2...v0.2.1
v0.2.0-rc.0
v0.2.0-rc.0
v0.2.0
fix: playground environment misconfig
v0.1.2
Patch Release
- docs: add model spec (unstable) version in #457
- docs: update vim documentation in #453
- fix(tabby): fix swagger's local server use local port in #458
- feat: Update Dockerfile to ctranslate 3.20.0 in #460
New Contributors
Full Changelog: v0.1.1...v0.1.2
v0.1.1
πΊ Homebrew support (Apple M1/M2)
brew install tabbyml/tabby/tabby
# Start with StarCoder-1B
tabby serve --device metal --model TabbyML/StarCoder-1B
π¦ Artifacts
- Binary for Mac OSX M1/M2, with Metal support: tabby_aarch64-apple-darwin.
- Binary for Linux x86/64, CPU only: tabby_x86_64-unknown-linux-gnu.
- Docker image with NVIDIA CUDA Support: tabbyml/tabby:0.1.1.
π Features
- Metal inference backend for Apple M1/M2: #391
- StarCoder model series support: ggerganov/llama.cpp#3187
- Experimental support of http api backend: in #410 #419
π§° Improvements
- Improve default suffix handling in FIM inference. #400
π« New Contributors
- @sunny0826 made their first contribution in #399
- @leiwen83 made their first contribution in #421
Full Changelog: v0.0.1...v0.1.1
v0.1.0-rc.1
feat: add LLAMA_CPP_LOG_LEVEL to control log level of llama.cpp (#436)
v0.1.0-rc.0
π¦ Artifacts
- Binary for Mac OSX M1/M2, with Metal support provided by llama.cpp: tabby_aarch64-apple-darwin.
- Binary for Linux x86/64, CPU only: tabby_x86_64-unknown-linux-gnu.
- Docker image with NVIDIA CUDA Support: tabbyml/tabby:0.1.0-rc.0.
π Features
llama.cpp on Apple M1/M2 for metal inference backend
- feat: llama.cpp for metal support [TAB-146] by @wsxiaoys in #391
- feat: support cancellation in llama backend [TAB-146] by @wsxiaoys in #392
- feat: tune llama metal backend performance by @wsxiaoys in #393
- fix: ensure default suffix to be non-empty by @wsxiaoys in #400
- feat: turn on metal device by default on macosx / aarch64 devices by @wsxiaoys in #398
- feat: implement input truncation with options.max_input_length by @wsxiaoys in #415
- feat: implement input truncation for llama-cpp-bindings by @wsxiaoys in #416
Experimental support of http api backend.
- feat: add vertex api bindings by @wsxiaoys in #410
- feat: add support vertex-ai http bindings by @wsxiaoys in #419
- feat: add support fastchat http bindings by @leiwen83 in #421
π§° Improvements
- Improve default suffix handling in FIM inference. #400
π« New Contributors
- @sunny0826 made their first contribution in #399
- @leiwen83 made their first contribution in #421
Full Changelog: v0.0.1...v0.1.0-rc.0
v0.1.0
π¦ Artifacts
- Binary for Mac OSX M1/M2, with Metal support provided by llama.cpp: tabby_aarch64-apple-darwin.
- Binary for Linux x86/64, CPU only: tabby_x86_64-unknown-linux-gnu.
- Docker image with NVIDIA CUDA Support: tabbyml/tabby:0.1.1.
π Features
- Metal inference backend for Apple M1/M2: #391
- StarCoder model series support: ggerganov/llama.cpp#3187
- Experimental support of http api backend: in #410 #419
π§° Improvements
- Improve default suffix handling in FIM inference. #400
π« New Contributors
- @sunny0826 made their first contribution in #399
- @leiwen83 made their first contribution in #421
Full Changelog: v0.0.1...v0.1.0
v0.0.1
π¦ Artifacts
- Binary for Mac OSX M1/M2, CPU only: tabby_aarch64-apple-darwin.
- Binary for Linux x86/64, CPU only: tabby_x86_64-unknown-linux-gnu.
- Docker image with NVIDIA CUDA Support: tabbyml/tabby:v0.0.1.
π Features
- Support FIM inference.
- Initial support of indexing with
tabby scheduler
. - Support CodeLlama series model.
π§° Improvements
- Support early cancellation to reduce GPU workload, increasing capacity of tabby server.
π« New Contributors
- @Allesanddro made their first contribution in #233
- @prologic made their first contribution in #250
- @pauchiner made their first contribution in #253
- @ghthor made their first contribution in #255
- @sandrofigo made their first contribution in #353
v0.0.1-rc.2
π¦ Artifacts
- Binary for Mac OSX M1/M2, CPU only: tabby_aarch64-apple-darwin.
- Binary for Linux x86/64, CPU only: tabby_x86_64-unknown-linux-gnu.
- Docker image with NVIDIA CUDA Support: tabbyml/tabby:0.0.1-rc.2.
π Features
- Support FIM inference.
- Initial support of indexing with
tabby scheduler
. - Support CodeLlama series model.
π§° Improvements
- Support early cancellation to reduce GPU workload, increasing capacity of tabby server.
π« New Contributors
- @Allesanddro made their first contribution in #233
- @prologic made their first contribution in #250
- @pauchiner made their first contribution in #253
- @ghthor made their first contribution in #255
- @sandrofigo made their first contribution in #353