v0.1.0-rc.0
Pre-release
Pre-release
github-actions
released this
12 Sep 06:52
·
2559 commits
to main
since this release
📦 Artifacts
- Binary for Mac OSX M1/M2, with Metal support provided by llama.cpp: tabby_aarch64-apple-darwin.
- Binary for Linux x86/64, CPU only: tabby_x86_64-unknown-linux-gnu.
- Docker image with NVIDIA CUDA Support: tabbyml/tabby:0.1.0-rc.0.
🚀 Features
llama.cpp on Apple M1/M2 for metal inference backend
- feat: llama.cpp for metal support [TAB-146] by @wsxiaoys in #391
- feat: support cancellation in llama backend [TAB-146] by @wsxiaoys in #392
- feat: tune llama metal backend performance by @wsxiaoys in #393
- fix: ensure default suffix to be non-empty by @wsxiaoys in #400
- feat: turn on metal device by default on macosx / aarch64 devices by @wsxiaoys in #398
- feat: implement input truncation with options.max_input_length by @wsxiaoys in #415
- feat: implement input truncation for llama-cpp-bindings by @wsxiaoys in #416
Experimental support of http api backend.
- feat: add vertex api bindings by @wsxiaoys in #410
- feat: add support vertex-ai http bindings by @wsxiaoys in #419
- feat: add support fastchat http bindings by @leiwen83 in #421
🧰 Improvements
- Improve default suffix handling in FIM inference. #400
💫 New Contributors
- @sunny0826 made their first contribution in #399
- @leiwen83 made their first contribution in #421
Full Changelog: v0.0.1...v0.1.0-rc.0