Skip to content

v0.1.0-rc.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@github-actions github-actions released this 12 Sep 06:52
· 2559 commits to main since this release
3af32c8

📦 Artifacts

🚀 Features

llama.cpp on Apple M1/M2 for metal inference backend

  • feat: llama.cpp for metal support [TAB-146] by @wsxiaoys in #391
  • feat: support cancellation in llama backend [TAB-146] by @wsxiaoys in #392
  • feat: tune llama metal backend performance by @wsxiaoys in #393
  • fix: ensure default suffix to be non-empty by @wsxiaoys in #400
  • feat: turn on metal device by default on macosx / aarch64 devices by @wsxiaoys in #398
  • feat: implement input truncation with options.max_input_length by @wsxiaoys in #415
  • feat: implement input truncation for llama-cpp-bindings by @wsxiaoys in #416

Experimental support of http api backend.

🧰 Improvements

  • Improve default suffix handling in FIM inference. #400

💫 New Contributors

Full Changelog: v0.0.1...v0.1.0-rc.0