Skip to content

Releases: michaelfeil/infinity

0.0.3

30 Oct 13:58
8116680
Compare
Choose a tag to compare

What's Changed

  • add Flash-Attention+ optimum-BetterTransformers by @michaelfeil in #20
  • Improve real-time / sleep strategy, async await for queues and result futures - reducing latency a bit by @michaelfeil in #12
  • add better FIFO queueing strategy - your requests now have a upper bound how long they queue by @michaelfeil in #19

Docs:

Full Changelog: 0.0.2rc0...0.0.3

0.0.2

22 Oct 10:51
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.0.1...0.0.2rc0

0.0.1

12 Oct 16:41
Compare
Choose a tag to compare

Initial release of Infinity

0.0.1-dev3

12 Oct 16:07
Compare
Choose a tag to compare
0.0.1-dev3 Pre-release
Pre-release

What's Changed

Full Changelog: 0.0.1-dev2...0.0.1-dev3

0.0.1-dev2 - Speedups

12 Oct 01:46
3ed24bb
Compare
Choose a tag to compare
0.0.1-dev2 - Speedups Pre-release
Pre-release

adds new dependency (orjson) for faster response serialization - 300%
uses torch.inference_mode() and delayed moving to CPU - 10%
adds uvicorn[standard] - slightly faster 2-5%?
Updates readme

#2

0.0.1-dev1

11 Oct 18:08
Compare
Choose a tag to compare
0.0.1-dev1 Pre-release
Pre-release

This is a release for testing the CI of Infinity.