Releases: michaelfeil/infinity
Releases · michaelfeil/infinity
0.0.3
What's Changed
- add Flash-Attention+ optimum-BetterTransformers by @michaelfeil in #20
- Improve real-time / sleep strategy, async await for queues and result futures - reducing latency a bit by @michaelfeil in #12
- add better FIFO queueing strategy - your requests now have a upper bound how long they queue by @michaelfeil in #19
Docs:
- Docs: Update README.md by @michaelfeil in #8
- Update description. Update pyproject.toml by @michaelfeil in #9
- Refactor model dir by @michaelfeil in #10
- Update README.md by @michaelfeil in #14
- Update README.md by @michaelfeil in #15
Full Changelog: 0.0.2rc0...0.0.3
0.0.2
What's Changed
- Docs: Update README.md by @michaelfeil in #8
- Update description. Update pyproject.toml by @michaelfeil in #9
- Refactor model dir by @michaelfeil in #10
- Improve real-time / sleep strategy, async await for queues and result futures by @michaelfeil in #12
Full Changelog: 0.0.1...0.0.2rc0
0.0.1
Initial release of Infinity
0.0.1-dev3
What's Changed
- startup msg, log handling, import by @michaelfeil in #4
- update CI to release pypi by @michaelfeil in #7
Full Changelog: 0.0.1-dev2...0.0.1-dev3
0.0.1-dev2 - Speedups
adds new dependency (orjson) for faster response serialization - 300%
uses torch.inference_mode() and delayed moving to CPU - 10%
adds uvicorn[standard] - slightly faster 2-5%?
Updates readme
0.0.1-dev1
This is a release for testing the CI of Infinity.