Releases: michaelfeil/infinity
0.0.33
What's Changed
- fix-orjson by @michaelfeil in #201
- Add
EngineArray
Multi-Model [1/3] by @michaelfeil in #200 - Openapi tests by @michaelfeil in #199
- refactor
BatchHandler
intoModelWorker
by @michaelfeil in #202 - Add fp32 as runtime dtype by @michaelfeil in #211
Full Changelog: 0.0.32...0.0.33
0.0.32
What's Changed
You can now run a model with a alias. This will help you communicating with the API.
infinity_emb --served-model-name "your_nickname"
You can now use preload
models. This acts as a "run download and load into ram" test. Upon execution, all files are cached, which will speedup consecutive loads. For additonal speedups, use --no-model-warmup
to skip model warmup after loading.
infinity_emb --preload-only --model--name-or-path BAAI/bge-large-en-v1.5
PR's
- feat: add served_model_name argument for the infinity_server by @bufferoverflow in #180
- FIX: import crossencoder without torch installed and git push of creds by @michaelfeil in #181
- update default model_name to be unified name across routes by @michaelfeil in #179
- python39 type hints by @michaelfeil in #182
- pydantic cli / args validation by @michaelfeil in #183
- update defered moving to cpu & type hints improvement by @michaelfeil in #187
- Update README.md - add Contributors by @michaelfeil in #189
- update infinity offline solution by @michaelfeil in #195
- update offline-mode: deployment docs v2 by @michaelfeil in #196
New Contributors
- @bufferoverflow made their first contribution in #180 Thanks!
Full Changelog: 0.0.31...0.0.32
0.0.31
What's Changed
- Create ISSUE_TEMPLATE by @michaelfeil in #168
- bump sentence transformers to v.2.6.0 by @michaelfeil in #169
- Embedding quant by @michaelfeil in #170
- refactor
ENUM..TypeHint
into a function by @michaelfeil in #172 - refactored more imports by @michaelfeil in #171
- redirect to
/docs
and optional imports by @michaelfeil in #175 - update typing by @michaelfeil in #176
- update lock by @michaelfeil in #177
Full Changelog: 0.0.30...0.0.31
0.0.30
What's Changed
- remove fastembed by @michaelfeil in #141
- Sentence transformers bump to 2.5.0 by @michaelfeil in #142
- Revert "Sentence transformers bump to 2.5.0" by @michaelfeil in #143
- Update README.md by @michaelfeil in #145
- update poetry lock - sentence-transformers 2.5.0 by @michaelfeil in #144
- Support for Inferentia2 (draft) by @michaelfeil in #118
- Add bettertransformer to cli by @michaelfeil in #152
- Fp8 support by @michaelfeil in #153
- Some docstring and typing fixes by @lckr in #156
- add async tokenization to reranker in torch by @michaelfeil in #154
- Update README.md by @sherwin684 in #167
New Contributors
- @lckr made their first contribution in #156
- @sherwin684 made their first contribution in #167
Full Changelog: 0.0.29...0.0.30
0.0.29
What's Changed
- OpenAI models compatability and update docs and by @michaelfeil in #140
This will be the last release with fastembed - fastembed and optimum provide similar capabilities. Please use optimum going forward.
Full Changelog: 0.0.28...0.0.29
0.0.28
What's Changed
- add macos ci by @michaelfeil in #133
- Quantization: int8 by @michaelfeil in #134
- add docs via mkdocs by @michaelfeil in #137
Full Changelog: 0.0.27...0.0.28
0.0.27
What's Changed
- BREAKING: EngineArgs by @michaelfeil in #124
- new stable interface. batch-size=32 as default. using michaelfeil/bge-small as default model. You can overwrite the pooling method now.
- multiple os ci and python 3.12 support by @michaelfeil in #131
- add michaelfeil/bge-small as default model by @michaelfeil in #135
Full Changelog: 0.0.26...0.0.27
0.0.26
What's Changed
- hf_transfer is automatically used by @michaelfeil in #112
- add revision to onnx by @michaelfeil in #109
- bump sentence-transformers to 2.4.0 by @michaelfeil in #113
- Adds benchmarking by @michaelfeil in #110
- ONNX/Optimum now works on windows by @michaelfeil in #117
Full Changelog: 0.0.25...0.0.26
0.0.25
What's Changed
- add engine args similar to vllm by @michaelfeil in #102
- ct2 bump by @michaelfeil in #103
- Deps free by @michaelfeil in #104
- fix: cli start by @michaelfeil in #105
Full Changelog: 0.0.24...0.0.25
0.0.24
What's Changed
- Update README.md contribution guidelines by @michaelfeil in #91
- Update tensorrt, onnxruntime, cuda base by @michaelfeil in #93
- Update dependencies by @NirantK in #96
- Torch dynamic shapes by @michaelfeil in #97
- update poetry version + cache in ci by @michaelfeil in #99
- pydantic upgrade by @michaelfeil in #100
- pydantic-v1-backwards-fixes by @michaelfeil in #101
New Contributors
Full Changelog: 0.0.23...0.0.24