v0.4.0
Release Notes
- Support for new models such as Qwen 2, LLaMA 3.1, Gemma 2, Phi-3 small on CPU
- Support to build already-quantized models that were quantized with AWQ or GPTQ
- Performance improvements for Intel and Arm CPU
- Packing and language binding
- Added Java bindings (build from source)
- Separate OnnxRuntime.dll and directml.dll out of GenAI package to improve usability
- Publish packages for Win Arm
- Support for Android (build from source)