Olive-ai 0.6.0

trajepl released this 15 May 11:13

Examples

The following examples are added:

Add LLM sample for DirectML #1082 #1106
- This adds an LLM sample for DirectML that can convert and quantize a bunch of LLMs from HuggingFace. The Dolly, Phi and LLaMA 2 folders were removed and replaced with a more generic LLM example that supports a large number of LLMs, including but not limited to Phi-2, Mistral, LLaMA 2
- Add Gemma to DML LLM sample #1138
Llama2 optimization with multi-ep managed env #1087
Llama2: Multi-lora example notebook, Custom generator #1114
Search Optimal optimization among multiple EPs #1092

Olive CLI updates

Previous commands python -m olive.workflows.run and python -m olive.platform_sdk.qualcomm.configure are deprecated. Use olive run or python -m olive instead. #1129

Passes (optimization techniques)

Pytorch
- AutoAWQQuantizer Enable AutoAwq in Olive and provides the capbility for onnx conversion #1080
- SliceGPT: Add support for generic data sets to SliceGPT pass #1145
ONNXRuntime
- ExtractAdapters pass supports int4 quantized models and expose the external data config options to users. #1083
- ModelBuilder: Converts a Huggingface/AML generative PyTorch model to ONNX model using the ONNX Runtime Generative AI >= 0.2.0. #1089 #1073 #1110 #1112 #1118 #1130 #1131 #1141 #1146 #1147 #1154
- OnnxFloatToFloat16: Use ort float16 converter #1132
- NVModelOptQuantization Quantize ONNX model with Nvidia-ModelOpt. #1135
- OnnxIOFloat16ToFloat32: Converts float16 model inputs/outputs to float32. #1149
- [Vitis AI] Make Vitis AI techniques compatible with ORT 1.18 #1140

Data Config

Remove name ambiguity in dataset configuration #1111
Remove HfConfig::dataset references in examples and tests #1113

Engine

Add aml deployment packaging. #1090

System

Make the accelerator EP optional in olive systems for non-onnx pass. #1072

Data

Add AML resource support for data configs.
Add audio classification data preprocess function.

Model

Provide build-in kv_cache_config for generative model's io_config #1121
MLFlow transfrormers models to huggingface format which can be consumed by the passes which need huggingface format. #1150

Metrics

Dependencies:

Support onnxruntime 1.17.3

Issues

Fix code scanning issues. #1078 #1081 #1084 #1085 #1091 #1094 #1103 #1104 #1107 #1126 #1124 #1128

Assets 3