Olive-ai 0.7.1
Command Line Interface
New command line tools have been added and existing tools have been improved.
olive --help
works as expected.auto-opt
:- The command chooses a set of passes compatible with the provided model type, precision and accelerator information.
- New options to split a model, either using
--num-splits
or--cost-model
.
Improvements
ExtractAdapters
:- Support lora adapter nodes in Stable Diffusion unet or text-embedding models.
- Default initializers for quantized adapter to run the model without adapter inputs.
GPTQ
:- Avoid saving unused bias weights (all zeros).
- Set
use_exllama
toFalse
by default to allow exporting and fine-tuning external GPTQ checkpoints.
AWQ
: Patch autoawq to run quantization on newer transformers versions.- Atomic
SharedCache
operations - New
CaptureSplitInfo
andSplit
passes to split models into components. Number of splits can be user provided or inferred from a cost model. disable_search
is deprecated from pass configuration in an olive workflow config.OrtSessionParamsTuning
redone to use olive search features.OrtModelOptimizer
renamed toOrtPeepholeOptimizer
and some bug fixes.
Examples:
- Stable Diffusion: New MultiLora Example
- Phi3: New int quantization example using
nvidia-modelopt