Skip to content

Olive-ai 0.7.1

Compare
Choose a tag to compare
@jambayk jambayk released this 12 Nov 20:57

Command Line Interface

New command line tools have been added and existing tools have been improved.

  • olive --help works as expected.
  • auto-opt:
    • The command chooses a set of passes compatible with the provided model type, precision and accelerator information.
    • New options to split a model, either using --num-splits or --cost-model.

Improvements

  • ExtractAdapters:
    • Support lora adapter nodes in Stable Diffusion unet or text-embedding models.
    • Default initializers for quantized adapter to run the model without adapter inputs.
  • GPTQ:
    • Avoid saving unused bias weights (all zeros).
    • Set use_exllama to False by default to allow exporting and fine-tuning external GPTQ checkpoints.
  • AWQ: Patch autoawq to run quantization on newer transformers versions.
  • Atomic SharedCache operations
  • New CaptureSplitInfo and Split passes to split models into components. Number of splits can be user provided or inferred from a cost model.
  • disable_search is deprecated from pass configuration in an olive workflow config.
  • OrtSessionParamsTuning redone to use olive search features.
  • OrtModelOptimizer renamed to OrtPeepholeOptimizer and some bug fixes.

Examples:

  • Stable Diffusion: New MultiLora Example
  • Phi3: New int quantization example using nvidia-modelopt