Command Line Interface

New command line tools have been added and existing tools have been improved.

olive --help works as expected.
auto-opt:
- The command chooses a set of passes compatible with the provided model type, precision and accelerator information.
- New options to split a model, either using --num-splits or --cost-model.

Improvements

ExtractAdapters:
- Support lora adapter nodes in Stable Diffusion unet or text-embedding models.
- Default initializers for quantized adapter to run the model without adapter inputs.
GPTQ:
- Avoid saving unused bias weights (all zeros).
- Set use_exllama to False by default to allow exporting and fine-tuning external GPTQ checkpoints.
AWQ: Patch autoawq to run quantization on newer transformers versions.
Atomic SharedCache operations
New CaptureSplitInfo and Split passes to split models into components. Number of splits can be user provided or inferred from a cost model.
disable_search is deprecated from pass configuration in an olive workflow config.
OrtSessionParamsTuning redone to use olive search features.
OrtModelOptimizer renamed to OrtPeepholeOptimizer and some bug fixes.