Release Notes

Compare to 7.2 (including features from 8.0b1 and 8.0b2)

Support for Latest Dependencies
- Compatible with the latest protobuf python package which improves serialization latency.
- Support torch 2.4.0, numpy 2.0, scikit-learn 1.5.
Support stateful Core ML models
- Updates to the converter to produce Core ML models with the State Type (new type introduced in iOS18/macOS15).
- Adds a toy stateful attention example model to show how to use in-place kv-cache.
Increase conversion support coverage for models produced by torch.export
- Op translation support is at 56% parity with our mature torch.jit.trace converter
- Representative deep learning models (mobilebert, deeplab, edsr, mobilenet, vit, inception, resnet, wav2letter, emformer) have been supported
- Representative foundation models (llama, stable diffusion) have been supported
- The model quantized by ct.optimize.torch could be exported by torch.export and then convert.
New Compression Features
- coremltools.optimize
  - Support compression with more granularities: blockwise quantization, grouped channel wise palettization
  - 4 bit weight quantization and 3 bit palettization
  - Support joint compression modes (8 bit look-up-tables for palettization, pruning+quantization/palettization)
  - Vector palettization by setting cluster_dim > 1 and palettization with per channel scale by setting enable_per_channel_scale=True.
  - Experimental activation quantization (take a W16A16 Core ML model and produce a W8A8 model)
  - API updates for coremltools.optimize.coreml and coremltools.optimize.torch
- Support some models quantized by torchao (including the ops produced by torchao such as _weight_int4pack_mm).
- Support more ops in quantized_decomposed namespace, such as embedding_4bit, etc.
Support new ops and fixes bugs for old ops
- compression related ops: constexpr_blockwise_shift_scale, constexpr_lut_to_dense, constexpr_sparse_to_dense, etc
- updates to the GRU op
- SDPA op scaled_dot_product_attention
- clip op
Updated the model loading API
- Support optimizationHints.
- Support loading specific functions for prediction.
New utilities in coremltools.utils
- coremltools.utils.MultiFunctionDescriptor and coremltools.utils.save_multifunction, for creating an mlprogram with multiple functions in it, that can share weights.
- coremltools.models.utils.bisect_model can break a large Core ML model into two smaller models with similar sizes.
- coremltools.models.utils.materialize_dynamic_shape_mlmodel can convert a flexible input shape model into a static input shape model.
Various other bug fixes, enhancements, clean ups and optimizations
Special thanks to our external contributors for this release: @sslcandoit @FL33TW00D @dpanshu @timsneath @kasper0406 @lamtrinhdev @valfrom @teelrabbit @igeni @Cyanosite

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coremltools 8.0

Release Notes

Contributors