Skip to content

Commit

Permalink
doc: fix some grammar errors
Browse files Browse the repository at this point in the history
  • Loading branch information
caozhuozi committed Dec 24, 2024
1 parent 5f26421 commit 4645c6e
Showing 1 changed file with 13 additions and 17 deletions.
30 changes: 13 additions & 17 deletions docs/spec.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,21 @@
# Model Format Specification

The specification defines an open standard Artifacial Intelligence model. It is defined through the artifact extension based on [the OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification), and extends model features through `artifactType` and `annotations`. Model storage and distribution can be optimized based on artifact extension.
The specification defines an open standard for packaging and distribution Artificial Intelligence models as OCI artifacts, adhering to [the OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification).

The goal of this specification is to package models in an OCI artifact to take advantage of OCI distribution and ensure efficient model deployment.
The goal of this specification is to outline a blueprint and enable the creation of interoperable solutions for packaging and retrieving AI/ML models by leveraging the existing OCI ecosystem, thereby facilitating efficient model management, deployment and serving in cloud-native environments.

The model specification needs to consider two factors:
## Use Cases

1. The model needs to be stored in the OCI registry and display the parameters of the model. So that the model should use
the [artifact extension](https://github.com/opencontainers/image-spec/blob/main/artifacts-guidance.md) to
packaging content other than OCI image specification.
2. The model needs to be mounted by the container runtime as
[read only volumes based on the OCI Artifacts in Kubernetes 1.31+](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/).
Container runtimes can only pull OCI artifact that follow the OCI image specification.

Therefore, the model specification must be defined through the artifact extension based on the [OCI image specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification). It can be better compatible with the kubernetes ecosystem.
* An OCI Registry could storage and manage AI/ML model artifacts with model versions, metadata, and parameters retrievable and displayable.
* A Data Scientist can package models and upload them to a registry, facilitating collaboration with MLOps Engineers while simplifying the deployment
process to efficiently bring models into production.
* A Model Serving/Deployment Platform can understand the AI/ML model format, configuration, and other details, identify the required server runtime
(as well as startup parameters, necessary resources, etc.), and serve the model in Kubernetes by [mounting it directly as a volume source](https://kubernetes.io/blog/2024/08/16/kubernetes-1-31-image-volume-source/)
without the need to pre-download it in an init-container or bundle it within the server runtime container.

## Overview

The model specification is defined through the artifact extension based on the OCI image specification, and extend model features through `artifactType` and `annotations`. Model storage and distribution can be optimized based on artifact extension.
The model specification follows [OCI image format specification](https://github.com/opencontainers/image-spec/blob/main/spec.md#image-format-specification) and leverages its [guidelines for artifacts usage](https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage) for AI/ML models storage and distribution.

![manifest](./img/manifest.svg)

Expand Down Expand Up @@ -61,12 +59,10 @@ The model specification is based on the [OCI image specification](https://github

Implementations MUST support at least the following media types:

- `application/vnd.cnai.model.layer.v1.tar`: The layer is a tarball that contains the model weight file. If the model has multiple weight files,
need to package them in separate layers.
- `application/vnd.cnai.model.layer.v1.tar`: The layer is a tarball that contains the model weight file. If the model has multiple weight files, they should be packaged
into separate layers.
- `application/vnd.cnai.model.layer.v1.tar+gzip`: The layer is a tarball that contains the model weight file and is compressed by gzip.
If the model has multiple weight files, need to package them in separate layers. But recommended package model weight files without compressed to
avoid the container runtime decompressing the model layer. Because the model weight files have been compressed, the container runtime will
cost long time to decompress the model layer.
If the model has multiple weight files, they should be packaged in separate layers. However, it is recommended to package model weight files without compression to avoid the container runtime decompressing the model layer. Since the model weight files are already compressed, the container runtime would take a long time to decompress the model layer.
- `application/vnd.cnai.model.doc.v1.tar`: The layer is a tarball that contains the model documentation file, such as README.md, LICENSE, etc.
- `application/vnd.cnai.model.config.v1.tar`: The layer is a tarball that contains the model configuration file,
such as config.json, tokenizer.json, generation_config.json, etc.
Expand Down

0 comments on commit 4645c6e

Please sign in to comment.