-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add an introduction readme (#6)
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
- Loading branch information
Showing
7 changed files
with
73 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,37 @@ | ||
# CNAI Model Format Specification | ||
# CNAI Model Specification Proposal | ||
|
||
[![LICENSE](https://img.shields.io/github/license/CloudNativeAI/model-spec.svg?style=flat-square)](https://github.com/CloudNativeAI/model-spec/blob/main/LICENSE) | ||
[![GoDoc](https://godoc.org/github.com/CloudNativeAI/model-spec?status.svg)](https://godoc.org/github.com/CloudNativeAI/model-spec) | ||
|
||
The Cloud Native Artifacial Intelegence(CNAI) Model Format Specification is a specification for a model format that is designed to be used in cloud native environments. | ||
The Cloud Native Artifacial Intelegence(CNAI) Model Specification aims to provide a standard way to package, distribute and run AI models in a cloud native environment. | ||
|
||
For details, see the [specification](docs/v1/spec.md). | ||
## Rationale | ||
|
||
Looking back in history, there are clear trends in the evolution of infrastructure. At first, there is the machine centric infrastructure age. GNU/Linux was born there and we saw a boom of Linux distributions then. Then comes the Virtual Machine centric infrastructure age, where we saw the rise of cloud computing and the development of virtualization technologies. The third age is the container centric infrastructure, and we saw the rise of container technologies like Docker and Kubernetes. The fourth age, which has just begun, is the AI model centric infrastructure age, where we will see a burst of technologies and projects around AI model development and deployment. | ||
|
||
![img](docs/img/infra-trends.png) | ||
|
||
Each of the new ages has brought new technologies and new ways of thinking. The container centric infrastructure has brought us the OCI image specification, which has become the standard for packaging and distributing software. The AI model centric infrastructure will bring us new ways of packaging and distributing AI models. The model specification is an attempt to define a standard to help package, distribute and run AI models in a cloud native environment. | ||
|
||
## Current Work | ||
|
||
There are two versions of specifications proposed, both of which are under development: | ||
|
||
* v1: The first version of the specification, provides a compatible way to package and distribute models based on the current [OCI image specification](https://github.com/opencontainers/image-spec/) and [the artifacts guidelines](https://github.com/opencontainers/image-spec/blob/main/manifest.md#guidelines-for-artifact-usage). For compatibility reasons, it only contains part of the model metadata, and handles model artifacts as opaque binaries. However, it provides a convient way to package AI models in the container image format and can be used as [OCI volume sources](https://github.com/kubernetes/enhancements/issues/4639) in Kubernetes environments. | ||
* v2: The second version of the specification, in a pretty early stage, includes a model image specification and a model runtime specification. The model image specification packages models with details like model artifacts, metadata, configuration, and runtime environment. The model runtime specification defines how to run the packaged models in a cloud native environment. It builds a foundation for promoting AI models as a first-class citizen in the cloud native ecosystem, and let users build once and run anywhere. | ||
|
||
We consider the two versions incremental steps toward a standard model specification. The v1 specification is a simple and compatible way to package AI models in the container image format, while the v2 specification is a more comprehensive and cloud native way to package, distribute, and run AI models. | ||
|
||
For details, please see [the v1 specification](docs/v1/spec.md) and [the v2 specification introduction](docs/v2/intro.md). | ||
|
||
## LICENSE | ||
|
||
Apache 2.0 License. Please see [LICENSE](LICENSE) for more information. | ||
|
||
## Contributing | ||
|
||
Any feedback, suggestions, and contributions are welcome. Please feel free to open an issue or pull request. | ||
|
||
Especially, we look forward to integrating the model specification with different model registry implementations (like [Harbor](https://goharbor.io/) and [Kubeflow model registry](https://www.kubeflow.org/docs/components/model-registry/overview/)), as well as existing model centric infrastructure projects like [Kubeflow](https://www.kubeflow.org/), [ollama](https://github.com/ollama/ollama), [Huggingface](https://huggingface.co/), [Lepton](https://www.lepton.ai/), and others. | ||
|
||
Enjoy! |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
File renamed without changes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Model Specification Version 2 | ||
|
||
## Overview | ||
|
||
The core of the v2 model specification is the definition of the model artifact, metadata and runtime environment. | ||
|
||
The model artifact is a collection of files that represent the AI model. It consists of the model configuration, model weights, model tokenizer, and other model resources. | ||
|
||
The model metadata is general information about the model, such as the model name, version, model family, description, author, license, and architecture. A model registry can parse the model metadata to display the model information. | ||
|
||
The model runtime environment is the environment in which the model runs. It includes the inference engine information, such as verion, configuration, dependencies, and environment variables. | ||
|
||
The model artifact, metadata and runtime environment are organized in a model manifest, which is a JSON file that describes the model. The model manifest is used to package and distribute the model, and can be stored in a model registry and downloaded by a model runtime. | ||
|
||
With a proper defined model specification, we can package AI models of a model repository into a model image, and push the model image to the model registry. The model image can be pulled and run by the model runtime, either as a standalone package or as a readonly volume source in a container. | ||
|
||
## Goals | ||
|
||
The goals of developing the model specification are: | ||
|
||
* To provide a way for developers to package and distribute AI models in a cloud native environment. | ||
* To promote AI models as a first-class citizen and pave the way for the infrastructure to be organized around AI models. | ||
* To define general model artifact, metadata, and runtime environment, so that the model can be easily understood and managed by any components of the infrastructure. | ||
* To define a general model format description to allow easy integration of models with model runtimes. | ||
|
||
## Non-Goals | ||
|
||
* To build standard interfaces for model management tools to build, distribute, manage, and run AI models. | ||
|
||
The model specification is designed to be a foundation for building standard interfaces to build, distribute, manage, and run AI models. But the model specification itself does not define such standard interfaces. | ||
|
||
## Plans | ||
|
||
The model specification is still pretty rough. It is a living document and will evolve over time. Future work includes: | ||
|
||
* Figure out the details of AI model artifact, metadata, and runtime environment. | ||
* Define a general transformer architecture abstraction to support build once and run everywhere of LLMs. | ||
* Develop tools to build and save AI models in a model registry. | ||
* Develop tools to pull and run AI models in a model runtime. | ||
* Modify [vllm](https://github.com/vllm-project/vllm) to support the model specification and run any transformer architecture LLMs without modification. |