From 5c681a0c10ac7557393fd31b7813d19bea7a9ec5 Mon Sep 17 00:00:00 2001 From: Yi Chen Date: Tue, 9 Apr 2024 20:14:15 +0800 Subject: [PATCH] Update model manage documenation Signed-off-by: Yi Chen --- .../benchmark/benchmark_torchscript.md | 0 docs/model/analyze/index.md | 19 + .../optimize/optimize_torchscript.md | 0 .../profile/1-torchscript-profile-result.jpg | Bin .../profile/profile_torchscript.md | 0 docs/model/index.md | 344 +++++++++++++++++- mkdocs.yml | 2 + 7 files changed, 355 insertions(+), 10 deletions(-) rename docs/model/{ => analyze}/benchmark/benchmark_torchscript.md (100%) create mode 100644 docs/model/analyze/index.md rename docs/model/{ => analyze}/optimize/optimize_torchscript.md (100%) rename docs/model/{ => analyze}/profile/1-torchscript-profile-result.jpg (100%) rename docs/model/{ => analyze}/profile/profile_torchscript.md (100%) diff --git a/docs/model/benchmark/benchmark_torchscript.md b/docs/model/analyze/benchmark/benchmark_torchscript.md similarity index 100% rename from docs/model/benchmark/benchmark_torchscript.md rename to docs/model/analyze/benchmark/benchmark_torchscript.md diff --git a/docs/model/analyze/index.md b/docs/model/analyze/index.md new file mode 100644 index 000000000..6d1672f93 --- /dev/null +++ b/docs/model/analyze/index.md @@ -0,0 +1,19 @@ +# Model Analyze Guide + +Welcome to the Arena Model Analyze Guide! This guide covers how to use the `arena cli` to profile the model to find performance bottleneck, and how to use tensorrt to optimize the inference performance, you can also benchmark the model to get inference metrics like qps, latency, gpu usage and so on. This page outlines the most common situations and questions that bring readers to this section. + +## Who should use this guide? + +After training you may get some models. If you want to know the model performance, and get some guidance to optimize the model if the performance is not meet you requirements, this guide is for you. we have included detailed usages for managing model profile and optimize job. + +## Profile the model + +* How to [profile the pytorch torchscript module](profile/profile_torchscript.md). + +## Optimize the model + +* I want to [optimize the torchscript module with tensorrt](optimize/optimize_torchscript.md). + +## Benchmark the model inference + +* I want to [benchmark the torchscript inference performance](benchmark/benchmark_torchscript.md). diff --git a/docs/model/optimize/optimize_torchscript.md b/docs/model/analyze/optimize/optimize_torchscript.md similarity index 100% rename from docs/model/optimize/optimize_torchscript.md rename to docs/model/analyze/optimize/optimize_torchscript.md diff --git a/docs/model/profile/1-torchscript-profile-result.jpg b/docs/model/analyze/profile/1-torchscript-profile-result.jpg similarity index 100% rename from docs/model/profile/1-torchscript-profile-result.jpg rename to docs/model/analyze/profile/1-torchscript-profile-result.jpg diff --git a/docs/model/profile/profile_torchscript.md b/docs/model/analyze/profile/profile_torchscript.md similarity index 100% rename from docs/model/profile/profile_torchscript.md rename to docs/model/analyze/profile/profile_torchscript.md diff --git a/docs/model/index.md b/docs/model/index.md index 6d1672f93..c61792f4d 100644 --- a/docs/model/index.md +++ b/docs/model/index.md @@ -1,19 +1,343 @@ -# Model Analyze Guide +# Model Manage Guide -Welcome to the Arena Model Analyze Guide! This guide covers how to use the `arena cli` to profile the model to find performance bottleneck, and how to use tensorrt to optimize the inference performance, you can also benchmark the model to get inference metrics like qps, latency, gpu usage and so on. This page outlines the most common situations and questions that bring readers to this section. +Welcome to the Arena Model Manage Guide! This guide covers how to use the `arena model` subcommand to manage registered model and model versions. This page outlines the most common situations and questions that bring readers to this section. -## Who should use this guide? +## Who Should Use this Guide? -After training you may get some models. If you want to know the model performance, and get some guidance to optimize the model if the performance is not meet you requirements, this guide is for you. we have included detailed usages for managing model profile and optimize job. +If you want to use arena to manage models, this guide is for you. We have included detailed usages for managing models. -## Profile the model +## Prerequisites -* How to [profile the pytorch torchscript module](profile/profile_torchscript.md). +Arena now use [MLflow](https://mlflow.org/) as model registry backend, so you first need to run MLflow tracking server with database as storage backend beforehand. See [MLflow Tracking Server](https://mlflow.org/docs/latest/tracking/server.html) for detailed information. -## Optimize the model +## Setup -* I want to [optimize the torchscript module with tensorrt](optimize/optimize_torchscript.md). +### Access MLflow Tracking Server In Non-proxied Mode -## Benchmark the model inference +To access MLflow tracking server in non-proxied mode, you need to set up the `MLFLOW_TRACKING_URI` environment variable as follows: -* I want to [benchmark the torchscript inference performance](benchmark/benchmark_torchscript.md). +```shell +export MLFLOW_TRACKING_URI=http://: +``` + +Replace `` with the hostname or IP address of your MLflow tracking server, and `` with the port number on which the tracking server is listening to. + +### Access MLflow Tracking Server In Proxied Mode + +If you run the MLflow tracking server within a Kubernetes cluster and do not set up the `MLFLOW_TRACKING_URI` environment variable, then Arena will search for services named `ack-mlflow` or `mlflow` across all namespaces and create a model client proxied by Kubernetes API server. If no such service is found, an error will be thrown. If multiple services are found, the first one will be used. + +### Configure Basic Authentication + +When the MLflow tracking server is secured with basic authentication, set up the `MLFLOW_TRACKING_USERNAME` and `MLFLOW_TRACKING_PASSWORD` environment variables to ensure that your MLflow client can authenticate with the tracking server successfully: + +```shell +export MLFLOW_TRACKING_USERNAME= +export MLFLOW_TRACKING_PASSWORD= +``` + +Remember to replace `` and `` with your actual username and password for the MLflow tracking server. + +
+ Warning
+ When accessing MLflow tracking server in proxied mode, basic authentication is not supported because the API server proxy will strip out Authorization HTTP header. +
+ +## Model Management + +### Create a Model Version + +```shell +$ arena model create \ + --name my-model \ + --tags key1,key2=value2 \ + --description "This is some description about my-model" \ + --version-tags key3,key4=value4 \ + --version-description "This is some description about my-model v1" \ + --source pvc://my-pvc/models/my-model/1 +INFO[0000] registered model "my-model" created +INFO[0000] model version 1 for "my-model" created +``` + +### Get a Registered Model or Model Version + +Get a registered model named `my-model`: + +```shell +$ arena model get \ + --name my-model +Name: my-model +LatestVersion 1 +CreationTime: 2024-04-09T19:53:15+08:00 +LastUpdatedTime: 2024-04-09T19:53:15+08:00 +Description: + This is some description about my-model +Tags: + createdBy: arena + key1: + key2: value2 +Versions: + Version Source + --- --- + 1 pvc://my-pvc/models/my-model/1 +``` + +Get model version `1` of registered model named `my-model`: + +```shell +$ arena model get \ + --name my-model \ + --version 1 +Name: my-model +Version: 1 +CreationTime: 2024-04-09T19:53:15+08:00 +LastUpdatedTime: 2024-04-09T19:53:15+08:00 +Source: pvc://my-pvc/models/my-model/1 +Description: + This is some description about my-model v1 +Tags: + createdBy: arena + key4: value4 + key3: +``` + +### List All Registered Models + +```shell +$ arena model list +NAME LATEST_VERSION LAST_UPDATED_TIME +my-model 1 2024-04-09T19:53:15+08:00 +``` + +### Update a Registered Model or Model Version + +Update registered model named `my-model`: + +```shell +$ arena model update \ + --name my-model \ + --description "This is some updated description" \ + --tags key1=updatedValue1,key2=updatedValue2 +INFO[0000] registered model "my-model" updated +``` + +Update version `1` of model named `my-model`: + +```shell +$ arena model update \ + --name my-model \ + --version 1 \ + --version-description "This is some updated description about version 1" \ + --version-tags key3=newValue3,key4=newValue4 +INFO[0000] model version "my-model/1" updated +``` + +If you want to delete tags, do as follows: + +```shell +$ arena model update \ + --name my-model \ + --tags key1-,key2=value2- \ + --version 1 \ + --version-tags key3-,key4=value4- +INFO[0000] registered model "my-model" updated +INFO[0000] model version "my-model/1" updated +``` + +This will delete tag with key `key1` and `key2` of registered model named `my-model` and delete tag `key3` and `key4` of model version `1`. + +### Delete a Registered Model or Model Version + +Delete a registered model named `my-model` with confirmation: + +```shell +$ arena model delete \ + --name my-model +Delete a registered model will cascade delete all its model versions. Are you sure you want to perform this operation? (yes/no) +yes +registered model "my-model" deleted +``` + +Or you can delete a registered model without confirmation by adding `--force` flag: + +```shell +$ arena model delete \ + --name my-model \ + --force +registered model "my-model" deleted +``` + +Delete model version `1` of registered model named `my-model` with confirmation: + +```shell +$ arena model delete \ + --name my-model \ + --version 1 +Are you sure you want to perform this operation? (yes/no) +yes +model version "my-model/1" deleted +``` + +Or you can delete a model version without confirmation by adding `--force` flag: + +```shell +$ arena model delete \ + --name my-model \ + --version 1 \ + --force +model version "my-model/1" deleted +``` + +
+ Warning:
+ Delete a registered model will cascade delete all its model versions, so you should do it carefully. +
+ +## Register a Model Version When Submitting a Training Job + +### Submit a Training Job + +When submitting a training job, you can register a model version at the same time as follows: + +- `--model-name`: The name of the model to be registered. Upon successful submission of the training job, the model (if it doesn't exist) and a new model version will be created. +- `--model-source`: The model source is a URI that specifies the location of the model, for example `s3://my-bucket/path/to/model`, `pvc://namespace/pvc-name/path/to/model`. In this example, the model produced by the training is stored in the `/bloom-560m-sft` directory on the `training-data` pvc in the `default` namespace. + +```shell +$ arena submit pytorchjob \ + --name=bloom-sft \ + --gpus=1 \ + --image=registry.cn-hangzhou.aliyuncs.com/acs/deepspeed:v0.9.0-chat \ + --data=training-data:/model \ + --model-name=my-model \ + --model-source=pvc://default/training-data/bloom-560m-sft \ + "cd /model/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning && bash training_scripts/other_language/run_chinese.sh /model/bloom-560m-sft" +pytorchjob.kubeflow.org/bloom-sft created +INFO[0001] The Job bloom-sft has been submitted successfully +INFO[0001] You can run `arena get bloom-sft --type pytorchjob -n default` to check the job status +INFO[0001] registered model "my-model" created +INFO[0001] model version 1 for "my-model" created +``` + +### Get Information About the Training Job + +By querying information about the training job, we can know that this job is associated with version `1` of model named `my-model`: + +```shell +$ arena get bloom-sft +Name: bloom-sft +Status: PENDING +Namespace: default +Priority: N/A +Trainer: PYTORCHJOB +Duration: 37s +CreateTime: 2024-04-10 16:36:39 +EndTime: +ModelName: my-model +ModelVersion: 1 +ModelSource: pvc://default/training-data/bloom-560m-sft + +Instances: + NAME STATUS AGE IS_CHIEF GPU(Requested) NODE + ---- ------ --- -------- -------------- ---- + bloom-sft-master-0 Pending 37s true 1 N/A +``` + +### Get Information About the Model Version Associated with the Training Job + +```shell +$ arena model get \ + --name my-model \ + --version 1 +Name: my-model +Version: 1 +CreationTime: 2024-04-10T16:36:39+08:00 +LastUpdatedTime: 2024-04-10T16:36:39+08:00 +Source: pvc://default/training-data/bloom-560m-sft +Description: + arena submit pytorchjob \ + --data training-data:/model \ + --gpus 1 \ + --image registry.cn-hangzhou.aliyuncs.com/acs/deepspeed:v0.9.0-chat \ + --model-name my-model \ + --model-source pvc://default/training-data/bloom-560m-sft \ + --name bloom-sft \ + "cd /model/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning && bash training_scripts/other_language/run_chinese.sh /model/bloom-560m-sft" +Tags: + createdBy: arena + arena.kubeflow.org/uid: 3399d840e8b371ed7ca45dda29debeb1 + modelName: my-model +``` + +## Refer a Model Version When Submitting a Serving Job + +### Submit a Serving Job + +When submitting a serving job, you can associate it with a model by specifying `--model-name` and `--model-version` flags. It is necessary to ensure that the model used by the serving job is the one specified. + +```shell +$ arena serve custom \ + --name=bloom-tgi-inference \ + --gpus=1 \ + --version=v1 \ + --replicas=1 \ + --restful-port=8080 \ + --data=training-data:/model \ + --model-name=my-model \ + --model-version=1 \ + --image=text-generation-inference:0.8 \ + "text-generation-launcher --disable-custom-kernels --model-id /model/bloom-560m-sft --num-shard 1 -p 8080" +service/bloom-tgi-inference-v1 created +deployment.apps/bloom-tgi-inference-v1-custom-serving created +INFO[0001] The Job bloom-tgi-inference has been submitted successfully +INFO[0001] You can run `arena serve get bloom-tgi-inference --type custom-serving -n default` to check the job status +``` + +### Get Information About the Serving Job + +By querying information about the serving job, we can know that this job is associated with version `1` of model named `my-model`: + +```shell +$ arena serve get bloom-tgi-inference +Name: bloom-tgi-inference +Namespace: default +Type: Custom +Version: v1 +Desired: 1 +Available: 0 +Age: 7s +Address: 172.16.166.93 +Port: RESTFUL:8080 +ModelName: my-model +ModelVersion: 1 +ModelSource: pvc://default/training-data/bloom-560m-sft + +Instances: + NAME STATUS AGE READY RESTARTS NODE + ---- ------ --- ----- -------- ---- + bloom-tgi-inference-v1-custom-serving-86cc9fb59c-dcxdp Pending 7s 0/1 0 +``` + +### Get Information About the Model Associated With the Serving Job + +```shell +$ arena model get \ + --name my-model \ + --version 1 +Name: my-model +Version: 1 +CreationTime: 2024-04-10T16:36:39+08:00 +LastUpdatedTime: 2024-04-10T16:36:39+08:00 +Source: pvc://default/training-data/bloom-560m-sft +Description: + arena submit pytorchjob \ + --data training-data:/model \ + --gpus 1 \ + --image registry.cn-hangzhou.aliyuncs.com/acs/deepspeed:v0.9.0-chat \ + --model-name my-model \ + --model-source pvc://default/training-data/bloom-560m-sft \ + --name bloom-sft \ + "cd /model/DeepSpeedExamples/applications/DeepSpeed-Chat/training/step1_supervised_finetuning && bash training_scripts/other_language/run_chinese.sh /model/bloom-560m-sft" +Tags: + createdBy: arena + arena.kubeflow.org/uid: 3399d840e8b371ed7ca45dda29debeb1 + modelName: my-model +``` diff --git a/mkdocs.yml b/mkdocs.yml index 095379915..59647a50c 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -9,6 +9,8 @@ nav: - User Guide: - Training Job Guide: training/index.md - Serving Job Guide: serving/index.md + - Model Manage Guide: model/index.md + - Model Analyze Guide: model/analyze/index.md - Display Resource Usage Guide: top/index.md - Supports Multiple Users Guide: multiple-users.md - Isolate Users In Namespace: isolate-users-in-namespace.md