Releases: kubeflow/arena
Releases · kubeflow/arena
v0.9.9
Release 0.9.9
Changed
- Update SDK and JAVA SDK Unit test.
- Fix panic when pod started failed.
- Support job set image pull policy.
- Support new training type deepspeed.
- Fix evaluator node selector.
- Fix update serve duplicate create env and toleration.
Please follow the Get started Guide to install.
v0.9.8
Release 0.9.8
Changed
- Support Cron tfjob set ttlAfterFinished.
- Add DeepSpeed base image dockerfile.
- Move policy v1beta1 to v1.
- Fix evaluatejob job yaml in charts.
Please follow the Get started Guide to install.
v0.9.7
Release 0.9.7
Changed
- Support set TTLSecondsAfterFinished in Builder.
Please follow the Get started Guide to install.
v0.9.6
Release 0.9.6
Changed
- Add ownerReference for configmap and tensorboard.
Please follow the Get started Guide to install.
v0.9.5
Release 0.9.5
Changed
- Add imagePullSecret and shareMemory for arena serve.
- Support TTLSecondsAfterFinished.
- Support TFJob StartingDeadlineSeconds.
- Support TFJob/PytorchJob ActiveDeadlineSeconds.
Please follow the Get started Guide to install.
v0.9.4
Release 0.9.4
Changed
- Fix serve update when limits is null.
- Fix arena serve update bug.
- Fix model serve args bug.
- Add toleration dedup for arena serve update.
Please follow the Get started Guide to install.
v0.9.3
Release 0.9.3
Changed
- Optimize arena submit etjob for spot.
- Fix arena serve update bug.
- Fix model serve args bug.
- Add toleration dedup for arena serve update.
Please follow the Get started Guide to install.
v0.9.2
Release 0.9.2
Fixed
- Fix serve triton bugs.
Changed
- Skip to update crd when upgrade arena-artifacts.
- Modify the support method of Toleration.
Added
- Update images and support clean all policy for tfjob.
- Support submit parameters useHostNetwork useHostIPC useHostPID.
- Support for arena serve update.
- Support custom scheduler name.
Please follow the Get started Guide to install.
v0.9.1
Release 0.9.1
Fixed
- Fix the bug that failed to run pytorchjob with RDMA.
- Fix the bug that error dispaly gpu core resources on nodes.
- Fix the bug that add evaluator and tensorboard to pod group.
Changed
- Refact installtion.
- Modify restful-serving to http-serving of deployment services.
- Optimize the operators to omit the Completed jobs into the queue.
Added
- Support modeljob adapts helm3.
- Cron workload supports custom labels.
- Java SDK submits training job with --label.
- Add resource limits for tfjob.
- Add subpathexpr for job .
Please follow the Get started Guide to install.
v0.9.0
Release 0.9.0
Fixed
- Fix the bug arena update serving with specified kubeconfig.
- Fix the bug evaluatejob status not return.
- Fix the bug not set default shell type in arena client.
- Fix the bug of install arena when kubedl-operator existed.
- Fix the bug of mpi-operator crash.
Added
- Add command 'arena model' to support model profile/benchmark/optimize/evaluate before deploy.
- Mark 'arena evaluate' as deprecated as it has merged to 'arena model evaluate'.
- Upgrade git-sync image version to support git token.
- Upgrade arena java sdk to the latest version.
- Support execute shell with custom shell type like sh or bash.
- Support --clean-task-policy for mpijiob.
- Add arena-artifacts to adapt k8s 1.22.
- Support prometheus url token.
- Upgrade the helm version to v3.7.2 and kube client version to v1.23.0.
Please follow the Get started Guide to install.