- Move away from static GKE version and use RAPID release default.
0.4.0 - 2024-04-17
- Enable Autoprovisioning support.
- Support deletion of subnets when cluster is deleted.
- Integrate Vertex AI functionality to create Vertex AI Tensorboard and upload logs in Tensorboard directory to Vertex AI Tensorboard.
- Integrate Pathways.
- Add retry logic to Kueue, jobset and cluster credential steps.
- Bump Kueue version to 0.6.1.
- Add XPK inspector.
0.3.0 - 2024-02-26
- Bump Jobset version to 0.3.2.
- Bump Kueue version to 0.6.0.
- Add single GPU support. Multislice, and A3 GPU optimizations in progress.
- Add CPU single and multislice support to XPK. Tested up to 1500 VMs.
- Fail workload creation early if the cluster doesn't have that resource type.
- Enable multiple workload deletion in parallel based on cluster name, and filters.
- Add
--enable-debug-logs
to workload create to add debug logs to a workload. - Support SIGTERM handling in xpk workload command, and propagate exit code from user jobs to cloud composer UI.
- Add sidecar container to display stack traces, and README details.
- Add label for xpk initiated TPU pods.
0.2.0 - 2023-12-07
- Add a reservation exists check and provide help if this errors
- Add error message and self-help instructions to readme for troubleshooting problems
- Add v5p support
- Add xpk cluster create flags for reservation/on-demand/spot
- Change GKE version to 1.28.3-gke.1286000
- Change cpu node pool defaults to be better adapted to demand
- Fix empty results from filter-by-status=QUEUED / FAILED / RUNNING
- Fix parallel execution of node pool commands (concurrent ops)
- Fix pip-changelog to the wrong package
0.1.0 - 2023-11-17
- Initial release of xpk PyPI package