Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update to mcad v1.34.1 support and torchx 0.6.0 #790

Merged
merged 1 commit into from
Oct 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 7 additions & 7 deletions guidebooks/kubernetes/mcad/install/mcad.sh
Original file line number Diff line number Diff line change
@@ -1,17 +1,16 @@
WORKDIR=$(mktemp -d) && cd $WORKDIR

GITHUB=github.com
#ORG=IBM
ORG=starpit
ORG=project-codeflare
REPO=multi-cluster-app-dispatcher
#BRANCH=quota-management
BRANCH=002
BRANCH=v1.34.1
LOCAL=multi-cluster-app-dispatcher
SUBDIR=deployment/mcad-controller

echo "Installing Advanced Pod Manager"

# sparse clone
if [ -n "$BRANCH" ]; then BRANCHOPT="-b $BRANCH"; fi
if [ -n "$BRANCH" ]; then BRANCHOPT="-b $BRANCH $LOCAL"; fi
(git clone -q --no-checkout --filter=blob:none https://${GITHUB}/${ORG}/${REPO}.git ${BRANCHOPT} && \
cd $REPO && \
git sparse-checkout set --cone $SUBDIR && git checkout ${BRANCH-master})
Expand All @@ -24,16 +23,17 @@ if [ -n "$CI" ]; then
RESOURCES="--set resources.limits.cpu=200m --set resources.requests.cpu=200m --set resources.limits.memory=750Mi --set resources.requests.memory=750Mi"
fi

IMAGE=darroyo/mcad-controller
IMAGE=quay.io/project-codeflare/mcad-controller
cd $REPO/$SUBDIR &&
helm upgrade --install --wait mcad . \
${KUBE_CONTEXT_ARG_HELM} ${RESOURCES} \
--namespace kube-system \
--set loglevel=4 \
--set image.repository=$IMAGE \
--set image.tag=quota-management-v1.29.40 \
--set image.tag=release-v1.34.1 \
--set image.pullPolicy=IfNotPresent \
--set configMap.name=mcad-controller-configmap \
--set configMap.quotaEnabled='"false"' \
--set configMap.preemptionEnabled='"true"' \
--set coscheduler.rbac.apiGroup="scheduling.sigs.k8s.io" \
--set coscheduler.rbac.resource="podgroups"
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ metadata:
component: ray-head
type: ray
ray-cluster-name: {{ .Values.clusterName }}
appwrapper.mcad.ibm.com: {{ .Values.clusterName }}
appwrapper.workload.codeflare.dev: {{ .Values.clusterName }}
app.kubernetes.io/name: {{ .Values.clusterName }}
app.kubernetes.io/instance: {{ .Values.clusterName }}
app.kubernetes.io/owner: {{ .Values.userName | default "unknown" }}
Expand All @@ -27,7 +27,7 @@ spec:
labels:
component: ray-head
type: ray
appwrapper.mcad.ibm.com: {{ .Values.clusterName }}
appwrapper.workload.codeflare.dev: {{ .Values.clusterName }}
app.kubernetes.io/name: {{ .Values.clusterName }}
app.kubernetes.io/instance: {{ .Values.clusterName }}
app.kubernetes.io/owner: {{ .Values.userName | default "unknown" }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ spec:
labels:
component: ray-worker
type: ray
appwrapper.mcad.ibm.com: {{ .Values.clusterName }}
appwrapper.workload.codeflare.dev: {{ .Values.clusterName }}
app.kubernetes.io/name: {{ .Values.clusterName }}
app.kubernetes.io/instance: {{ .Values.clusterName }}
app.kubernetes.io/owner: {{ .Values.userName | default "unknown" }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

{{- else }}
{{- if .Values.mcad.enabled }}
apiVersion: mcad.ibm.com/v1beta1
apiVersion: workload.codeflare.dev/v1beta1
kind: AppWrapper
metadata:
name: {{ .Values.clusterName }}
Expand Down
2 changes: 1 addition & 1 deletion guidebooks/ml/torchx/install/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,5 +32,5 @@ if which pip3.10; then
fi
--8<-- "./activate.sh"
pip3 --version
pip3 install "torchx[dev]==${TORCHX_PIP_VERSION}"
pip3 install "torchx==${TORCHX_PIP_VERSION}" "kubernetes==${KUBERNETES_PIP_VERSION}"
```
6 changes: 5 additions & 1 deletion guidebooks/ml/torchx/install/path.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
```shell
export TORCHX_PIP_VERSION=${TORCHX_PIP_VERSION-0.5.0}
export TORCHX_PIP_VERSION=${TORCHX_PIP_VERSION-0.6.0}
```

```shell
export KUBERNETES_PIP_VERSION=${KUBERNETES_PIP_VERSION-18.20.0}
```

python 3.9.6 on macOS does not handle spaces in the venv path, and on
Expand Down
2 changes: 1 addition & 1 deletion guidebooks/ml/torchx/run/instance-label.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
```shell
export TORCHX_INSTANCE_LABEL=appwrapper.mcad.ibm.com=${TORCHX_INSTANCE}
export TORCHX_INSTANCE_LABEL=appwrapper.workload.codeflare.dev=${TORCHX_INSTANCE}
```

```shell
Expand Down