Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use subpath for OCI Models #411

Merged
merged 1 commit into from
Nov 5, 2024
Merged

Use subpath for OCI Models #411

merged 1 commit into from
Nov 5, 2024

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Nov 4, 2024

Adding subpath=/models to the Mount command in quadlet

Currently model-cars store AI Model in /models subdir but standard model-raw are storing them in /.

Changing bot model-cars and model-raw to use the same /models directory allows quadlets to support either model.

The big difference between model cars is that they include executables with the image, where a model-raw is just the model.

Also found a crash when doing ramalama list with images without a tag.

Improved tests to check model type flags.

Adding subpath=/models to the Mount command in quadlet

Currently model-cars store AI Model in /models subdir
but standard model-raw are storing them in /.

Changing bot model-cars and model-raw to use the same /models
directory allows quadlets to support either model.

The big difference between model cars is that they include executables
with the image, where a model-raw is just the model.

Also found a crash when doing ramalama list with images without a tag.

Improved tests to check model type flags.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
COPY --from=builder /mnt/models /
COPY {model} /{model_name}
COPY --from=builder /models /models
COPY {model} /models/{model_name}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be possible to annotate this layer from the manifest, or the manifest.config ?

maybe this can be used when you want to pull only the "raw" portion, given a KServe Modelcar format.

i.e. you are pulling quay.io/my/modelcar but in this [ramalama] context, you are interested in only pulling the model layer(s).

wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have studied this more then me, so I am fine with going along with what you want. Can we setup some time to discuss this today?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have working code to generate a new style k8s yaml file, but in a different local PR. It is hung up on the support for subPath in k8s.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have studied this more then me, so I am fine with going along with what you want. Can we setup some time to discuss this today?

this week is summit connect Rome, agenda clogged sorry would prefer async and online from next week. but in short it would be helpful that, since the manifest is built locally, the layer containing just the model is identified by annotation either at the manifest level, or the manifest.config level. wdyt ?

a new style k8s yaml file

not sure what is meant here? 🤔 do you have an example to share please, so to get more context / the idea

Copy link
Member Author

@rhatdan rhatdan Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ ramalama --runtime=vllm serve --name MyGraniteModel --generate=kube oci://quay.io/rhatdan/tiny-car:latest
# Save the output of this file and use kubectl create -f to import
# it into Kubernetes.
#
# Created with ramalama-0.0.4
apiVersion: v1
kind: Deployment
metadata:
  labels:
    app: MyGraniteModel
  name: MyGraniteModel
spec:
  containers:
  - name: MyGraniteModel
    image: quay.io/ramalama/ramalama:latest
    command: ["vllm"]
    args: ['serve', '--port', '8080', 'quay.io/rhatdan/tiny-car:latest']
    ports:
    - containerPort: 8080
    volumeMounts:
    - mountPath: /mnt/models
      name: model
    - mountPath: /dev/dri
      name: dri
  volumes:
  - name model
    image:
      reference: "quay.io/rhatdan/tiny-car:latest"
      pullPolicy: IfNotPresent
      subPath: "/models"
  - name dri
    hostPath:
      path: /dev/dri

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically this kube.yaml could take your model.car design and allow it to be mounted directly into a container/pod, No need to run a separate sleep process or to search throgh /proc/PID ... for the rootfs of the image.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, the KEP-4639 if complemented by subPath will likely benefit similar for KServe deployments, for the time being I think having the models in the /models and linking, plus annotation (not in this PR) would benefit as an ad-interim solution; wdyt ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the goal. OpenShift should support the new way some time in Feb/March.

@rhatdan rhatdan merged commit 31002af into containers:main Nov 5, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants