-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Kubeflow Serverless Serving CRD #2306
Comments
@cliveseldon would be great to get your input on this. |
@ellis-bigelow https://github.com/nuclio/nuclio |
@ellis-bigelow some initial thoughts from someone working on Seldon
|
@cliveseldon thanks for the input! I wonder if, at a high level, it would make sense to be able to drop in a "Kubeflow Service" for a "Celdon Model". We both want to support GRPC/REST, canarying, istio routing, etc. It seems to me that your controller is responsible for instantiating the actual server infrastructure, so there would need to be some sort of inversion of control. Your controller would need to have some awareness of a Kubeflow service, but I imagine it wouldn't be too difficult for the graph to recognize it via annotations and route to it via Istio VirtualServices instead of creating the resources itself. Knative is scaled to zero by default, so it wouldn't bring up and resources until traffic starts to flow (i.e., being routed by Seldon). Does this seem like a workable pattern? Would this approach prevent any of Seldon's features like Drift, outliers, transformations? Responses to your points below. #1) The knative team explicitly removed some of the complexity from the podtemplatespec. I haven't seen data to give me particularly strong feelings either way, so I've been deferring to their judgement. What pieces of the PodTemplateSpec are you seeing being used by customers that isn't supported by the Knative API? Off the top of my head, volumes might be an issue. I'm all for enabling these by bypassing knative with an admission controller. #2) Seldon must have some interface that your deployments conform to. Can you outline that for me? #3) We absolutely want to encapsulate some of these features into the CRD if possible. As with yours, it should be framework agnostic. #4) The interface is at the Istio mesh. I'm architecturally agnostic to Ambassador vs Istio vs other ingress (though I've had performance issues with Ambassador in the recent past). #5) Yep #6) I think this is one of the most critical pieces. It may make even more sense for Seldon given that nodes in your graph will likely have very different scaling characteristics. |
@ellis-bigelow Thanks for your detailed comments. To start with the more particular questions:
Be great to get further feedback when you have looked at our architecture some more. In general, my present thinking is along your lines that the Seldon CRD Operator creates underlying KNative Service CRDs if running in that environment. Feel free to connect directly to me on Kubeflow/Seldon Slack for further quick chats. |
@cliveseldon that overview and links really helped my understanding. Let's connect over slack late this week/early next and then follow up in here. I'd love to learn about:
|
Any further thoughts on this proposal? Keep it open or close it? |
What is Kubeflow's process regarding open/closing? I'm still coordinating with MSFT and Bloomberg engineers on this proposal. |
This proposal has now spawned the project: https://github.com/kubeflow/kfserving. Follow from there. |
Note: See the latest at https://github.com/kubeflow/kfserving
Hi Kubeflow Community,
I've been toying around with Model Serving with Knative. I initially prototyped a ksonnet component for serving models using arbitrary model servers and found it to be quite cumbersome, both from a development perspective, and from a consumption perspective. The number of yaml files and if statements are non-trivial and I thought there must be a better approach.
The high level idea is that we should be able to distill all of the Kubernetes details down into a few ML specific parameters and package that concept as a Kubernetes CRD. This CRD can serve as a building block for integration into things like ML Pipelines and Model Microservice Architectures, built on the backbone of Istio and Knative.
With that, I'll delegate the rest of the details to the doc:
https://docs.google.com/document/d/1_s8CYdhlrQRu4BX2m7adQhVt_OTr4WSZXgUY0Z77GzY
A prototype is available here:
https://github.com/ellis-bigelow/serving
The text was updated successfully, but these errors were encountered: