-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use ServingRuntime autoscaling? #374
Comments
@Jooho -- there may be a need to update our docs to clear up some of the confusion :-) |
Thank you. In the meantime...can you tell me how to activate the serving runtimes autoscaling? :D |
I'm sorry for causing confusion and thank you for providing the questions. I will answer each of the questions you have raised.
In order to enable HPA, you can add this annotation for the specific ServingRuntime.
Correct. If you want to enable HPA, you have to remove the replicas from ServingRuntime spec.
HPA uses webhook so you have to update all yaml files Additional Comments: If you have further questions, please let me know. |
@andreapairon -- when you got a chance to try it out, would you be willing to open a PR to update our docs? |
@andreapairon No it does not need knative installation but the cluster has to support metrics. |
We have to update the go code for the latest Kubernetes version, OpenShift v4.13 and K8s v1.26 no beta2 version of HPA (deprecated)
|
That's completed now with #403, but this issue is more about better documenting how to use autoscaling right? Maybe we can either open a new issue re: improving autoscaling documentation, or repurpose/rename this one to better capture that. |
Hi all,
which is the way to correctly use the HPA autoscaling on ServingRuntime?
Should I remove the
replicas
property underspec
?Should I update all the YAML files involved in the "Enable HPA..." commit or use another version of the ModelMesh controller image?
It's not very easy to understand how to use the autoscaling from the scaling documentation page.
The text was updated successfully, but these errors were encountered: