-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow HPA minReplicas other than 1 while still scaling to 0 #1838
Comments
Ha interesting so basically you want to have a minimum of 5 replicas, but when there is no work you want to scale to 0. So basically decoupling the min replica count from the scale to 0. Is that a good summary? |
@tomkerkhove That is exactly what we want! :-) |
That's viable ask, are you willing to contribute this? |
I am not experienced with developing in Go/Kubernetes modules so if you had the time to do it, I'd prefer that. However, I would have a look at it if you won't be able to work on it in the near future |
Unfortunately I can't promise anything :) So let's see. |
Do you have some hints maybe? Is it as simple as introducing a new config option and use it her if configured? |
It is not just that, but the loop logic needs to be modified as well. @philipp94831 OK, so ping me in two weeks and I'll let you know if I am able to implement this in near future. |
Hey @zroubalik, can you please point out the loop where modifications need to be made along with some changes here. I would be glad if I could contribute. Hoping its not too complex! :) |
@ChayanBansal it might not be needed, but we need to be sure that the loop in here is working correctly with the new feature:
I mean setting the right number of replicas etc, I've been very briefly checking that and it should work as it is implemented currently, but I am not sure. Would be really great if you can contribute this! Thanks :) |
Hi @ChayanBansal, I really appreciate that you would like to contribute to this! Do you think you will be able to implement this? If so, how long do you estimate it to take? |
Hi @zroubalik, do you think you will be able to implement this in the near future? |
@philipp94831 I will give it a try :) |
I am thinking how should we expose this configuration in ScaledObject spec, wrt naming and UX. apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: example-so
spec:
scaleTargetRef:
name: target
pollingInterval: 30
cooldownPeriod: 300
scaleToZero: true/false
minReplicaCount: 5
maxReplicaCount: 100 @philipp94831 @tomkerkhove WDYT? |
I like that :-) I assume that scaleToZero will be true by default? Or at least if |
Yeah, exactly. Let's wait what Tom's opinion is :) |
The only concern I have, if this field should be "hidden' in |
Advanced or so might make sense since by default we should recommend scale to zero indeed; What about |
What about |
Doesn't that imply it will always be scaled to 0? 😁 |
Fair enough 🤷♂️ 😅 But you can look at it this way: always scale to zero, even if minReplicas is > 0 ... I don't know 😄 |
I think the new Wouldn't it be better to keep the meaning of |
The ScaledObject will be validated and an error will be returned in this case. And wrt the confusion, that's why I want to move it to
I was thinking about similar approach using
And IMHO that's way more complex logic that would need to be changed in KEDA, but checking that right know. |
|
So the question is, whether we want to support:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: example-so
spec:
scaleTargetRef:
name: target
minReplicaCount: 5
maxReplicaCount: 100
advanced:
(always)scaleToZero: true/false
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: example-so
spec:
scaleTargetRef:
name: target
minReplicaCount: 2
initialReplicaCount: 5
maxReplicaCount: 100 or apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: example-so
spec:
scaleTargetRef:
name: target
minReplicaCount: 0
initialReplicaCount: 5
maxReplicaCount: 100 I am curious whether usecase |
I think |
Hi @zroubalik, any update on this? :-) |
I think he was mainly waiting for me but I forgot to circle back on this, sorry about that! If we take a step back on what we want to achieve it is defining how many instances we want to have when there is no work to be done. I think |
IMHO we already have that in the form of I was thinking: Would it be possible to take the use case described in #692, which is probably a bit more "generic" (scale down to 0 when there's no work, but scale up immediately when there's work to be done, but not enough to reach the threshold) and make it so that it supports the use case described in this issue as well? Perhaps by defining a apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
name: my-kafka-processor-keda
spec:
scaleTargetRef:
deploymentName: my-kafka-processor
pollingInterval: 30
minReplicaCount: 0
replicaCountIncrement: 5
maxReplicaCount: 10
triggers:
- type: kafka
metadata:
bootstrapServers: awesome.servicebus.windows.net:9093
consumerGroup: my-kafka-worker
topic: my-topic
lagThreshold: "100"
startThreshold: "1" The above basically means:
The |
While scaling up by increments of 5 would allow us to immediately scale from 0 to 5, it would also mean that the next step would be 10 replicas while it would be preferable for us to still then scale normally, i.e., 6 as the next step |
Yeah, that's a different use case completely. |
@tomkerkhove it is not about the number of instances when there is no work to be done, that's clear -> It is about the inital number of replicas (if min is 0) that should be set when there is some work to do, so instead of scaling to 1 it should scale for example to 5. |
That's if you think about how it is today, but if you ignore that and think purely conceptually I think we have it wrong today and |
Are you proposing to add a new field minReplicaCount: 5
maxReplicaCount: 20 means "Scale to 5 when there's no work" ( and idleReplicaCount: 0
minReplicaCount: 5
maxReplicaCount: 20 means "Scale to 0 when there's no work, but scale to a minimum of 5 when there is work" ? |
Yeah, the other proposal is not to change the meaning of minReplicaCount: 0
initialReplicaCount: 5
maxReplicaCount: 20 Or the new property could be named |
This sounds like what I had in mind as well, +1 on this |
I agree that this approach is easier to explain than All in all I do think this use case is a bit too specific/exotic to warrant a toggle in the spec. Another wild idea would be to allow for dynamic values in |
I do not agree since you might want to always have 1 instance up and running rather than 0 to have faster scaling/uptime. A simple scenario is cluster capacity running out and you want to scale from 0 to 1 but you can't since the cluster autoscaler still needs to kick in so you have to wait, which could have platform impact. |
So should I go with the |
I would introduce replicas with 3 fields underneath it and deprecate current but might be overkilling it |
That's overkill imho and this would break 99,9% of currently deployed and defined ScaledObject. Optional |
It would be backwards compatible though but just flat is fine for me for now as well |
@tomkerkhove should |
I'd say keep them next to the rest, in our next major version we'll move all of them to |
Thank you so much @zroubalik 🙏 |
@philipp94831 you can give a try if you want to and give me a feedback, just use images with |
Hi @zroubalik, due to vacation we only now had the time to test it. We switched to |
@philipp94831 you need to install the new CRD as well, so your k8s cluster knows the updated ScaledObject : https://github.com/kedacore/keda/blob/main/config/crd/bases/keda.sh_scaledobjects.yaml |
@zroubalik Ah yes. We install KEDA using helm. Is the chart also published in a |
@philipp94831 You will have to manually update the CRD, the rest of the resouces could be from the Helm. |
@zroubalik Everything works perfectly and as expected, thanks again! |
@philipp94831 glad to hear that! |
@zroubalik Do you have any timeline on making a stable release? |
A few weeks at maximum. |
Proposal
We currently have a scenario where we need to scale our deployments to 0 or have at least, e.g., 5 replicas available. We were not able to achieve this by overriding the HPA scale behavior. In the code, the HPA minReplicas are always set to 1 if we want to scale to 0.
Therefore, we propose to be able to explicitly configure HPA minReplicas.
Use-Case
We are running Kafka Streams apps on Kubernetes and automatically scale them using KEDA. If there is no message lag, we can safely scale to 0. However, when process messages, we use Kafka Streams state stores, which are loaded into the memory of our pods. Because the resources of a single pod are insufficient, we replicate our deployment and thus distribute the state and require less resources per pod.
Anything else?
No response
The text was updated successfully, but these errors were encountered: