Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow HPA minReplicas other than 1 while still scaling to 0 #1838

Closed
philipp94831 opened this issue May 29, 2021 · 53 comments · Fixed by #1958
Closed

Allow HPA minReplicas other than 1 while still scaling to 0 #1838

philipp94831 opened this issue May 29, 2021 · 53 comments · Fixed by #1958
Assignees
Labels
feature-request All issues for new features that have not been committed to needs-discussion

Comments

@philipp94831
Copy link

Proposal

We currently have a scenario where we need to scale our deployments to 0 or have at least, e.g., 5 replicas available. We were not able to achieve this by overriding the HPA scale behavior. In the code, the HPA minReplicas are always set to 1 if we want to scale to 0.

Therefore, we propose to be able to explicitly configure HPA minReplicas.

Use-Case

We are running Kafka Streams apps on Kubernetes and automatically scale them using KEDA. If there is no message lag, we can safely scale to 0. However, when process messages, we use Kafka Streams state stores, which are loaded into the memory of our pods. Because the resources of a single pod are insufficient, we replicate our deployment and thus distribute the state and require less resources per pod.

Anything else?

No response

@philipp94831 philipp94831 added feature-request All issues for new features that have not been committed to needs-discussion labels May 29, 2021
@tomkerkhove
Copy link
Member

Ha interesting so basically you want to have a minimum of 5 replicas, but when there is no work you want to scale to 0. So basically decoupling the min replica count from the scale to 0.

Is that a good summary?

@philipp94831
Copy link
Author

@tomkerkhove That is exactly what we want! :-)

@zroubalik
Copy link
Member

That's viable ask, are you willing to contribute this?

@philipp94831
Copy link
Author

That's viable ask, are you willing to contribute this?

I am not experienced with developing in Go/Kubernetes modules so if you had the time to do it, I'd prefer that. However, I would have a look at it if you won't be able to work on it in the near future

@zroubalik
Copy link
Member

Unfortunately I can't promise anything :) So let's see.

@philipp94831
Copy link
Author

Do you have some hints maybe? Is it as simple as introducing a new config option and use it her if configured?

@zroubalik
Copy link
Member

It is not just that, but the loop logic needs to be modified as well.

@philipp94831 OK, so ping me in two weeks and I'll let you know if I am able to implement this in near future.

@ChayanBansal
Copy link

ChayanBansal commented Jun 12, 2021

Hey @zroubalik, can you please point out the loop where modifications need to be made along with some changes here. I would be glad if I could contribute. Hoping its not too complex! :)

@zroubalik
Copy link
Member

@ChayanBansal it might not be needed, but we need to be sure that the loop in here is working correctly with the new feature:

func (e *scaleExecutor) RequestScale(ctx context.Context, scaledObject *kedav1alpha1.ScaledObject, isActive bool) {

I mean setting the right number of replicas etc, I've been very briefly checking that and it should work as it is implemented currently, but I am not sure.

Would be really great if you can contribute this! Thanks :)

@philipp94831
Copy link
Author

Hi @ChayanBansal, I really appreciate that you would like to contribute to this! Do you think you will be able to implement this? If so, how long do you estimate it to take?

@philipp94831
Copy link
Author

It is not just that, but the loop logic needs to be modified as well.

@philipp94831 OK, so ping me in two weeks and I'll let you know if I am able to implement this in near future.

Hi @zroubalik, do you think you will be able to implement this in the near future?

@zroubalik
Copy link
Member

@philipp94831 I will give it a try :)

@zroubalik zroubalik self-assigned this Jun 24, 2021
@zroubalik
Copy link
Member

I am thinking how should we expose this configuration in ScaledObject spec, wrt naming and UX. scaleToZero boolean field?

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: example-so
spec:
  scaleTargetRef:
    name:  target
  pollingInterval: 30                        
  cooldownPeriod:  300                            
  scaleToZero: true/false
  minReplicaCount: 5                     
  maxReplicaCount: 100  

@philipp94831 @tomkerkhove WDYT?

@philipp94831
Copy link
Author

I like that :-) I assume that scaleToZero will be true by default? Or at least if minReplicaCount is 0? So that nothing changes for users that used minReplicaCount: 0

@zroubalik
Copy link
Member

Yeah, exactly. Let's wait what Tom's opinion is :)

@zroubalik
Copy link
Member

The only concern I have, if this field should be "hidden' in spec.advanced section.

@tomkerkhove
Copy link
Member

Advanced or so might make sense since by default we should recommend scale to zero indeed;

What about allowScaleToZero or useScaleToZero though? Otherwise it's not clear what the intent is, maybe? (I think things through too much 😂)

@zroubalik
Copy link
Member

What about alwaysScaleToZero ?

@tomkerkhove
Copy link
Member

Doesn't that imply it will always be scaled to 0? 😁

@zroubalik
Copy link
Member

Fair enough 🤷‍♂️ 😅 But you can look at it this way: always scale to zero, even if minReplicas is > 0 ...

I don't know 😄

@svmaris
Copy link

svmaris commented Jun 24, 2021

I think the new allowScaleToZero of scaleToZero boolean in the spec will cause confusion and allows for weird combinations. What if minReplicas is set to 0, but scaleToZero is false?

Wouldn't it be better to keep the meaning of minReplicas: 0 to be "allow scale to zero" and introduce a new field that specifies the minimum number of replicas you want to have running when there's actually work to be done? Like for example initialReplicas / defaultReplicas / baselineReplicas ?

@zroubalik
Copy link
Member

zroubalik commented Jun 24, 2021

I think the new allowScaleToZero of scaleToZero boolean in the spec will cause confusion and allows for weird combinations. What if minReplicas is set to 0, but scaleToZero is false?

The ScaledObject will be validated and an error will be returned in this case. And wrt the confusion, that's why I want to move it to spec.advanced section, with a solid documentation.

Wouldn't it be better to keep the meaning of minReplicas: 0 to be "allow scale to zero" and introduce a new field that specifies the minimum number of replicas you want to have running when there's actually work to be done? Like for example initialReplicas / defaultReplicas / baselineReplicas ?

I was thinking about similar approach using initialReplicas, but then we will need to cover cases where minReplicas != 0 for example:

minReplicas = 2
initialReplicas = 5 
maxReplicas = 10

And IMHO that's way more complex logic that would need to be changed in KEDA, but checking that right know.

@tomkerkhove
Copy link
Member

allowScaleToZero of scaleToZero are fine for me

@zroubalik
Copy link
Member

zroubalik commented Jun 24, 2021

So the question is, whether we want to support:

  1. scale to zero with minReplicas > 0 :
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: example-so
spec:
  scaleTargetRef:
    name:  target                
  minReplicaCount: 5                     
  maxReplicaCount: 100  
  advanced:
    (always)scaleToZero: true/false
  1. arbitrary minReplicas with initialReplicas approach, allowing more flexibility:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: example-so
spec:
  scaleTargetRef:
    name:  target                
  minReplicaCount: 2
  initialReplicaCount: 5                   
  maxReplicaCount: 100  

or

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: example-so
spec:
  scaleTargetRef:
    name:  target                
  minReplicaCount: 0
  initialReplicaCount: 5                   
  maxReplicaCount: 100  

I am curious whether usecase 2. is something that can be used by anyone.

@svmaris
Copy link

svmaris commented Jun 24, 2021

I think baselineReplicaCount or baseReplicaCount is a better name, because otherwise you might assume that the number of replicas will drop below 5 replicas if there's work to do (but not enough to warrant 5 replicas), which is not what we want to achieve here. My point was mainly about not changing the meaning of minReplicaCount, because minReplicaCount: 5 with scaleToZero: true sounds like a contradiction.

@philipp94831
Copy link
Author

Hi @zroubalik, any update on this? :-)

@tomkerkhove
Copy link
Member

I think he was mainly waiting for me but I forgot to circle back on this, sorry about that!

If we take a step back on what we want to achieve it is defining how many instances we want to have when there is no work to be done.

I think baseline/base is a bit too ambiguous but initial implies that it is only at the beginning. What about idle? Although that might be odd as well to some.

@svmaris
Copy link

svmaris commented Jul 7, 2021

If we take a step back on what we want to achieve it is defining how many instances we want to have when there is no work to be done.

IMHO we already have that in the form of minReplicaCount. The use case here is that we want to define the minimum number of replicas when there is work to be done. But maybe this use case is a bit too exotic to support directly.

I was thinking: Would it be possible to take the use case described in #692, which is probably a bit more "generic" (scale down to 0 when there's no work, but scale up immediately when there's work to be done, but not enough to reach the threshold) and make it so that it supports the use case described in this issue as well? Perhaps by defining a replicaCountIncrement that defines by how many replicas it should scale up at the same time?

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: my-kafka-processor-keda
spec:
  scaleTargetRef:
    deploymentName: my-kafka-processor
  pollingInterval: 30
  minReplicaCount: 0
  replicaCountIncrement: 5
  maxReplicaCount: 10
  triggers:
  - type: kafka
    metadata:
      bootstrapServers: awesome.servicebus.windows.net:9093
      consumerGroup: my-kafka-worker
      topic: my-topic
      lagThreshold: "100"
      startThreshold: "1"

The above basically means:

  • Scale to zero when there's no work to be done
  • Scale up when there's at least 1 message in the queue
  • Scale up by increments of 5 replicas at a time

The replicaCountIncrement could be used to make sure there's a replica running in each availability zone for example (using anti-affinity rules).

@philipp94831
Copy link
Author

The above basically means:

  • Scale to zero when there's no work to be done
  • Scale up when there's at least 1 message in the queue
  • Scale up by increments of 5 replicas at a time

While scaling up by increments of 5 would allow us to immediately scale from 0 to 5, it would also mean that the next step would be 10 replicas while it would be preferable for us to still then scale normally, i.e., 6 as the next step

@zroubalik
Copy link
Member

The above basically means:

  • Scale to zero when there's no work to be done
  • Scale up when there's at least 1 message in the queue
  • Scale up by increments of 5 replicas at a time

While scaling up by increments of 5 would allow us to immediately scale from 0 to 5, it would also mean that the next step would be 10 replicas while it would be preferable for us to still then scale normally, i.e., 6 as the next step

Yeah, that's a different use case completely.

@zroubalik
Copy link
Member

zroubalik commented Jul 7, 2021

I think he was mainly waiting for me but I forgot to circle back on this, sorry about that!

If we take a step back on what we want to achieve it is defining how many instances we want to have when there is no work to be done.

I think baseline/base is a bit too ambiguous but initial implies that it is only at the beginning. What about idle? Although that might be odd as well to some.

@tomkerkhove it is not about the number of instances when there is no work to be done, that's clear -> minReplicaCount

It is about the inital number of replicas (if min is 0) that should be set when there is some work to do, so instead of scaling to 1 it should scale for example to 5.

@tomkerkhove
Copy link
Member

not about the number of instances when there is no work to be done, that's clear -> minReplicaCount

That's if you think about how it is today, but if you ignore that and think purely conceptually I think we have it wrong today and idle feels better but it is what it is.

@svmaris
Copy link

svmaris commented Jul 7, 2021

That's if you think about how it is today, but if you ignore that and think purely conceptually I think we have it wrong today and idle feels better but it is what it is.

Are you proposing to add a new field idleReplicaCount that defaults to the value of minReplicaCount when not defined and "changes" the meaning of minReplicaCount when it is defined, but has a different value? So that:

  minReplicaCount: 5
  maxReplicaCount: 20

means "Scale to 5 when there's no work" (idleReplicaCount is set to 5 implicitly in this case)

and

  idleReplicaCount: 0
  minReplicaCount: 5
  maxReplicaCount: 20

means "Scale to 0 when there's no work, but scale to a minimum of 5 when there is work" ?

@zroubalik
Copy link
Member

zroubalik commented Jul 7, 2021

Yeah, the other proposal is not to change the meaning of minReplicaCount, something like this:

  minReplicaCount: 0
  initialReplicaCount: 5
  maxReplicaCount: 20

Or the new property could be named baseReplicaCount as suggested above.

@tomkerkhove
Copy link
Member

That's if you think about how it is today, but if you ignore that and think purely conceptually I think we have it wrong today and idle feels better but it is what it is.

Are you proposing to add a new field idleReplicaCount that defaults to the value of minReplicaCount when not defined and "changes" the meaning of minReplicaCount when it is defined, but has a different value? So that:

  minReplicaCount: 5
  maxReplicaCount: 20

means "Scale to 5 when there's no work" (idleReplicaCount is set to 5 implicitly in this case)

and

  idleReplicaCount: 0
  minReplicaCount: 5
  maxReplicaCount: 20

means "Scale to 0 when there's no work, but scale to a minimum of 5 when there is work" ?

This sounds like what I had in mind as well, +1 on this

@svmaris
Copy link

svmaris commented Jul 7, 2021

This sounds like what I had in mind as well, +1 on this

I agree that this approach is easier to explain than initialReplicaCount/baseReplicaCount. I also think there's no real reason to allow idleReplicaCount to be anything other than 0, so maybe the scaleToZero: true/false (that defaults to false) that @zroubalik originally proposed isn't so bad after all 😉

All in all I do think this use case is a bit too specific/exotic to warrant a toggle in the spec. Another wild idea would be to allow for dynamic values in minReplicaCount and maxReplicaCount. For example, Argo Workflows is using Go templates + the Sprig library to do just this. If the value of the scaler is exposed as a variable, you can set the minReplicaCount to 0 if there's no work and to 5 if there is.

@tomkerkhove
Copy link
Member

This sounds like what I had in mind as well, +1 on this

I agree that this approach is easier to explain than initialReplicaCount/baseReplicaCount. I also think there's no real reason to allow idleReplicaCount to be anything other than 0, so maybe the scaleToZero: true/false (that defaults to false) that @zroubalik originally proposed isn't so bad after all 😉

I do not agree since you might want to always have 1 instance up and running rather than 0 to have faster scaling/uptime. A simple scenario is cluster capacity running out and you want to scale from 0 to 1 but you can't since the cluster autoscaler still needs to kick in so you have to wait, which could have platform impact.

@zroubalik
Copy link
Member

So should I go with the idleReplicaCount right in the spec root and then vhange the meaning of minReplicaCount a little bit?

@tomkerkhove
Copy link
Member

I would introduce replicas with 3 fields underneath it and deprecate current but might be overkilling it

@zroubalik
Copy link
Member

I would introduce replicas with 3 fields underneath it and deprecate current but might be overkilling it

That's overkill imho and this would break 99,9% of currently deployed and defined ScaledObject. Optional idleReplicaCount on the other hand doesn't break anything as if it is not defined, the behaviour is still the same.

@tomkerkhove
Copy link
Member

It would be backwards compatible though but just flat is fine for me for now as well

@zroubalik
Copy link
Member

@tomkerkhove should idleReplicaCount live in .spec root next to the minReplicaCount & maxReplicaCount or in .spec.advanced ?

@tomkerkhove
Copy link
Member

I'd say keep them next to the rest, in our next major version we'll move all of them to replicas.xyz?

@philipp94831
Copy link
Author

Thank you so much @zroubalik 🙏

@zroubalik
Copy link
Member

Thank you so much @zroubalik 🙏

@philipp94831 you can give a try if you want to and give me a feedback, just use images with :main tag

@philipp94831
Copy link
Author

philipp94831 commented Jul 26, 2021

Thank you so much @zroubalik 🙏

@philipp94831 you can give a try if you want to and give me a feedback, just use images with :main tag

Hi @zroubalik, due to vacation we only now had the time to test it. We switched to :main tag for our KEDA installation but we are getting the following error when creating a scaled object: error validating data: ValidationError(ScaledObject.spec): unknown field "idleReplicaCount" in sh.keda.v1alpha1.ScaledObject.spec Do you know what causes this?

@zroubalik
Copy link
Member

@philipp94831 you need to install the new CRD as well, so your k8s cluster knows the updated ScaledObject : https://github.com/kedacore/keda/blob/main/config/crd/bases/keda.sh_scaledobjects.yaml

@philipp94831
Copy link
Author

@philipp94831 you need to install the new CRD as well, so your k8s cluster knows the updated ScaledObject : https://github.com/kedacore/keda/blob/main/config/crd/bases/keda.sh_scaledobjects.yaml

@zroubalik Ah yes. We install KEDA using helm. Is the chart also published in a main version or do we have to manually apply the definition?

@zroubalik
Copy link
Member

@philipp94831 You will have to manually update the CRD, the rest of the resouces could be from the Helm.

@philipp94831
Copy link
Author

@zroubalik Everything works perfectly and as expected, thanks again!

@zroubalik
Copy link
Member

@philipp94831 glad to hear that!

@philipp94831
Copy link
Author

@zroubalik Do you have any timeline on making a stable release?

@zroubalik
Copy link
Member

A few weeks at maximum.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request All issues for new features that have not been committed to needs-discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants