Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatefulSet not scaled #1940

Closed
avdhoot opened this issue Jul 7, 2021 · 11 comments
Closed

StatefulSet not scaled #1940

avdhoot opened this issue Jul 7, 2021 · 11 comments
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity

Comments

@avdhoot
Copy link

avdhoot commented Jul 7, 2021

Report

keda not able to scale StatefulSet even triggers condition meet. Check attached screen shot where threshold beyond the trigger(75) in keda metrics but StatefulSet not have scaled .

image

Expected Behavior

After crossing threshold StatefulSet should have scaled.

Actual Behavior

After crossing threshold StatefulSet not scaled.

Steps to Reproduce the Problem

Logs from KEDA operator

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: fluentd-logs
  namespace: fluent
spec:
  scaleTargetRef:
    name: fluentd-logs
    kind: StatefulSet
  pollingInterval: 30
  minReplicaCount: 2
  maxReplicaCount: 10
  advanced:
    horizontalPodAutoscalerConfig:
      behavior:
        scaleUp:
          policies:
          - periodSeconds: 60
            type: Pods
            value: 1
        scaleDown:
          stabilizationWindowSeconds: 900
          policies:
          - periodSeconds: 300
            type: Pods
            value: 1
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus-infra.prometheus.svc.cluster.local:9090
      metricName: namespace_pod_name_container_name:container_cpu_usage_seconds_total:sum_rate
      query:  avg(namespace_pod_name_container_name:container_cpu_usage_seconds_total:sum_rate{namespace="fluent", container_name="fluentd"}) * 100
      threshold: '75'
Name:   keda-hpa-fluentd-logs
Namespace: fluent
Labels: app.kubernetes.io/managed-by=keda-operator
        app.kubernetes.io/name=keda-hpa-fluentd-logs
        app.kubernetes.io/part-of=fluentd-logs
        app.kubernetes.io/version=2.2.0
        scaledObjectName=fluentd-logs
Annotations:                                                                                                                                                                                                                      <none>
CreationTimestamp:                                                                                                                                                                                                                Wed, 07 Jul 2021 14:44:08 +0530
Reference:                                                                                                                                                                                                                        StatefulSet/fluentd-logs
Metrics: ( current / target )
  "prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace=\"fluent\", container_name=\"fluentd\"}" (target average value):  46 / 75
Min replicas:                                                                                                                                                                                                                     2
Max replicas:                                                                                                                                                                                                                     10
Behavior:
  Scale Up:
    Stabilization Window: 0 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 1  Period: 60 seconds
  Scale Down:
    Stabilization Window: 900 seconds
    Select Policy: Max
    Policies:
      - Type: Pods  Value: 1  Period: 300 seconds
StatefulSet pods:   2 current / 2 desired
Conditions:
  Type            Status  Reason              Message
  ----            ------  ------              -------
  AbleToScale     True    ReadyForNewScale    recommended size matches current size
  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from external metric prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace="fluent", container_name="fluentd"}(&LabelSelector{MatchLabels:map[string]string{scaledObjectName: fluentd-logs,},MatchExpressions:[]LabelSelectorRequirement{},})
  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable range
Events:           <none>

KEDA Version

2.2.0

Kubernetes Version

1.19

Platform

Amazon Web Services

Scaler Details

prometheus

Anything else?

Not sure why current value in hpa is 46 ?

❯ kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/fluent/prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace=\"fluent\", container_name=\"fluentd\"}" | jq
{
  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "metricName": "prometheus-http---prometheus-infra-prometheus-svc-cluster-local-9090-(namespace_pod_name_container_name-container_cpu_usage_seconds_total-sum_rate{namespace=\"fluent\", container_name=\"fluentd\"}",
      "metricLabels": null,
      "timestamp": "2021-07-07T10:52:25Z",
      "value": "93"
    }
  ]
}

@avdhoot avdhoot added the bug Something isn't working label Jul 7, 2021
@avdhoot avdhoot changed the title staefulset not scaled StatefulSet not scaled Jul 7, 2021
@zroubalik
Copy link
Member

Is this really related to StatefulSet? What happens if you use Deployment?

From the HPA logs seems like, that it doesn't need to scale. Are you sure that your query is correct?

@coderanger
Copy link
Contributor

Also if all you want to scale on is CPU usage, normal HPA objects (or the passthrough for HPA's cpu/memory metric scaling) might be a better fit than roundtripping through Keda.

@avdhoot
Copy link
Author

avdhoot commented Jul 9, 2021

@zroubalik Sorry for confusion. I do not think so it related to StatefulSet. I can change issue title if you want.

Looks like HPA controller divide current metrics by no. of current replica ref. In this situation Currently we have 2 pod hence current metrics(91) get divided by 2 ie. 46. Trigger value is 75 so HPA never thinks it should scale.

if above theory is right not sure how people scaling stuff on aggregated metrics which are not related to the no. pod.

Please let me know if I am wrong this my attempt understand why.

@coderanger
That is plan B. But wanted to understand why this is not working. In theory it should

@coderanger
Copy link
Contributor

All metrics computations can be in either Value or AverageValue mode. In general this is currently hard-coded per scaler based on the first use case it dealt with. There is a vague plan to make it configurable and more consistent overall but for now just check each scaler's code to see which mode it uses.

@zroubalik
Copy link
Member

I think that it is pretty safe to say that all scalers are using AverageValue mode. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details

@zroubalik
Copy link
Member

Just for a reference: #1314 for enabling Value mode for Rabbit MQ, but I'd rather see it done on a global level, as @coderanger mentioned.

@avdhoot
Copy link
Author

avdhoot commented Jul 9, 2021

Thanks for confirming the behavior & providing ref. Any idea how you guys think it should get implemented?

@avdhoot
Copy link
Author

avdhoot commented Jul 10, 2021

Suggestion Any thoughts on exposing MetricTargetType in scaler.metadata like.

  triggers:
  - type: prometheus
    metadata:
      serverAddress: 
      metricTargetType: AverageValue # default value can be Value
      metricName: 
      query:  
      threshold: 
     

@zroubalik
Copy link
Member

zroubalik commented Jul 12, 2021

I would love to see some generic approach for all scalers.
So it might even be a new field next to metadata section (similar way how this PR adds Fallback: https://github.com/kedacore/keda/pull/1910/files#diff-33506d72fc24194f1ac7ad0a8963c2f19a49c4a46b6d559c70f8f2b5c27d0837R110).

The only thing I am concerned about is what would be the actual behavior when there will be mutliple triggers in one ScaledObject with mixed metric target types? For example 2 triggers in ScaledObject, first using AverageValue, the latter Value.

So eventually we would need to set this setting on a ScaledObject level and apply to all triggers?

But I am not sure about this and we need some investigation on this topic.

@stale
Copy link

stale bot commented Oct 13, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Oct 13, 2021
@stale
Copy link

stale bot commented Oct 20, 2021

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Oct 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity
Projects
None yet
Development

No branches or pull requests

3 participants