Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STAN scaler does not scale from 0 #519

Closed
cwoolum opened this issue Dec 18, 2019 · 21 comments
Closed

STAN scaler does not scale from 0 #519

cwoolum opened this issue Dec 18, 2019 · 21 comments
Labels
bug Something isn't working

Comments

@cwoolum
Copy link
Contributor

cwoolum commented Dec 18, 2019

A clear and concise description of what the bug is.

Expected Behavior

When there messages in the queue and the HPA is scaled down to 0, the scaler should create at least one instance of a pod to create the subscription with STAN.

Actual Behavior

When the loop runs, The logic checks if the pending count of messages is greater than 0 and that the queue names match. It never takes into account if the subscription does not exist.

Steps to Reproduce the Problem

  1. Create a new ScaledObject that can scale to 0
apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  name: test-scaledobject
  labels:
    deploymentName: test
spec:
  pollingInterval: 10   # Optional. Default: 30 seconds
  cooldownPeriod: 30   # Optional. Default: 300 seconds
  minReplicaCount: 0   # Optional. Default: 0
  maxReplicaCount: 5  # Optional. Default: 100  
  scaleTargetRef:
    deploymentName: test
  triggers:
  - type: stan
    metadata:
      natsServerMonitoringEndpoint: "stan.default:8222"
      queueGroup: "place_indexer"
      durableName: "ImDurable"
      subject: myQueue
      lagThreshold: "100" 
  1. cURL the STAN channel endpoint. It will return a response like:
{
  "name": "myQueue",
  "msgs": 500,
  "bytes": 0,
  "first_seq": 0,
  "last_seq": 0,
  "subscriptions": []
  1. Keda will ignore the msgs field because there are no active subscriptions.
  2. If minReplicaCount is set to 1, scaling works normally.

Specifications

  • Version: 1.0.0
  • Platform: Linux
  • Scaler(s): STAN
@cwoolum cwoolum added the bug Something isn't working label Dec 18, 2019
@balchua
Copy link
Contributor

balchua commented Dec 21, 2019

Thanks @cwoolum for spotting this one. Its strange that I didn't see this one, sure tested it will min replicas 0. Anyway thanks!

@balchua
Copy link
Contributor

balchua commented Dec 24, 2019

I think i know how i missed this one. My initial deployment has a replica of 1. So atleast there is a registered subscriber. Once you get past that it can scale from 0 afterwards.

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 24, 2019

If it scales to zero though, the subscription will go away if you are draining and closing the connection. Even if you don't drain and close, the subscription will drop off after a short period(5 minutes?) And that prevents it from scaling back up.

@balchua
Copy link
Contributor

balchua commented Dec 24, 2019

I was testing it with this program https://github.com/balchua/gonuts. Subscriber seems to stick around.

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 24, 2019

https://github.com/balchua/gonuts/blob/c0ce3d6448fc1dafeaf5695136d008a04a696873/sub/main.go#L150

You are calling sc.Close() but not sc.Unsubscribe() when your subscriber app ends.

If your service scales up to 20 instances, you'll have 20 subscribers in STAN. Now say you scale back down to 1, each of those instance need to unsubscribe so that STAN doesn't keep trying to deliver messages to them, right?

I could be thinking about this the wrong way though.

@balchua
Copy link
Contributor

balchua commented Dec 25, 2019

Thanks for the explanation.👍 I remember setting the maxinflight to 1 so it round robins it to different subscribers. If the original subscriber goes down, i think stan will try to deliver it to the same subscriber until certain period of time. But i maybe wrong here too.

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 25, 2019

I do use MaxInFlight but set it to 50 since each one of my pods can handle that number of concurrent requests.

I saw that one of the newer versions of STAN/NATS does have a keepalive that it sends periodically to make sure a subscriber is still listening but I'm not sure of the relationship between a connection being open and a subscription existing for it. I'm going to open up an issue in the STAN repo and see if the team can provide some best practices on it.

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 25, 2019

I was able to find the docs for this. It looks like I was doing it wrong. For durable subscriptions, you don't want to Unsubscribe but only Close the connection.

The main difference between a queue subscription and a durable one, is that when the last member leaves the group, the state of the group will be maintained by the server. Later, when a member rejoins the group, the delivery will resume.
Note: For a durable queue subscription, the last member to  unsubscribe  (not simply close) causes the group to be removed from the server.

I'll try removing the Unsubscribe call from my code and see how that affects it.

@balchua
Copy link
Contributor

balchua commented Dec 26, 2019

@cwoolum thanks for checking the docs. Did it work for your when you removed the Unsubscribe?

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 27, 2019

I see the subscriptions are still there but I'm not sure if the messages are getting delivered successfully now. I'm seeing a large number of pending_count on the pods that were taken down as part of the scale down.

{
  "name": "place.inserted",
  "msgs": 653382,
  "bytes": 358537710,
  "first_seq": 1,
  "last_seq": 653382,
  "subscriptions": [
    {
      "client_id": "xxxplaceinserter-5595885df6-v57vq",
      "inbox": "_INBOX.AQ77TV2TOF1DVJUUW109X8",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD3HS",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653379,
      "pending_count": 0,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-zf6js",
      "inbox": "_INBOX.ZQCLS4F06A4AMM1V78KYZ8",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD3Qt",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653382,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-tjwzt",
      "inbox": "_INBOX.SD4GQBUAJY4L8SPRIUYNGH",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD4Dx",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653382,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-2cd2v",
      "inbox": "_INBOX.XAERJTARSEMGR9N3ZUHZ1J",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD44W",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653382,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-mv9fr",
      "inbox": "_INBOX.7PWDZ4ZLPWOA311VJKKC6V",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD3yr",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653376,
      "pending_count": 110,
      "is_stalled": false
    },
    {
      "client_id": "xxxplaceinserter-7b79867b48-xzz2s",
      "inbox": "_INBOX.DSA21NRUGOLG1DBW8CIP52",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfD40k",
      "queue_name": "UpdatedDurable:place_inserter",
      "is_durable": true,
      "is_offline": false,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 653378,
      "pending_count": 110,
      "is_stalled": false
    }
  ]
}

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 27, 2019

After the scale down occurred, the number of subscribers dropped to 1. I'm not sure the time it took for that to happen though.

I opened an issue to ask the best practices for this.

nats-io/nats-streaming-server#1001

@balchua
Copy link
Contributor

balchua commented Dec 27, 2019

As far as i know stan will direct the pending messages to available subscribers. But how long before it takes that decision.

@cwoolum
Copy link
Contributor Author

cwoolum commented Dec 31, 2019

I'm going to keep testing to make sure none of the messages are getting stuck. I keep running into the error Operation cannot be fulfilled on scaledobjects.keda.k8s.io \"mytest-scaledobject\": the object has been modified; please apply your changes to the latest version and try again. I've tried to delete both the HPA and ScaledObject multiple times but it keeps occurring. Any idea?

@balchua
Copy link
Contributor

balchua commented Jan 1, 2020

When this error shows up does it stop scaling? Care to show what kubectl get hpa looks like? And also your scaledobject yaml.
Thanks!

@zroubalik
Copy link
Member

@cwoolum could you please decribe steps leading to this error + more details on ScaledObject and HPA. thanks

@cwoolum
Copy link
Contributor Author

cwoolum commented Jan 3, 2020

Okay, the modified issue was due to an old CPU based HPA conflicting with the Keda Scaler but I was able to recreate the issue with the pod not scaling from 0. Here are the requested items.

apiVersion: keda.k8s.io/v1alpha1
kind: ScaledObject
metadata:
  creationTimestamp: '2020-01-03T06:52:29Z'
  finalizers:
    - finalizer.keda.k8s.io
  generation: 3
  labels:
    deploymentName: turnouttplaceindexer
  name: turnouttplaceindexer-scaledobject
  namespace: api
spec:
  cooldownPeriod: 30
  maxReplicaCount: 5
  minReplicaCount: 0
  pollingInterval: 10
  scaleTargetRef:
    deploymentName: turnouttplaceindexer
  triggers:
    - metadata:
        durableName: UpdatedDurable
        lagThreshold: '100'
        natsServerMonitoringEndpoint: 'stan.default:8222'
        queueGroup: place_indexer
        subject: place.updated
      type: stan
status:
  externalMetricNames:
    - lagThreshold
  lastActiveTime: '2020-01-03T15:56:05Z'
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
  name: keda-hpa-turnouttplaceindexer
  namespace: api
  selfLink: >-
    /apis/autoscaling/v2beta1/namespaces/api/horizontalpodautoscalers/keda-hpa-turnouttplaceindexer
  uid: 8a0bd442-13df-43db-9897-0b96ffae7e4f
  resourceVersion: '25270090'
  creationTimestamp: '2020-01-03T15:52:05Z'
  ownerReferences:
    - apiVersion: keda.k8s.io/v1alpha1
      kind: ScaledObject
      name: turnouttplaceindexer-scaledobject
      uid: 8f9f9d7f-022a-4f95-ba70-c30cb155337c
      controller: true
      blockOwnerDeletion: true
spec:
  scaleTargetRef:
    kind: Deployment
    name: turnouttplaceindexer
    apiVersion: apps/v1
  minReplicas: 1
  maxReplicas: 5
  metrics:
    - type: External
      external:
        metricName: lagThreshold
        metricSelector:
          matchLabels:
            deploymentName: turnouttplaceindexer
        targetAverageValue: '100'
status:
  currentReplicas: 0
  desiredReplicas: 0
  currentMetrics: null
  conditions:
    - type: AbleToScale
      status: 'True'
      lastTransitionTime: '2020-01-03T15:52:20Z'
      reason: SucceededGetScale
      message: the HPA controller was able to get the target's current scale
    - type: ScalingActive
      status: 'False'
      lastTransitionTime: '2020-01-03T15:57:12Z'
      reason: ScalingDisabled
      message: scaling is disabled since the replica count of the target is zero
    - type: ScalingLimited
      status: 'True'
      lastTransitionTime: '2020-01-03T15:55:40Z'
      reason: TooFewReplicas
      message: >-
        the desired replica count is increasing faster than the maximum scale
        rate

Here is the response from STAN:

{
  "name": "place.updated",
  "msgs": 1000000,
  "bytes": 131655679,
  "first_seq": 1752831,
  "last_seq": 2752830,
  "subscriptions": [
    {
      "client_id": "turnouttplaceindexer-68c66f7545-dhwqc",
      "inbox": "_INBOX.6Z7TV1D0B84HNLMDPIM8IW",
      "ack_inbox": "_INBOX.INczX1devhkKAX12cfuiaw",
      "queue_name": "UpdatedDurable:place_indexer",
      "is_durable": true,
      "is_offline": true,
      "max_inflight": 100,
      "ack_wait": 60,
      "last_sent": 2751730,
      "pending_count": 0,
      "is_stalled": false
    }
  ]
}

And here are the logs from KEDA

2020-01-03T15:54:24.630417316Z {"level":"info","ts":1578066864.6303,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
2020-01-03T15:54:24.730667858Z {"level":"info","ts":1578066864.7304275,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"scaledobject-controller"}
2020-01-03T15:54:24.830825399Z {"level":"info","ts":1578066864.8306682,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"scaledobject-controller","worker count":1}
2020-01-03T15:54:25.156227482Z {"level":"info","ts":1578066865.1561325,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:54:25.156241982Z {"level":"info","ts":1578066865.1561651,"logger":"controller_scaledobject","msg":"Detecting ScaleType from ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:54:25.156247382Z {"level":"info","ts":1578066865.1561697,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:54:25.181735746Z {"level":"info","ts":1578066865.1815946,"logger":"controller_scaledobject","msg":"Updated HPA according to ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject","HPA.Namespace":"api","HPA.Name":"keda-hpa-turnouttplaceindexer"}

2020-01-03T15:57:10.682897812Z {"level":"info","ts":1578067030.6824272,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:57:10.684467522Z {"level":"info","ts":1578067030.682496,"logger":"controller_scaledobject","msg":"Detecting ScaleType from ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:57:10.684494823Z {"level":"info","ts":1578067030.6825025,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject"}
2020-01-03T15:57:10.695899496Z {"level":"info","ts":1578067030.6957793,"logger":"controller_scaledobject","msg":"Updated HPA according to ScaledObject","Request.Namespace":"api","Request.Name":"turnouttplaceindexer-scaledobject","HPA.Namespace":"api","HPA.Name":"keda-hpa-turnouttplaceindexer"}
2020-01-03T15:57:10.718530941Z {"level":"info","ts":1578067030.7182388,"logger":"scalehandler","msg":"Successfully scaled deployment to 0 replicas","ScaledObject.Namespace":"api","ScaledObject.Name":"turnouttplaceindexer-scaledobject","ScaledObject.ScaleType":"deployment","Deployment.Namespace":"api","Deployment.Name":"turnouttplaceindexer"}

There are messages pending for this subscription but Keda doesn't seem to be picking them up. The pods do call close on the NATS and STAN connections but do not Unsubscribe.

@balchua
Copy link
Contributor

balchua commented Jan 3, 2020

@cwoolum i think i know what the issue is, the stan scaler actually uses the channels' msg attribute and subtract that using the subscriber's last_sent. Unfortunately last_sent is a sequence number.
Looking at the STAN response you provided, msg=1000000 and last_sent=2751730 the resulting pending lag will be negative.
I think the better approach is to use the channel information's last_seq instead of msg.
Wdyt?

@cwoolum
Copy link
Contributor Author

cwoolum commented Jan 6, 2020

I think that makes sense. I created PR #533 to make the change. It also adds additional logging to help diagnose issues.

@balchua
Copy link
Contributor

balchua commented Jan 7, 2020

@cwoolum does PR #519 addresses this issue?

@cwoolum
Copy link
Contributor Author

cwoolum commented Jan 10, 2020

Yes, I just did a full deploy using the master image tag and scale up/down is working for me.

@cwoolum cwoolum closed this as completed Jan 10, 2020
@balchua
Copy link
Contributor

balchua commented Jan 10, 2020

Thanks @cwoolum

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants