Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEDA Operator crash with SO option restoreToOriginalReplicaCount #2872

Closed
karimrut opened this issue Apr 1, 2022 · 6 comments · Fixed by #2881
Closed

KEDA Operator crash with SO option restoreToOriginalReplicaCount #2872

karimrut opened this issue Apr 1, 2022 · 6 comments · Fixed by #2881
Assignees
Labels
bug Something isn't working

Comments

@karimrut
Copy link

karimrut commented Apr 1, 2022

Report

We are having an issue with KEDA and the advanced option "restoreToOriginalReplicaCount".
If the SO is created with a existing deployment as "scaleTargetRef" there is no issue.

Sometimes the SO can be created when the deployment no longer exists. The creation of the SO by itself is not problem, but if we delete the SO before any deployment has started to use it keda-operator crashes with:
panic: runtime error: invalid memory address or nil pointer dereference

The last debug log every time this happen before the crash is:
DEBUG scalehandler ScaleObject was not found in controller cache

The SO is stuck with the message “scaledobject.keda.sh "cron-scaledobject" deleted” even though it’s still there.
To resolve the issue you have to edit the SO and remove the two lines “finalizers:” and “- finalizer.keda.sh”.
Then KEDA will run as normal.

If you do not use the “restoreToOriginalReplicaCount” and you delete the SO there is no problem.
If the KEDA operator is not running it’s also not an issue even if the restoreToOriginalReplicaCount is set to true.

Expected Behavior

Expect the operator to keep running as normal.

Actual Behavior

The operator is crashing with:
panic: runtime error: invalid memory address or nil pointer dereference

Steps to Reproduce the Problem

  1. Create any of the cron, cpu or memory "ScaledObject" with the option "restoreToOriginalReplicaCount" set to true.
  2. Set "scaleTargetRef" to a deployment that does not exist.
  3. kubectl delete so cron-scaledobject -n default and the operator will crash.
  4. Example scaler:
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: my-deployment
  advanced:
    restoreToOriginalReplicaCount: true
  triggers:
  - type: cron
    metadata:
      timezone: Asia/Kolkata
      start: 30 * * * *
      end: 45 * * * *
      desiredReplicas: "10"

Logs from KEDA operator

1.6488308249284003e+09	DEBUG	scalehandler	ScaleObject was not found in controller cache	{"key": "scaledobject.<ns>.<scaledobject-name>"}
panic: runtime error: invalid memory address or nil pointer dereference

KEDA Version

2.6.1

Kubernetes Version

1.21

Platform

Amazon Web Services

Scaler Details

Cron

Anything else?

No response

@karimrut karimrut added the bug Something isn't working label Apr 1, 2022
@tomkerkhove tomkerkhove moved this to Proposed in Roadmap - KEDA Core Apr 1, 2022
@JorTurFer
Copy link
Member

Thanks for notifying it @karimrut
I'll take a look

@JorTurFer JorTurFer self-assigned this Apr 2, 2022
@JorTurFer JorTurFer moved this from Proposed to To Do in Roadmap - KEDA Core Apr 2, 2022
@JorTurFer
Copy link
Member

Hi @karimrut
I cannot reproduce the bug :(
Maybe I'm missing any step in the middle... could you share the process to reproduce the bug step by step, please?
Please, share with us also the result of this command (just to double-check the version):

kubectl get deploy keda-operator -n KEDA-NAMESPACE -o jsonpath="{.spec.template.spec.containers[0].image}"

@karimrut
Copy link
Author

karimrut commented Apr 4, 2022

Hi @JorTurFer !
Thanks for looking into it! That's unfortunate.
Here is the result from your command above: ghcr.io/kedacore/keda:2.6.1

I tried to do it from scratch without our CI/CD. Just installed it directly and got the same result. This is happening in 2 different EKS clusters.
The “restoreToOriginalReplicaCount has to be set to “true” and the the “scaleTargetRef” used cannot exists.

SO yaml:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaledobject
spec:
  scaleTargetRef:
    name: my-deployment
  advanced:
    restoreToOriginalReplicaCount: true
  triggers:
  - type: cron
    metadata:
      timezone: Asia/Kolkata
      start: 30 * * * *
      end: 45 * * * *
      desiredReplicas: "10"

Installing KEDA.

  1. helm repo add kedacore https://kedacore.github.io/charts
  2. helm repo update
  3. helm install keda kedacore/keda --namespace keda
  4. kubectl get deploy keda-operator -n keda -o jsonpath="{.spec.template.spec.containers[0].image}"
  5. Result: ghcr.io/kedacore/keda:2.6.1

Creating SO only.

  1. kubectl apply -f so_only.yaml -n test
  2. kubectl delete so cron-scaledobject -n test
  3. delete command stuck & keda-operator: panic: runtime error: invalid memory address or nil pointer dereference

@zroubalik
Copy link
Member

Thanks for reporting, I will check that @JorTurFer !

@zroubalik zroubalik assigned zroubalik and unassigned JorTurFer Apr 4, 2022
@zroubalik zroubalik moved this from To Do to In Progress in Roadmap - KEDA Core Apr 4, 2022
@zroubalik zroubalik moved this from In Progress to In Review in Roadmap - KEDA Core Apr 4, 2022
@zroubalik
Copy link
Member

@karimrut good catch. It was a corner case not covered properly. The referenced PR fixes this. Thanks for reporting.

Repository owner moved this from In Review to Ready To Ship in Roadmap - KEDA Core Apr 4, 2022
@karimrut
Copy link
Author

karimrut commented Apr 6, 2022

Thanks a lot for the help @zroubalik !

@tomkerkhove tomkerkhove moved this from Ready To Ship to Done in Roadmap - KEDA Core Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants