Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to Redis event source? #240

Closed
cristianoveiga opened this issue May 30, 2019 · 39 comments
Closed

Support to Redis event source? #240

cristianoveiga opened this issue May 30, 2019 · 39 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Looking for support from community scaler

Comments

@cristianoveiga
Copy link

Hi,

I noticed that Redis is currently not supported and also is not present on the planned scalers.

Have you considered some sort of integration?

@jeffhollan
Copy link
Member

Great question. I think the trick with Redis for me has always been what "metric" we can pull from redis to know if there is work to be done, and how much work is done. What I've read in the past from Redis docs is that if you don't have an active client listening for events, an event is sent but lost. Since KEDA uses a polling interval to check for messages, I suspect it means that unless the event happened to be sent while KEDA was checking, KEDA would have no idea about the work to be done and couldn't publish accurate metrics. I'm not sure on this though. I'd love to see a Redis scaler but we'd need a way to know:

  • after being disconnected for some time (maybe 1 second, maybe 1 minute), can we check some Redis endpoint to know if there is work to be processed
  • after being disconnected for some time, can we check some Redis endpoint to know how much work there is to process (to feed the HPA custom metrics)

@cristianoveiga
Copy link
Author

In my use case, I would like to be monitoring the size of a queue (internally implemented as a list in Redis), so I guess the simplest implementation would creating a custom metrics (most likely using custom metrics ) that returns the length of the specified key or a range of keys.

@jeffhollan
Copy link
Member

Yeah that one would work I think. Do you remove the item from the list after processing? So that the length of the list could always == the amount of unprocessed work?

@yaron2
Copy link
Contributor

yaron2 commented May 30, 2019

@cristianoveiga would using LLEN for a list for our Redis scaler do the trick?

https://redis.io/commands/llen

@cristianoveiga
Copy link
Author

Correct @jeffhollan @yaron2 !

The length of the list (s) would give me metric of how much work I have pending.

Please note that Redis has other data types/structures, but this could be a good start.

@yaron2
Copy link
Contributor

yaron2 commented May 30, 2019

Good :)

@jeffhollan jeffhollan added good first issue Good for newcomers help wanted Looking for support from community and removed needs-discussion labels May 30, 2019
@jeffhollan
Copy link
Member

Yes scoping to that is very do-able. Not sure if you want to take a pass at it. I flagged it as help wanted and will update the wiki, but assuming you just did pretty much exactly what the rabbitMq and azure queue scalers are doing, but instead of lenght of queue it's length of list you'd have what you wanted

@cristianoveiga
Copy link
Author

awesome, thanks!

I'm trying to understand the whole custom metrics workflow. If I manage to get it sorted, I would be glad to help send a PR.

I will keep you posted.

@patnaikshekhar
Copy link
Contributor

@jeffhollan I'm happy to help with this one

@patnaikshekhar
Copy link
Contributor

@jeffhollan @yaron2 I've created a pull request for this. Could you verify?

@patnaikshekhar
Copy link
Contributor

@jeffhollan @yaron2 @cristianoveiga this has now been merged into master. Can this issue be closed?

@jeffhollan
Copy link
Member

Awesome work @patnaikshekhar! Thanks a ton for the contribution 🎉.

@asavaritayal can you help just to track work to make sure

  1. The wiki is updated with the schema of the ScaledObject for Redis
  2. The README.md is updated with this in the list of triggers (not sure if that was already taken care of in the PR)

Once that's in I think it's officially now a supported KEDA trigger.

@cristianoveiga
Copy link
Author

This is great! Thanks @patnaikshekhar !

I will take some time this week to try it out.

@ahmelsayed
Copy link
Contributor

btw, make sure to use kedacore/keda:master instead of kedacore/keda:latest to get @patnaikshekhar changes.

@tomkerkhove
Copy link
Member

Anything left open for this one?

It's merged and documented to scale on Redis Lists AFAIK

@tomkerkhove
Copy link
Member

Anything left open for this one?

It's merged and documented to scale on Redis Lists AFAIK

@cristianoveiga
Copy link
Author

Hi @tomkerkhove @patnaikshekhar ,

I was trying to get this working on my cluster, but I am getting an error:

{"level":"info","ts":1575407848.053906,"logger":"controller_scaledobject","msg":"Reconciling ScaledObject","Request.Namespace":"default","Request.Name":"redis-scaledobject"} {"level":"info","ts":1575407848.0539508,"logger":"controller_scaledobject","msg":"Detecting ScaleType from ScaledObject","Request.Namespace":"default","Request.Name":"redis-scaledobject"} {"level":"info","ts":1575407848.0539541,"logger":"controller_scaledobject","msg":"Detected ScaleType = Deployment","Request.Namespace":"default","Request.Name":"redis-scaledobject"} {"level":"info","ts":1575407848.0539687,"logger":"controller_scaledobject","msg":"Creating a new HPA","Request.Namespace":"default","Request.Name":"redis-scaledobject","HPA.Namespace":"default","HPA.Name":"keda-hpa-custom-api"} {"level":"error","ts":1575407848.0556328,"logger":"controller_scaledobject","msg":"Error updating scaledObject status with used externalMetricNames","Request.Namespace":"default","Request.Name":"redis-scaledobject","error":"the server could not find the requested resource (put scaledobjects.keda.k8s.io redis-scaledobject)","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\ngh.neting.cc/kedacore/keda/pkg/controller/scaledobject.(*ReconcileScaledObject).getScaledObjectMetricSpecs\n\tkeda/pkg/controller/scaledobject/scaledobject_controller.go:351\ngh.neting.cc/kedacore/keda/pkg/controller/scaledobject.(*ReconcileScaledObject).newHPAForScaledObject\n\tkeda/pkg/controller/scaledobject/scaledobject_controller.go:297\ngh.neting.cc/kedacore/keda/pkg/controller/scaledobject.(*ReconcileScaledObject).reconcileDeploymentType\n\tkeda/pkg/controller/scaledobject/scaledobject_controller.go:203\ngh.neting.cc/kedacore/keda/pkg/controller/scaledobject.(*ReconcileScaledObject).Reconcile\n\tkeda/pkg/controller/scaledobject/scaledobject_controller.go:147\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.2/pkg/internal/controller/controller.go:216\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.2/pkg/internal/controller/controller.go:192\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.2.2/pkg/internal/controller/controller.go:171\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20190404173353-6a84e37a896d/pkg/util/wait/wait.go:88"}

Any ideas about what can be causing this issue?

Thanks,

Cristiano

@marchmallow
Copy link
Contributor

Tried to run the redis autoscaler against a Redis cloud service in IBM cloud with TLS but looks like TLS not supported in the current Redis autoscaler, can try to submit a PR for it.. any idea if someone working on it already?

@tomkerkhove
Copy link
Member

That is correct, we don't support this AFAIK.

We are more than happy to accept PRs and I'm not aware of anybody working on it.

Thanks!

@fktkrt
Copy link

fktkrt commented Jan 6, 2020

I am wondering if adding TLSConfig to ClusterOptions/redis.Options might do it. I am thinking something like this as the simplest solution:

TLSConfig: &tls.Config{
	InsecureSkipVerify: true,
},

Reference:
https://github.com/go-redis/redis/blob/master/cluster.go#L147

@marchmallow
Copy link
Contributor

@inuyasha82 is working on adding TLS...

@inuyasha82
Copy link
Contributor

@fktkrt i'm trying to do exactly that! :) stay tuned...

@inuyasha82
Copy link
Contributor

Hi guys, i submitted PR #540 it is in wip state btw, because there is an odd behaviour, apparently when it connects to an SSL queue, it can read the listlength, and create the pod if is scaled down to 0, but for some reason is not scaling to more than one pod even if i add 100 items to the list, so maybe i need to add changes somewhere else, or maybe is just a problem of my local env, but i will appreciate anyone that can help! :)

@tomkerkhove
Copy link
Member

This sounds like kedacore/keda-docs#64

If I remember it correctly, but adding @jeffhollan, it might be because you are using Minikube - Is that possible?

@inuyasha82
Copy link
Contributor

@tomkerkhove i don't know, the thing is that using the changes in PR #540 when using a normal redis queue looks it is scaling correctly to more than 1. When i connect to the tls queue, it scale up, but to no more than one pod.

So the normal queue is deployed in the Minikube environment using helm.
The ssl queue is hosted externally.

@tomkerkhove
Copy link
Member

Question is, where is KEDA running? On Minikube?

@inuyasha82
Copy link
Contributor

yes on MIinkube.

@tomkerkhove
Copy link
Member

Then it works as intended, mini kube has a limitation which causes it not to scale beyond 1

@inuyasha82
Copy link
Contributor

inuyasha82 commented Jan 17, 2020

@tomkerkhove today i tried the implementation in a kubernetes cluster, that is not minukube and i'm seeing the same behaviour.
Redis SSL -> scale maximum to 1
Redis non-ssl -> correct scaling

I took couple of screenshots of the logging (for the non redis-one ignore the pods error, but you can see that the deployments was correctly scaling up (it arrived to 6 even if not evident from the screenshot).

Using Redis non ssl:

Screenshot 2020-01-17 at 15 36 24

While using Redis on ssl:
Screenshot 2020-01-20 at 11 09 27

As you can see from the delpoyments/pods sections (the 2 on the right) when using tls it doesn't scale up to more than 1, while on the other hand using the normal queue it is working.

BTw i'm printing the list length every time the function getRedisListLength is called, and it is always returning an accurate number (check the bottom left section) in both cases.

So i can't figure out what is the problem, since apparently getRedisListLength is returning the correct number every single time.

Maybe do i need to update somewhere else? Someone has an idea of what could be the reason?

@balchua
Copy link
Contributor

balchua commented Jan 21, 2020

@inuyasha82 whats the output of the hpa?
kubectl get hpa

@inuyasha82
Copy link
Contributor

So this is the output when i use a redis with TLS:

Screenshot 2020-01-21 at 14 21 04

Instead if i'm connecting to a non tls queue i can see some values:

Screenshot 2020-01-21 at 16 53 09

When using the TLS queue apparently is not getting any metrics information, so tried to describe the hpa this is the output:

Screenshot 2020-01-22 at 09 54 52

It looks like that it can't extract any metric from the hpa on the TLS queue.

There is one difference between the twho redis server, usually the non tls one is deployed in the same kubernetes cluster, while the TLS redis queue i'm using is a service deployed on the cloud (so not in the same cluster), could that be the reason why it doesn't scale up?

@balchua
Copy link
Contributor

balchua commented Jan 22, 2020

@inuyasha82 you have set your minpod to 1. Perhaps that's the reason why you have 1 pod running. Although from your previous log, i can see the return value of getRedisListLength.

Would you be able to print some logs in the GetMetrics method instead of just in IsActive? Im guessing there is an error somewhere. It couldnt be because of redis server outside kubernetes cluster thats causing it as far as i know.

Maybe it is failing and end up in this block of code in GetMetrics method.

	if err != nil {
		redisLog.Error(err, "error getting list length")
		return []external_metrics.ExternalMetricValue{}, err
	}

Im just guessing. 😁

@inuyasha82
Copy link
Contributor

@balchua i'll give it a try, btw consider that the setup used for both tls and non tls is identical, the only difference is the host and password (And the new parameter i added to enableTLS)

@fktkrt
Copy link

fktkrt commented Jan 22, 2020

Are you using metrics-server?
Maybe redeploying it with the following arguments can help

# edit metric-server deployment to add the flags
# args:
# - --kubelet-insecure-tls
# - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
$ kubectl edit deploy -n kube-system metrics-server

https://github.com/kubernetes-sigs/metrics-server

@marchmallow
Copy link
Contributor

Started to look at this with @inuyasha82 - realised missed to deploy the locally built keda-metrics-adapter

Hence should update the documentation in the README with something like this:

helm install . --set image.keda=kedacore/keda:$IMAGE_TAG,image.metricsAdapter=kedacore/keda-metrics-adapter:$IMAGE_TAG,image.pullPolicy=IfNotPresent

Will send a pull request for that. Retesting to see if this fixes the problem.

@tomkerkhove
Copy link
Member

Thanks for sending the PR!

@inuyasha82
Copy link
Contributor

So @tomkerkhove @fktkrt @balchua so tested again using the keda-metrics-adatper from the custom build, and now it is correctly scaling up and down.

So the problem was that it was using the keda-metrics-adapter from master and my keda operator from the custom tag. So as @marchmallow suggested could be a good idea to add this in the README.

So i think that the PR #540 can be reviewed now.

@tomkerkhove
Copy link
Member

I presume we can close this one once the docs are there for new TLS changes?

@marchmallow
Copy link
Contributor

I think this can be closed indeed. @inuyasha82 also submitted a PR to add an option for choosing which redis DB to use. It seems pretty feature complete now. Looking forward to get keda into our production environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Looking for support from community scaler
Projects
None yet
Development

No branches or pull requests