Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto down scaling collectors without loosing information #30916

Closed
chkp-yairt opened this issue Jan 31, 2024 · 3 comments
Closed

Auto down scaling collectors without loosing information #30916

chkp-yairt opened this issue Jan 31, 2024 · 3 comments
Labels
enhancement New feature or request needs triage New item requiring triage

Comments

@chkp-yairt
Copy link

Component(s)

cmd/otelcontribcol

Is your feature request related to a problem? Please describe.

Our current architecture is set so we have a load balancer that accepts all data as otlp and divides the load between several opentelemetry deployments that in turn send the data to the required exporter endpoint (loki,mimir,tempo).
Once we have a high load, the auto scaling kicks in and increases the number of deployments.
The issue is when the load decreases and we need to decrease the number of deployments as well.
When a pod goes down it deletes all the data it was holding leading to an obvious data loss.
This happens regardless if you use a persistent storage as each pod is unaware of the other.

Describe the solution you'd like

The solution, if possible is that once the number of pods decrease each pod will first - stop receiving data, clear out all of its queues and then drop.

Describe alternatives you've considered

Using an efs where the queue folders are shared among the pods so they know to pickup where the dropped pod left off.

Additional context

This issue also happens when using the stateful set mode not just deployment.

@chkp-yairt chkp-yairt added enhancement New feature or request needs triage New item requiring triage labels Jan 31, 2024
@crobert-1
Copy link
Member

Hello @chkp-yairt, are you using the operator and the helm chart, or just one or the other? Maybe neither?

@chkp-yairt
Copy link
Author

Hi @crobert-1, I'm using the helm chart but not the operator. Is there a feature in the operator that provides this capability?

@crobert-1
Copy link
Member

I'm not sure if the operator provides this functionality. From my understanding you may be better served by filing this issue in the helm chart repository, since it's what handles the autoscaling.

Feel free to reopen if I've misunderstood though, happy to help if there's anything specific to the collector itself that can be done!

@crobert-1 crobert-1 closed this as not planned Won't fix, can't repro, duplicate, stale Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs triage New item requiring triage
Projects
None yet
Development

No branches or pull requests

2 participants