Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to limit memory used by traces #190

Open
martin-sucha opened this issue Jul 2, 2021 · 2 comments
Open

Option to limit memory used by traces #190

martin-sucha opened this issue Jul 2, 2021 · 2 comments

Comments

@martin-sucha
Copy link

It seems that there is no limit on how much memory traces can use.
It would be good have option to configure such limit so that we can be sure that tracing can't kill the service because it exhausts all
memory (for example when a client starts sending x-datadog-sampling-priority: 1 for all requests).
It is better to temporarily lose traces than to kill the service altogether.

What do you think?

See also istio/istio#33073.

@dgoffredo
Copy link
Contributor

It's difficult to anticipate and track the memory consumption of the tracer as it's currently written.

One yet unplanned work item that we will eventually address is benchmarks for the tracer, measuring its resource consumption under a variety of loads. If we find that memory consumption is pathological in some realistic situation, then we could address it somehow, maybe by adding a limit like you describe.

I don't know the underlying cause of the issue that you linked. Do you think that the Datadog tracer is ever the culprit of memory exhaustion?

@martin-sucha
Copy link
Author

We have seen the Envoy reach its memory limit with DataDog tracing enabled. When we disabled DataDog tracing, the Envoy was running without issues. We discovered that by mistake all the requests (several thousand requests per second) coming to Envoy had headers forcing the traces to be recorded.

The limit was set to 1GB as far as I remember. I admit that is not extra high limit for storing traces, but it's not extra low either, considering that Istio has The Envoy proxy uses 0.35 vCPU and 40 MB memory per 1000 requests per second going through the proxy. in docs.

So I think it would be good to have a fail-safe option to discard traces instead of OOM killing Envoy in case some client starts sending tracing headers like this by mistake (or deliberately when trying to bring down the service). I'd rather lose some traces (and get notified that it happened) than affect responses to users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants