You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that there is no limit on how much memory traces can use.
It would be good have option to configure such limit so that we can be sure that tracing can't kill the service because it exhausts all
memory (for example when a client starts sending x-datadog-sampling-priority: 1 for all requests).
It is better to temporarily lose traces than to kill the service altogether.
It's difficult to anticipate and track the memory consumption of the tracer as it's currently written.
One yet unplanned work item that we will eventually address is benchmarks for the tracer, measuring its resource consumption under a variety of loads. If we find that memory consumption is pathological in some realistic situation, then we could address it somehow, maybe by adding a limit like you describe.
I don't know the underlying cause of the issue that you linked. Do you think that the Datadog tracer is ever the culprit of memory exhaustion?
We have seen the Envoy reach its memory limit with DataDog tracing enabled. When we disabled DataDog tracing, the Envoy was running without issues. We discovered that by mistake all the requests (several thousand requests per second) coming to Envoy had headers forcing the traces to be recorded.
The limit was set to 1GB as far as I remember. I admit that is not extra high limit for storing traces, but it's not extra low either, considering that Istio has The Envoy proxy uses 0.35 vCPU and 40 MB memory per 1000 requests per second going through the proxy. in docs.
So I think it would be good to have a fail-safe option to discard traces instead of OOM killing Envoy in case some client starts sending tracing headers like this by mistake (or deliberately when trying to bring down the service). I'd rather lose some traces (and get notified that it happened) than affect responses to users.
It seems that there is no limit on how much memory traces can use.
It would be good have option to configure such limit so that we can be sure that tracing can't kill the service because it exhausts all
memory (for example when a client starts sending
x-datadog-sampling-priority: 1
for all requests).It is better to temporarily lose traces than to kill the service altogether.
What do you think?
See also istio/istio#33073.
The text was updated successfully, but these errors were encountered: