-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Leak in dd-trace http plugin #4286
Comments
dd-trace http
plugin
dd-trace http
plugin
@tlhunter sorry for the tag, not sure the protocol for raising issues up to folks at dd... wondering if this is being looked at in the v4 and v5 lines |
I am too having this issue. We're seeing elevated memory footprint in our node application after upgrading dd-trace npm package. Has anyone on the DataDog side been able to confirm this? |
for me, using |
Also noticing increased memory usage, growing at a linear rate after adding in Have disabled the http plugin for now and will report back if that resolves the leak. Edit: Can confirm disabling the http plugin has fixed the leak. |
I'll be looking into this issue today. Does anybody have a simple reproduction of the issue? Ideally a single file that generates outgoing requests? Often times memory leaks are caused by interplay with third party packages but without knowing the dependencies everyone in this thread is using I'll be working in the dark. |
I've been working on this for a couple hours and have not perfectly reproduced the situation where requests made via fetch with dd-trace enabled causes a memory leak and where setting |
I'm not sure if it is related to this issue, but the tests in an app that I'm working on started failing with out of memory issues when I upgraded dd-trace from 4.2.0 to 4.41.0. I didn't have time to identify which version introduced the problem. The app is built using nest.js (v9) app with express (v4) running on node v18. The test are using jest v29. Unfortunately I was not able to reproduce this outside the app context to be able to create a simple way to reproduce the issue. |
Whew, this ended up being an essay, but I hope the information is helpful and is something we can link to in the future. BackgroundHere is a pretty straightforward application that I've been using to simulate a high load application (Node.js = v18.20.2). It works by sending millions of requests to a simple Node.js http server which in turn makes an outbound http request to another server. It's running with the latest version of the tracer. Heap snapshots are generated before and after a run of exactly 1,000,000 requests (though at that point exactly 2,000,003 requests had already been made). I then compare the dump of the before and after 1MM requests in Chrome using the comparison feature. In this case I am seeing a ~500kb growth in memory which is pretty much a rounding error and could possibly be attributed to things like optimizations that V8 makes under the hood. The vanilla application without the tracer grows by ~200kb during this same time. Thanks for the original heapdump, @viral-barot, however just a single request often isn't enough to track down a memory leak. For example, the application may be lazy loading files from disk to serve a request, caches aren't necessarily primed, code isn't optimized, etc. There is constant jitter when doing memory comparisons between two dumps and if 1 object is allocated somewhere it gets lost in the noise. Instead, by tracking exactly 1MM requests, I'm able to look through the heap and keep an eye out for situations where 1 million objects are retained (or multiples of 1 million like 2 million or 500k). For example, in the past when a confirmed memory leak has occurred, and one million requests are made, we would see a million So there could be a memory leak in the tracer but my attempt to reproduce it has failed. The tracer has code paths for every integration we support so the leak might only occur when a particular pair of integrations is used. In order for me to properly reproduce the memory leak at this point we'll need either of the following:
Defining a Memory LeakAlso I think it's worth defining what a memory leak is for any one trying to identify and even reproduce one. A Node.js application of course always consumes some memory, maybe a fresh basic process is 40MB. While serving requests it will use more, perhaps 60MB. After requests are complete it will idle at a certain level, maybe 50MB. Maybe while serving 10,000 concurrent requests the memory usage is 200MB but the process might settle back down around 50MB. So while the memory usage grows and contracts, as defined none of that is a memory leak so far. Any observability tool, including the Datadog tracer, incurs additional overhead. The amount that is incurred is nearly impossible to measure as it changes depending on the packages used in the application. Even coding patterns and patterns of asyncrony can change the memory usage. However, to just throw out a number, we could pretend that a given basic app has a 20% memory overhead. In such case we might end up with memory usage like this:
So while there is a memory overhead (in this case 20%) there isn't a memory leak. This can muddy the waters though as a container might be configured to provide enough memory for a vanilla application during peak usage or an idling application that is traced it might not provide enough memory for a traced application during peak usage. For example a memory limit of 225MB would cause this traced app to be killed during peak usage. Thus a true memory leak is a program that grows in an unbounded manner related to an operation that is happening somewhere in the application. Generally speaking such growth is related to an operation happening in the application such as an HTTP request / response, gRPC messages being received (or in the case of a CRON style server it might be a function of time such as Essentially, request / response objects can stick around in memory after the request is complete. Repro FilesSave these files in a directory and then run the following commands and a series of benchmarks and heapdumps and memory snapshots will be taken: $ npm install
$ chmod +x ./benchmark.sh
$ ./benchmark.sh
Here's my benchmarking and heap dump generation code: app.jsconsole.log('PID:', process.pid);
const http = require('http');
const URL = require('url');
const fs = require('fs');
const path = require('path');
const v8 = require('v8');
const host = 'localhost';
const port = 8000;
fs.writeFileSync(
path.join(__dirname, 'app.pid'),
String(process.pid)
);
const requestListener = async function (req, res) {
const url = URL.parse(req.url);
const params = new URLSearchParams(url.search);
switch (url.pathname) {
case '/gc':
gc();
res.end('garbage collected');
break;
case '/stats':
const mb = (process.memoryUsage().rss / 1024 / 1024).toFixed(3)+'MB';
res.end(mb);
break;
case '/dump':
const filename = params.get('filename') ? params.get('filename') + '.heapsnapshot' : undefined
console.log(`writing snapshot ${filename}...`);
v8.writeHeapSnapshot(filename);
console.log(`wrote snapshot ${filename}.`);
res.end(`wrote snapshot ${filename}`);
break;
default:
const r = await fetch(`http://localhost:8001`);
const t = await r.text();
res.end('client: ' + t);
break;
}
};
const server = http.createServer(requestListener);
server.listen(port, host, () => {
console.log(`Server is running on http://${host}:${port}`);
}); benchmark.shecho "START COMMON SERVER"
npm run start-server &
# NO DATADOG
echo "START CLIENT WITHOUT DATADOG"
npm run start-no-dd &
sleep 1
echo "PRIME THE CACHE"
npm run req
npm run req
npm run req
sleep 1
npm run gc
sleep 1
echo "GET THE MEMORY USAGE BEFORE BENCHMARK"
npm run mem > mem-no-dd-0.txt
cat mem-no-dd-0.txt
echo "RUN THE FIRST BENCHMARK"
npm run benchmark
sleep 10
npm run gc
sleep 5
npm run gc
sleep 5
echo "GET THE MEMORY USAGE AFTER BENCHMARK"
npm run mem > mem-no-dd-1.txt
cat mem-no-dd-1.txt
echo "RUN THE SECOND BENCHMARK"
curl http://localhost:8000/dump?filename=mem-no-dd-0
npm run benchmark
npm run benchmark
sleep 10
npm run gc
sleep 5
npm run gc
sleep 5
curl http://localhost:8000/dump?filename=mem-no-dd-1
echo "GET THE MEMORY USAGE AFTER SECOND BENCHMARK"
npm run mem > mem-no-dd-2.txt
cat mem-no-dd-2.txt
echo "KILL THE NON-DATADOG CLIENT"
kill `cat app.pid`
# WITH DATADOG
echo "START CLIENT WITHOUT DATADOG"
npm run start-dd &
sleep 1
echo "PRIME THE CACHE"
npm run req
npm run req
npm run req
sleep 1
npm run gc
sleep 1
echo "GET THE MEMORY USAGE BEFORE BENCHMARK"
npm run mem > mem-dd-0.txt
cat mem-dd-0.txt
echo "RUN THE FIRST BENCHMARK"
npm run benchmark
sleep 10
npm run gc
sleep 5
npm run gc
sleep 5
echo "GET THE MEMORY USAGE AFTER BENCHMARK"
npm run mem > mem-dd-1.txt
cat mem-dd-1.txt
echo "RUN THE SECOND BENCHMARK"
curl http://localhost:8000/dump?filename=mem-dd-0
npm run benchmark
npm run benchmark
sleep 10
npm run gc
sleep 5
npm run gc
sleep 5
curl http://localhost:8000/dump?filename=mem-dd-1
echo "GET THE MEMORY USAGE AFTER SECOND BENCHMARK"
npm run mem > mem-dd-2.txt
cat mem-dd-2.txt
echo "KILL THE DATADOG CLIENT"
kill `cat app.pid` faux-server.jsconst http = require('http');
const host = 'localhost';
const port = 8001;
const requestListener = function (req, res) {
res.end('ok from faux server');
};
const server = http.createServer(requestListener);
server.listen(port, host, () => {
console.log(`Faux server is running on http://${host}:${port}`);
}); package.json{
"name": "gh-4286",
"version": "1.0.0",
"description": "memory leak test",
"main": "app.js",
"scripts": {
"start": "echo 'use start-dd or start-no-dd instead' ; exit 1",
"start-server": "node --expose-gc faux-server.js",
"start-dd": "node --require='dd-trace/init' --expose-gc app.js",
"start-no-dd": "node --expose-gc app.js",
"watch": "watch -n 0.5 ps aux `cat app.pid`",
"benchmark": "autocannon -a 1000000 http://localhost:8000",
"req": "curl --silent http://localhost:8000/",
"gc": "curl --silent http://localhost:8000/gc",
"mem": "curl --silent http://localhost:8000/stats",
"dump": "curl --silent http://localhost:8000/dump"
},
"author": "Thomas Hunter II <tlhunter@datadog.com>",
"license": "BSD",
"dependencies": {
"dd-trace": "^5.17.0"
}
} |
@tlhunter totally understand your point that is impossible to fix this kind of issues if you can't reproduce it. I will try to build a sample app to reproduce the issue that I'm having. BTW, I checked multiple versions and the issues happens when I upgrade from v4.18.0 to v4.19.0. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
@rmagrin you bisected / installed a bunch of tracer versions and narrowed it down to exactly v4.19.0? That was a big release and it could be related to any of these commits:
Someone else suspected dc-polyfill could be contributing to the issue and this commit introduced it. That said there's no way of even telling if any of the memory leaks reported in this thread are related at this point. |
FYI, I'm closing any replies unrelated to the http integration as being off-topic. If anyone else has concerns about potential memory leaks, please create an issue via the helpdesk. There's a link on the GitHub create issue screen that takes you to the helpdesk. Such issues are private and allow for attaching potentially sensitive information such as Using the helpdesk not only prioritizes these issues but also helps prevent off-topic comments from muddying the water. Memory leaks are highly dependent on things like what combination of packages are used, what version of Node.js is used, etc. |
@tlhunter exactly. I created a branch from my app where I created one commit for each version off dd-trace between v4.7.0 (the last version a new didn't have the issue) and v4.41.0 and used git bisect to figure out that v4.19.0 was the one that introduced the issue. After I double check running the tests with v4.18.0 and v4.19.0 and confirmed that the issue only happens in v4.19.0. A few more data points that could help here:
|
I have encountered memory leak issue in dd-trace package.
when I disable
http
pluginDD_TRACE_DISABLED_INSTRUMENTATIONS=http
then there is no memory leak.I have tried with upgrading and downgrading dd-trace package versions between
3.X
to5.X
also tried differentnode
version but still facing same issue.node version :
18.17.1
dd-trace version :
5.10.0
restify version :
11.1.0
Memory Heap Snapshot
http plugin Disabled - Heap Snapshot.zip
http plugin Enabled - Heap Snapshot.zip
The text was updated successfully, but these errors were encountered: