Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential memory leak in UDP #6189

Closed
mjesun opened this issue Apr 13, 2016 · 12 comments
Closed

Potential memory leak in UDP #6189

mjesun opened this issue Apr 13, 2016 · 12 comments
Labels
dgram Issues and PRs related to the dgram subsystem / UDP. dns Issues and PRs related to the dns subsystem. memory Issues and PRs related to the memory management or memory footprint.

Comments

@mjesun
Copy link

mjesun commented Apr 13, 2016

  • Version: 4.4.2
  • Platform: Linux (Ubuntu Server 8 - Jessie), kernel 3.16
  • Subsystem: dgram.js

UDP sockets seem to have a memory leak. Seems that it's not the first time it had that, so it feels to me that it might be a regression. The following test code reproduces it:

const Crypto = require('crypto');
const Dgram = require('dgram');

let socket = Dgram.createSocket('udp4');

setInterval(() => {
  for (let i = 0; i < 30; i++) {
    socket.send(Crypto.randomBytes(50), 0, 50, 8125, 'example.org');
  }
}, 10);

Memory grows approximately at a rate of 1 MB/s in my machine. I assume that a UDP socket should just send the data (no matter if no one is listening on the other side), so there should not be this huge memory consumption.

@mscdex mscdex added dgram Issues and PRs related to the dgram subsystem / UDP. memory Issues and PRs related to the memory management or memory footprint. labels Apr 13, 2016
@mscdex
Copy link
Contributor

mscdex commented Apr 14, 2016

I believe it's the large number of DNS requests that are happening. Every time you call socket.send() it has to resolve example.org to an IP address. Try this and you should see more stable memory usage:

'use strict';
const Crypto = require('crypto');
const Dgram = require('dgram');
const dns = require('dns');

let socket = Dgram.createSocket('udp4');

dns.resolve('example.org', function(err, addrs) {
  if (err) throw err;
  const addr = addrs[0];
  setInterval(() => {
    for (let i = 0; i < 30; i++) {
      socket.send(Crypto.randomBytes(50), 0, 50, 8125, addr);
    }
  }, 10);
});

@mscdex mscdex added the dns Issues and PRs related to the dns subsystem. label Apr 14, 2016
@mjesun
Copy link
Author

mjesun commented Apr 14, 2016

Very true. I've tested the code and the memory is stable. So it seems the memory leak might be in the DNS resolution? I've checked the dns module and everything seems to be right to me. Could it be on the native side?

@mscdex
Copy link
Contributor

mscdex commented Apr 14, 2016

The underlying issue here is that dns requests are currently made in the thread pool which has a default size of 4. All requests to use the thread pool are queued when no threads are available. Since a dns request is being made here every ~10ms, these requests start stacking up very quickly because the thread pool can't keep up (and dns requests can take some time to complete -- especially if the nameserver is responding slowly).

@cjihrig
Copy link
Contributor

cjihrig commented Apr 14, 2016

This seems to be resolved. Are we good to close this?

@mscdex mscdex added dgram Issues and PRs related to the dgram subsystem / UDP. and removed dgram Issues and PRs related to the dgram subsystem / UDP. labels Apr 14, 2016
@mjesun
Copy link
Author

mjesun commented Apr 14, 2016

What about a host/IP LRU cache? I think it would mitigate the problem. I assume as well that most of the people that is using UDP sockets is not realizing that, if they pass a host to sendTo, a DNS request is going to be made every time.

If a request is being made every 10 ms, a cache of 5-10 seconds should probably kill the problem.

@mscdex
Copy link
Contributor

mscdex commented Apr 14, 2016

While that may seem like a sensible solution, I suspect that in-memory caching may catch people off guard especially if the results in the cache don't automatically expire. At that point you're having to implement ttl and having to decide whether a custom ttl is used or you use the ttl returned by the nameserver (which IIRC we aren't retrieving from c-ares for one reason or another) and other things.

So IMHO adding transparent caching support can be tricky and is liable to cause problems for other use cases. FWIW there have been discussions in the past about returning ttl and other information for use in userland modules, but that would only help you write a more efficient/"correct" cache in userland, it wouldn't help with this particular issue.

@mjesun
Copy link
Author

mjesun commented Apr 14, 2016

I see the problem with the cache; so probably what looks the most reasonable is to use the server's returned TTL, which is unfortunate we can't get. As I said before, I think most of the people using UDP sockets will fall into the same issue (and, in general, not expecting gethostbyname to queue); so I think it's worth taking a look to that. However, if you want to close this ticket, feel free to do that.

@mscdex
Copy link
Contributor

mscdex commented Apr 14, 2016

At the very least it may be worth adding some notes to the documentation in the appropriate places to make this more evident. Perhaps either a separate section in the dns docs or modifying existing wording and then linking to that from various places in other modules' documentation that accept hostnames. If you or anyone else wants to submit a PR for something like that, that would be a good start.

I should also point out that net.Socket() already has support for passing in a custom lookup() function that defaults to dns.lookup(). So I suppose something similar could be added for dgram. This would allow you to implement your own caching system in front of dns.lookup() and/or you could even call out to a pure js dns client from the custom lookup() to avoid the thread pool entirely.

@mjesun
Copy link
Author

mjesun commented Apr 14, 2016

I don't think net.Socket suffers this problem, because you do Socket#connect once, and, unless you kill the socket and reconnect, the DNS resolution will only happen once. The problem is more in send, which is able to accept a host name per call.

@punnerud
Copy link

punnerud commented Jan 26, 2017

Here is the same problem in version v6.9.4 (LTS)
Recordet what is happening while running in debug mode
https://youtu.be/mYMMZzEDAE4 (to big to upload here)
The first package is sendt (23sec in the movie), the rest is send just before the stack is closing (29sec)

Here you can see how the memory is growing to 1,3GB if i increase the loop to 1000000
https://youtu.be/xAAz56bIcm0

Server:

var PORT = 33333;
var HOST = '0.0.0.0';

var dgram = require('dgram');
var server = dgram.createSocket('udp4');

server.on('listening', function () {
    var address = server.address();
    console.log('UDP Server listening on ' + address.address + ":" + address.port);
});

server.on('message', function (message, remote) {
    console.log(remote.address + ':' + remote.port +' - ' + message);

});

server.bind(PORT, HOST);

Client:

var PORT = 33333;
var HOST = '127.0.0.1';

var dgram = require('dgram');
var client = dgram.createSocket('udp4');
for(i=0;i<1000000;i++){
var message = new Buffer('My KungFu is Good! '+ i);
client.send(message, 0, message.length, PORT, HOST, function(err, bytes) {
    if (err) throw err;
	if(i==1000000){
	console.log("done");
}
});
}

@Trott
Copy link
Member

Trott commented Jul 16, 2017

@mscdex @cjihrig @mcollina Should this remain open?

@mcollina
Copy link
Member

@Trott this can be closed, I'm closing it.

@punnerud the problem in your code is caused by the fact that you are scheduling 1000000 messages to be sent before the socket is even ready. This is causing the massive memory allocation spike. You should schedule your messages using a queue and send only a limited number in parallel.

cjihrig added a commit to cjihrig/node that referenced this issue Aug 3, 2017
This commit adds support for custom DNS lookup functions in
dgram sockets. This is similar to the existing feature in net
sockets.

Refs: nodejs#6189
PR-URL: nodejs#14560
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Wyatt Preul <wpreul@gmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
jasnell pushed a commit that referenced this issue Sep 25, 2017
This commit adds support for custom DNS lookup functions in
dgram sockets. This is similar to the existing feature in net
sockets.

Refs: #6189
PR-URL: #14560
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Wyatt Preul <wpreul@gmail.com>
Reviewed-By: Evan Lucas <evanlucas@me.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dgram Issues and PRs related to the dgram subsystem / UDP. dns Issues and PRs related to the dns subsystem. memory Issues and PRs related to the memory management or memory footprint.
Projects
None yet
Development

No branches or pull requests

6 participants