-
Notifications
You must be signed in to change notification settings - Fork 48
Multi-threaded setup #189
Comments
@chr4 You should construct a single By default, Try adding this to your JS code before creating the
|
Thanks for the hints! Unfortunately, regardless of the osrm module was installed using Here's the code I'm using (missing code just assembles the config hash): var express = require('express');
var OSRM = require('osrm');
var app = express();
var osrm = new OSRM();
[...]
function getDrivingDirections(req, res) {
[...]
osrm.route(config, function(err, result) {
if (err) e
returnres.json({
error: err.message
});
} else {
return res.json(result);
}
});
}
app.get('/route/v1/driving/:coordinates', getDrivingDirections);
console.log('Listening on port: ' + 5000);
app.listen(5000); Any further hints? |
@chr4 Sounds like you might be I/O bound at some layer. Almost all of the routing data is loaded into RAM, except the coordinate index (the If you're not running on an SSD disk, try that. If you've got enough RAM, you can also try moving everything into a ramdisk (e.g. using |
I'm using an SSD RAID with a lot more than 700mb/s (probably > 1gb/s) . Furthermore, the old setup using
Any further ideas? To make sure I got it right: Even the use of |
@chr4 Yes, even with I'm not sure what else to suggest at this stage. There's a bottleneck somewhere that you're going to need to track down. Have you tried removing the OSRM call and see if your Node process can create sufficient CPU load by itself? |
@chr4 actually we tried to reproduce this and saw a similar behavior. @daniel-j-h is working on a fix that should utilize the all the cores using |
Thanks for looking into this! Let me know if you have something to try out, or in case you need any other feedback. |
We haven't figured out exactly what changed in the libuv behavior since the 0.10.x series but it seems like it doesn't play nicely anymore |
For the record, properly using |
This fixes concurrency issues reported in #189. References: - https://github.com/nodejs/nan#asynchronous-work-helpers - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_worker - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_queue_worker
This fixes concurrency issues reported in #189. References: - https://github.com/nodejs/nan#asynchronous-work-helpers - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_worker - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_queue_worker
This fixes concurrency issues reported in #189. References: - https://github.com/nodejs/nan#asynchronous-work-helpers - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_worker - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_queue_worker
This fixes concurrency issues reported in #189. References: - https://github.com/nodejs/nan#asynchronous-work-helpers - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_worker - https://github.com/nodejs/nan/blob/master/doc/asyncworker.md#api_nan_async_queue_worker
@chr4 I worked on this during the last days and my pull request already landed in |
Wow, that was quick! Is the fix released in |
@chr4 it's in 5.1.1, should be good to just npm install. |
I tried out version Is there anything I have to change in my Out of curiosity: The CHANGELOG of osrm-backend indicates improvements for |
No these were external contributions. Can you test the following for me? Go into the node-osrm repository do:
Then execute the following script and watch CPU usage:
EDIT: Sorry used the wrong script. |
When using export CXX=g++ # won't compile with clang++ I get the following:
I tried pointing to the
The
|
@chr4 sorry about that. You should probably do: |
Sorry to bother again, but I get exactly the same error messages after doing a fresh This time though, it seems to work after running
When running my server.js in the Next steps? Let me know if there's anything else that I can test out/ provide to help. |
For the test-script from above? Hm. Can you try to add
at the very top of the script?
This could be caused by something saturating the main thread of your node process with blocking calls that are not OSRM. A good way to test this is to replace all OSRM calls with async timeouts like: var express = require('express');
var OSRM = require('osrm');
var app = express();
var osrm = new OSRM();
[...]
function getDrivingDirections(req, res) {
[...]
setTimeout(function() {
return res.json(someDummyResponse);
}, 100);
}
app.get('/route/v1/driving/:coordinates', getDrivingDirections);
console.log('Listening on port: ' + 5000);
app.listen(5000); Expected behavior of the above script: It still maxes out at 200% CPU. If you can now magically push beyond the 200% we might have another performance bug on our hand with the node bindings. (maybe validation?) |
Setting You're right. When wrapping osrm calls into function getDrivingDirections(req, res) {
[...]
setTimeout(function() {
osrm.route(config, function(err, result) {
if (err) {
return res.json({
error: err.message
});
} else {
return res.json(result);
}
});
}, 100);
} |
Update: I just replaced |
Hm a few questions:
To narrow this down you could also try 4.9.1 as the last pre-5 release (where a lot of things changed that could maybe cause regressions). |
We never used 4.9.x in production, only for tests (production still uses ancient 0.3.x). I suppose even more things changed between 0.3.x and 5.0.x :)) For the record: The new setup is approx. as fast as the old one (that uses all the cores), so it's not like it is overall slower. I just imagine it could be like three times as fast, if we'd get it to use all cores available. |
If the script that sends the requests is too slow (because it is not concurrent for example) the server can't reach full load. If I get a sample of the requests I can try to reproduce this on my machines. |
We're generating 1000 requests and then firing them multi-threaded. I've generated a set to test for you (sorry, hope this won't blow up Github comments (does Github Markup supports spoilers?) :)) https://gist.github.com/chr4/f404f5bdfe81fe27fce1d33f037391e3 |
You might be right. When using |
@chr4 these requests seem to have Longitude and Latitude swapped? The order for v5 is lon,lat not lat,lon. |
Oh, my bad, sorry. I wasn't sure when generating the URLs, I updated the post with the gist with a link to an updated list. This obviously affects the results: CPU usage sometimes shortly spikes to 700-800% (for like 0.1s), average is around 200% again, though. |
With OSRM v5.2.5 via 1/ 2/ |
To be clear here: this shows This still shows that your statement
lets me think your bottleneck is not in OSRM at all. |
I'm closing this as resolved for us. We haven't heard back from you in a while. Feel free to test with 5.4 and open a new issue if you still see performance issues. |
Small note: if you're not interested in examining the OSRM object tree, but only care about sending JSON over the network (i.e. you're using |
Thanks for the advice! Is there anything I can do to allow SSH-type behavior with the node.js wrapper for OSRM? Also (this might be better in a separate issue), but is it possible to use MLD-compiled .osrm files for node-osrm? |
I've been running the internal
osrm-backend
C++ API in the past, and recently came across that it's not supposed to be run in production. On my way to migrate to node-osrm, I came across an issue: I can't get a setup to work that uses more than ~1.5 CPU threads.As nodejs is single-threaded, I was using the nodejs internal cluster module, and I also tried load-balancing requests using haproxy to two different instances of node-osrm on the same machine.
I'm using
var osrm = new OSRM();
to access the map data via SHM (loaded viaosrm-datastore
).While both approaches succeed in spawning multiple instances of osrm-node and all instances are retrieving and processing requests, the overall CPU load on the server is still roughly the same as with the single node-osrm setup. I therefore assume that there's a lock somewhere in the osrm-node -> libosrm techstack preventing parallel use.
How do you guys run node-osrm in production? Are you just relying on a single core, or is there something I missed?
The text was updated successfully, but these errors were encountered: