Skip to content
This repository has been archived by the owner on Jun 8, 2024. It is now read-only.

Refactors Bindinds to use Async Workers #193

Merged
merged 1 commit into from
May 20, 2016
Merged

Refactors Bindinds to use Async Workers #193

merged 1 commit into from
May 20, 2016

Conversation

@daniel-j-h daniel-j-h added the bug label May 19, 2016
@daniel-j-h
Copy link
Member Author

The support code could benefit from some refactoring.
I didn't touch it except for moving it into a separate header.

}

return engine_config;
static Nan::Persistent<v8::Function> init;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is needed for the thread-safe singleton impl. without relying on global constructors.

@TheMarex TheMarex merged commit 18e7c56 into master May 20, 2016
@TheMarex TheMarex deleted the async_worker branch May 20, 2016 17:18
@TheMarex TheMarex mentioned this pull request May 20, 2016
4 tasks
@springmeyer
Copy link

@daniel-j-h - can you share more about what was wrong with the original code, what the specific fix needed was, and how you confirmed the fix impacted performance? I'm asking because its not clear to me whether the fixes here apply or are relevant to other async code I maintain for other node c++ modules.

@daniel-j-h
Copy link
Member Author

@springmeyer a couple of weeks before this pull request I learned about npm / node / nan by writing node bindings for @danpat's route-annotator. For making its features asynchronous I used NAN's beautiful async. abstractions. More concretely: 1/ inherit from Nan::AsyncWorker implementing the interface, 2/ enqueue the worker with Nan::AsyncQueueWorker.

If you take a look at the code before this changeset, node-osrm used libuv directly as in uv_work_t, uv_queue_work etc. --- when the performance regression came up during the sprint, I quickly prototyped the switch to NAN's async. abstractions for the Match service for which @TheMarex and I verified that it solved the problem. Eventually I rewrote all services (and added some abstractions that allows us to share code between the services).

Looping in libuv-wizard @kkaefer here.

@springmeyer
Copy link

springmeyer commented Jun 21, 2016

@daniel-j-h - Thank you for the clarification. If I am understanding correctly then:

  • The code using uv_work_t/uv_queue_work directly was buggy somehow and not performing well
  • Porting to Nan fixed the issue
  • You benchmarked to confirm the performance difference

Is that correct? If so, there are dozens of node c++ modules that we (Mapbox) maintain which need fixed because they are using libuv directly, without any known problems. But your findings indicate they are likely not working correct for some reason. So my further questions are:

  • Was the performance problem restricted to node v4 and not node v0.10?
  • Do you understand what Nan is doing that helped perf? Is it using the libuv API differently?

If you don't know no problem, I'm just trying to understand before diving in to figure out whether we need to change many node c++ modules and why/how. Thanks!

@daniel-j-h
Copy link
Member Author

If I am understanding correctly then:

  • The code using uv_work_t/uv_queue_work directly was buggy somehow and not performing well
  • Porting to Nan fixed the issue
  • You benchmarked to confirm the performance difference

Is that correct?

The code using uv_work_t / uv_queue_work worked fine until we switched to Node 4.x (which updated the bundled libuv version, too). We never traced down the changes in libuv or the reason for the limited concurrency beyond the direct libuv usage, because the AsyncWorker approach fixed it and refactored the code at the same time quite nicely.

If you want to dig into this, this is what I would do:

  • checkout the commit before this changeset
  • get familiar with the source code that is using libuv a bit (maybe you find issues in it)
  • reproduce concurrency issue (see the original issue at the top)
  • read the Nodejs Changelog and get the libuv version number that comes bundled with it
  • check the libuv docs and source for this version and the features the source code is using
  • build a small self contained example reproducing the issue

Was the performance problem restricted to node v4 and not node v0.10?

Yes, I'm not sure though what else changed since I haven't had a close eye on node-osrm before, but I think that was the reason. Maybe @TheMarex or @lbud know more about this.

Do you understand what Nan is doing that helped perf? Is it using the libuv API differently?

From reading the NAN source, it looks like a really thin wrapper around libuv's uv_work_t / uv_queue_work. Take a look here:

Hope that helps.

@springmeyer
Copy link

Thanks @daniel-j-h - I'll try to replicate and dig in when I have time.

@danpat
Copy link
Member

danpat commented Jun 25, 2016

@springmeyer FWIW, it may be that the way node-osrm was using uv_work_queue was incorrect, and other usages @mapbox are just fine.

We noticed this when we were running benchmarks - we were maxing out at 2 CPUs in use and couldn't drive usage any higher. If you aren't seeing that performance problem elsewhere, then it's probably a non-issue.

Rather than diving into what was wrong with our direct usage of uv_work_queue, @daniel-j-h just swapped in AsyncWorker and we got our performance back. This tidied up the code and simplified things a lot, so we ran with it. We do not know exactly what was wrong with our old code, but the problem was triggered by the Node 0.10.x to 4.x update.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants