-
Notifications
You must be signed in to change notification settings - Fork 7.3k
fs.realpath 70x slower than native #7902
Comments
Cc: @isaacs, @rsms, @piscisaureus, who seem to have written most of the By the way, there was a comment (now deleted) in the |
+1. I think node should use the native BTW, it's only a 13x difference on my MacBook: |
Chiming in here: @arrelid and I spent about a day designing and a couple of days building the asynchronous version of realpath. My personal opinion:
Hitting |
Hm, my guess would be that under most circumstances the synchronous C version is still faster than the asynchronous JS version, since most of the time is spent on CPU. I might certainly be wrong, but it might be worth benchmarking. (I'm not writing the patch here, so please just ignore me if you disagree - it's up to you.)
👍 |
@joliss You are indeed wrong 😄 — realpath will spend 99% on I/O and a very small amount of time instructing a CPU. In an essence, what realpath does is to query the file system until it has found the canonical path to a particular file, where "canonical" means a hard-link leaf in a "real" (no symlinks) directory tree. Basically a bunch of So clearly calling realpathSync on an "unfortunate" path could freeze up your entire NodeJS process for N time where N is anywhere from a couple of microseconds to minutes or even hours (i.e. if part of the path needed to be resolved is on a slow network drive or the local hard drive is busy doing other work. An asynchronous (or "dispatched" if you will) approach to I/O is a core philosophy on which NodeJS was based. |
I think a lot is going to depend on whether the kernel's dentry cache is hot or not. In the benchmarks that @joliss posted, the dentry for "." is loaded once and then hit over and over again. There's almost no real I/O taking place, nearly all wall clock and CPU time is spent on string operations and marshalling the That's easy to optimize for but probably not representative of a real workload. It would be more interesting to see what performance is like when you hit a large range of paths starting from a cold cache ( |
@bnoordhuis Good point. I assumed the tests were run on uncached paths. Otherwise—as you point out—all the code is doing is essentially javascript object key lookups. Theoretically, realpath should use a minuscule amount of CPU when resolving a path and spend basically all of its time waiting to be woken up by the kernel. |
Yes, my use case here is a CLI tool (Broccoli, for building JavaScript browser apps), where stuff tends to come out of the cache. The problem we have is with the CPU usage of |
I recommend you submit a patch (pull req or patch to the mailing list) where realpathSync takes the libc naïve approach of assuming <=PATH_MAX. |
Here's the flamegraph for I'll take a look into it sometime. |
@trevnorris ... any further updates on this? |
Haven't tried latest v0.12, but io.js v2.3.0 now runs this in ~9us. Still not near optimal, but better. I'll take a quick look. |
Hm. Difference my be from my box. The native code runs at 120ns/op, while the script on io.js v2.3.0 runs in 3430ns/op (which is less than the 9us I was getting before a small change to the test). Either way, that is significant overhead. Going to take a peek at the code. |
That method is substantial. Not really sure what's going on. It's outside the realm of trivial fix, and not exactly sure why we aren't just linking to @jasnell I'll let you decide whether to keep this open. Did a comparison on the tip of v0.12 branch and it runs in the same amount of time as io.js v2.3.0. So what I thought was a performance improvement was from a faster hd. |
I agree that linking to |
Ok, I'm going to mark this as a defer but keep it open. The fix would likely be made over in io.js or the converged repo then backported here. |
@jasnell is the issue safe here, or should we open a new one on a non-archive repo ? I just want to prevent it from being forgotten. |
Either way. It's safe here. The only downside is that it's going to be as visible sitting here as a new issue would be over in nodejs/node. |
i dont see a downside (or was that the point) |
lol.. I mean not going to be as visible sitting here... there aren't that many people looking over these older issues. |
IIRC there wasn't any opposition to using |
@joliss has already identified nodes performance issues with realpath [here](nodejs/node-v0.x-archive#7902) As per @krisseldens suggestion, we can likely just use readlinkSync here instead. real life example: https://github.com/ember-cli/stress-app before: 8668.970928ms after: 7152.094867ms 18% incremental build improvement As realpathSync uses readlinkSync internally, this also reduces calls to readlinkSync before: 17826 (134.230726ms) after: 6171 ( 93.061130ms)
Can this be done in uv? |
@trevnorris should I reopen one different repo? |
Opening a new issue or PR over in nodejs/node would likely be good. |
The
fs.realpath
function is 70x slower than native C realpath. On my system,fs.realpath
takes 32 µs, while Crealpath
takes 0.45 µs.This is a real problem in the Broccoli build tool, where we need to resolve symlinks in hot code paths. Resolving 1000 symlinked files - not an unusual case - would take 45 ms, slowing down the build considerably. [1]
As for a solution: I haven't looked at the
fs.js
source in detail, but it seems we might be able to call therealpath
function in the C standard library, where available, instead of using our own implementation.Benchmark code for Node:
Benchmark code for C:
Run with
gcc -std=gnu99 realpath-benchmark.c -o realpath-benchmark && time ./realpath-benchmark
. This yields 0.45 µs per iteration on my Linux system.[1] We cannot work around this by using naïve string concatenation, because path_resolution(7) requires that we resolve symlinks in all path components. Here is a gist to show why this matters.
The text was updated successfully, but these errors were encountered: