-
Notifications
You must be signed in to change notification settings - Fork 4
Conversation
Wow, this is really amazing! Really great analysis. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
wow! :D |
I'm trying to dig deeper into what is the actual bit of work that was getting repeated or backed up here. Fundamentally I don't believe it. I've seen it with my own eyes but I still don't believe it. |
I was expecting something like this. You might see some more improvement once ipfs/js-datastore-fs#21 lands. We should likely do a call to understand what is happening and why. |
The only significant difference I can see in the pull stream code between |
Wow, this is absolutely fantastic!!! Amazing work @achingbrain!! 👏🏽👏🏽👏🏽@achingbrain would you be down record a 5 minute screencast (maybe even just 2) to share with the community? Also, would love to have a session on how you approach this performance problems and generate these graphs so that everyone can learn from it. |
I was just checking this out with this script: const Ipfs = require('ipfs')
const Os = require('os')
const Path = require('path')
const Crypto = require('crypto')
const KB = 1024
const MB = KB * 1024
console.log('starting...')
const ipfs = new Ipfs({ repo: Path.join(Os.tmpdir(), `${Date.now()}`) })
ipfs.on('ready', async () => {
console.log('ready...')
const data = Crypto.randomBytes(240 * MB)
console.log('adding...')
console.time('added')
const res = await ipfs.add(data)
console.log(res[0].hash)
console.timeEnd('added')
await ipfs.stop()
process.exit()
})
ipfs.on('error', console.error)
console.log('waiting for ready...') I get the output: starting...
waiting for ready...
Swarm listening on /ip4/127.0.0.1/tcp/4002/ipfs/QmcNzyg3kVu7n1Gmwc1am8BU1kqVh158oLUP1JGmxubstt
Swarm listening on /ip4/192.168.1.68/tcp/4002/ipfs/QmcNzyg3kVu7n1Gmwc1am8BU1kqVh158oLUP1JGmxubstt
Swarm listening on /ip4/127.0.0.1/tcp/4003/ws/ipfs/QmcNzyg3kVu7n1Gmwc1am8BU1kqVh158oLUP1JGmxubstt
ready...
adding...
QmXWAhTKcYSJtSgVoEuPAZsc77pH8aG4HpsYgtBoM5uG7z
added: 6986.134ms Am I doing this right? With 0.34.0-rc.1 I'm getting ~6s minimum which is definitely way faster than with 0.33 (confirmed ~13s) but its a way off <2s. |
We are measuring 2.07s for Go on our benchmark server and 2.19s for Node (it dropped from 3s to 2s) to add a 64MB file. |
Fixes #9
Converts the chunk-by-chunk passing of blocks to IPLD to be done in parallel instead using
pull-paramap
.You can limit the concurrency in
pull-paramap
- I wasn't sure which value to use so I did a comparison using this test which usespull-buffer-stream
to generate streams of buffers of random bytes:The results were (lower is better):
So there's a slight overhead with 100x concurrency, but unbounded is about the same as 50x on my laptop so in this PR I leave it unbounded.
This is only 0-20% of the file being added. I was interested by the jump so left it running in unbounded parallel mode to see if there were any other bottlenecks:
The line is reasonably straight, so at least it's constant and that's ok, right? Wrong. Each value on the x axis is 1%, but the y axis is how long it took to ingest that 1% so it's getting slower over time.
Adding a file to IPFS is basically going at O(n log n) time which is kind of bad for large files.
What's causing this? Turns out using
pull-stream/throughs/map
to calculate file ingest progress is causing this, and it allocates a new buffer for the file chunk to boot, so slow and memory inefficient.Switching that out for a
pull-stream/throughs/through
results in the following graph:Or about 2300x faster.
Writing the file chunks in series results in:
Or about 2200x faster.
So switching from series to parallel writes gets you a modest speed increase, but not changing pull stream data in-flight gets you an enormous boost.
In real-world use, this changed the time it takes to
jsipfs add
a 260MB file to a fresh repo from 13.7s to 1.95s. By comparisongo-ipfs
takes 1.58s to add the same file to a fresh repo.