Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallelize batch flushing #4296

Merged
merged 3 commits into from
Oct 14, 2017
Merged

parallelize batch flushing #4296

merged 3 commits into from
Oct 14, 2017

Conversation

Stebalien
Copy link
Member

  1. Modern storage devices (i.e., SSDs) tend to be highly parallel.
  2. Allows us to read and write at the same time (avoids pausing while flushing).

This makes ipfs add --local ~3.5x faster with the flatfs datastore (untested
with badger).

fixes #898 (comment)

License: MIT
Signed-off-by: Steven Allen steven@stebalien.com

1. Modern storage devices (i.e., SSDs) tend to be highly parallel.
2. Allows us to read and write at the same time (avoids pausing while flushing).

fixes #898 (comment)

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@Stebalien
Copy link
Member Author

We may want to reduce the parallelism. 2*NumCpu is slightly faster than NumCpu but will allocate 2x the memory (on a 4 thread CPU, it will allocate 64MiB instead of 32MiB).

However, we should probably test badger first (it may work better with increased parallelism).

@Stebalien
Copy link
Member Author

This makes adds with sync enabled almost as fast adds with sync disabled with the badger datastore (and the same 2*NumCpu constant seems to work).

@whyrusleeping
Copy link
Member

This is really nice, great find here :)

)

// ParallelBatchCommits is the number of batch commits that can be in-flight before blocking.
// TODO: Experiment with multiple datastores, storage devices, and CPUs to find
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create issues for this instead of in code TODO (they just get forgotten).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine! (...grumble... we'll never get to it anyways)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that might be true but it is still better than having TODO in code.
As you said in your issue, someone might just take a stab at it out of pure boredom.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I was just being lazy 🙂.

}(t.blocks)

t.activeCommits++
t.blocks = nil
Copy link
Member

@Kubuxu Kubuxu Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would preallocate a buffer here of MaxBlocks as appending will expand the buffer and cause more allocations and copies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the max size of this array is 128 pointers to blocks (2KiB). However, it will likely never be greater than 32 pointers (0.5KiB) assuming that we have 256KiB blocks. Does 32 sound like a reasonable default size?

Personally, I don't think that will make much of a difference. We already do 1 allocation per block so this will only add another log(n) allocations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just preallocated a blocks array of the same size as the one we just filled. That should be a reasonable guess.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As go allocates next power of 2, getting to 128 would be 9 reallocations and copies. IMO it is worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, my log(n) estimate was incorrect anyways. log(n) per batch but still n overall (7-15% allocation overhead depending on the block sizes).

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
It's probably safe to assume that this buffer will be about the same time each
flush.

This could cause 1 extra allocation (if this is the last commit) but that's
unlikely to be an issue.

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@Stebalien
Copy link
Member Author

Stebalien commented Oct 11, 2017

After further testing, the effect isn't nearly so pronounced for medium-size files (the tests above were on single large files) and is probably non-existent for small files as we create a new batch per file. We should consider using the same batch when adding multiple small files.

@whyrusleeping whyrusleeping merged commit 9c06a0e into master Oct 14, 2017
@whyrusleeping whyrusleeping deleted the feat/parallel-batch branch October 14, 2017 12:34
Stebalien added a commit to Stebalien/go-ipld-format that referenced this pull request Oct 16, 2017
(ipfs/kubo#4296)

1. Modern storage devices (i.e., SSDs) tend to be highly parallel.
2. Allows us to read and write at the same time (avoids pausing while flushing).

fixes ipfs/kubo#898 (comment)
Stebalien added a commit to Stebalien/go-ipld-format that referenced this pull request Oct 16, 2017
(ipfs/kubo#4296)

1. Modern storage devices (i.e., SSDs) tend to be highly parallel.
2. Allows us to read and write at the same time (avoids pausing while flushing).

fixes ipfs/kubo#898 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/merkledag Topic merkledag topic/perf Performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ipfs add horrendously slow
4 participants