Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse compression objects #1185

Merged
merged 2 commits into from
Dec 14, 2018
Merged

Conversation

muirdm
Copy link

@muirdm muirdm commented Oct 5, 2018

Reusing lz4/gzip reader and writer objects via sync.Pool greatly reduces CPU usage (saves allocations and creates much less garbage).

Muir Manders added 2 commits October 4, 2018 17:46
Use sync.Pool to reuse lz4 and gzip reader objects across
decompressions. lz4 in particular makes a large allocation per-reader,
so you spend all your time in GC if make a new reader per-message.

Benchmarking reading 500 messages/s with 3 consumers and 32
partitions, lz4 consumer CPU fell from ~120% to ~5%. gzip went from
~20% to ~5%.
Similar to decompressing, you can reuse *lz4.Writer and *gzip.Writer
objects by calling Reset() between uses. Add two more sync.Pools so we
can reuse writers easily.

The gzip writer is only reused when using the default compression. We
could maintain a separate sync.Pool for each gzip compression level,
but that adds more complexity and gzip wasn't that slow without this
optimization.

When producing 100 msgs/second across 10 producers, CPU usage dropped
from 70% to 4% on my machine (lz4). Gzip dropped from 15% to 5%.
@bai bai merged commit 94536b3 into IBM:master Dec 14, 2018
@qiangyin
Copy link

Yes,I used 1.17.0 and gzip, will lead to gc use cpu very high.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants