Reuse compression objects #1185

muirdm · 2018-10-05T00:47:21Z

Reusing lz4/gzip reader and writer objects via sync.Pool greatly reduces CPU usage (saves allocations and creates much less garbage).

Use sync.Pool to reuse lz4 and gzip reader objects across decompressions. lz4 in particular makes a large allocation per-reader, so you spend all your time in GC if make a new reader per-message. Benchmarking reading 500 messages/s with 3 consumers and 32 partitions, lz4 consumer CPU fell from ~120% to ~5%. gzip went from ~20% to ~5%.

Similar to decompressing, you can reuse *lz4.Writer and *gzip.Writer objects by calling Reset() between uses. Add two more sync.Pools so we can reuse writers easily. The gzip writer is only reused when using the default compression. We could maintain a separate sync.Pool for each gzip compression level, but that adds more complexity and gzip wasn't that slow without this optimization. When producing 100 msgs/second across 10 producers, CPU usage dropped from 70% to 4% on my machine (lz4). Gzip dropped from 15% to 5%.

qiangyin · 2018-12-19T11:49:13Z

Yes,I used 1.17.0 and gzip, will lead to gc use cpu very high.

Muir Manders added 2 commits October 4, 2018 17:46

bai merged commit 94536b3 into IBM:master Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse compression objects #1185

Reuse compression objects #1185

muirdm commented Oct 5, 2018

qiangyin commented Dec 19, 2018

Reuse compression objects #1185

Reuse compression objects #1185

Conversation

muirdm commented Oct 5, 2018

qiangyin commented Dec 19, 2018