Compress the training data with Google Snappy for higher IO throughput #1533

futurely · 2014-12-05T09:59:20Z

When computation is faster than data IO, the device utitlization is reduced. To reduce the throughput gap, the data is usually compressed in many distributed computation or storage framework such as Hadoop, HBase, Hive and Kafka. Among many compressing libraries available for this purpose, Google Snappy is a very widely used one. It supports many languages and strikes a good balance between compression ratio and speed.

shelhamer · 2017-03-23T07:27:04Z

Closing as not a grievous bottleneck in many cases, and b.c. this can be handled by specialized layers relevant to the given use case that don't necessarily have general relevance.

shelhamer closed this as completed Mar 23, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compress the training data with Google Snappy for higher IO throughput #1533

Compress the training data with Google Snappy for higher IO throughput #1533

futurely commented Dec 5, 2014

shelhamer commented Mar 23, 2017

Compress the training data with Google Snappy for higher IO throughput #1533

Compress the training data with Google Snappy for higher IO throughput #1533

Comments

futurely commented Dec 5, 2014

shelhamer commented Mar 23, 2017