Skip to content

Latest commit

 

History

History

count-min-101

This ipython notebook is a brief, interactive introduction to the Count Min (CM) Sketch, a method of estimating the number of occurances of a particular item in a set. The CM sketch is well-suited to estimating "heavy-hitters" in a streaming data set.

First, the notebook implements an algorithm for the CM sketch, taken from here. Exact and approximate frequencies for the heavy hitters are calculated and compared for various configurations of the CM sketch. Some timing tests are done.