You might initially think of hash tables when discussing hash functions in the context of data structures, but that's just the tip of the iceberg. By using hash functions strategically, we can estimate how many times we've seen a given element in a data stream without actually storing all the elements in that stream. The techniques we'll use to do so generalize to much more sophisticated and clever data structures for "sketching" data and computing statistics without holding the whole data set in memory.
Readings
- Graham Cormode and S. Muthukrishnan. An Improved Data Stream Summary: The Count-Min Sketch and its Applications
Links