An improved data stream summary: the count-min sketch and its applications

https://doi.org/10.1016/j.jalgor.2003.12.001Get rights and content

Abstract

We introduce a new sublinear space data structure—the count-min sketch—for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc. The time and space bounds we show for using the CM sketch to solve these problems significantly improve those previously known—typically from 1/ε2 to 1/ε in factor.

References (30)

  • G. Cormode et al.

    What's new: finding significant differences in network data streams

  • A. Dobra et al.

    Processing complex aggregate queries over data streams

  • C. Estan et al.

    New directions in traffic measurement and accounting

    Proceedings of ACM SIGCOMM, Computer Communication Review

    (2002)
  • C. Estan et al.

    Data streaming in computer networks

  • M. Fang et al.

    Computing iceberg queries efficiently

  • Cited by (0)

    1

    Supported by NSF ITR 0220280 and NSF EIA 02-05116.

    2

    Supported by NSF CCR 0087022, NSF ITR 0220280 and NSF EIA 02-05116.

    View full text