Probabilistic counting

1 November 1983

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 02725428,p. 76-82
https://doi.org/10.1109/sfcs.1983.46

Abstract

We present here a class of probabilistic algorithms with which one can estimate the number of distinct elements in a collection of data (typically a large file stored on disk) in a single pass, using only 0(1) auxiliary storage and 0(1) operations per element. We precisely quantify the accuracy-storage trade-offs: for instance a typical accuracy of about 5% can be achieved using only 256 binary words, even for very large files. The algorithms are totally insensitive to the replicative structure of the elements in the file. They are particularly adapted to data base systems in the context of query optimization and can be implemented in a decentralized manner (thus making them also useful for distributed data base applications).

Keywords

This publication has 4 references indexed in Scilit:

Approximate counting: A detailed analysis
BIT Numerical Mathematics, 1985
The complexity of approximate counting
Published by Association for Computing Machinery (ACM) ,1983
Counting large numbers of events in small registers
Communications of the ACM, 1978
Key-to-address transform techniques
Communications of the ACM, 1971

Cited by 70 articles