Occupancy Models for the Estimation of Block Accesses

Abstract
Estimating the number of block accesses required to retrieve a set of records is an important problem in the design and implementation of data and knowledge systems. The models underlying some of the proposed estimation formulas are discussed as variations of occupancy problems, which have been extensively discussed in the literature in probability and statistics. Some additional properties of the relevant distributions are presented. In addition, an approach based on a sequential occupancy problem is discussed. This approach, which involves the estimation of the number of records which will require the retrieval of a target number of blocks, can be used to avoid the estimation of block accesses in some situations.