Sequentiality and prefetching in database systems

1 September 1978

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Database Systems

Vol. 3 (3), 223-247
https://doi.org/10.1145/320263.320276

Abstract

Sequentiality of access is an inherent characteristic of many database systems. We use this observation to develop an algorithm which selectively prefetches data blocks ahead of the point of reference. The number of blocks prefetched is chosen by using the empirical run length distribution and conditioning on the observed number of sequential block references immediately preceding reference to the current block. The optimal number of blocks to prefetch is estimated as a function of a number of “costs,” including the cost of accessing a block not resident in the buffer (a miss), the cost of fetching additional data blocks at fault times, and the cost of fetching blocks that are never referenced. We estimate this latter cost, described as memory pollution, in two ways. We consider the treatment (in the replacement algorithm) of prefetched blocks, whether they are treated as referenced or not, and find that it makes very little difference. Trace data taken from an operational IMS database system is analyzed and the results are presented. We show how to determine optimal block sizes. We find that anticipatory fetching of data can lead to significant improvements in system operation.

Keywords

This publication has 13 references indexed in Scilit:

Sequential Program Prefetching in Memory Hierarchies
Computer, 1978
Empirical Data Reference Behavior in Data Base Systems
Computer, 1976
Statistical Analysis of Non-stationary Series of Events in a Data Base System
IBM Journal of Research and Development, 1976
Dynamic Improvement of Locality in Virtual Memory Systems
IEEE Transactions on Software Engineering, 1976
The UNIX time-sharing system
Communications of the ACM, 1974
A model for masking rotational latency by dynamic disk allocation
Communications of the ACM, 1974
Characterization of Program Paging in a Time-sharing Environment
IBM Journal of Research and Development, 1973
Principles of Optimal Page Replacement
Journal of the ACM, 1971
Evaluation techniques for storage hierarchies
IBM Systems Journal, 1970
The working set model for program behavior
Communications of the ACM, 1968

Cited by 111 articles