Evaluation of the File Redundancy in Distributed Database Systems

Abstract
This paper treats the file redundancy issue in distributed database systems, asking what is the optimal number of file copies, given the ratio r of the frequency of update requests to the frequency of all file access requests (i.e., queries and updates). Formulations of this type of problem, including optimal file allocation, have been attempted by a number of authors, and some algorithms have been proposed. Although such algorithms can be used to solve particular problems, it seems difficult to draw general conclusions applicable to a wide variety of practical distributed database systems. To probe into this hard to formulate but interesting problem, our paper constructs simplified network models of distributed database systems, and computes the optimal number of file copies, as well as their locations, to minimize the communication cost. For several network types, we plot the optimal number of file copies as a function of the ratio r.