Determining the optimal file size on tertiary storage systems based on the distribution of query sizes
- 27 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 20 (10993371), 22-31
- https://doi.org/10.1109/ssdm.1998.688108
Abstract
In tertiary storage systems, the data is stored on multiple tape volumes where each tape is further divided into files. Since in many such systems the minimum unit of data transfer is a file, it is an important problem to match file sizes with the access patterns to the data. In general, if the file size is large relative to the query size it will lead to the transfer of large amounts of irrelevant data whereas small file sizes will incur an overhead penalty associated with reading each new file. In this work, we analyze the relationship between file sizes and query response times and provide a methodology to compute the optimal file size given information about the distribution of query sizes. Exact closed form solutions for the cost function are given for two common distributions.Keywords
This publication has 3 references indexed in Scilit:
- Database systems for efficient access to tertiary memoryPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Efficient organization and access of multi-dimensional datasets on tertiary storage systemsInformation Systems, 1995
- Optimizing storage of objects on mass storage systems with robotic devicesLecture Notes in Computer Science, 1994