Characterizing reference locality in the WWW
Top Cited Papers
- 24 December 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
In this paper we propose models for both temporal and spatial locality of reference in streams of requests am'v- ing at Web servers. We show that simple models based on document popularity alone are insuficient for cap- turing either temporal or spatial locality. Instead, we rely on an equivalent, but numerical, representation of a reference stream: a stack distance trace. We show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and we propose models for typical distributions and compare their cache performance to our traces. We also show that spatial Io- cality an a reference stream can be characterized using the notion of self-similarity. Self-similarity describes long- range correlations an the dataset, which is a property that previous researchers have found hard to incorporate into synthetic reference strings. We show that stack dis- tance strings appear to be stongly self-similar, and we provide measurements of the degree of self-similarity an our traces. Finally, we discuss methods for generating synthetic Web traces that exhibit the properties of tem- poral and spatial locality that we measured an our data.Keywords
This publication has 16 references indexed in Scilit:
- Demand-based document dissemination to reduce traffic and balance load in distributed information systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The case for geographical push-cachingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Application-level document caching in the InternetPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Main memory caching of Web documentsComputer Networks and ISDN Systems, 1996
- Self-similarity in World Wide Web trafficPublished by Association for Computing Machinery (ACM) ,1996
- Using speculation to reduce server load and service time on the WWWPublished by Association for Computing Machinery (ACM) ,1995
- A caching relay for the World Wide WebComputer Networks and ISDN Systems, 1994
- On the self-similar nature of Ethernet traffic (extended version)IEEE/ACM Transactions on Networking, 1994
- Properties of the working-set modelCommunications of the ACM, 1972
- Evaluation techniques for storage hierarchiesIBM Systems Journal, 1970