Application of sampling methodologies to network traffic characterization

Abstract
The relative performance of different data collection methods in the assessment of various traffic parameters is significant when the amount of data generated by a complete trace of a traffic interval is computationally overwhelming, and even capturing summary statistics for all traffic is impractical. This paper presents a study of the performance of various methods of sampling in answering questions related to wide area network traffic characterization. Using a packet trace from a network environment that aggregates traffic from a large number of sources, we simulate various sampling approaches, including time-driven and event-driven methods, with both random and deterministic selection patterns, at a variety of granularities. Using several metrics which indicate the similarity between two distributions, we then compare the sampled traces to the parent population. Our results revealed that the time-triggered techniques did not perform as well as the packet-triggered ones. Furthermore, the performance differences within each class (packet-based or time-based techniques) are small.

This publication has 3 references indexed in Scilit: