A Non-Parametric Approach to Generation and Validation of Synthetic Network Traffic

A colloquium given at Columbia Uiversity, New York, NY, January, 2004.

Abstract: In order to perform valid experiments, network simulators and testbeds require source-level traffic generators that correspond to valid, contemporary models of application or user behavior. Unfortunately, the Internet is evolving far more rapidly than our present ability to understand and model the mix and use of applications that account for the majority of bytes transferred on the Internet. We describe a method that automatically generates a source-level model of the mix of TCP applications found on a network link, and show that this model is suitable for generating synthetic traffic in a simulation that is statistically equivalent to traffic in the measured network. The method is novel in that it constructs the model in a non-parametric manner, without any knowledge of the applications present in the network. We validate the method empirically uses traces from our institution and Abilene backbones. The importance of having a method that allows one to faithfully reproduce the source-level characteristics of actual network flows in experiments is illustrated via a case study of active queue management mechanisms (AQM). We demonstrate that dramatically different, and contradictory, conclusions are reached regarding the effectiveness of AQM depending on the types of TCP connections generated in the simulation. This serves as an existence proof that our inability to perform experiments with realistic synthetic traffic makes it difficult, if not impossible, to conclude anything fundamental given the experimental procedures followed at present in most network simulations.

