Next: The Sequential a-b-t Model Up: diss Previous: Summary Contents

Abstract Source-level Modeling

$\textstyle \parbox{0.2in}{\mbox{}}$ $\textstyle \parbox{5.25in}{\begin{singlespace}\textit{\textbf{model}: (11a) a d... ...ize something (as an atom) that cannot be directly observed.}\end{singlespace}}$ $\textstyle \parbox{0.2in}{\mbox{}}$
$\textstyle \parbox{5.25in}{\begin{flushright}--- \textsc{{Merrian--Webster English Dictionary}}\end{flushright}}$

$\textstyle \parbox{0.2in}{\mbox{}}$ $\textstyle \parbox{5.25in}{\begin{singlespace}\textit{Anything that has real and lasting value is always a gift from within.}\end{singlespace}}$ $\textstyle \parbox{0.2in}{\mbox{}}$
$\textstyle \parbox{5.25in}{\begin{flushright}--- \textsc{{Franz Kafka (1883--1924)}}\end{flushright}}$

Abstract source-level modeling provides a method to describe the workload of a TCP connection at the source level in a manner than is not tied to the specifics of individual applications. The starting point of this method is the observation that at the transport level, a TCP endpoint is doing nothing more than sending and receiving data. Each application (i.e., web browsing, file sharing, etc.) employs its own set of data units for carrying application-level control messages, files, and other information. The actual meaning of the data is irrelevant to TCP, which is only responsible for delivering data in a reliable, ordered, and congestion-responsive manner. As a consequence, we can describe the workload of TCP in terms of the demands by upper layers of the protocol stack for sending and receiving Application Data Units (ADUs). This workload characterization captures only the sizes of the units of data that TCP is responsible for delivering, and abstracts away the details of each application (e.g., the meaning of its ADUs, the size of the socket reads and writes, etc.). The approach makes it feasible to model the entire range of TCP workloads, and not just those that derive from a few well-understood applications as is the case today. This provides a way to overcome the inherent scalability problem of application-level modeling.

While the work of a TCP endpoint is to send and receive data units, its lifetime is not only dictated by the time these operations take, but also by quiet times in which the TCP connection remains idle, waiting for upper layers to make new demands. TCP is only affected by the duration of these periods of inactivity and not by the cause of these quiet times, which depends on the dynamics of each application (e.g., waiting for user input, processing a file, etc.). Longer lifetimes have an important impact, since the endpoint resources needed to handle TCP state must remain reserved for a longer period of time^3.1. Furthermore, the window mechanism in TCP tends to aggregate the data of those ADUs that are sent within a short period of time, reducing the number of segments that have to travel from source to destination. This is only possible when TCP receives a number of back-to-back requests to send data. If these requests are separated by significant quiet times, no aggregation occurs and the data is sent using at least as many segments as ADUs.

We have formalized these ideas into the a-b-t model, which describes TCP connections as sets of ADU exchanges and quiet times. The term a-b-t is descriptive of the basic building blocks of this model: a-type ADUs ( $a$ 's), which are sent from the connection initiator to the connection acceptor, b-type ADUs ( $b$ 's), which flow in the opposite direction, and quiet times ( $t$ 's), during which no data segments are exchanged. We will make use of these terms to describe the source-level behavior of TCP connections throughout this dissertation. The a-b-t model has two different flavors depending on whether ADU interleaving is sequential or concurrent. The sequential a-b-t model is used for modeling connections in which only one ADU is being sent from one endpoint to the other at any given point in time. This means that the two endpoints engage in an orderly conversation in which one endpoint will not send a new ADU until it has completely received the previous ADU from the other endpoint. On the contrary, the concurrent a-b-t model is used for modeling connections in which both endpoints send and receive ADUs simultaneously.

The a-b-t model not only provides a reasonable description of the workload of TCP at the source-level, but it is also simple enough to be populated from measurement. Control data contained in TCP headers provide enough information to determine the number and sizes of the ADUs in a TCP connection and the durations of the quiet times between these ADUs. This makes it possible to convert an arbitrary trace of segment headers into a set of a-b-t connection vectors, in which each vector describes one of the TCP connections in the trace. As long as this process is accurate, this approach provides realistic characterizations of TCP workloads, in the sense that they can be empirically derived from measurements of real Internet links.

In this chapter, we describe the a-b-t model and its two flavors in detail. For each flavor, we first discuss a number of sample connections that illustrate the power of the a-b-t model to describe TCP connections driven by different applications, and point out some limitations of this approach. We then present a set of techniques for analyzing segment headers in order to construct a-b-t connection vectors and provide a validation of these techniques using traces from synthetic applications. We finally examine the characteristics of a set of real traces from the point of view of the a-b-t model, providing a source-level view of the workload of TCP.

Subsections

Next: The Sequential a-b-t Model Up: diss Previous: Summary Contents

Doctoral Dissertation: Generation and Validation of Empirically-Derived TCP Application Workloads
© 2006 Félix Hernández-Campos