Next: The Concurrent a-b-t Model Up: Abstract Source-level Modeling Previous: Abstract Source-level Modeling Contents

Subsections

The Sequential a-b-t Model

Client/Server Applications

The a-b-t connection vector of a sequential TCP connection is a sequence of one or more epochs. Each epoch describes the properties of a pair of ADUs exchanged between the two endpoints. The concept of an epoch arises from the client/server structure of many distributed systems, in which one endpoint acts as a client and the other one as a server. The client sends a request for some service (e.g., performing a computation, retrieving some data, etc.) that is followed by a response from the server (e.g., the results of the requested action, a status code, etc.). An epoch represents our abstract characterization of a request/response exchange. An epoch is characterized by the size $a$ of the request and the size $b$ of the response.

**Figure 3.1:** An a-b-t diagram representing a typical ADU exchange in HTTP version 1.0.
$\includegraphics[width=3.5in]{fig/abt-diagram/http-e1.eps}$

The that underlines the World-Wide Web provides a good example of the kinds of TCP workloads created by client/server applications. Figure 1 shows a simple a-b-t diagram that represents a TCP connection between a web browser and a web server, which communicate using the HTTP 1.0 application-layer protocol [BLFF96]. In this example, the web browser (client side) initiates a TCP connection to a web server (server side) and sends a request for an object (e.g., HTML source code, an image, etc.) specified using a . This request constitutes an ADU of size 341 bytes. The server then responds by sending the requested object in an ADU of size 2,555 bytes. The representation in the figure captures:

the sequential order of the ADUs within the TCP connection (first the HTTP request then the HTTP response - in this case, order also implies ``causality''),
the direction in which the ADUs flow (above the time line for the ADU sent from the connection initiator to the connection acceptor; below the time line for the ADU sent from the connection acceptor to the connection initiator), and
the sizes of the ADUs (using annotations and the lengths of the rectangles, which are proportional to the number of bytes).

The diagram provides a visualization in the spirit of abstract source-level modeling, since it does not incorporate any specific information about the actual contents of the ADUs. The bytes in the first ADU (HTTP request) represent an HTTP header that includes a URL, and the bytes in the second ADU (HTTP response) represent an HTTP header (with a success code of 200 OK) followed by the requested object (e.g., HTML source code). In this example, the purpose of this particular connection was well-understood, and that allowed us to assign labels to the ADUs (HTTP request and response) and to the TCP endpoints (web browser and server). In general, when we examine how the ADUs flow in an arbitrary TCP connection, we do not have this application-specific information (or we can only guess it). The same diagram (without the HTTP-specific labels) could be used to represent different connections with completely different payloads in ADUs of the same size. The diagram does not include any network-level information either, so this diagram could also represent connections with very different maximum segment sizes, round-trip times, and other network properties below the application level. Note that this example, and the following ones, came from real connections that were actually observed. In some cases, we had access to the actual segment payloads and used them to add annotations to the ADUs. In other cases, we used port numbers and our understanding of the protocols to add these annotations.

**Figure 3.2:** An a-b-t diagram illustrating a persistent HTTP connection.
$\includegraphics[width=6.1in]{fig/abt-diagram/http-e3.eps}$

Some client/server applications use a new connection for each request/response exchange, while other applications reuse a connection for more than one exchange, creating connections with more than one epoch. As long as the application has enough data to send, multi-epoch connections can improve performance substantially, by avoiding the connection establishment delay and TCP's slow start phase. For example, HTTP was revised to support more than one request/response exchange in the same ``persistent'' TCP connection [FGM$^+$97]. Figure 3.2 illustrates this type of interaction. This is a connection between a web browser and a web server, in which the browser first requests the source code of an HTML page, and receives it from the web server, just like in Figure 3.1. However, the use of persistent HTTP makes it possible for the browser to send another request using the same connection. Unlike the example in Figure 3.1, this persistent connection remains open after the first object is downloaded, so the browser can send another request without first closing the connection and reopening a new one. In Figure 3.2 the web browser sends three ADUs that specify three different URLs, and the server responds with three ADUs. Each ADU contains an HTTP header that precedes the actual requested object. If the requested object is not available, the ADU may only contain the HTTP header with an error code. Note that the diagram has been annotated with extra application-level information showing that the first two epochs were the result of requesting objects from the same document (i.e., same web page), and the last epoch was the result of requesting a different document.

The diagram in Figure 3.2 includes two time gaps between epochs (represented with dashed lines). In both cases, these are quiet times in the interaction between the two endpoints. We call the time between the end of one epoch and the beginning of the next, the inter-epoch quiet time. The first quiet time in the a-b-t diagram represents processing time in the web browser, which parsed the web page it received, retrieved some objects from the local cache, and then made another request for an object in the same document (that was not in the local cache). Because of its longer duration, the second quiet time is most likely due to the time taken by the user to read the web page, and click on one of the links, starting another page download from the same web server.

As will be discussed in Section 3.3, it is difficult to distinguish quiet times caused by application dynamics, which are relevant for a source-level model, and those due to network performance and characteristics, which should not be part of a source-level model (because they are not caused by the behavior of the application). The basic heuristic employed to distinguish between these two cases is the observation that the scale of network events is hardly ever above a few hundred milliseconds^3.2. Going back to the example in Figure 3.2, the only quiet time that could be safely assumed to be due to the application (in this case, due to the user) is the one between the second and third epochs. The 120 milliseconds quiet time between the first and second epochs could easily be due to network effects (such as having the sending of the second request delayed by Nagle's algorithm [Nag84]), and therefore should not be part of the source-level behavior. Similarly, the two a-b-t diagrams shown so far have not depicted any time between the request and the response inside the same epoch. In general, web servers process requests so quickly that there is no need to incorporate intra-epoch quiet times in a model of the workload of a TCP connection. While this is by far the most common case, some applications do have long intra-epoch quiet times, and the a-b-t model can include these.

Formally, a sequential a-b-t connection vector has the form $C_i = (e_1, e_2, \dots, e_n)$ with $n \ge 1$ epoch tuples. An epoch tuple has the form $e_j = (a_j, ta_j, b_j, tb_j)$ where

$a_j$ is the size of the $j^{th}$ ADU sent from the connection initiator to the connection acceptor. $a_j$ will also be used to name the $j$ th ADU sent from the initiator to the acceptor.
$b_j$ is the size of the $j^{th}$ ADU sent in the opposite direction (and generally in response to the request made by $a_j$ ).
$ta_j$ is the duration of the quiet time between the arrival of the last segment of $a_j$ and the departure of the first segment of $b_j$ . $ta_j$ is defined from the point of view of the acceptor (often the server), but ultimately our estimate of the duration is based on the arrival times of segments at some monitoring point.
$tb_j$ is either the duration of the quiet time between $b_j$ and $a_{j+1}$ (for connections with at least $j+1$ epochs), or the quiet time between the last data segment (i.e., last segment with a payload) in the connection and the first control segment used to terminate the connection.

Note that $ta_j$ is a quiet time as seen from the acceptor side, while $tb_j$ is a quiet time as seen from the initiator side. The idea of these definitions is to capture the network-independent component of quiet times, without being concerned with the specific measurement method. In a persistent HTTP connection, $a$ 's would usually be associated to HTTP requests, $b$ 's to HTTP responses, $ta$ 's to processing times on the web server, and $tb$ 's to browser processing times and user think times. We can say that a quiet time $ta_j$ is ``caused'' by an ADU $a_j$ , and that a quiet time $tb_j$ is caused by an ADU $b_j$ . Both time components are defined as quiet times observed at one of the endpoints, and not at some point in the middle of the network where the packet header tracing takes place.

As mentioned in the introduction, the name of the model comes from the three variable names used in this model, which are used to capture the essential source-level properties: data in the ``a'' direction, data in the ``b'' direction, and time ``t'' (non-directional, but associated with the processing of the preceding ADU, as discussed in Section 3.1.1). Using the notation of the a-b-t model, we can succinctly describe the HTTP connection in Figure 3.1 as a single-epoch connection vector of the form

$\begin{displaymath}((341, 0, 2555, 0))\end{displaymath}$

where the first ADU, $a_1$ , has a size of 341 bytes, and the second ADU, $b_1$ , has a size of 2,555 bytes. In this example the time between the transmission of the two data units and the time between the end of $b_1$ and connection termination are considered too small to be included in the source level representation, so they are set to 0. Similarly, we can represent the persistent HTTP connection shown in Figure 3.2 as

$\begin{displaymath}((329, 0, 403, 0.12), (403, 0, 25821, 3.12), (356, 0, 1198, 15.3))\end{displaymath}$

where quiet times are given in seconds. Notice that $tb_3$ is not zero for this connection, but a large number of seconds (in fact, probably larger than the duration of the rest of the activity in the connection!). Persistent connections are often left open in case the client decides to send a new HTTP request reusing the same TCP connection^3.3. As we will show in Section 3.5, this separation is frequent enough to justify incorporating it in the model. Gaps between connection establishment and the sending of $a_1$ are almost nonexistent.

**Figure 3.3:** An a-b-t diagram illustrating an SMTP connection.
$\includegraphics[width=6.1in]{fig/abt-diagram/smtp.eps}$

As another example, the connection in Figure 3.3 illustrates a sample sequence of data units exchanged by two SMTP servers. The first server (labeled ``sender'') previously received an email from an email client, and uses the TCP connection in the diagram to contact the destination SMTP server (i.e., the server for the domain of the destination email address). In this example, most data units are small and correspond to application-level (SMTP) control messages (e.g., the host info message, the initial HELO message, etc.) rather than application objects. The actual email message of 22,568 bytes was carried in ADU $a_6$ . The a-b-t connection vector for this connection is

$\begin{displaymath}((0, 0, 93, 0), (32, 0, 191, 0), (77, 0, 59, 0), (75, 0, 38, 0), (6, 0, 50, 0), (22568, 0, 44, 0)).\end{displaymath}$

Note that this TCP connection illustrates a variation of the client/server design in which the server sends a first ADU identifying itself without any prior request from the client. This pattern of exchange is specified by the SMTP protocol wherein servers identify themselves to clients right after connection establishment. Since $b_1$ is not preceded by any ADU sent from the connection initiator to the connection acceptor, the vector has $a_1=0$ (we sometimes refer to this phenomenon as a ``half-epoch'').

This last example illustrates an important characteristic of TCP workloads that is often ignored in traffic generation experiments. TCP connections do not simply carry files (and requests for files), but are often driven by more complicated interactions that impact TCP performance. An epoch where $a_j > 0$ and $b_j > 0$ requires at least one segment to carry $a_j$ from the connection initiator to the acceptor, and at least another segment to carry $b_j$ in the opposite direction. The minimum duration of an epoch is therefore one round-trip time (which is precisely defined as the time to send a segment from the initiator to the acceptor plus the time to send a segment from the acceptor back to the initiator). This means that the number of epochs imposes a minimum duration and a minimum number of segments for a TCP connection. The connection in Figure 3.3 needs 4 round-trip times to complete the ``negotiation'' that occurs during epochs 2 to 5, even if the ADUs involved are rather small. The actual email message in ADU $b_6$ is transferred in only 2 round-trip times. This is because $b_6$ fits in 16 segments^3.4, and it is sent during TCP's slow start. Thus the first round-trip time is used to send 6 segments, and the second round-trip time is used to send the remaining 10 segments. The duration of this connection is therefore dominated by the control messages, and not by the size of the email. In particular, this is true despite the fact that the email message is much larger than the combined size of the control messages. If the application protocol (i.e., SMTP) were modified to somehow carry control messages and the email content in ADU $a_2$ , then the entire connection would last only 4 round-trip times instead of 6, and would require fewer segments. In our experience, it is common to find connections in which the number of control messages is orders of magnitude larger than the number of ADUs from files or other dynamically-generated content. Clearly, epoch structure has an impact on the performance (more precisely, on the duration) of TCP connections and should therefore be modeled accurately.

**Figure 3.4:** Three a-b-t diagrams representing three different types of NNTP interactions.
$\includegraphics[width=6.1in]{fig/abt-diagram/nntp-nonconc.eps}$

Application protocols can be rather complicated, supporting a wide range of interactions between the two endpoints. Most of them assume a client/server model of interaction and hence can be cast into the sequential a-b-t model. For example, Figure 3.4 shows three types of interactions that are supported by the Network News Transfer Protocol (NNTP) [KL86,Bar00]. The first a-b-t diagram exhibits the straightforward behavior of an NNTP reader (i.e., a client for reading newsgroup postings) posting a new article. The two endpoints exchange a few control messages in the first three epochs, and then the client uploads the content of the article in ADU $a_4$ .

The second connection shows an NNTP reader using a TCP connection to first check whether the server knows about any new articles in two newsgroups (unc.support and unc.test). After that, the reader requests an overview of those messages (using XOVER). The server replies with the subjects of the new articles and some other information. Finally, after a 5.02 seconds of inactivity, the reader requests the content of one of the new articles. This relatively long time suggests that the user of the NNTP reader waited some time before actually requesting the reader to display the content of a new article.

The way NNTP servers interact is illustrated in the third connection. One of the peers will ask the other about new newsgroups and articles. This typically involves hundreds or even thousands of ADUs sent in each direction. The connection shown here has only a small subset of the ADUs observed in one of these connections between NNTP peers. Here the initiator peer asked for new groups first, and then for new articles. One article was sent from the initiator to the acceptor, and another one in the opposite direction.

These examples provide a good illustration of the complexity of modeling applications one by one, and they provide further evidence supporting the claim that our abstract source-level model is widely applicable. In general, the use of a multi-epoch model is essential to accurately describe how applications drive TCP connections.

Incorporating Quiet Times into Source-Level Modeling

Unlike ADUs, which flow from the initiator to the acceptor or vice versa, quiet times are not associated with any particular direction of a TCP connection. However, we have chosen to use two types of quiet times in our sequential a-b-t model. This choice is motivated by the intended meaning of quiet time, and by the difference between the duration of the quiet times observed at different points in the connection's path. When we were developing the model, we initially considered quiet times independent of the endpoint causing them. They were simply ``connection quiet times''. In practice, quiet times in sequential connections are associated with source-level behavior in only one of the endpoints. For example, a ``user think time'' in an HTTP connection is associated with a quiet time on the initiator side (which is waiting for the user action), while a server processing delay in a Telnet connection is associated with the acceptor side (which is waiting for a result). In every case, one endpoint is quiet for some period before sending new data, and the other endpoint remains quiet, waiting for these new data to arrive. Having two types of quiet times, $ta$ and $tb$ , makes it possible to annotate the side of the connection that is the source of the quiet time.

The second reason for the use of two types of quiet times is that the duration of the quiet time depends on the point at which the quiet time is measured. The endpoint that is not the source of the quiet time will observe a quiet time that depends on the network and not only on the source-level behavior of the other endpoint. This is because the new ADU which defines the end of the quiet time needs some time to reach its destination. In the example in Figure 3.2, the quiet time between $a_1$ and $b_1$ observed by the server endpoint is very small (only the time needed to retrieve the requested URL). However, this quiet time is longer when observed by the client, since it is the time between the last socket write of $a_1$ and the first socket read of $b_1$ . It includes the server processing time, and at least one full round-trip time. Ideally, we would like to measure this quiet time $ta_1$ on the server side, in order to characterize source-level behavior in a completely network-independent manner. Similarly, we would like to measure $tb_1$ on the client side. In summary, source-level quiet times are non-directional, in the sense that they do not travel in one direction or the other, but they are associated with one of the endpoints, which is the source of the quiet time.

Beyond Client/Server Applications

**Figure 3.5:** An a-b-t diagram illustrating a server push from a webcam using a persistent HTTP connection.
$\includegraphics[width=6.1in]{fig/abt-diagram/http-push.eps}$

Not all applications follow the strict pattern of requests and responses that characterizes traditional client/server applications. For example, HTTP is commonly used for server push operations^3.5, in which the server periodically refreshes the state of the client without any prior request. Figure 3.5 illustrates this behavior using a TCP connection where a web browser first requests a webcam URL (UNC's ``Pitcam'' in this example), and the web server responds with a sequence of image frames separated by small quiet times. The browser renders each frame as soon as it is received, creating a continuous movie. Each frame can be considered an individual ADU, so this connection does not follow the basic request/response sequence of previous examples. The notation provided by the sequential a-b-t model can still be used to represent this source-level behavior using the connection vector $(e_1, e_2, e_3, e_4, e_5)$ where $e_1 = (392, 0.041, 97939, 0),$ $e_2 = (0, 0.057, 97942, 0),$ $e_3 = (0, 0.035, 97820, 0),$ $e_4 = (0, 0.054, 97820, 0),$ and $e_5 = (0, 0.037, 98019, 0).$ While this connection has no natural epochs in the request/response sense, we can describe the connection by assigning each frame to a separate $b_j$ , and each quiet time between frames to a $ta_j$ (since the connection vector is intended to capture a quiet time on the server side).

**Figure 3.6:** An a-b-t diagram illustrating Icecast audio streaming in a TCP connection.
$\includegraphics[width=6.1in]{fig/abt-diagram/icecast.eps}$

The same type of server push behavior is found in streaming applications. A TCP connection carrying Icecast traffic (from ibiblio.org) is shown in Figure 3.6. Icecast is a popular audio streaming application that follows the same pattern of ADUs discussed in the previous paragraph, and can be described using the same type of connection vector. Each $b_j$ is associated to an MPEG audio frame. Note that the sizes of the ADUs and the durations of the quiet times between them are highly variable, unlike the example in Figure 3.5. Perhaps surprisingly, TCP is widely used for carrying streaming traffic today, despite its inability to perform the typical trade-off between loss recovery and delay in multimedia applications. Streaming over TCP has two significant benefits:

Streaming traffic can use TCP port numbers associated with web traffic and therefore overcome firewalls that block other port numbers. This is important for web sites that deliver web pages and multimedia streams, since it guarantees that the user will be able to download the multimedia content.
Most clients experience such low loss rates, that TCP's loss recovery mechanisms have an insignificant impact on the timing of the stream. The common use of stream buffering prior to the beginning of the playback further reduces the impact of loss recovery.

**Figure 3.7:** Three a-b-t diagrams of connections taking part in the interaction between an FTP client and an FTP server.
$\includegraphics[width=6.1in]{fig/abt-diagram/ftp.eps}$

The interaction between the two endpoints of a client/server application does not generally require more than one TCP connection to be opened between the two endpoints. As we have seen, some applications use a new connection for each request/response exchange, while others make use of multi-epoch connections (e.g., persistent connections in HTTP/1.1). Handling more than one TCP connection can have some performance benefits, but it does complicate the implementation of the applications (e.g., it may require using concurrent programming techniques). However, some applications do interact using several TCP connections and this creates interdependencies between ADUs. For example, Figure 3.7 illustrates an FTP session^3.6 between an FTP client program and FTP server in which three connections are used. The connection in the top row is the ``FTP control'' connection used by the client to first identify itself (with username and password), then list the contents of a directory, and then retrieve a large file. The actual directory listing and the file are received using separate ``FTP data'' connections (established by the client) with a single ADU $b_1$ . The figure illustrates how the start of the data connections depends on the use of some ADUs in the control connection (i.e., the directory listing LIST does not occur until after the RETR ADUs has been received), and how the control connection does not send the 226 Complete ADU until the data connections have completed.

While the sequential a-b-t model can accurately describe the source-level properties of these three connections, the model cannot capture the interdependency between the connections. The FTP example in Figure 3.7 shows three connections with a strong dependency. The two FTP data connections necessarily followed a 150 Opening operation in the FTP control connection. Our current model cannot express this kind of dependencies between connections or between the ADUs of more than one connection. It would be possible to develop a more sophisticated model capable of describing these types of dependencies, but it seems very difficult to populate such a model from traces in an accurate manner without knowledge of application semantics. As an alternative, the traffic generation approach proposed in this dissertation carefully reproduces relative differences in connection start times, which tend to preserve temporal dependencies between connections. Our experimental results also suggest that the impact of interconnection dependencies is negligible, at least for our collection Internet traces.

Next: The Concurrent a-b-t Model Up: Abstract Source-level Modeling Previous: Abstract Source-level Modeling Contents

Doctoral Dissertation: Generation and Validation of Empirically-Derived TCP Application Workloads
© 2006 Félix Hernández-Campos