Next: Abstract Source-Level Measurement Up: Abstract Source-level Modeling Previous: The Sequential a-b-t Model Contents

The Concurrent a-b-t Model

**Figure 3.8:** An a-b-t diagram illustrating an NNTP connection in ``stream-mode'', which exhibits data exchange concurrency.
$\includegraphics[width=6.1in]{fig/abt-diagram/nntp-conc.eps}$

**Figure 3.9:** An a-b-t diagram illustrating the interaction between two BitTorrent peers.
$\includegraphics[width=6.1in]{fig/abt-diagram/bittorrent.eps}$

In the sequential model we have considered so far, application data is either flowing from the client to the server or from the server to the client. However, some TCP connections are not driven by this traditional style of client/server interaction. Some applications send data from both endpoints of the connection at the same time. Figure 3.8 shows an NNTP connection between two NNTP peers (servers) in which NNTP's ``streaming mode'' is used. As shown in the diagram, ADUs $b_5$ and $b_6$ are sent from the connection acceptor to the connection initiator while ADU $a_6$ is being sent in the opposite direction. ADUs $b_5$ and $b_6$ carry 438 messages, where the acceptor NNTP peer tells the initiator that it is not interested in articles id3 and id4. ADU $a_6$ carried article id2 in the opposite direction. There is no causal dependency between these ADUs, which make it possible for the two endpoints to send data independently. Therefore this connection is said to exhibit data exchange concurrency in the sense that one or more pairs of ADUs are exchanged simultaneously. In contrast, the connections illustrated in previous figures exchanged data units in a sequential fashion. A fundamental difference between these two types of communication patterns is that sequential request/response exchanges (i.e., epochs) always take a minimum of one round-trip time. Data exchange concurrency makes it possible to send and receive more than one ADU per round-trip time, and this can increase throughput substantially. In the figure, the initiator NNTP peer is able to send check requests to the other party quicker because it can do so without waiting for the corresponding responses, each of which would take a minimum of one full round-trip time to arrive.

Another example of concurrent data exchange is shown in Figure 3.9. Here two BitTorrent peers [Coh03] exchange pieces of a large file that both peers are trying to download. The BitTorrent protocol supports the backlogging of requests (i.e., pieces k and m of the file are requested before the download of the preceding piece is completed), and also the simultaneous exchange of file pieces (i.e., the transmission of pieces k and l of the file coexist with the transmission of piece m). As discussed above, this type of behavior helps to avoid quiet times in BitTorrent connections, thereby increasing average throughput. Furthermore, this example illustrates a type of application in which both endpoints act as client and server (both request and receive file pieces).

Application designers make use of data concurrency for two primary purposes:

Keeping the pipe full, by making use of requests that overlap with uncompleted responses. Rather than waiting for the response of the last request to arrive, the client keeps sending new requests to the server, building up a backlog of pending requests. The server can therefore send responses back-to-back, and maximize its use of the path from the server to the client. Without concurrency, the server remains idle between the end of a response and the arrival of a new request, hence the path cannot be fully utilized.
Supporting ``natural'' concurrency, in the sense that some applications do not need to follow the traditional request/response paradigm. In some cases, the endpoints are genuinely independent, and there is no natural concept of request/response.

Examples of protocols that attempt to keep the pipe full are the pipelining mode in HTTP, the streaming mode in NNTP, the Rsync protocol for file system synchronization, and the BitTorrent protocol for file-sharing. Examples of protocols/applications that support natural concurrency are instant messaging and Gnutella (in which the search messages are simply forwarded to other peers without any response message). Since BitTorrent supports client/server exchanges in both directions, and these exchanges are independent of each other, we can say that BitTorrent also supports a form of natural concurrency.

For data-concurrent connections, we use a different version of our a-b-t model in which the two directions of the connection are modeled independently by a pair $(\alpha, \beta)$ of connection vectors of the form

$\begin{displaymath}\alpha = ((a_1, ta_1), (a_2, ta_2), \dots, (a_{n_a}, ta_{n_a}))\end{displaymath}$

and

$\begin{displaymath}\beta=((b_1, tb_1), (b_2, tb_2), \dots, (b_{n_b}, tb_{n_b}))\end{displaymath}$

Depending on the nature of the concurrent connection, this model may or may not be a simplification. If the sides of the connection are truly independent, the model is accurate. Otherwise, if some dependency exists, it is not reflected in our characterization (e.g., the fact that request $a_i$ necessarily preceded response $b_j$ is lost). Our current data acquisition techniques cannot distinguish these two cases, and we doubt that a technique to accurately distinguish them exists. In any case, the two independent vectors in our concurrent a-b-t model provide enough detail to capture the two uses of concurrent data exchange in a manner relevant for traffic generation. In the case of pipelined requests, one side of the connection mostly carries large ADUs with little or no quiet time between them (i.e., backlogged responses). The exact timing at which the requests arrive in the opposite direction is irrelevant as long as there is always an ADU carrying a response to be sent. It is precisely the purpose of the concurrency to decouple the two directions to avoid the one round-trip time per request/response pair that sequential connections must incur in. There is, therefore, substantial independence in concurrent connections of this type, which supports the use of a model like the one we propose. In the case of connections that are ``naturally'' concurrent, the two sides are accurately described using two separate connection vectors.

Next: Abstract Source-Level Measurement Up: Abstract Source-level Modeling Previous: The Sequential a-b-t Model Contents

Doctoral Dissertation: Generation and Validation of Empirically-Derived TCP Application Workloads
© 2006 Félix Hernández-Campos