HTTP simulation

Introduction

This page describes the use of the http simulation package (THTTP). The idea is to simulate HTTP 1.0 (2) traffic, by simulating a number of browsers requesting service from a number of servers. For instance we may simulate that there are 3000 browsers actively requesting service from 5 servers.

Ensuring realistic modeling of HTTP traffic is done by using an implementation of the HTTP traffic model described in (1).

It should be noted that this is a practical introduction for running the experiments.

The THTTP simulation tool comes in two parts, a server side which represents the "webserver" (THTTPD) and a client side which represents a number of browsers. In the following we will first look at THTTPD and then THTTP. After describing the simulation package a couple of tools that supports running experiments are described.

THTTPD

The server side of the HTTP traffic simulation. As mentioned THTTPD simulates the server side of HTTP traffic. Basically it just replies to any request from a browser by returning a number of bytes following the HTTP model.

THTTPD

THTTPD takes the following command line options:

[-d] (print debugging messages)
[-s stop_time] (stop after this many seconds) [We usually run experiments for 45 minutes. There is a startup period for the simulator, before it reaches some sort of stable state. This means that the first 15-20 min of each experiment should be ignored. Give the server side 1 minute more than the client side to ensure that the servers do not finish before the clients]
[-c #connections] (maximum connections)[default=100]
[-r #read_bytes] (maximum bytes per read)[default=4320]
[-w #write_bytes] (maximum bytes per write)[default=4320]
[-p listen_port] (port listening for connects)[default=4320]
[-rs random_seed] (use a specific random seed) [By choosing a specific random seed, one is able to lock the simulators behavior - this should allow one to repeat a certain behavior - if no seed is specified, then the simulator will choose one a random seed]
[-ld log_display] (logging interval sec.) [usually set to 1 sec.]

Log files

Each line of the log file from THTTPD gives a summery over the events of the last logging interval. Each line has the
following format:

Timestamp

Elapsed time (ms)

Connections - number of finished replies

Bytes Out

Bytes In

response time - the time it took to process a request

A sample log clipping:

Tue Oct 20 11:05:45 1998     366000         34     426263      12997       3416
Tue Oct 20 11:05:46 1998     367000         12     188012       4496        800
Tue Oct 20 11:05:47 1998     368000         14     112616       4693       3313

THTTP

The client side of the simulation. Each instance of THTTP simulates a number of browsers. This is done by having a state machine for each browser, describing whether it is thinking, requesting or on it's way to the next server. A polling loop iterates over the browsers, doing non-blocking communication, thereby simulating the cucurrent running browsers.

THTTP

THTTP takes the following command line options:

[-d] (print debugging messages)
[-f file_name] (server configuration file) [see next section]
[-s stop_time] (stop after this many seconds)[use 45 mins]
[-b #browsers] (number of active browsers) [between 0 and 1300, above will overload the host and introduce perturbation on the experiment]
[-c #connections] (connections per browser) [default=2]
[-r #read_bytes] (maximum bytes per read)[defalt=4320]
[-w #write_bytes] (maximum bytes per write)[default=4230]
[-rt round_trip] (maximum round trip time ms.)[default=10secs]
[-rs random_seed] (use a specific random seed) [See under THTTPD]
[-ld log_display] (logging interval sec.) [usually set to 1 sec]
[-es] (extended statistics) [Extends the statistics to include pr page statistics]
[-dlr filename] (detailed logging of pr connection rsp times) [logs response times for each request, this is opossed to avaraging over the logging interval.
[-dlp filename] (detailed logging of pr page rsp times) [as above, but pr page logging insetad of pr request]
[-dlrp filename] (detailed logging of both page and connection rsp times)[combination of the two last]
[-rf filename] (reads binary logfile from -dl)

Server Configuration file

This file tells the client on which hosts there will be servers running, and on which port. An example file is:

host:howard138;port:6789
host:lovey138;port:6789
host:speedy138;port:6789
host:brain138;port:6789
host:petunia138;port:6789

PS: There must not be a newline after the last entry - then THTTP will break!

Logging

THTTP has three different logging modes which can be combined: defualt, extended logging, and detailed logging .

Default logging prints a single line for each logging interval, it has the following format:

Timestamp Elapsed time (ms) connections bytes out bytes in cumulative conn. rsp time (ms)

Example:

Mon Aug 3 22:32:36 1998     263000         26      10472     674864       7489
Mon Aug 3 22:32:37 1998     264000         44      18969     554220      11866
Mon Aug 3 22:32:38 1998     265000         31       9979     588430      15724
Mon Aug 3 22:32:39 1998     266000         33       9679     115978       4910

Extended Logging: Prints a default logging line as well as a line with cummulative statistics for each server. Furthermore
these statistics include page statistics, instead of only connection response times. A default stats line includes:

STAT:

Timestamp Elapsed time connections pages bytes out bytes in cummulative conn. rsp time cummulative page rsp time

SERVER:

Server name connections cummulative conn. rsp time pages cummulative page rsp time

Example:
STAT: Wed Sep 30 17:11:00 1998, 142081, 10, 8, 3992, 76562, 11121, 11977
SERVER:howard138,3,3134,2,2454
SERVER:lovey138,5,6272,4,7805
SERVER:speedy138,1,1318,1,1319
SERVER:brain138,1,397,1,399
SERVER:petunia138,0,0,0,0

Detailed Logging: During detailed logging every connection or page request is logged as a single entry. Because of the amount of data captured, the data is dumped to a binary file (1-2MB). Note that this logging mechanism may use more memory that is available as primary memory.

THTTP provides fuctionality to read the binary logfiles with the -rf option. Each line of a detailed log

req entry:

Timestamp (gettimeofday) servername time (ms)

Example:
version: 1
servers: howard138, lovey138, speedy138, brain138, petunia138
mean per connection response time: 1098.153785
mean per page response time: 4551.405574
req: 908980000.888406 lovey138        416
req: 908980003.733097 brain138        306
page: 908982697.583750 speedy138       243
page: 908982697.856280 lovey138        2123

Hints on setting up an experiment

delaybox settiings: To aviod synchronization problems we use the delaybox. This also makes the simulation more "realistic" since we simulate that machines have different distances to each other. Allways check the delaybox setting before running an experiment.
automation: The THTTP experiments include a large amount of different parameters. My personal experience is that it is very difficult to be consitent when running meny experiments. Therefore it is a good idea to automate.

Tools

During my work with the HTTP simulator I have written a couple of tools that might be of intrest.

All of these tools are currently used on the Dirt Research Network - and will require some modification
in order to function on the class network.

None of the scripts require root access.

All these tools expect that each experiment is has the follwing directory structure:

/some-dir-which-is-accessable-from-all-hosts-in-the-network/ ... /expname
expname/plots # plots
expname/plots/tmp # temporary space for plot producing scripts.

Testnet

The idea of testnet is to make a script that automatically and fast does a test of the network configuration. It is written in perl and quite simple. As argument testnet takes af configurationfile:

Example configuration file:
delay floyd134 brain138 0 5
delay floyd134 howard138 210 5
route thelmalou134 152.2.134.5 192.168.137.4 howard138
window brain138 16384
window howard138 16384

Each entry describes a test to run. a delay test consist of a source host, dst host, delay, allowed jitter. A delay test
is done by performing 3 consecutive pings and taking the avrage.

A route test consist of the list of hosts which should occur during a trace route. In this case from thelmalou134 to howard138.

The window test checks the size of the windows of a given host. See the sample configuration file for further instructions.

Files in distriution: testnet and testnetconfig

Runtest

This script will run an experiment described in a configurationfile. It's not totally generic, but in the area. It's written
in perl and has a couple of supporting shellscripts (bash).

The test will start the simulator and various logging stuff. All logging is done to local disk space. When the experiment is done, the logs are copied to central storage thrugh NSF.

Look at the sample configuration file for further instructions.

Files in distribution: runtest, localclientrun, localserverrun, testconfig, killsomething
Currently uses testnet.

DL cdf

Produce a CDF from an experiment with different delaysettings. Given a bunch of dl logfiles and a description
of the setting of delays, this script will produce a CDF over the responsetimes.

Files in distribution: dl_cdf, delaytable.pl

sample:
A CDF over response times with different delaysettings.

Distribution

/usr/src/thttp.tgz on bennett

Comments and Help

I you have problems and need help with some of the tools described on this page, then you can contact me by email:

mixxel@cs.unc.edu

Literature

1) Bruce A., Mah: An Empirical Model of HTTP Network Traffic.
2) RFC 1945: HTTP/1.0 (not RFC 2324:-)