The present chapter is the longest one in this dissertation, and presents the results of
20 source-level trace replay experiments using 130 plots and 10 tables.
The conclusions are not always straight-forward or consistent across traces,
so it is difficult to form a coherent picture by simply going through the entire
body of results. In this section, we summarize our results so far
in order to make the rest of the chapter easier to follow.
Our summary is in the form of a list of 18 observations, which report both on
findings that were consistent for Leipzig-II and UNC 1 PM, and findings that were inconsistent.
Observations on Byte Throughput
From the analysis of the plots of the time series of byte throughput, their marginal
distributions and wavelet spectra, we can make the following observations:
B.1
Both full and collapsed-epochs replays provide a reasonable approximation of the
original 1-minute time series of byte throughput and the body of its 10-millisecond
marginal.
Replays do not track every spike in the original time series, but the similarity is
remarkable.
The replays achieve a very close approximation of the Leipzig-II time series, but
are slightly below the UNC 1 PM time series.
For both traces, the approximation of the bodies of the original marginal are somewhat
better for the inbound direction than for the outbound one. This observation is not
explained by traffic volume asymmetry, since the inbound direction was the dominant
direction in terms of byte volume only in the case of Leipzig-II.
B.2
Lossless replays sometimes show substantially more spikes of 1-minute byte throughput
above the original trace than lossy replays. This is clear for UNC 1 PM but not for
Leipzig-II. At the finer scales studied by the marginal distributions, we find that the
tails of the lossless replays are substantially heavier than those of the lossy replays.
However, they are not consistently above the tails of the original distributions.
In contrast, the results for every trace show that the bodies of the lossless replays are
wider than the bodies of the lossy replays. This reveals higher burstiness in the lossless
replays in the sense that they have a higher probability of bins with byte throughput far from
the mean (i.e., a larger number of 10-millisecond intervals with have
rather low or rather high byte throughput).
B.3
Collapsed-epochs replays show somewhat more bursty 1-minute time series, and
track the changes in the shape of the original time series less closely. The extra
burstiness may not appear very substantial in the plots, but given the coarse scale,
it may have a large impact on experiments sensitive to prolonged byte throughput spikes.
We do not find a corresponding phenomenon for the marginal distributions, where
collapsed-epochs replays are generally close to the full replays (except for the
outbound direction of UNC 1 PM). Together with observation B.5,
this shows that the extra burstiness of the collapsed-epochs replays manifests itself
in the auto-correlation structure of the byte throughput process, rather than in the set
of byte throughputs observed throughout the replays.
B.4
Full replays provide a close approximation of the scaling region (octaves 6 to 15)
of the wavelet spectra of the original traces. This does not necessarily translate into
similarly good approximations of the estimated Hurst parameters. Only the lossy replays
are within confidence intervals for Leipzig-II, while only the lossless ones are within
confidence intervals for UNC 1 PM.
B.5
Collapsed-epochs replays tend to show slightly more energy in the scaling region.
This is true for the four spectra from lossless replays and for the two spectra from
lossy replay of Leipzig-II. However, the energy of the original scaling region is well
approximated by the lossy collapsed-epochs replay for the outbound direction of UNC 1 PM.
This higher energy in the wavelet spectrum plot does not necessarily translate into
higher estimates of the Hurst parameters.
B.6
Both full and collapsed-epochs replays do not consistently match the
spectra of the finer scales (octaves 1 to 5). We find higher or slightly higher energy
levels for the replays of Leipzig-II, similar levels for the replays of the inbound
direction of UNC 1 PM and lower levels for the outbound direction of UNC 1 PM.
B.7
By construction, the most detailed replay is the lossy full replay, so we expect it
to achieve the best approximation of the original trace. This was always true for
1-minute time series, the body of the marginal distribution and the scaling region of
the wavelet spectrum. However, it was not consistently true for the tail of the marginal
distribution, the energy of the wavelet spectrum at fine scales, and the estimated Hurst
parameter.
Observations on Packet Throughput
We can make the following observations regarding packet throughput:
P.1
Full replays achieve a close approximation of the original 1-minute time series
of packet throughput, remaining between 2% and 8% below the original for most
of the time series. Collapsed-epochs replays result in a substantially worse approximation,
being between 20% to 30% below the original for most of the time series. This
difference is also present in the bodies of the 10-millisecond marginal distributions.
In the best case for full replays, the median of the marginal distribution is equal
to the original median for the inbound direction of the UNC 1 PM lossy replay. In the
worst case, the median is 7% below the original for the inbound direction
of the Leipzig-II lossy replay. Collapsed epochs replays show medians of the
marginal distributions that are 20% (UNC 1 PM inbound) and 25% (Leipzig-II outbound)
below the original median.
P.2
Incorporating losses into the replays increases packet throughput,
reducing the distance to the original time series.
While this effect is small for Leipzig-II, it is rather significant for UNC 1 PM inbound.
In addition, lossless replays sometimes show more artificial spikes in the 1-minute
time series plot than the lossy ones (e.g., UNC 1 PM outbound). This phenomenon seems
less prominent for packet throughput than for byte throughput (see observation B.2).
P.3
Unlike the byte throughput case, the tails of the packet throughput from the replays
marginals are never significantly heavier than the original tails. Lossless
replays provide the best approximations of the original tails, being excellent in some
cases (Leipzig-II inbound and UNC 1 PM inbound). Lossy replays show lighter tails than
lossless replays, revealing significantly worse approximations of the original
tails. We can also observe that the tails of the collapsed-epochs replays are consistently
lighter than those of the full replays. However, the impact of detailed modeling on the
tails of the marginals is less prominent than the impact of incorporating losses.
P.4
Full replays and lossy collapsed-epochs replays provide good approximations
of the original wavelet spectra, while the lossless collapsed-epochs replays show somewhat
higher energy. In general, we can say that the best approximation is achieved
by the lossless full replay. As in the case of byte throughput, Hurst parameter
estimates offer a different picture. Only the estimates for the lossy full replay are
within confidence intervals of the original estimates for Leipzig-II, while the estimates
for both lossless and lossy full replays are within confidence intervals for UNC 1 PM.
P.5
Replays do not consistently reproduce the energy levels at the finest
scales of the original time series of packet arrivals. We find minor differences
for Leipzig-II and UNC 1 PM inbound, and substantially larger ones for UNC 1 PM outbound.
Collapsed-epochs replays are significantly worse than full replays only for UNC 1 PM.
Observations on Active Connections
Regarding active connections, we can make the following observations that
hold true for both Leipzig-II and UNC 1 PM:
C.1
The number of active connections in the original trace and in the
full replays is very similar.
C.2
The lossy full replay provides the best approximation of the
active connection time series, being within 1% of the original time series.
There is no difference for UNC 1 PM.
C.3
The number of active connections in collapsed-epochs replays is
several times smaller than the original (around 3 times smaller for Leipzig-II and
UNC 1 PM).
C.4
Adding losses to the replays substantially increases the average number of
connections. This increase is of the same magnitude for both full and collapsed-epochs
replays.
C.5
Full replays track the features of the original time series
very closely. The only difference between lossless and lossy replays is a slowly varying
offset. This suggests a homogeneous impact of losses, which lengthens the lifetimes of
a stable number of connections throughout the traces.
C.6
Unlike full replays, collapsed-epochs replays do not track the features of the
original time series. However, the magnitude of this effect pales in comparison to
the much smaller number of active connections.