Another Heuristic to compute AB_est - Exponential Moving Averaging

1. Description

raw data: gap1 gap2 gap3 gap4 ... gapN
pruned: for n=1, prune_gap1=gap1
        for n>1, prune_gapN = alpha * prune_gap(N-1) + (1-alpha) * gapN

2. Effect of EMA on gaps

3. Effect of EMA on AB_ests

One example of over estimation:

In this pstream, when valid excursion=2, alpha=0.5 gives abest=688Mbps; alpha=0.5 gives abest=673Mbps, but alpha=0.875 gives abest=952Mbps. The reason is that alpha=0.875 is over-smoothed. From the raw data, the sendgap for the 26th probe suddenly increased, which corresponds to a larger cum_size. But after moving averaging, that burst at the 26th probe disappeared. The smoothed sendgap becomes much smaller than before, which leads to a larger probing rate(=size/sendgap).

However, this problem is easy to be fixed. "Excursions must be detected using smoothed gaps, but raw data must be used when computing rate."
After fixing the over-estimation problem as decribed above: -- Using raw data to calculate AB_est

If raw data(before EMA) is used to calculate AB_est, all lines shift to the right. The raw sendgaps usually are smaller than the smoothed sendgap at the end of each pstream, which leads to higher AB estimation. But this removes the long tail in the previous plot, avoiding unexpected overestimation due to sudden increase in sendgaps.

5. Observation:

  • EMA fails to produce a perfect match of AB_estimation and actual ab. But over-estimation and under-estimation are both random events. So if more than half of the time rapid is under estimating ab, then it will not overload the network.
  • EMA has better averaging effect than algorithmatic averaging method. And it brings much less change to the code. So EMA is preferred than algorithmatic averaging.