Rapid_v0.9.2 with UDP Cross Traffic

1. Description:

one rapid flow from rapid22 to rapid1 lasting for 5s. 300Mbps UDP cross traffic from rapid6 to rapid3 starts between 2s to 5s.

2. Configuration:

default window size, sysctl configuration and probe array size

3. Experiment Results:

(1) Baseline throughputs for single rapid flow from rapid22 to rapid1 is : 938Mbps
(2) Throughputs with 300Mbps cross traffic:

[  4]  0.0- 0.5 sec  6.71 MBytes   113 Mbits/sec
[  4]  0.5- 1.0 sec  39.4 MBytes   661 Mbits/sec
[  4]  1.0- 1.5 sec  53.5 MBytes   898 Mbits/sec
[  4]  1.5- 2.0 sec  55.8 MBytes   937 Mbits/sec
[  4]  2.0- 2.5 sec  47.9 MBytes   804 Mbits/sec
[  4]  2.5- 3.0 sec  41.5 MBytes   695 Mbits/sec
[  4]  3.0- 3.5 sec  15.9 MBytes   266 Mbits/sec
[  4]  3.5- 4.0 sec  11.5 MBytes   194 Mbits/sec
[  4]  4.0- 4.5 sec  12.7 MBytes   214 Mbits/sec
[  4]  4.5- 5.0 sec  19.7 MBytes   331 Mbits/sec
[  4]  5.0- 5.5 sec  16.5 MBytes   277 Mbits/sec
[  4]  5.5- 6.0 sec  22.7 MBytes   380 Mbits/sec
[  4]  6.0- 6.5 sec  54.6 MBytes   916 Mbits/sec
[  4]  6.5- 7.0 sec  51.6 MBytes   866 Mbits/sec
[  4]  0.0- 7.4 sec   488 MBytes   552 Mbits/sec


(3) Router and netstat statistics:
router 192.168.134.1: dropped 171 segments. Among them 76 segments are dropped in udp connection.
receiver smithfd-rapid1: 0 packet rejects because of timestamp.
                         1 segment retransmited
sender smithfd-rapid22:     3174 segments retransmited
			    2609 fast retransmits
			    167 forward retransmits
			    395 retransmits in slow start
			    3 sack retransmits failed
according to dag trace: 2607 segments got retransmited once. 237 segments twice. 


(4) gap analysis:
  • AB_estimate calculation and sendrate calculation are correct according to the current algorithm.
    AB_est for each probe stream when there's cross traffic:

    Zoom in for one spike in the graph above:

  • in kernel log at the sender side, get the information on PROBE_STREAM status change:
    PROBE_STREAM_OFF 2.866253165
    PROBE_STREAM_OFF 3.155911446
    PROBE_STREAM_ON 3.161326308
    PROBE_STREAM_OFF 3.451278799
    PROBE_STREAM_OFF 3.744746267
    PROBE_STREAM_ON 3.747780347
    PROBE_STREAM_OFF 4.099698261
    PROBE_STREAM_ON 4.207626702
    PROBE_STREAM_OFF 4.207631801
    PROBE_STREAM_ON 4.217001599
    PROBE_STREAM_OFF 4.868148635
    PROBE_STREAM_ON 4.971362896
    PROBE_STREAM_OFF 5.141323434
    PROBE_STREAM_ON 5.231682733
    PROBE_STREAM_OFF 5.357811917
    PROBE_STREAM_OFF 5.591578455
    PROBE_STREAM_ON 5.599032695
    
    The total off time during the PROBE_STREAM_OFF time amounts to 2.3s when there's udp cross traffic.
  • Send gaps for some segments from dag trace when there's udp cross traffic:


  • Some ideas on the problem:
    I. The plots above show that the send gap fails to present the exponential-increasing gap when there's cross traffic on the router.
    II. For the sendgaps, some are extremely large and some are extremely small, which fails to form a valid excursion. So AB_estimation will keep increasing until it reaches link speed. From the abest analysis, only a few probe streams have chances to see excursions.
    III. AB_est and send rate keep increasing till link speed, which overloads the link. So when there's udp cross traffic, rapid repeated entered loss recovery mode because it over-estimated link capacity. This exacerbates the throughput even more.
    IV. What's the consequence if we send segments at the rate higher than link capacity? Those segments with gaps smaller than 12us(1Gbps link) will be sent as gaps = 12us back by back. Within this probe stream, this will not affect abest. And if the minimum send rate for the next probe stream is low enough, it's ok. But if the minimum send rate for the next probe stream is large enough, the few segments in the beginning will be sent out back by back because of the untimely scheduling for the segments in the end of the previous probe stream. Thus the exponential send gaps will be destroyed and form the sendgap pattern we saw above. If the udp cross traffic is 700Mbps, the throughput is just similar but no worse.

    The plot above is abest for one experiment with 700M udp cross traffic. The AB_est is close to actual available bandwidth.