Single Rapid Flow with ON/OFF UDP Cross Traffic

1. Description:

Machines running rapid flows: rapid flow from smithfd-linux1 to smithfd-linux2, udp flow from smithfd-rapid3 to smithfd-rapid6
Rapid Version: rapid_v0.9
Network Paths: the switched path in netlab
TCP Configuration: All default
List of Experiments:
  • single rapid flow for 60s with 50Mbps UDP cross traffic during 10s to 20s, 30s to 40s and 50s to 60s
  • single rapid flow for 60s with 100Mbps UDP cross traffic during 10s to 20s, 30s to 40s and 50s to 60s
  • single rapid flow for 60s with 300Mbps UDP cross traffic during 10s to 20s, 30s to 40s and 50s to 60s
  • single rapid flow for 60s with 500Mbps UDP cross traffic during 10s to 20s, 30s to 40s and 50s to 60s
  • single rapid flow for 60s with 800Mbps UDP cross traffic during 10s to 20s, 30s to 40s and 50s to 60s And use Cubic instead of rapid to repeat the experiments above for comparison.

    2. Experimental Results:

  • Baseline: throughput of UDP cross traffic alone
    UDP Throughput during ON periods:
    requested througput   actual throughput
    50M                   51M
    100M                  101M
    300M                  301M
    500M                  510M
    800M                  655M
    
  • Throughput of the single cubic flow with cross traffic
  • Throughput of the single rapid flow with cross traffic

    Obviously, during ON periods, the rapid flow gave much lower throughput than cubic flow.

    3. Study of "Packets Rejects Because of Timestamps" Problem

    Netstat statistics were retrieved at the receiver side, which showed that for the rapid flows, there were thousands of "packets rejects because of timestamp" event during the experiment, and these events only occur when UDP cross traffic was "ON". Here are some statistics gathered for some experiments with rapid:
    udp=50Mbps			udp=500Mbps			udp=800Mbps
    23716 retransmits		29407 retransmits		7892 retransmits
    415 dropped by router		2017 dropped by router		1282 dropped by router
    22240 rejects due to ts		19694 rejects due to ts		5389 rejects due to ts
    
    Here are two conjectures for the low throughputs and huge number of segment rejects at the receiver side:
  • Conjecture 1: the manipulation of timestamp echoes at the sender side violates RFC rules
  • Conjuecture 2: The timestamp values for retransmitted packets at the sender side violates RFC rules because of the Qdisc module we implemented

    For Conjecture 1, the timestamp values and echos were logged at both endhosts run another experiment with the same settings, which failed to expose any problems of mismatching timestamp values and their echoes.

    For Conjecture 2, a dag trace was taken between two switches, which offered us clues why there are so many packet rejects. From the dag trace, we found that many segments got tranmitted twice, and the first trial would be rejected by the receiver. One common aspect of those rejected segments is that those segments use a timestamp which is smaller than the highest timestamp ever used in that connection. For example:

    1342187058.274410 IP 192.168.135.5.43287 > 192.168.134.9.5678: Flags [.], seq 482147033:482148481, ack 2987578558, win 229, options [nop,nop,TS val 68581250 ecr 48164441], length 1448
    1342187058.274422 IP 192.168.135.5.43287 > 192.168.134.9.5678: Flags [.], seq 482148481:482149929, ack 2987578558, win 229, options [nop,nop,TS val 68581250 ecr 48164441], length 1448
    1342187058.274446 IP 192.168.135.5.43287 > 192.168.134.9.5678: Flags [.], seq 482361337:482362785, ack 2987578558, win 229, options [nop,nop,TS val 68581238 ecr 48164429], length 1448
    1342187058.274483 IP 192.168.135.5.43287 > 192.168.134.9.5678: Flags [.], seq 482364233:482365681, ack 2987578558, win 229, options [nop,nop,TS val 68581238 ecr 48164429], length 1448
    
    the seqment with seqno= 482362785 and 482365681 use a smaller timestamp than the previous segments. And the previous segments were retransmitted segments due to previous packet losses. But the PAWS mechanisim in TCP requires non-decreasing timestamp for data segments, which explains why those segments with smaller timestamp values get rejected by the receiver.

    4. Conclusion:

    To sum up, the whole story is: there is a priority queue for retransmitted segments in the Qdisc implementation. When never there's ane segment to be dequeued, check the retransmission queue first and dequeue all the segments in that queue and then send out the data segments. Thus the retransmitted segments may have a newer timestamp values than data segments, which cause packet rejects at the receiver side.