experiences in design and implementation of a high performance transport protocol yunhong gu, xinwei...
Post on 28-Dec-2015
213 Views
Preview:
TRANSCRIPT
Experiences in Design and Implementation of a High Performance Transport Protocol
Yunhong Gu, Xinwei Hong, and Robert L. Grossman
National Center for Data Mining
Outline
• TCP’s inefficiency in grid applications
• UDT
• Design issues
• Implementations issues
• Conclusion and future work
TCP and AIMD
• TCP has been very successful in the Internet– AIMD (Additive Increase Multiplicative
Decrease)
• Fair: max-min fairness• Stable: globally asynchronously stable• But, inefficient and not scalable
– In grid networks (with high bandwidth-delay product)
• RTT bias
Efficiency of TCP
1 Gb/s link, 200ms RTT, between Tokyo and Chicago
28 minutesOn 10 Gb/s link, 200ms RTT, it will take 4 hours 43 minutes to recover from a single loss.TCP’s throughput model:
It needs extremely low loss rate on high bandwidth-delay product networks.
pRTT
S
2
3
Fairness of TCP
100ms1 Gb/s
1ms1Gb/s
Merge two real-time data streams
From Chicago 1 to Chicago 2: 800Mbps
From Amsterdam to Chicago 2: 80Mbps
The throughput is limited by the slowest stream!
Amsterdam
Chicago 2
Chicago 1
UDT – UDP-based Data Transfer Protocol
• Application level transport protocol built above UDP
• Reliable data delivery
• End-to-end approach
• Bi-directional
• General transport API; not a (file transfer) tool.
• Open source
UDT Architecture
DATA
ACK
ACK2
NAK
Sender
Recver
Sender
Recver
Pkt. Scheduling Timer
ACK Timer
NAK Timer
Retransmission Timer
Rate Control Timer
Sender
UDT – Objectives
• Goals– Easy to install and use– Efficient for bulk data transfer– Fair– Friendly to TCP
• Non-goals– TCP replacement– Messaging service
Design Issues
• Reliability/Acknowledging
• Congestion/Flow Control
• Performance evaluation– Efficiency– Fairness and friendliness– Stability
Reliability/Acknowledging
• Acknowledging is expensive– Packet processing at end hosts and routers– Buffer processing
• Timer-based selective acknowledgement– Send acknowledgement per constant time (if
there are packets to be acknowledged)
• Explicit negative acknowledgement
Congestion Control
• AIMD with decreasing increases
• Increase formula
• Decrease– 1/9
• Control interval is constant – SYN = 0.01 second
SYNSx xCL 11500
10)( ))(log(
UDT Algorithm
C (Mbps) L - C (Mbps) Increment (pkts/SYN)
[0, 9000) (1000, 10000] 10
[9000, 9900) (100, 1000] 1
[9900, 9990) (10, 100] 0.1
[9990, 9999) (1, 10] 0.01
[9999, 9999.9) (0.1, 1] 0.001
9999.9+ <0.1 0.00067
L = 10 Gbps, S = 1500 bytes
UDT: Efficiency and Fairness Characteristics
• Takes 7.5 seconds to reach 90% of the link capacity, independent of BDP
• Satisfies max-min fairness if all the flows have the same end-to-end link capacity– Otherwise, any flow will obtain at least half of
its fair share
• Does not take more bandwidth than concurrent TCP flow as long as
6/10822 SYNLRTT
Efficiency
0 10 20 30 40 50 60 70 80 90 1000
200
400
600
800
1000
Time (s)
Th
rou
gh
pu
t (M
bp
s)
to Chicago, 1Gbps, 0.04msto Canarie, OC-12, 16msto Amsterdam, 1Gbps, 110ms
• UDT bandwidth utilization– 960Mb/s on 1Gb/s– 580Mb/s on OC-12 (622Mb/s)
0 10 20 30 40 50 60 70 80 90 100
0
200
400
600
0 10 20 30 40 50 60 70 80 90 100320
322
324
326
328
330
Time (s)
Th
rou
gh
pu
t (M
bp
s)
Fairness
• Fair bandwidth sharing between networks with different RTTs and bottleneck capacities – 330 Mb/s each for the 3 flows from Chicago to Chicago
Local via 1Gb/s, Amsterdam via 1Gb/s and Ottawa via 622Mb/s
Fairness
10-2
10-1
100
101
102
103
0.8
0.85
0.9
0.95
1
RTT (ms)
Fa
irn
ess
Ind
ex
UDTTCP
• Fairness index– Simulation: Jain’s Fairness Index for 10 UDT
and TCP flows over 100Mb/s link with different RTTs
RTT Fairness
100
101
102
103
0.9
0.92
0.94
0.96
0.98
1
RTT (ms)
RT
T F
airn
ess
• Fairness index of TCP flows with different RTTs– 2 flows, one has 1ms RTT, the other varies
from 1ms to 1000ms
Fairness and Friendliness
50 TCP flows and 4 UDT flows between SARA and StarLight
Realtime snapshot of the throughput
The 4 UDT flows have similar performance and leave enough space for TCP flows
TCP Friendliness
0 1 2 3 4 5 6 7 8 9 1020
30
40
50
60
70
80
Number of UDT flows
TC
P T
hro
ug
hp
ut (
Mb
ps)
• Impact on short life TCP flows– 500 1MB TCP flows with 1-10 bulk UDT
flows, over 1Gb/s link between Chicago and Amsterdam
Stability
10-2
10-1
100
101
102
103
0
0.2
0.4
0.6
0.8
RTT (ms)
Sta
bili
ty In
de
x
UDTTCP
• Stability index of UDT and TCP– Stability: average standard deviation of throughout per
unit time– 10 UDT flows and 10 TCP flows with different RTTs
Implementations Issues
• Efficiency and CPU utilization
• Loss information processing
• Memory management
• API
• Conformance
Efficiency and CPU utilization
• Efficiency = Mbps/MHz
• Maximize throughput– Use CPU time as little as possible, so that CPU
won’t be used up before network bottleneck is reached
– Remove CPU burst, which can cause packet loss: even distribution of processing
• Minimize CPU utilization
Loss Processing
• On high BDP networks, the number of lost packets can be very large during a loss event
• Access to the loss information may take long time• Acknowledge may take several packets
0 10 20 30 40 50 60 70 80 90 1000
1000
2000
3000
Loss Events
Nu
mb
er
of L
oss
Pa
cke
ts
Loss Processing
• UDT loss processing– Most loss are continuous– Record loss event other than lost packets– Access time is almost constant
0 10 20 30 40 50 60 70 80 90 1000
2
4
6
8
Loss Events
Acc
ess
Tim
e (
us)
Memory Processing
• Memory copy avoidance• Overlapped IO• Data scattering/gathering• Speculation of next packet
Protocol Buffer Protocol Buffer
User Buffer
Data
New Data
API
• Socket-like API
• Support overlapped IO
• File transfer API– sendfile/recvfile
• Thread safe
• Performance monitoring
API - Example
UDTSOCKET client = UDT::socket(AF_INET, SOCK_STREAM, 0);
UDT::connect(client, (sockaddr*)&serv_addr, sizeof(serv_addr));
If (UDTERROR == UDT::send(client, data, size, 0)){ //error processing}
int client = socket(AF_INET, SOCK_STREAM, 0);
connect(client, (sockaddr*)&serv_addr, sizeof(serv_addr));
If (-1 == send(client, data, size, 0)){ //error processing}
Implementation Efficiency
0 5 10 15 20 25 30 35 40 45 500
10
20
30
40
50
60
Sample Event
CP
U U
sag
e (
%)
udt sendingudt receivingtcp sendingtcp receiving
• CPU usage of UDT and TCP– UDT takes about 10% more CPU than TCP– More code optimizations are still on going
Conclusion
• TCP is not suitable for distributed data intensive applications over grid networks
• We introduced a new application level protocol named UDT, to overcome the shortcomings of TCP
• We explained the design rationale and implementations details in this paper
Future Work
• Bandwidth Estimation
• CPU utilization– Self-clocking– Code optimization
• Theoretical work
References
• More details can be found in our paper.
• UDT specification– Draft-gg-udt-01.txt
• Congestion control– Paper on Gridnets '04 workshop
• UDT open source project– http://udt.sf.net
top related