cdl was here datatag cern sep 2002 r. hughes-jones manchester 1 european topology: nrns & geant...
Post on 22-Dec-2015
216 views
TRANSCRIPT
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
1
CdL was here
European Topology: NRNs & Geant
SuperJANET4
CERN
UvA
Manc
SURFnet
RAL
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
2
CdL was here
Gigabit Throughput on the Production WAN
Manc - RAL 570 Mbit/s 91% of the 622 Mbit access link
between SuperJANET4 and RAL 1472 bytes propagation ~21s
Manc-UvA (SARA) 750 Mbit/s SJANET4 + Geant + SURFnet
Manc – CERN 460 Mbit/s CERN PC had a 32 bit PCI bus
U D P M an -C E R N G ig 19 M ay 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transm it T im e per fram e us
Re
cv
Wir
e r
ate
Mb
its
/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
U D P M an -R AL G ig 21 Ap r 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transm it T im e per fram e us
Re
cv
Wir
e r
ate
Mb
its
/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
U D P M an -U vA G ig 19 M ay 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transm it T im e per fram e us
Re
cv
Wir
e r
ate
Mb
its
/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1472 bytes
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
3
CdL was here
Gigabit TCP Throughput on the Production WAN
Throughput vs TCP buffer size
TCP window sizes in Mbytes calculated from RTT*bandwidth
0
100
200
300
400
500
600
0 1000000 2000000 3000000 4000000 5000000
T C P B uffe r s ize B y tes
Re
cv
. U
se
r d
ata
ra
te M
bit
/s
M an-UvA M an-C E R N
Link Round trip time ms
TCP Window for
BW 1 Gbit/s
TCP Window for UDP BW 750 Mbit/s
TCP Window for UDP BW 460 Mbit/s
Man – Ams 14.5 1.8 1.36
Man - CERN 21.4 2.68 1.23
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
4
CdL was here
Gigabit TCP on the Production WAN Man-CERN
Throughput vs n-streams Default buffer size slope = ~25 Mbit/s/stream up to 9 streams then 15 Mbit/s/stream With larger buffers rate of increase per stream is larger Plateaus at about 7 streams giving a total throughput of ~400 Mbit/s
0
50
100
150
200
250
300
350
400
450
500
0 5 10 15 20 25
Num ber T C P s tream s
Re
ce
ive
d t
hro
ug
hp
ut
Mb
it/s 6 5536
524288
1048576
1600512
2097152
3145728
4194304
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
5
CdL was here
UDP Throughput: SLAC - Man
UDP SLAC - Man 31May 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transmit Time per frame usR
ecv
Wir
e ra
te M
bits
/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
UDP SLAC - Man 31May 02
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40
Transmit Time per frame us
% P
acke
t lo
ss
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
SLAC – Manc 470 Mbit/s 75% of the 622 Mbit access link SuperJANET4 peers with ESnet
at 622Mbit in NY
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
6
CdL was here
Gigabit TCP Throughput Man-SLAC
Throughput vs n-streams Much less than for European links Buffer required: rtt*BW (622Mbit) = ~14 Mbytes With larger buffers > default, rate of increase per
stream is ~ 5.4 Mbit/s/stream No Plateau Consistent with Iperf Why do we need so many streams?
TCP Man-SLAC
0
20
40
60
80
100
120
140
160
0 2 4 6 8 10 12
Number of TCP streams
Receiv
ed
th
rou
gh
pu
t M
bit
/s
1048576
1600512
2097152
3145728
4194304
Les CottrellSLAC
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
7
CdL was here
iGrid2002 Radio Astronomy data movement (1)
Arrival times
Slope corresponds to > 2Gbit/s - not physical ! 1.2 ms steps every 79 packets Buffer required: ~ 120 kbytes Average slope: 560 Mbit/s – agrees with (bytes received)/ (time taken)
UDP 1472 bytes -w20 iGrid-Man 23sep02
y = 6.1149x + 3E+06
2995000
2996000
2997000
2998000
2999000
3000000
3001000
3002000
3003000
3004000
3005000
800 850 900 950 1000packet number
Re
ce
ive
tim
e u
s
0
2000
4000
6000
8000
10000
12000
1 W
ay
tim
e u
s
recv_time
one_way time us
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
8
CdL was here
iGrid2002 Radio Astronomy data movement (2)
Arrival times
Slope corresponds to 123 Mbit/s – agrees! 1-way delay flat
Suggest that the interface/driver are being clever with the interrupts !
UDP 1472 bytes -w100 iGrid-Man 23sep02
y = 99.988x + 3E+06
3055000
3060000
3065000
3070000
3075000
3080000
3085000
800 850 900 950 1000packet number
Re
ce
ive
tim
e u
s
7580
7600
7620
7640
7660
7680
7700
7720
7740
7760
1 W
ay
tim
e u
s
recv_time
one_way time us
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
9
CdL was here
iGrid2002 UDP Throughput: Intel Pro/1000
UDP Man-iGrid 23 Sep 02
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transmit Time per frame us
Recv W
ire r
ate
Mb
its/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
Max throughput 700Mbit/s
Loss only when at wire rate Loss not due to user Kernel moves
Receiving CPU load ~15% 1472bytes
Motherboard: SuperMicro P4DP6 Chipset: Intel E7500 (Plumas) CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz Slot 4: PCI, 64 bit, 66 MHz RedHat 7.2 Kernel 2.4.18
UDP Man-iGrid 23 Sep 02
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40
Transmit Time per frame us%
Packet
loss
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
UDP Man-iGrid 23 Sep 02
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40
Transmit Time per frame us
Receiv
e K
ern
el
CP
U % 50 bytes
100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
11
CdL was here
Work on End Systems: PCI: SysKonnect SK-9843
Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz
RedHat 7.1 Kernel 2.4.14
SK301 1400 bytes sent Wait 20 us
Sk303 1400 bytes sent Wait 10 us Frames are back-to-back Can drive at line speed Cannot go any faster !
Gig Eth frames back to back
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
12
CdL was here
PCI: Intel Pro/1000 Motherboard: SuperMicro 370DLE Chipset:: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz
RedHat 7.1 Kernel 2.4.14
IT66M212 1400 bytes sent Wait 11 us ~4.7us on send PCI bus PCI bus ~45% occupancy ~ 3.25 us on PCI for data recv
IT66M212 1400 bytes sent Wait 11 us Packets lost Action of pause packet?
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
13
CdL was here
Packet Loss: Where?
Intel Pro 1000 on 370DLE 1472 byte packets Expected loss in transmitter ! /proc/net/snmp
UDPmon
UDP
IP
Eth drv
UDPmon
UDP
IP
Eth drv
HW HW
N Gen
N Transmit
N Lost
InDiscards
N Received
Gig Switch
No loss at switch But Pause packet seen to sender
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
14
CdL was here
High Speed TCP
Gareth & Yee Implemented mods to TCP - Sally Floyd 02 draft RFC Congestion Avoidance Interest in exchanging stacks:
Les Cottrell SLAC
Bill Allcock Argonne
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
15
CdL was here
UDP IntelPro1000 : P4DP6 64bit 66MHz PCI slot4
0
100
200
300
400
500
600
700
800
900
1000
0 5 10 15 20 25 30 35 40
Transmit Time per frame usR
ecv
Wir
e ra
te M
bit
s/s
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
UDP Throughput: Intel Pro/1000 on B2B P4DP6
Max throughput 950Mbit/s Some throughput drop for packets
>1000 bytes
Loss NIC dependent Loss not due to user Kernel
moves
Traced to discards in the receiving IP layer ???
Motherboard: SuperMicro P4DP6 Chipset: Intel E7500 (Plumas) CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz Slot 4: PCI, 64 bit, 66 MHz RedHat 7.2 Kernel 2.4.14
UDP IntelPro1000 : P4DP6 64bit 66MHz PCI slot4
0
10
20
30
40
50
60
70
80
90
100
0 5 10 15 20 25 30 35 40
Transmit Time per frame us
% P
acket
loss
50 bytes 100 bytes 200 bytes 400 bytes 600 bytes 800 bytes 1000 bytes 1200 bytes 1400 bytes 1472 bytes
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
16
CdL was here
Interrupt Coalescence: Latency Intel Pro 1000 on 370DLE 800 MHz CPU
0
20
40
60
80
100
120
140
160
180
0 200 400 600 800 1000 1200 1400
Packet size bytes
Lat
ency
us
coal5_0 coal10_0coal20_0 coal40_0coal64_0 coal100_0coal0_0
DataTAG CERN Sep 2002R. Hughes-Jones Manchester
17
CdL was here
Interrupt Coalescence: Throughput
Intel Pro 1000 on 370DLE
Throughput 1472 byte packets
0
100
200
300
400
500
600
700
800
900
0 10 20 30 40
Delay between transmit packets us
Re
ce
ive
d W
ire
ra
te M
bit
/s
coa5
coa10
coa20
coa40
coa64
coa100
Throughput 1000 byte packets
0
100
200
300
400
500
600
700
800
0 10 20 30 40
Delay between transmit packets us
Receiv
ed
Wir
e r
ate
Mb
it/s
coa5
coa10
coa20
coa40
coa64
coa100