the design and demonstration of the ultralight network testbed

The Design and Demonstration of the UltraLight Network Testbed

http://ultralight.caltech.eduPresented by

Xun Su [email protected]

GridNets 2006, Oct 2nd, 2006

mailto:[email protected]

ESnet Monthly Accepted Traffic ThroughMay, 2005

0

100

200

300

400

500

600

Feb,

90

Sep

, 90

Apr

, 91

Nov

, 91

Jun,

92

Jan,

93

Aug

, 93

Mar

, 94

Oct

, 94

May

, 95

Dec

, 95

Jul,

96

Feb,

97

Sep

, 97

Apr

, 98

Nov

, 98

Jun,

99

Jan,

00

Aug

, 00

Mar

, 01

Oct

, 01

May

,02

Dec

, 02

Jul,

03

Feb,

04

Sep

, 04

Apr

, 05

TByt

e/M

onth

Long Term Trends in Network Traffic Volumes: 300-1000X/10Yrs

SLAC Traffic ~400 Mbps; Growth in Steps (ESNet Limit): ~ 10X/4 Years.

Summer ‘05: 2x10 Gbps links: one for production, one for R&D

Projected: ~2 Terabits/s by ~2014

W. Johnston

L. Cottrell

Progressin Steps

10 Gbit/s

TER

AB

YTES

Per

Mon

th

100

300

400

500

600

200

ESnet Accepted Traffic 1990 – 2005Exponential Growth:

Avg. +82%/Year for the Last 15 Years

Motivation

Provide the network advances required to enable petabyte-scale analysis of globally distributed data. Current Grid-based infrastructures provide

massive computing and storage resources, but are currently limited by their treatment of the network as an external, passive, and largely unmanaged resource.

The mission of UltraLight is to: Develop and deploy prototype global services

which broaden existing Grid computing systems by promoting the network as an actively managed component.

Integrate and test UltraLight in Grid-based physics production and analysis systems currently under development in ATLAS and CMS.

Engineer and operate a trans- and intercontinental optical network testbed for broader community

UltraLight Backbone

The UltraLight testbed is a non-standard core network with dynamic links and varying bandwidth inter-connecting our nodes.

The core of UltraLight is dynamically evolving as function of available resources on other backbones such as NLR, HOPI, Abilene and ESnet.

The main resources for UltraLight: US LHCnet (IP, L2VPN, CCC) Abilene (IP, L2VPN) ESnet (IP, L2VPN) UltraScienceNet (L2) Cisco Research Wave (10 Gb Ethernet over NLR) NLR Layer 3 Service HOPI NLR waves (Ethernet; provisioned on

demand) UltraLight nodes: Caltech, SLAC, FNAL, UF, UM,

StarLight, CENIC PoP at LA, CERN, Seattle

UltraLight topology: point of presence

GOALGOAL:: Determine an effective mix of bandwidth-management Determine an effective mix of bandwidth-management techniques for this application-space, particularly:techniques for this application-space, particularly:

Best-effort and “scavenger” using Best-effort and “scavenger” using “effective” protocols“effective” protocolsMPLSMPLS with with QOS-enabledQOS-enabled packet switchingpacket switchingDedicated pathsDedicated paths provisioned with TL1 commands, provisioned with TL1 commands,

GMPLSGMPLS PLANPLAN: : Develop, Test the most cost-effective integrated Develop, Test the most cost-effective integrated

combination of network technologies on our unique testbed:combination of network technologies on our unique testbed:Exercise UltraLight Exercise UltraLight applicationsapplications on NLR, Abilene and on NLR, Abilene and

campus networks, as well as LHCNet, and our international campus networks, as well as LHCNet, and our international partnerspartners

Deploy and systematically study Deploy and systematically study ultrascale protocolultrascale protocol stacks stacks (such as FAST) addressing issues of performance & fairness(such as FAST) addressing issues of performance & fairness

Use MPLS/QoS and other forms of Use MPLS/QoS and other forms of BW managementBW management, to , to optimize end-to-end performance among a set of virtualized disk optimize end-to-end performance among a set of virtualized disk serversservers

Address Address “end-to-end” issues“end-to-end” issues, including monitoring and end-, including monitoring and end-hostshosts

UltraLight Network Engineering

UltraLight: Effective Protocols

The protocols used to reliably move data are a critical component of Physics “end-to-end” use of the networkTCP is the most widely used protocol for reliable data transport, but is becoming ever more ineffective for higher and higher bandwidth-delay networks.UltraLight is exploring extensions to TCP (HSTCP, Westwood+, HTCP, FAST, MaxNet) designed to maintain fair-sharing of networks and, at the same time, to allow efficient, effective use of these networks.

FAST

others

Gigabit WAN 5x higher utilization Small delay

FAST: 95%

Reno: 19%

Random packet loss 10x higher throughput Resilient to random loss

FAST Protocol Comparisons

Optical Path Developments

Emerging “light path” technologies are arriving:They can extend and augment existing grid

computing infrastructures, currently focused on CPU/storage, to include the network as an integral Grid component.

Those technologies seem to be the most effective way to offer network resource provisioning on-demand between end-systems.

We are developing a multi-agent system for secure light path provisioning based on dynamic discovery of the topology in distributed networks (VINCI)We are working to further develop this distributed agent system and to provide integrated network services capable of efficiently using and coordinating shared, hybrid networks, improving the performance and throughput for data intensive grid applications. This includes services able to dynamically configure routers and to aggregate local traffic on dynamically created optical connections.

GMPLS Optical Path Provisioning

Collaboration efforts between UltraLight and Enlightened Computing.

Interconnecting Calient switches across the US for the purpose of unified GMPLS control plane.

Control Plane: IPv4 connectivity between site for control messages

Data Plane: Cisco Research wave: between LA and

Starlight EnLIGHTened wave: between StarLight and

MCNC Raleigh LONI wave: between Starlight and LSU Baton

Rouge over LONI DWDM.

GMPLS Optical Path Network Diagram

Realtime end-to-end Network monitoring is essential for UltraLight. We need to understand our network infrastructure and track its performance both historically and in real-time to enable the network as a managed robust component of our infrastructure.

Caltech’s MonALISA: http://monalisa.cern.chSLAC’s IEPM: http://www-

iepm.slac.stanford.edu/bw/ We have a new effort to push monitoring to the “ends” of the network: the hosts involved in providing services or user workstations.

Monitoring for UltraLight

MonALISA UltraLight Repository

The UL repository: http://monalisa-ul.caltech.edu:8080/

The Functionality of the VINCI System

Layer 3

Layer 2

Layer 1

Site A Site B Site C

MonALISA

ML AgentML Agent

MonALISA

ML AgentML Agent

MonALISA

ML AgentML Agent

ML proxy servicesML proxy services

Agent

Agent

Agent

Agent

ROUTERS

ETHERNETLAN-PHYor WAN-PHY

DWDMFIBER

Agent

SC|05 Global Lambdas for Particle Physics

We previewed the global-scale data analysis of the LHC Era

Using a realistic mixture of streams: Organized transfer of multi-TB event datasets; plus Numerous smaller flows of physics data that absorb the remaining capacity

We used Twenty Two [*] 10 Gbps waves to carry bidirectional traffic between Fermilab, Caltech, SLAC, BNL, CERN and other partner Grid sites including: Michigan, Florida, Manchester, Rio de Janeiro (UERJ) and Sao Paulo (UNESP) in Brazil, Korea (KNU), and Japan (KEK)

The analysis software suites are based on the Grid-enabled UltraLight Analysis Environment (UAE) developed at Caltech and Florida, as well as the bbcp and Xrootd applications from SLAC, and dcache/SRM from FNAL

Monitored by Caltech’s MonALISA global monitoring and control system

[*] 15 at the Caltech/CACR Booth and 7 at the FNAL/SLAC Booth

Switch and Server Interconnections at the Caltech Booth Switch and Server Interconnections at the Caltech Booth

15 10G Waves 64 10G Switch

Ports: 2 Fully Populated Cisco 6509Es

43 Neterion 10 GbE NICs

70 nodes with 280 Cores

200 SATA Disks 40 Gbps

(20 HBAs) to StorCloud

Thursday - Sunday

Monitoring NLR, Abilene/HOPI, LHCNet, USNet,TeraGrid, PWave, SCInet, Gloriad, JGN2, WHREN, other Int’l R&E Nets, and 14000+ Grid Nodes at 250 Sites (250k Paramters) Simultaneously

I. Legrand

HEP at SC2005Global Lambdas for Particle Physics

RESULTS 151 Gbps peak, 100+ Gbps of throughput

sustained for hours: 475 Terabytes of physics data transported in < 24 hours 131 Gbps measured by SCInet BWC

team on 17 of our waves Sustained rate of 100+ Gbps translates

to > 1 Petayte per day Linux kernel optimized for TCP-based

protocols, including Caltech’s FAST Surpassing our previous SC2004 BWC

Record of 101 Gbps

Global Lambdas for Particle PhysicsCaltech/CACR and FNAL/SLAC Booths

Above 100 Gbps for Hours

475 TBytes Transported in < 24 Hours

Sustained Peak Projects to > 1 Petabyte Per Day

It was the first time: a struggle for the equipment and the

team

We will stabilize, package and more widely deploy these methods and tools in 2006

SC05 BWC Lessons Learned

Take-aways from this Marathon exercise: An optimized Linux kernel (2.6.12 + FAST-TCP +

NFSv4) for data transport; after 7 full kernel-build cycles in 4 days

Scaling up SRM/gridftp to near 10 Gbps per wave, using Fermilab’s production clusters

A newly optimized application-level copy program, bbcp, that matches the performance of iperf under some conditions

Extensions of SLAC’s Xrootd, an optimized low-latency file access application for clusters, across the wide area

Understanding of the limits of 10 Gbps-capable computer systems, network switches and interfaces under stress

Thank You

the design and demonstration of the ultralight network testbed

Documents

ultralight network testbedhttp

core of ultralight

test ultralight

mission of ultralight

network advances

network traffic volumes

nonstandard core network

seattle ultralight topology