networking research overview
DESCRIPTION
Networking Research Overview. Micah Beck Assoc. Prof., Computer Science Director, LoCI Laboratory University of Tennessee SciDAC PI Mtg 24 March 2004. SciDAC Networking Research Projects: Goals. Goal: Phase I - PowerPoint PPT PresentationTRANSCRIPT
Networking Research Overview
Micah BeckAssoc. Prof., Computer Science
Director, LoCI LaboratoryUniversity of Tennessee
SciDAC PI Mtg 24 March 2004
SciDAC Networking Research Projects: Goals
• Goal: Phase I– Develop data movement tools and infrastructures to support
real-time data-intensive SciDAC applications
– To develop advanced network tools enable SciDAC applications efficiently measure, predict, and diagnose end-to-end performance (2 projects)
– To develop and deploy cyber security tools to support group collaborations in grid infrastructures
• Goal: Phase II– Deploy the advanced tools developed in phase I in production
infrastructures to support network intensive SciDAC projects
Logistical Networking: Tools, Applications & Architecture
Micah BeckJack Dongarra
James S. Plank University of Tennessee
Rich Wolksi University of California,Santa Barbara
http://loci.cs.utk.edu/scidac
Project Thrusts
• Dongarra: Application Development Tools/Environments – NetSolve/GridSolve
• Wolski: Network Monitoring/Prediction– Network Weather Service
• Beck & Plank: Logistical Networking Infrastructure, Middleware & Support– Internet Backplane Protocol– Logistical Runtime System
Internet Backplane Protocol
• Overlay intermediate node providing services based on enriched resources– Storage: file system, RAM, disk– Transfer: TCP (std, compressed),
UDP(SABUL, mcast), SAN/WAN– Processing: primitive operations (alpha)
• 100s of IBP depots deployed worldwide• 1.4 alpha release: persistent sockets;
optional authentication, usage logging
Logistical Networking Tools
• Logistical Runtime System (LoRS)– E2E Services: Fault tolerance (Reed-Solomon), encryption (AES),
compression, high perf. data movement strategies
– Library, command line, GUI, Web tools– Ported to all compute platforms (Cray OS problems)
• Logistical Backbone (L-Bone)– depot monitoring, resource discovery
• Logistical Distribution Network (LoDN)– directory services, content distribution– Java Web Start delivery of tools
SciDAC Application Impact• Terascale Supernova Initiative
(A. Mezzacappa, ONRL; J. Blondin, NCSU, D. Swesty, SUNY Stony Brook)– Five 1.6TB depots deployed at TSI sites
• Energy Fusion Research (S. Klasky, PPPL)– Depots deployed on PPPL cluster nodes
• Dataset transfers: O(1TB) @ 1-400 Mb/s– Simulations at NERSC and ORNL– Control/viz at ONRL, NCSU, Stony Brook, PPPL– Transfers span ESNet, Abilene
• CS/Physics collaboration, science getting done!
TSI Site Deployment: ORNL, NCSU, SUNY Stony Brook, NERSC, UCSD
SciDAC Technology Impact
• Spanning heterogeneous networks– Ultrascale (10 Gbps) wide area transfers
require specialized systems– Optically swtiched networks (e.g DOE
Science UltraNet) do not peer with IP
• Serving scalable communities– Staging and caching at intermediate nodes– Processing data “in transit”
• Common services on distributed data
transfer processingstorage
Transit Networking Architecture
…
common interface
Physical
Local
Network
Transport
Application
Transit
link
IP
INCITE –Edge-based Traffic Processing
for High-Performance Networks
R. Baraniuk, E. Knightly, R. Nowak, R. Riedi Rice University
L. Cottrell, J. Navratil, W. MathewsSLAC
W. Feng, M. GardnerLANL
web site: incite.rice.edu
INCITE Project• InterNet Control and Inference from The Edge
on-line tools to characterize and map host and network performance as a function of time, space, application, protocol, and service
INCITE Thrusts and ToolsThrust 1: Multiscale traffic analysis and modeling
techniques» wavelet, multifractal, connection-level models
Thrust 2: Inference and control algorithms for network paths, links, and routers
» end-to-end path probing and modeling» network tomography and topology discovery» advanced high-speed protocols
Thrust 3: Data collection tools
» active measurement infrastructure» passive application-layer measurement
pathChirp• Goal
– estimate instantaneous available bandwidth (ABW) on an end-to-end network link
• Basic probing paradigm– stream packets at some rate
• no queuing delay rate<ABW• queuing delay builds up
rate>ABW• Until now: tradeoff
– high accuracy has required high volume probing (inefficient)
• Unique to pathChirp – variable rate probe packet train
(exponentially spaced chirp)– 10x more efficient than
competing techniques
Network TomographyFrom end-to-endmeasurements…
… infer internal topology and delay/loss characteristics
TCP - Low Priority
• TCP alone 745.5 Kb/s
• TCP plus 739.5 Kb/s
TCP-LP 109.5 Kb/s
• TCP-LP is invisible to TCP
• Goal– utilize excess bandwidth in a
non-intrusive fashion
• Methodology– sender-side modification of TCP:
delay-based approach
• Applications– bulk data transfers– available bandwidth monitoring– P2P file sharing
• High-speed TCP-LP– TCP-LP + HSTCP– implementation
• Linux-2.4.22-web100
– experiments• Stanford - Ann Arbor• Stanford - Gainesville
R 1 R 2
TC P-L P
TC P
C = 1 .5 M b/s
cro s s - t ra f f ic
Changes in network topology (BGP) can result in dramatic changes in performance
Snapshot of traceroute summary table
Samples of traceroute trees generated from the table
ABwE measurement one/minute for 24 hours Thursday 9 October 9:00am to Friday 10 October 9:01am
Drop in performance(From original path: SLAC-CENIC-Caltech to SLAC-Esnet-LosNettos (100Mbps) -Caltech )
Back to original path
Changes detected by IEPM-Iperf and AbWE
Esnet-LosNettos segment in the path(100 Mbits/s)
Hour
Rem
ote
host
Dynamic BW capacity (DBC)
Cross-traffic (XT)
Available BW = (DBC-XT)
Mbit
s/s
Note:1. Caltech misrouted via Los-Nettos 100Mbps commercial net 14:00-17:002. ESnet/GEANT working on routes from 2:00 to 14:00
Los-Nettos (100Mbps)
Crossing the Application/Network Divide
Application
TCP
IP
Data Link
Network
Send dataover network
Segmentation
Fragmentation
Flow & Congestion Control
Checksums
::
• Implications to the application?• Insights for high- performance network
protocols?
Network monitors focus here.
TICKET and MAGNET+MUSETICKET: Traffic Information-Collecting Kernel with Exact Timing
MAGNeT: Monitor for Application-Generated Network TrafficMUSE: MAGNET User-Space Environment
Application
TCP
IP
Data Link
Network
MAGNET
Send dataover network
Segmentation
Fragmentation
Flow & Congestion Control
Checksums
MUSE
TICKET:tcpdump++
::
For more information, go to www.lanl.gov/radiant/pubs.html
MAGNeT MAGNET Monitoring Apparatus for General kerNel-Event Tracing
(at nanoscale granularity)
• Why not extend monitoring to kernel events in general? Software Oscilloscope for Cluster and Grids – Debugging
• e.g., IdentifiedLinux OS bug in the scheduler for SMPs.• Can be used to deploy, debug, and monitor the DOE UltraNet
(UltraScienceNet), e.g., dynamic provisioning.– Performance Optimization
• Improved performance of 10GigE adapters by 300%. Can improve end-to-end performance of DOE UltraNet.
– Monitoring Grid Applications• Integrated MAGNET with SciDAC’s PERC TAU and SciDAC’s PERC
SvPablo/Autopilot.*– Adaptive Resource-Aware Applications
• SciDAC Deployment: PERC, Supernova Science Ctr, Transit Network Fabric + Terascale Supernova Initiative + Fusion Energy (emerging), and Earth Systems Grid II (emerging).
* For more information, see M. Gardner, W. Deng, T. Markham, C. Mendes, W. Feng, and D. Reed, “A High-Fidelity Software Oscilloscope for Globus,” GlobusWorld 2004, Jan. 2004.
Bandwidth estimation:measurement
methodologies and applications
k claffy (CAIDA),
Constantinos Dovrolis (Georgia Tech)
Project goals
• Develop estimation techniques and public-domain tools for the estimation of end-to-end:1. Network capacity (bottleneck bandwidth) 2. Available bandwidth (residual capacity)
• Focus 1: non-intrusive, fast, and accurate techniques
• Focus 2: high-bandwidth paths (up to 1Gbps)• Compare and validate different tools in
reproducible and realistic net conditions• Apply bandwidth estimation in transport and
overlay routing problems• Disseminate research results at conferences and
journals
Main accomplishments• Pathrate: capacity estimation tool
– Based on packet pairs and trains– Publication: Transactions on Networking, to appear
in 2004, and Infocom 2001• Pathload: available bandwidth estimation tool
– Based on self-loading periodic streams– Publications at ACM SIGCOMM02 and PAM 2002
• Both tools are available at:www.pathrate.org
– About 200 downloads per month (and increasing)• Able to measure up to 1Gbps paths, even in the
presence of interrupt coalescence– See publication at PAM 2004
• 1st Bandwidth Estimation workshop at CAIDA, Dec’03
Main accomplishments (cont’)• Created testbed at CAIDA with several high-bw routers
and switches and realistic cross traffic– Tested all existing open-source bandwidth estimation
tools– Showed that, despite that several such tools exist,
very few are accurate and consistent• Developed estimation technique for passive capacity
estimation– See publications at IMC 2003 and PAM 2004
• Showed that per-hop capacity estimation tools (pathchar-like) are not accurate in the presence of layer-2 switches– See publication at Infocom 2003
• Created ANEMOS, a distributed system for automated on-line monitoring of many network paths– See publication at PAM 2003
Ongoing work• Created SOBAS, an automatic socket buffer sizing
technique based on available bandwidth estimation– Basic idea: limit TCP window based on available
bandwidth before the connection causes losses– Does not require changes in TCP
• Develop estimation technique for the variation range of available bandwidth in different time scales– Variation range is crucial for some applications,
including overlay routing• Evaluate the predictability of available bandwidth
process in Internet traffic– How far in the future can we predict the avail-bw with
a given accuracy?• Use of bandwidth estimation in overlay network routing
and in UltraScienceNet dynamic optical circuit bandwidth provisioning
Security and Policy forGroup Collaboration
http://www.mcs.anl.gov/dsl/scidac/security/
• PIs:– Steven Tuecke (ANL)– Carl Kesselman (USC/ISI)– Miron Livny (U. Wisconsin)
• Technologies involved:– Globus Toolkit– Condor
Problem
• Scalable, fine-grain policy management for large, dynamic collaborations:– Large number of individually managed
resources, each with own policies– Large number of users– Users and resources in different domains– Community policies on use of resources
Goals of this Project• Design, develop and standardize tools for maintaining
structure of a collaboration– Take into account collaboration policy, user privileges,
site policies, resource policies, etc.
• Improve significantly the integration of local security environments– E.g., Kerberos
• Instantiate our research results into a framework that makes it useable to a wide range of collaborative tools– Globus Toolkit, Condor
• Work within standards community to socialize and standardize our approaches– GGF, IETF, OASIS
Our ProcessEngage withcommunities
Designand develop
solutions
Integrate intocommunity software
Standardize solutionsfor greater acceptance Evaluate and guide
emerging standards
Get feedback
Delivered Solutions
• Fine-grained Policy R&D:– Community Authorization Service– Dynamic Policy Reconciliation
• Site Security Integration:– KCA/Kx509– Authorization Callouts
• Grid Security Usability:– SimpleCA /Online CA / MiniCA– Online Credential Repository
Standards and Implementations
• X.509 Proxy Certificates
• GSSAPI extensions
• Policy work: SAML, XACML
• Policy Reconciliation