an architectural approach to managing data in transit

15
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science Department University of Tennessee DOE Data Management Workshop 3/17/2004

Upload: dale

Post on 17-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

An Architectural Approach to Managing Data in Transit. Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science Department University of Tennessee DOE Data Management Workshop 3/17/2004. “Data in Transit”. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Architectural Approach to Managing Data in Transit

An Architectural Approach to Managing Data in Transit

Micah Beck

Director & Associate Professor

Logistical Computing and Internetworking LabComputer Science Department

University of Tennessee

DOE Data Management Workshop 3/17/2004

Page 2: An Architectural Approach to Managing Data in Transit

“Data in Transit”

» After being generated by an instrument or supercomputer

» Not stored in a permanent archive» Serving the diverse purposes of a community of

users and applications» Being transferred, processed and stored to meet

changing and unanticipated needs• Visualization• Data Mining• Collaboration• Distributed Computing

Page 3: An Architectural Approach to Managing Data in Transit

Interoperability via a Common Interface

» Span heterogeneous physical resources, operating systems, local management schemes

» Serve changing and unexpected application requirements; enable application autonomy

» We measure success in terms of infrastructure deployment scalability• In networks and distributed systems, this means

number, distribution, global reach, spanning administrative domains…

• The Internet is the gold standard of infrastructure deployment scalability

Page 4: An Architectural Approach to Managing Data in Transit

Layering as An Architectural Approach

» Abstractions at each layer can hide differences at lower layers

» Exposed approaches avoid creating overly complex mechanisms at lower layers

» The E2E Principle: Attributes of lower layers implemented on shared infrastructure enable deployment scalability• Generality: Serve diverse application needs,

model diverse lower layer resources• Weak semantics: Don’t give too much away at

one time!

Page 5: An Architectural Approach to Managing Data in Transit

The IP Network Stack

common interface (IP)

Physical

Link

Network

Transport

Application

Page 6: An Architectural Approach to Managing Data in Transit

IP’s Failure of Scalability

» Today, IP is failing as a common interface» The design of IP is out of date

• Application communities are more diverse• Link layer technologies violate IP assumptions

» Application communities are defining their own common interfaces for general resource sharing, deploying their own infrastructure (e.g. the Grid)

» Some networking communities have abandoned interoperability at the network layer between widely divergent link layer technologies (e.g. optical switching & IP)

Page 7: An Architectural Approach to Managing Data in Transit

The Transit Layer: A New Location for Interoperability

» Expand the link layer to a local layer to model transfer, storage and processing resources

» Insert a new transit layer between the local and network layers to implement a common interface to diverse technologies at the local layer

» Adopt a highly general common interface at the transit layer, providing a uniform view of all of the resources of the network node

» Build diverse network services on top of this common interface to model diverse application requirements

» “Locating Interoperability in the Network Stack”, Micah Beck & Terry Moore, UT-CS-04-520, Univ. of TN CS Dept Tech Rpt

Page 8: An Architectural Approach to Managing Data in Transit

The Transit Network Stack

common interface

Physical

Local

Network

Transport

Application

Transit

transfer storage processing

Page 9: An Architectural Approach to Managing Data in Transit

Transit Networking: A Unified View

“… memory locations … are just wires turned sideways in time”

Dan Hillis, 1982,Why Computer Science is No

Good

Page 10: An Architectural Approach to Managing Data in Transit

Logistical Networking: An Overlay Implementation of the Transit Layer

» Logistical Networking is an overlay implementation of transit layer functionality built on top of the IP network

» The Internet Backplane Protocol is the common transit layer interface for Logistical Networking

» Network nodes are IBP “depots” that run as user level processes, communicate using TCP/IP as well as other link and network layer protocols

» Depots also serve storage and processing resources to Logistical Networking clients

Page 11: An Architectural Approach to Managing Data in Transit

LN Tools and Deployment

» The Logistical Runtime System (LoRS) is a set of tools based on IBP that enable users to take advantage of the resources of IBP depots

» Logistical Distribution Network (LoDN) is a data directory, monitoring and management system

» The Logistical Backbone is a Resources Discovery service and global experimental IBP testbed• Over 35 TB of storage available• Over 300 depots in 21 countries• Leverages the resources of PlanetLab

» Additional depots deployed at ORNL & NERSC

Page 12: An Architectural Approach to Managing Data in Transit

L-Bone: August 2003 (20TB)

Page 13: An Architectural Approach to Managing Data in Transit

Example LN Applications

» Astrophysics: Terascale Supernova Initiative (A. Mezzacappa, ORNL; J. Blondin, NCSU)• Management of massive datasets

» Fusion Energy Research (S. Klasky, PPPL)• Streaming of simulation data during generation

» Viewset-Based Visualization• Prestaging & caching of distant data

» Content Distribution• Heroic data distribution problems (Linux ISOs)

» Multimedia Networking• Creation, mgt & delivery of high value content

Page 14: An Architectural Approach to Managing Data in Transit

LN Futures and Directions

» Storage• Implementation of file system services• Moving data through firewalls at line speed• QoS in highly controlled environments

» Networking• Interoperability at ultrascale• Advanced services (e.g. multicast)

» Computation• Offloading visualization to IBP depots• Developing sets of operations to support

application communities

Page 15: An Architectural Approach to Managing Data in Transit

Thank you!

[email protected]

http://loci.cs.utk.edu