next generation cloud computing: advanced services ...€¦ · july 1-3, 2009 • creation and ......
TRANSCRIPT
Next Generation Cloud Computing: Advanced Services,
Architecture, and Technologies
Joe Mambretti, Director, ([email protected])
International Center for Advanced Internet Research (www.icair.org)
Northwestern University
Director, Metropolitan Research and Education Network (www.mren.org)
Partner, StarLight/STAR TAP, PI-OMNINet (www.icair.org/omninet)
Technische Universit Carolo-Wilhelmina
zu Braunschweig
Braunschweig,
July 1-3, 2009
• Creation and Early Implementation of Advanced Networking Technologies - The Next Generation Internet All Optical Networks, Terascale Networks, Networks for Petascale Science
• Advanced Applications, Middleware, Large-Scale Infrastructure, NG Optical Networks and Testbeds, Public Policy Studies and Forums Related to NG Networks
• Three Major Areas of Activity: a) Basic Research b) Design and Implementation of Prototypes c) Operations of Specialized Communication Facilities (e.g., StarLight)
Accelerating Leading Edge Innovation
and Enhanced Global Communications
through Advanced Internet Technologies,
in Partnership with the Global Community
Introduction to iCAIR:
Invisible Nodes,
Elements,
Hierarchical,
Centrally Controlled,
Fairly Static
Traditional Provider Services:
Invisible, Static Resources,
Centralized Management,
Highly Layered
Distributed Programmable Resources,
Dynamic Services,
Visible & Accessible Resources,
Integrated As Required, Non-Layered
Limited Services, Functionality,
Flexibility
Unlimited Services, Functionality,
Flexibility
Paradigm Shift – Ubiquitous Services Based on Large Scale
Distributed Facility vs Isolated Services Based on Separate
Component Resources
A Next Generation Architecture: Distributed Facility
Enabling Many Types Network/Services
Commodity
Internet
Environment: VO
Environment:
International Gaming Fabric
Environment:
Control Plane
TransLight Environment: Real Org
Environment: Sensors
Environment: Intelligent
Power Grid Control
Environment: Real Org1
Environment:
Large Scale System Control
Environment: Global App
Environment: Financial Org
Environment: Gov AgencyEnvironment: RFIDNet
Environment: Bio Org
Environment: Lab
Environment: Real Org2
SensorNetFinancialNet
HPCNet
MediaGridNet
R&DNet
RFIDNet
BioNet
PrivNet
GovNet1
MedNet
Cloud Context (1)
• In General, Clouds Are A Means To Support Large Scale Computing
and Data Capabilities for Distributed On-Demand Resources Using
Data Networks (WANs)
• There Are Many Different Types of Clouds
• Some Are Oriented Toward Services (e.g., Web 2.0 Based)
• Some Are Oriented Toward Resources
– For Example, On-Demand Computing Instances Using
Infrastructure As A Service Techniques (IaaS, Amazon EC2, S3,
etc., Eucalyptus)
• Some provide Large Scale On-Demand Computing Capacity (G
FS/MapReduce/Bigtable, Hadoop, Sector, etc
• Some Support Public Services Provided By Global Corporations,
Some Support Private Enterprises Through External Resources,
Some Support Private Organizations Through Internal Resources –
and There Are Many Variations and Hybrids
5
Cloud Context (2)
• To Date, Clouds Have Been Successful
• However, Current Clouds Have Limitations
• For Example, They Do Not Necessarily Provide
Optimal Performance
• Also, Current Clouds Are Oriented Toward
Providing Support For Many Billions of Small
Data Over Commodity “Best Effort” Routed
Networks
• Current Clouds Do Not Provide Optimal Support for
Large Capacity Data Flows and Extremely Large
Amounts of Individual Data Components
• They Have Not Been Integrated Into Next
Generation Networking Capabilities
• They Do Not Handle Specialized Data Well – e.g.,
Digital Media
Cloud Context (3)
• If Clouds Are Successful, Why Improve Them
Despite Limitations?
• They Are Successful for Today’s Consumer and
Enterprise Services –
• However, They Will Not Meet Future Challenges
Using Current Architectures and Technologies
• Illustration – And Also Motivation For Improving
the State-of-the-Art – Large Scale Science
• Why Large Scale Science?
• Large Scale Science Provides a Looking Glass Into
the Future
A Scientific Perspective
• Scientific Research Requires The Resolution of Extremely Complex Problems
• Scientific Research Requires The Design And Creation Of Specialized Tools
• Increasingly, These Tools Are Being Created Using Digital Technologies
• Because of the Complexity and Scale of Major Scientific Problems, Many Areas of Research Encounter Technical Barriers Years Before They Are Recognized By Other Domains
• Technical Solutions That Are Created Later Migrate To Wider Communities
• Can Large Scale Science Use Clouds? Not Until the Limitations Described Earlier Are Addressed
Motivation: Data-Intensive Science &
Engineering-e-Science Community Resources
ATLAS
Sloan Digital Sky
Survey
LHC
ALMA
Magnetic Fusion Energy
Source: DOE
Source: DOE
New Sources
Of Power
Spallation Neutron Source (SNS) at ORNL
Source: DOE
USGS Images 10,000 Times
More Data than Landsat7
Shane DeGross, Telesis
USGS
Landsat7 Imagery
100 Foot Resolution
Draped on elevation data
New USGS Aerial Imagery
At 6-inch Resolution
Source: EVL, UIC
Today’s Aerial Imaging is >500,000
Times More Detailed than Landsat7
30 meter pixels
4 centimeter pixels
Shane DeGross
Laurie Cooper, SDSU
SDSU Campus
Source: Eric Frost, SDSU
iGrid 2005
UCSD
4K Digital Media
Ultra High Definition
Digital
Communications• NTT, Japan
NTT’s digital communications using
SHD transmits extra-high-quality,
digital, full-color, full motion images.
4k pixels horizontal, 2k vertical
4 * HDTV – 24 * DVD
. www.onlab.ntt.co.jp/en/mn/shd
• Center for Computation and Technology,
Louisiana State University (LSU), USA
• Northwestern University
• MCNC, USA
• NCSA, USA
• Lawrence Berkeley National Laboratory,
USA
• Masaryk University/CESNET, Czech
Republic
• Zuse Institute Berlin, Germany
• Vrije Universiteit, NL
www.cct.lsu.edu/Visualization/iGrid2005
http://sitola.fi.muni.cz/sitola/igrid/
• Interactive visualization coupled with
computing resources and data storage
archives over optical networks enhance
the study of complex problems, such as
the modeling of black holes and other
sources of gravitational waves.
• HD video teleconferencing is used to
stream the generated images in real time
from Baton Rouge to Brno and other
locations
High-Performance Digital MediaFor Interactive Remote Visualization (2006)
OptIPuter JuxtaView Software for Viewing
High Resolution BioImages on Tiled Displays
30 Million Pixel Display
NCMIR Lab UCSDSource: David Lee, Jason
Leigh, EVL, UIC
Components Comprising Environment
• Overall Architecture
• Compute Nodes
• Storage Performance
• Network Architecture, Protocols,
Performance and Technology
Note=>Ultra High Performance Networks
Can Make Remote Data Appear To Be Local
• Proof of Concept – Large Scale
Testbed/Prototype
• Integration With Emerging Technologies, e.g.,
Massive Multicore, FPGAs, Customized
Integrated Components, etc.
Architecture (1)
Storage Compute
Data
Super-Computer Model:
•Expensive
•IO is a bottleneck
Alternative Model:
•Inexpensive,
•Parallel data IO
•Examples: Hadoop
•Sphere/Sector
Source: NCDM, UIC
Architecture (2)
Parallel/Distributed
Programming With MPI,
etc.:
•Flexible and powerful.
•BUT Very Complicated
•No Data Locality
Sector/Sphere Model:
•Very Simple to Apply UDF to
All Data in Parallel;
•Exploits Data Locality
•Limited to Certain Data
Parallel Applications.
Source: NCDM, UIC
What is Sector/Sphere?
• Sector/Sphere = Wide Area Cloud Providing
On-Demand Computing Capacity
• Sector: Distributed Storage System
• Sphere: Run-time Middleware Applies User
Defined Functions (UDF) to Sector Datasets.
• Open Source Software, GPL/LGPL, written in
C++.
• Initiated 2006, Current Version 1.19
• http://sector.sf.net
Source: NCDM, UIC
Sector
• Sector: Provides Long Term Persistent Storage to
Large Datasets Managed as Distributed Indexed Files.
• File Segments Are Placed Throughout Distributed
Storage Managed by Sector.
• Sector Generally Replicates Data To
• Ensure Longevity,
• Decrease the Latency When Retrieving It,
• Provide Opportunities for Parallelism.
• Sector is Designed to Take Advantage of Wide Area
High Performance Networks When Available.
•Sector Can Address Issues Of Extremely Large Data Sets,
Including Very Large Scale Science Data Sets
Source: NCDM, UIC
Sphere
• Sphere: Designed To Execute User Defined Functions
(UDF) In Parallel Using a Stream Processing Pattern for
The Data That Is Managed By Sector
• UDFs Are Applied To Every Data Record In a Data Set
Managed by Sector
• Each Data Segment Is Processed Independently
Providing a Natural Parallelism
• The Sector/Sphere Design Results in Allowing Data
To Be Frequently Processed in Place Without
Moving It
• If Data Must be Moved, It Can Be Transported Over High
Performance Channels With High Performance
Protocols
Source: NCDM, UIC
Comparing Hadoop and Sector
Hadoop Sector
Storage Cloud Block-based file
system
File-based
Programming
Model
MapReduce MapReduce&
UDF
Protocol TCP UDP-based
protocol (UDT)
Replication At time of writing Periodically
Security Within 6 months Security (HIPAA)
Language Java C++
23
Proof of Concept: Large Scale
Testbed/Prototype
• Theories Must Be Proven in Practice Using Real Facilities
• Questions: Can This Concept Scale?
Will It Work At Extreme Scales?
Can High Performance Be Achieved?
• Lab Modeling and Simulation Cannot Substitute for
Real Empirical Studies
• Experimental Research testing Is Required
-- Using Real World Large Scale Facilities
• Distributed Environments and Infrastructure
• With National Science Foundation Funding A Research
Consortium (NCDM, iCAIR, Et Al) Has Created:
• An International Scale TeraFlow Testbed Using NLR and
the Global Lambda Integrated Facility (GLIF)
• A National Scale Open Cloud Testbed, Based on the NLR
Source: NLR
Teraflow 1 & 2 TestbedTeraflow Network is Built Over the
National Lambda Rail and GLIF Seoul
Asia EUGLIF
StarLight – “By Researchers For Researchers”
Abbott Hall, Northwestern University’s
Chicago downtown campusView from StarLight
StarLight is an experimental optical infrastructure andproving ground for network services optimized forhigh-performance applicationsGE+2.5+10GEExchangeSoon:Multiple 10GEsOver Optics –World’s “Largest”10GE ExchangeFirst of a KindEnabling InteroperabilityAt L1, L2, L3
iCAIR: Founding Partner of the Global Lambda Integrated Facility
Available Advanced Customizable Network Resources
Visualization courtesy of Bob Patterson, NCSA; data compilation by Maxine Brown, UIC.
www.glif.is
GLIF is a consortium of institutions, organizations, consortia and country
National Research & Education Networks who voluntarily share optical
networking resources and expertise to develop the Global LambdaGrid for the
advancement of scientific collaboration and discovery.
Enables the Creation of Distributed Virtual Environments
Open Cloud Testbed – 2009
6 Locations
• 8 racks
• 256 Nodes
• 1024 Cores
• 10+ Gb/s
32
MREN
CENIC Dragon
Hadoop
Sector/Sphere
Thrift
Eucalyptus
C-Wave
Example: Sorting a TeraByte
• Data is Split Into Multiple Small Files,
Scattered On All Nodes
• Stage 1: On Each Node, an SPE Scans
Local Files, Sends Each Record To a
“Bucket File” On a Remote Node
According To The key, So That All
Buckets Are Sorted.
• Stage 2: On eEach Destination Node, an
SPE Sorts All Data Inside Each Bucket.
Source: NCDM, UIC
TeraSort Using Sector & Data-Parallel UDFs
on Open Cloud Testbed
10-byte 90-byte
Key Value
10-bit
Bucket-0
Bucket-1
Bucket-1023
0-1023
Stage 1:
Hash based on
the first 10 bits
Bucket-0
Bucket-1
Bucket-1023
Stage 2:
Sort each bucket
on local node
Binary Record 100 bytes
Source: NCDM, UIC
Performance Results: TeraSort on Open
Cloud Testbed
Data Size Sphere Hadoop (3
replicas)
Hadoop (1
replica)
UIC 300GB 1265 2889 2252
UIC +
StarLight
600GB 1361 2896 2617
UIC +
StarLight +
Calit2
900GB 1430 4341 3069
UIC +
StarLight +
Calit2 + JHU
1200GB 1526 6675 3702
Run time: seconds
Sector v1.16 vs Hadoop 0.17
Source: NCDM, UIC
Terasort on Open Cloud Testbed
Source: NCDM, UIC
Testbed Demonstrations With National Science Foundation
at the Annual Conference of
The American Association for the Advancement of Science
February 2009
Using An Optical Fiber Extension from StarLight/GLIF
NSF Director
Former NSF Director NCDM Director
Future: On-Going Expansion
• Expansion Using More Resources
• Additional Enhancements
• Integration With Emerging Technologies
• Expansion Across National Fabrics
• Expansion Across International Fabrics,
Using GLIF, StarLight
• Additional Communities
For More Information
• iCAIR: www.icair.org
• NCDM: www.ncdm.uic.edu
• Open Cloud Testbed:
www.opencloudconsortium.org
• Sector: sector.sourceforget.net