1a-1.1 introduction to grid computing itcs 4146/5146, unc-charlotte, b. wilkinson, 2008 aug 27, 2008
TRANSCRIPT
1a-1.1
Introduction to Grid Computing
ITCS 4146/5146, UNC-Charlotte, B. Wilkinson, 2008 Aug 27, 2008
1a-1.2
“The grid virtualizes heterogeneous geographically disperse resources” from "Introduction to Grid Computing with Globus," IBM
Redbooks
• Using geographically distributed and interconnected computers together for computing and for resource sharing.
Grid Computing
“Grid”
• Common practice to use word Grid as a proper noun (i.e. G is capitalized) although does not refer to one universe Grid.
• There are many Grid infrastructures.
• We have set up one for this course.
• You will learn how that was done and the technicalities in the course.
1a-1.3
1a-1.4
Need to harness computers
Original driving force behind Grid computing same as behind the early development of networks that became the Internet:
– Connecting computers at distributed sites for high performance computing.
1a-1.5
However, Grid computing is about collaborating and resource sharing as much as it is about high performance computing.
1a-1.6
Virtual Organizations
Grid computing offerspotential of virtual organizations:
– groups of people, both geographically and organizationally distributed, working together on a problem, sharing computers AND other resources such as databases and experimental equipment.
Different organizations can supply resources and personnel.
Concept has many benefits, including:
•Problems that could not be solved previously for humanity because of limited computing resources can now be tackled.
Examples
• Understanding the human genome • Searching for new drugs … .
Continued.
1a-1.7
• Users can have access to far greater computing resources and expertise than available locally.
• Inter-disciplinary teams can be formed across different institutions and organizations to tackle problems that require expertise of multiple disciplines.
• Specialized localized experimental equipment can be accessed remotely and collectively.
Continued.
1a-1.8
• Large collective databases can be created to hold vast amounts of data.
• Unused compute cycles can be harnessed at remote sites, achieving more efficient use of computers.
• Business processes can be re-implemented using Grid technology for dramatic cost saving.
1a-1.9
Crosses multiple administrative domains.
• Another hallmark of larger Grid computing projects.
• Resources being shared owned either by members of virtual organization or donated by others.
• Introduces challenging technical and social-political challenges.
• Requires true collaboration.
1a-1.10
• Some key features we regard as indicative of Grid computing:
– Shared multi-owner computing resources
– Uses Grid computing software, with security and cross-management mechanisms in place
– Tools to bring together geographically distributed computers owned by others.
1a-1.11
1a-1.12
Shared Resources
Can share much more than just computers:
• Storage
• Sensors for experiments at particular sites
• Application Software
• Databases
• Network capacity, …
1a-1.13
Interconnections and Protocols
Focus now on:
• using standard Internet protocols and technology, i.e. HTTP, SOAP, web services, etc.,
1a-1.14
History
• Began in mid 1990’s with experiments using computers at geographically dispersed sites.
• Seminal experiment – “I-way” experiment at 1995 Supercomputing conference (SC’95), using 17 sites across US running:
– 60+ applications.– Existing networks (10 networks).
1a-1.15
Applications• Originally e-Science applications
– Computational intensive• Traditional high performance computing
addressing large problems• Not necessarily one big problem but a
problem that has to be solved repeatedly with different parameters.
– Data intensive• Computational but emphasis on large
amounts of data to store and process
– Experimental collaborative projects
1a-1.16
• Now also e-Business applications–To improve business models and
practices.
–Sharing corporate computing resources and databases
–On-demand Grid computing … indirectly led to cloud computing.
Grid Computing verse Cluster Computing
• Important not to think of Grid computing simply as large cluster because potential and challenges different.
UNC-C cluster computing course ITCS 4145/5145.
UNC-C Grid computing course ITCS 4146/5146.
• Courses on Grid computing and on cluster computing are quite different.
1a-1.17
Cluster computing course• One learns about :
– Message passing programming using tools such as MPI, and
– Shared memory programming using threads and OpenMP, given that most computers in a cluster today now multi-core shared memory systems.
– Parallel algorithms (lots)
• Network security is not a big issue. – Usually an ssh connection to front node of cluster
sufficient. – User logging onto a single compute resource.
• Computers connected together locally under one administrative domain
1a-1.18
Grid computing course• Learn about running jobs of remote machines,
scheduling jobs and distributed workflow
• Learn in detail underlying Grid infrastructure
• How Internet technologies applied to Grid computing
• Grid computing software and standards
• Security is an issue.
1a-1.19
Grid Computing verse Cluster Computing
• Of course, there are things in common
• Both courses hands-on with programming experiences.
• Both use multiple computers
• Both require job scheduler to place jobs.
1a-1.20
Cloud computing
• Lot of hype on Cloud computing at the moment.
• Business model in which services provided on servers that can be accessed through Internet.
• Lineage of cloud computing can be traced back to on-demand Grid computing in the early 2000’s.
1a-1.21
1a.22Fig 1.2
Cloud computing using virtualized resources
• Common thread between Grid computing and cloud computing is use of Internet to access resources.
• Cloud computing driven by widespread access that Internet provides and Internet technologies.
• However cloud computing quite distinct from original purpose of Grid computing.
1a-1.23
Grid Computing verse Cloud Computing
• Whereas Grid computing focuses on collaborative and distributed shared resources,
Cloud computing concentrates upon placing services for users to pay to use.
• Technology for cloud computing emphases:– use of software as a service (SaaS)– virtualization (process of separating particular
user’s software environment from underlying hardware).
1a-1.24
Ian Fosters’ check listIan Foster credited for development of Grid computing.
Sometimes called father of Grid computing
Proposed simple checklist of aspects that are common to most true Grids:
•No centralized Control
•Standard open protocols
•Non-trivial quality of service (QoS)
1a-1.25
1a-1.26
Computational Grid Applications
• Biomedical research
• Industrial research
• Engineering research
• Studies in Physics and Chemistry
• …
1a-1.27
Sample Grid Computing Projects
• Enterprise Grids – Grid formed within an organization for collaboration
– Still might cross administrative domains of departments and requires departments to share their resources
– Example: campus Grids
1a-1.28
1a.29
ExampleUniversity of Virginia Campus
Grid
• Partner Grids -- Grids between collaborative organizations
• This makes most use of potential of Grid computing and collaboration
1a-1.30
NSF Network for Earthquake Engineering Simulation
(NEES) Transform our ability to carry out research vital to reducing
vulnerability to catastrophic earthquakes
from I. Foster
Environment/Earth
1a-1.32
SCOOP ProjectSoutheastern Coastal Ocean Observing and
Prediction Programhttp://scoop.sura.org/
• Integrating data from regional observing systems for real time coastal forecasts in SE
• Coastal modelers with computer scientists to couple models, provide data solutions, deploy ensembles of models on the Grid, assemble real time results with GIS technologies.
From: "Urgent Computing for Hurricane Forecasts,“ Gabrielle Allen, Urgent Computing Workshop, Argonne National Laboratory, April 25th to 26th, 2007 http://scoop.sura.org/documents/UrgentComputing_April2007.pdf
•
SCOOP Prototype Distributed LaboratorySCOOP Prototype Distributed Laboratory
Funded by ONR & NOAAFunded by ONR & NOAA
Bedford Institute of Oceanography
Virginia Institute of Marine Science
University of Alabama, Huntsville
Texas A&M
Renaissance
Computing Institute
2005/2006 SCOOP
Implementation Team
University of North Carolina
University of Florida
Louisiana State University
Gulf of Maine Ocean
Observing System
MCNC
Southeastern Universities
Research Association
•External Resources•e.g. SURAgrid regional grid infrastructure, www.sura.org/suragrid
From: Dr. Philip Bogden "Designing a Collaborative Cyberinfrastructure for Event-Driven Coastal Modeling," Philip Bogden, Supercomputing 2006, Nov 2006, Tampa, Fl.
1a-1.34www.earthsystemgrid.org
DOE Earth System Grid
Goal
Address technical obstacles to sharing and analysis of high-volume data from advanced earth system models
1a.35
Earth System Grid II http://www.csm.ornl.gov/Highlights/esg.html
1a.36
http://www.ediamond.ox.ac.uk/
Medicine/Biology
Project period: 2002-2005
1a-1.37http://www.openmolgrid.org/
Project period: 2002-2005…
1a-1.38
Large Hadron Collider experimental facility for complex particle experiments at CERN
(European Center for Nuclear Research, near Geneva Switzerland).
Physics
CERN LCH Computing grid (LCG)
Started in 2002. Expected operational 2008
1a-1.39http://public.web.cern.ch/public/en/LHC/LHC-en.html
1a.40
CERN LCH Computing grid (LCG)
LCG depends on two major science grid infrastructures ….
EGEE - Enabling Grids for E-ScienceOSG - US Open Science Grid
From: LCG Overview - May 2007 - Les Robertson, http://lcg.web.cern.ch/LCG/dissemination.html
1a-1.42
Grid computing infrastructure projects
Not tied to one specific application
1a-1.43
Grid networks for collaborative grid computing
projects
Grids have been set up at local level, national level, and international level throughout the world, to promote Grid computing
Grid Networks
1a-1.44
Funded by NSF in 2001 initially to link five supercomputer centers. Hubs established at Chicago and Los Angeles . Five centers connected to one hub:
• Argonne National Laboratory (ANL) (Chicago hub)
• National Center for Supercomputing Applications
(NCSA) (Chicago hub)
• Pittsburgh Supercomputing Center (PSC) (Chicago hub)
• San Diego Supercomputer Center (SDSC) (LA hub)
• Caltech (LA hub)
• National Center for Supercomputing Applications
(NCSA) (Chicago hub)
TeraGrid
1a-1.45
Hubs at Chicago and Los Angeles Interconnected using 40 Gigabit/sec optical
backplane network .
Five centers Connected to one hub using 30 Gigabit/sec
connections
State-of-the-art optical lines could reach 10 Gigabit/sec in the early 2000s
Four lines used to achieve 40 Gigabit/sec.
Three lines used to achieve 30 Gigabit/sec
1a-1.46
TeraGrid circa 2004
TeraGrid was further funded by NSF for period 2005-2010.
Has developed into a platform for a wide range of Grid applications and is described as:
“the world’s largest, most comprehensive distributed cyberinfrastructure for open scientific research.”
http://www.teragrid.org/about/1a-1.47
1a-1.48
TeraGrid as of 2008
Open Science Grid (OSG)Started around 2005, received $30 million funding from
NSF and DOE in 2006:
• Boston University• Brookhaven National
Laboratory• California Institute of
Technology• Columbia University• Cornell University• Fermi National Accelerator
Laboratory• Indiana University• Lawrence Berkeley National
Laboratory1a-1.49
• Stanford Linear Accelerator Center
• University of California, San Diego
• University of Chicago• University of Florida• University of Iowa• University of North
Carolina/RENCI• University of Wisconsin-
Madison
1a.50
Current status July 2008
1a.51
SURAGrid as of 2008Southeastern Universities Research Association
1a-1.52
National GridsMany countries have embraced Grid computing and set-up Grid computing infrastructure:• UK e-Science grid• Grid-Ireland• NorduGrid• DutchGrid• POINIER grid (Poland)• ACI grid (France)• Japanese grid• etc, etc., …
1a-1.53
UK e-Science GridEarly 2000’s
UK National Grid Service• Follow-up from UK e-Science Grid
• Founded in 2004 to provide distributed access to computational and database resources, with four core sites:– Universities of Manchester, Oxford and Leeds,
and Rutherford Appleton Laboratory
• By 2008, it had grown to 16 sites.
• Access free to any academic with a legitimate need.
1a-1.54
Multi-national Grids
• 2000-2005, several efforts to create Grids that spanned across many countries.
1a.55
Multi-national Grid example
ApGrid
• A partnership in Asia Pacific region involving:
– Australia, Canada, China, Hong Kong, India, Japan, Malaysia, New Zealand, Philippines, Singapore, South Korea, Taiwan, Thailand, USA, and Vietnam.
1a.56
European centered multi-national Grids
• Several initiatives for European countries to collaborated in forming Grid-like infrastructures to share compute resources funded by European programs.
1a.57
European centered multi-national Grid Example
DEISA(Distributed European Infrastructure for
Supercomputing Applications)
DEISA-1 project from 2004 - 2008.
DEISA-2 started in 2008, to extend to 2011
1a.58
DEISA(Distributed European
Infrastructure for Supercomputing
Applications)As of 2008
1a.59
DEISA-2 partners• Barcelona Supercomputing Centre Spain (BSC),• Consortio Interuniversitario per il Calcolo Automatico Italy (CINECA),• Finnish Information Technology Centre for Science Finland (CSC),• University of Edinburgh and CCLRC UK (EPCC)• European Centre for Medium-Range Weather Forecast UK (ECMWF)• Research Centre Juelich Germany (FZJ)• High Performance Computing Centre Stuttgart Germany (HLRS),• Institut du Développement et des Ressources en Informatique
Scientifique - CNRS France (IDRIS),• Leibniz Rechenzentrum Munich Germany (LRZ),• Rechenzentrum Garching of the Max Planck Society Germany (RZG)• Dutch National High Performance Computing Netherlands (SARA),• Kungliga Tekniska Högskolan Sweden (KTH),• Swiss National Supercomputing Centre Switzerland (CSCS),• Joint Supercomputer Center of the Russian Academy of Sciences
Russia (JSCC).
1a.60
Vision of a single universal international
Grid such as the Internet/World Wide
Web
May never be achieved though.
More likely - Grids will connect to other Grids but will maintain their
identity.
1a.61
Uses the teleconferencing facilities of NCREN
and
Clusters at various sites across North Carolina
1a.62
Our Grid computing course
1a.63
Our Grid Computing Course
• Uses the teleconferencing facilities of NCREN
• Broadcast on NCREN network across North Carolina.
• Uses clusters at various participating sites
• Relies heavily on faculty at participating sites
• First offered in 2004 (8 sites). Again in Fall 2005 (12 sites), Spring 2007 (3 sites), and Fall 2008 (5 sites) WCU teleclassroom
15 Participating sites to total2004-2008
1a.64
1a.65
Every state has its own network structure for the Internet Close to home: Basis of our course
1a.66
Fall 2005 Course grid structure
MCNC
UNC-W UNC-A
NCSUWCU
UNC-CASU
CA
CA
CA
CA
CA
CA
CA
Backup facility, not actually used
1a.67
Questions
1a.68
There will be multiple-choice quizzes in the course (on-line through Blackboard).
QuizQuestion: What is a virtual organization?
(a) An imaginary company.(b) A web-based organization.(c) A group of people geographically distributed that
come together from different organizations to work on a Grid project.
(d) A group of people that come together to work on a virtual reality Grid project.
Question: What is meant by the term cloud computing?
(a) Atmospheric Computing
(b) Computing using geographically distributed computers
(c) A facility providing services and software applications
(d) A secure CIA computing facility
1a.69
Question: In addition to computers, which of the following resources can be shared on a Grid?
(a) Storage
(b) Application Software
(c) Specialized equipment (such as sensors)
(d) Databases
(e) All of the above
1a.70
Questions
Grid Computing is using ______________
______________ and interconnected
computers together for computing and
resource ______________.
Questions
The original driving force behind Grid
Computing was ______________
______________ ______________.
Questions
However, Grid Computing is more about
______________ and ______________
______________ than it is about high
performance computing.
Questions
Another important components of Grid
Computing is ______________
______________, groups of people, both
geographically and organizationally
distributed, working together on a problem,
sharing computers AND other resources.
Questions
Other models of computing that are similar
but different to Grid Computing are
______________ Computing and
______________ Computing.
Questions
Ian Foster's checklist for determining of
a grid is a Grid:
a)______________________
b)______________________
c)______________________