grids, utility computing and a perspective on
TRANSCRIPT
Grids, utility computingand a perspective onthe future of IT infrastructure
Washington Area CTO ForumMarch 31, 2006
Nirav [email protected]
© Nirav Kapadia 2
Outline Characterizing computing grids
Grids as intended versus what we see today Common types of grids today
Putting computing grids to work Types of problems addressed by today’s grids Operational considerations in deploying a grid
A perspective on the future of IT infrastructure Cost pressures and technology commoditization Grid and utility computing: the technology enablers
© Nirav Kapadia 3
Grids came about from a need for large scale, collaborative computing
Scale is measured in terms of users, nodes, organizations, geography, and heterogeneity A grid in the strict sense of the word involves a
large number of heterogeneous, shared resources
Collaboration is measured in terms of resource sharing and interoperability A key characteristic is the ability to manage
across organizational boundaries
© Nirav Kapadia 4
Systems for large scale, collaborative computing must meet key criteria
Group A Scalable with users and resources Support for heterogeneity
Group B Support for interoperability Scalable with geographical distances
Group C Fully distributed (federated) architecture Ability to compartmentalize along organizational boundariesSt
rict d
efini
tion
of c
ompu
ting
grid
Broa
d de
finiti
on o
fco
mpu
ting
grid
© Nirav Kapadia 5
Many commercial grid solutions only meet the broad definition of a grid
Cluster management systems Typically harness clusters of dedicated servers Examples include Platform LSF, Sun Grid Engine
CPU-scavenging “master-slave” applications Typically take advantage of idle desktop cycles Examples include SETI@Home, distributed.net
© Nirav Kapadia 6
Many commercial grid solutions only meet the broad definition of a grid
Application-specific, custom-built grids Typically built around a key business function Examples include Acxiom, Oracle offerings
© Nirav Kapadia 7
Today, solutions that meet the strict definition of a grid have to be “built”
Grid solutions based on the Globus toolkit Several vendors have Globus based offerings Univa Corp is commercializing Globus
Other grid solutions in academia and research Most are custom-built and target a specific
problem Typically not appropriate for commercial use
(today)
© Nirav Kapadia 8
Key takeaways A grid is a distributed computing system
that enables large scale, collaborative computing Scalable across a large number of diverse and
geographically dispersed resources
Many commercial “grid solutions” of today do not meet the strict definition of a grid Limited ability to manage policies and
resources across administrative boundaries
© Nirav Kapadia 9
Outline Characterizing computing grids
Grids as intended versus what we see today Common types of grids today
Putting computing grids to work Types of problems addressed by today’s grids Operational considerations in deploying a grid
A perspective on the future of IT infrastructure Cost pressures and technology commoditization Grid and utility computing: the technology enablers
© Nirav Kapadia 10
Even today’s grids can benefit users with large scale computing needs
High throughput computing (HTC) Many independent (non-communicating)
tasks Large problems that break up into
manageable, independent tasks
High performance computing (HPC) Large problem that is not decomposable into
manageable, independent tasks
© Nirav Kapadia 11
High throughput computing is common in business environments
Large, legacy applications are best served by cluster management systems Compute-intensive apps are preferable but a mix
of compute- and data-intensive apps are manageable
Customizable apps that work on small slices of data work well with CPU-scavenging grids Apps must be compute-intensive and preferably
run within a sandbox
© Nirav Kapadia 12
High performance computing isseen more in targeted environments
Applications involving multiple, communicating tasks are typically require custom designed grid environments Examples include Oracle grid offering and
some test beds built with Globus Other examples include distributed
computing platforms such as PVM and MPI
© Nirav Kapadia 13
So… you’re ready to deploy a grid computing environment…
As with any other technology, there are several operational considerations… Resources on the grid – dedicated or shared? Access management – who needs access to
what? Data management – how does data get to the
grid? Security model employed by the grid
© Nirav Kapadia 14
Resources on the grid –should they be dedicated or shared?Cluster Mgmt Systems
Cluster management systems work best with dedicated resources
Condor – from the U of Wisconsin – is a notable exception, but not commercially available
CPU-scavenging grids
As the name implies, resources are shared – and typically involve desktops
A custom screen saver is the most common vehicle for running the grid application
© Nirav Kapadia 15
Access management –who needs (gets) access to what?Cluster Mgmt Systems
Option #1: jobs run in a guest account
Shared access across jobs
Option #2: accounts for everyone on all machines
Homogeneous uid pool highly recommended
Logins typically disabled
CPU-scavenging grids
Option #1: jobs run with user’s privileges
If downloaded by user Option #2: jobs run in
guest account If set up by
administrator No direct remote user
access to desktop
© Nirav Kapadia 16
Data management –how does data get to the apps?Cluster Mgmt Systems
Transfer user specified files via ftp, scp, etc
File staging for large data
On demand file transfer (system call traps)
Shared file systems
CPU-scavenging grids
Data embedded within application or retrieved via HTTP/Java call-backs
Limited data, typically no files
© Nirav Kapadia 17
Security model –user accountability is key today
Basic
syst
em a
nd k
erne
l saf
egua
rds
Run TimeEnvironment
ApplicationExecutable
ApplicationGeneration
ApplicationUsers
UnchangedBinaries
Object CodeModifications
Source CodeModifications
CustomApplications
Ideal Grid
Unix
LSF, PBS, SGE
Globus
Condor
Java, PCCs
distributed.net,SETI@Home, etc
Access management (capability control)Opportunities for subversion
© Nirav Kapadia 18
Key takeaways Today’s commercially available grid solutions
primarily target high throughput computing Cluster management systems and CPU-
scavenging grids are the most common
Carefully consider the policy implications of grids in terms of access and data management More of a concern for grids that span sub-nets or
fire walls
© Nirav Kapadia 19
Outline Characterizing computing grids
Grids as intended versus what we see today Common types of grids today
Putting computing grids to work Types of problems addressed by today’s grids Operational considerations in deploying a grid
A perspective on the future of IT infrastructure Cost pressures and technology commoditization Grid and utility computing: the technology enablers
© Nirav Kapadia 20
Even as grids take hold, theIT landscape is changing rapidly…
Technology is rapidly being commoditized
Businesses are more willing and able to shop for IT services
In-house IT infrastructure is increasingly seen as complex and rigid © Harvard Business Review
© Nirav Kapadia 21
IT infrastructure is already a commodity from a business view
Outsourcing is pervasive; and standards-based, open systems are increasingly common Cost pressures will continue driving businesses to
streamline IT infrastructure
More often than not, customized in-house IT systems stand out for their cost and complexity Common off-the-shelf solutions provide more value
in the absence of direct competitive advantage
© Nirav Kapadia 22
In time, economics will drive IT infrastructure out of the enterprise
The technology enablers for this paradigm exist today, but are still nascent (True) grids offer a way to manage computing
resources across organizational boundaries Utility computing solutions bring together
grids, data center automation, and virtualization
© Nirav Kapadia 23
The technology implications of these changes are enormous
Computing infrastructure needs to become transparent to end users Users only interact with applications and data
Policy management needs to be decoupled from system management Cannot assume users can be held
accountable Components of computing systems need
to be less tightly coupled CPU, OS, data, apps may all be in different,
remote locations
© Nirav Kapadia 24
A utility computing test bed at Purdue showcases this paradigm
Operating since 1995; now a joint development effort between Purdue and U of Florida By 2001, allowed 3,000+ users from 30
countries to run ~100 applications in a utility environment
Extensively validated: ~400,000 runs (by 2001); highly peaked usage profile
Powers online simulations in the nanoHUB.org portal for the nanotechnology community
© Nirav Kapadia 25
nanoHUB.org – remote access to simulators and compute power
ClusterTeraGrid
Condor-GGlobus
Internet
nanoHUB infrastructure
nanoHUB.orgWeb site
Physical MachineVirtual Machine
NMI Cluster
Slide courtesy of Gerhard Klimeck, Network for Computational Nanotechnology
Remote desktop (VNC)
Real users and real usage >10,687 users
© Nirav Kapadia 26
ApplicationRepositories
Data Vaults
CPU Farms
WebPortal
PUNCH Virtual Machine
Utility ServicesLocal Services
OSRepositories
Custom computingenvironment assembled
in real time
Inside nanoHUB.org
© Nirav Kapadia 27
In conclusion… Today’s commercially available grids provide
a valuable but narrow service More efficient computing in a closed environment;
limited support for cross-organizational sharing
In time, grid and utility computing technologies will move IT infrastructure out of the enterprise Virtualization and data center automation
products are visible precursors
© Nirav Kapadia 28
Questions? Comments?Email: [email protected]