cm0256 pervasive computinggoodale/teaching/cardiff/2006-7/... · 2007. 3. 20. · transport,...
TRANSCRIPT
Lecture OutlineIn this lecture we look at:
Distributed computing techniques/middleware from 50,000 feet P2P
Examples Gnutella Video Hype, round 1 cooltown
Grid Computing Background, tools Video Hype, round 2 IBM
Web Services Background benefits of service
SOAs Convergence of all Technologies? Video Hype, round 3 JISC
Back to the Dawn of P2P• Peer to Peer (P2P) - originally used to describe the communication of two peers.
• The internet started as peer to peer system e.g. ARPANET • goal - to share computing resources around the USA using different networks• UCLA, SRI, Utah and Santa Barabara• all had equal status – P2P
• From late 1960’s until 1994, machines were assumed to be switch on, connected and had an IP address assigned
• Then, invention of Mosaic and WWW led to a different type of user….
dial-up modemsIP addresses changing
unpredictable
The 1990's Client-Server Internet
Late 1990’s Naptser, then Gnutella 2000 – the new P2P started
The Brains Behind Gnutella
QuickTime and aTIFF (Uncompressed) decompressorare needed to see this picture.
And now I am writing songs and eating toxic waste …
We are Geeky and Rich !
What is Gnutella?
Gnutella is a protocol for distributed search
• peer-to-peer comms• decentralized model• No third party lookup
?
?
?
Searching a Gnutella Network: Broadcasting
Searching in Gnutella involves broadcasting a Query message to all connected peers. Each connected peer will send it to their connected peers (say 3) and so on. Typically, this search will run 7 hops. If the number of connected peers, c=3 and the hops i.e. TTL=7 then the total number of peers searched (in a fully connected network) will be:S = c + c2 +c3 + ….. ch = 3 + 9 + 27 + 81 + 243 + 729 + 2187 = 3279 Nodes
3-D Cayley Tree
Searching a Gnutella Network: All nodes
Scaling Gnutella - Social Networks• Stanley Milgram (Harvard professor) – 1967 social networking experiment• How many ‘social hops’ would it take for messages to traverse through the US population (200 million)
• Posted 160 letters randomly chosen people in Omaha, Nebraska
Boston
Omaha
• Asked them to try to pass these letters to a stockbroker working in Boston, Massachusetts
• Rules:• use intermediacies whom they know on a first name basis• chosen intelligently• make a note at each hop
• 42 letters made it !!
• Average of 5.5 hops
• Demonstrated the ‘small world effect’
Proved that the social network of the United States is indeed connected with a path-length (number of hops) of around 6 – The 6 degrees of separation !
Does this mean that it takes 6 hops to traverse 200 million people??
Lessons Learned from Milgrim’s Experiment
• Social circles are highly clustered
• A few members have wide-ranging connections• these form a bridge between far-flung social clusters • this bridging plays a critical role in bringing the network closer together
For example • A quarter of all letters passed through a local storekeeper• A half were mediated by just 3 people
Lessons Learned • These people acted as gateways or hubs between the source and the wider world• A small number of bridges dramatically reduces the number of hops
Centralized + Decentralized
• New Wave of P2P• Clip2 Gnutella Reflector
(next)• FastTrack
– KaZaA– Morpheus– Skype– Jxta
• Like Social Networks perhaps ?
The figure below is a view of the topology of a Gnutella network as shown on the LimeWire web site, the popular Gnutella file-sharing client. Notice how the power-law or centralized-decentralized structure is demonstrated.
The Gnutella Network Today
So, What is New?
Direct communication between two nodes on the Internet acting as both client and server is not novel.
P2P addresses issues that have arisen through the nature and rapid growth of the Internet.
Nodes that were not part of the fabric of an Internet application, now can constitute the majority.
It is this ability to reach the edges is what is new about P2P.
Modern Peer to Peer
P2P is a class of applications that takes advantage of resources e.g. storage, cycles, content, human presence, available at the edges of the Internet – Clay Shirky
What is an P2P application?
Computers/devices “at the edges of the internet” are those:
• Operating within transient environments - computers come and go frequently
• They can be behind a firewall or NAT systems
• Have to operate outside of DNS
• Often have to deal with differing transport protocols, devices and operating systems
Maybe P2P is not a definition…
Oram writes:
“In fact, definitional inadequacies aside, peertopeer isn’t really a set of technologies as much as it is a set of problems.”
QuickTime and aYUV420 codec decompressorare needed to see this picture.
Cooltown:the future??
Grid Computing
Grid Computing
Grid takes its name from analogy with electrical power Grid:
– electricity on demand via wall socket
– source unknown but reliable
– transparency and resilience are keys to its success
The Grid dream is to allow users to tap into resources off the internet as easily as electrical power can be drawn from a wall socket
- imagine …
Get To Know Your Grid
Foster I, Kesselman C and Tuecke S, (2001) The Anatomy of the Grid: Enabling Scalable Virtual Organizations
“The Grid is flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources
The concept of Virtual Organisations
QuickTime and aTIFF (Uncompressed) decompressorare needed to see this picture.
QuickTime and aTIFF (Uncompressed) decompressor
are needed to see this picture.QuickTime and a
TIFF (Uncompressed) decompressorare needed to see this picture.
Globus Tools GT2
Recap: consists of four elements:
Resource Management: to allocate resources provided by a Grid - GRAM
Data Management: involves accessing and managing data – GridFTP, GASS
Information Services: to provide information about Grid services - MDS
And of course:
Security: to provide authentication, delegation and authorization
Grid computing by IBM
Macintosh HD:Users:scmijt:Documents:WorkDoc:ComScJob:DistributedSystems:Lectures:PervasiveLecture:gridintroToOGSA.mov
ServicesA common Language ?
Web Services, P2P services, SOA
and WSRF
What is a Service?
“A network-enabled entity that provides some capability” (Foster/Kesselman)
A service is a view of some resource, usually a software asset. Implementation detail is hidden behind the service interface.
Some commentators also stress that this interface should be programming language independent.
P2P applications normally work at the service level - adverts & queries - file sharing service, phone service, Jxta, etc.
Web Services are seen as the standardized answer.
A Simple Web Service
A minimal web service has three components:
The Service: a software component which is capable of processing an XML document. Transport, programming language etc is not important
The Document: the XML document that contains all of the application-specific information.
The Address: describes the protocol binding (e.g. TCP or HTTP) & network address that can be used to access the service. also called an endpoint or port-reference
In practice, also need an envelope (e.g. SOAP)… adds things like security/routing info without altering the
content of the message
Envelope(SOAP)
XML
XML
Web Service Interface(WSDL)
Envelope(SOAP)
SOAPServer
XML XML Java
Soap Client
XML
XML
XML
XML
Soap Client
Java
Java
Example when Client and Service implementations are Java
Web Server - Tomcat/AxisOr P2P ? Why not …
Service Oriented Architecture
The Service Oriented Architecture (SOA)
The dynamic discovery and binding of services is often described in terms of the publish, find, bind triad of operations software components should be dynamically
discoverable clear separation of the capabilities and implementation possible to quickly assemble impromptu computing
communities with minimal coordinated planning efforts, installation technicalities or human intervention.
Web Services are seen as the current optimum SOA - combination of SOA qualities plus XML message passing and stateless
This model has been adopted by Grid computing
Middlew
are(G
lobus)Users/Clients
Resources
Globus V.4
Single Sign-on
Internet Routing
VO VOGSI X. 509
Mutual Authentication
GRAM
GridFTP
MDS
MDSMDS
Web Services (OGSA)
VO
VO VO
QuickTime and aSorenson Video 3 decompressorare needed to see this picture.
Summary Grid Computing
Standardized approach Strong trusted toolbase/ security
P2P Scalability Decentralized (fault tolerance, searching) Useful File Algorithms: Bittorrent, file swarming
Web Services Standardised interfacelevel tools Loosely Coupled XML + language of your choice Easy Development
SOAs The Glue ties things together Grid computing tools exposed as services P2P Algorithms work at the services level
End of Lecture