introduction p2p
DESCRIPTION
A short introduction to p2p computingTRANSCRIPT
![Page 1: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/1.jpg)
1
Introduction to P2P systems
Davide Carboni © 2005-2006
![Page 2: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/2.jpg)
2
LicenseAttribution-ShareAlike 2.5 You are free:to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions: Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a licence identical to this one.For any reuse or distribution, you must make clear to others the licence terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above.This is a human-readable summary of the Legal Code (the full licence). Disclaimer
![Page 3: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/3.jpg)
3
P2P is about sharing resources
Your CPU time Your bandwidth Your disk space
![Page 4: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/4.jpg)
4
What is P2P
From WikipediaA peer-to-peer computer network is a
network that relies on the computing power and bandwidth of the participants in the network rather than concentrating it in a relatively low number of servers
![Page 5: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/5.jpg)
5
P2P and GRID
From Wikipedia
Grid computing […] performs higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture.
![Page 6: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/6.jpg)
6
Topology Comparison
Client/server GRID P2P
server
client
client=server
![Page 7: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/7.jpg)
7
Overlay
Crs4.it Australian ISP
Mobile phones in cell xyz
![Page 8: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/8.jpg)
8
Overlay
Crs4.it Australian ISP
Mobile phones in cell xyz
![Page 9: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/9.jpg)
9
Three main issues in P2P systems Bootstrapping Index/Lookup (query) Delivery of large objects (in case of file
sharing)
![Page 10: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/10.jpg)
10
A la Napster
Query / Query Hits
GET <file>
![Page 11: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/11.jpg)
11
Copyright issues with Napster
Napster claimed that the law allows people to share music with friends.
The court considered this position illegal and Napster was closed.
![Page 12: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/12.jpg)
12
Gnutella Overlay
RequestorResponder
![Page 13: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/13.jpg)
13
Gnutella Messages
Byte Description
0 - 15 GUID
16 ping, pong, push, query, queryhit
17 TTL
18 hops
19-22 Payload length
23 – 23+payload length
![Page 14: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/14.jpg)
14
Gnutella messages
ping: discover hosts on network pong: reply to ping query: search for a file query hit: reply to query push: download request for firewalled
servents
Ref. http://rfc-gnutella.sourceforge.net/developer/stable/index.html
![Page 15: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/15.jpg)
15
Gnutella: PING
Requestor
PING
![Page 16: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/16.jpg)
16
Gnutella: PONG
Requestor
PONG
![Page 17: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/17.jpg)
17
Gnutella: QUERY
Requestor
QUERY
![Page 18: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/18.jpg)
18
Gnutella: QUERY-HITS
A
C
B
DRequestor
QUERY-HITS
Responder 1
Responder 2
![Page 19: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/19.jpg)
19
Gnutella: GET the file
RequestorResponder 1
GET file HTTP/1.1
file
![Page 20: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/20.jpg)
20
Gnutella, behind firewalls
Requestor Responder
GET file
![Page 21: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/21.jpg)
21
Gnutella, behind firewalls (2)
C
B
DRequestor
Responder
PUSH
A
![Page 22: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/22.jpg)
22
Gnutella, behind firewalls (3)
Requestor
Responder
FILE
![Page 23: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/23.jpg)
23
Bootstrapping in Gnutella
X-Try Ping/Pong Storing from QueryHit messages GWebCache
![Page 24: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/24.jpg)
24
Open issues in Gnutella Latency Scalability Vulnerability Privacy Security
![Page 25: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/25.jpg)
25
Is Gnutella obsolete?
Alive and Kicking The version 0.6 of the protocol prevents
pure flooding and uses smart routing based on Ultrapeers
More than 2 millions users with 500,000 nodes always up
![Page 26: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/26.jpg)
26
Popularity of P2P Networks (measured by Slick.com) Latest Statistics taken 2006-02-26 22:14:12:
eDonkey2KUsers: 3,474,261FastTrackUsers: 2,609,688GnutellaUsers: 2,219,539OvernetUsers: 578,521MP2PUsers: 252,893FiletopiaUsers: 4,806
![Page 27: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/27.jpg)
27
Hub (Gnutella2 et al.)
Hub Web
![Page 28: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/28.jpg)
28
Hub Requirements
> 100 sockets CPU and RAM for servicing the network Uptime (>2 hours) Broadband (also for upload) Able to receive inbound TCP and/or UDP (IP
in the global address space, no NAT)
![Page 29: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/29.jpg)
29
Hub Tasks
Keep up-to-date information about other hubs
Manage routing tables to route messages efficiently
Manage filters for query messages Monitor they own resources.
![Page 30: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/30.jpg)
30
Query Hash Table
QHTs provide information to know that a particular node (and possibly its descendants) will not be able to provide any matching objects for a given query.
queries can be discarded confidently. Neighbours know what their neighbours do not
have, but cannot say for sure what they do have.
QHT
![Page 31: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/31.jpg)
31
What is Hashing
From Wikipedia, the free encyclopedia A hash function or hash algorithm is a
function for examining the input data and producing an output hash value. The process of computing such a value is known as hashing. The process of hashing has the property that two different inputs are unlikely to hash to the same hash value.
![Page 32: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/32.jpg)
32
What is Hashing (2)
Collisions occur with 2^(-N)
![Page 33: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/33.jpg)
33
Query Hash Table
1 1 1 1 1 1 1 1 1 1
0 1 2 2^N
0<= Hash(word) <= 2^N
![Page 34: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/34.jpg)
34
Query Filtering
If any of the lookups based on URNs found a hit, send the query packet
If at least two thirds of lookups based on words found a hit, send
Otherwise, drop the packet
Consider all text content in the query, including generic search text and metadata search text if it is present.
Tokenize quoted phrases into words, ignoring the phrase at this level
![Page 35: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/35.jpg)
35
Distributed hashtables
![Page 36: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/36.jpg)
36
Distributed Hashtables Main features: a key is mapped onto a
node of the network. Several proposals: Chord, Pastry and
Kademlia. Lookup(key) reaches the right node with
O(log(N) ) hops.
![Page 37: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/37.jpg)
37
Possible applications of DHT
DHT DNS Content lookup Web search engine
![Page 38: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/38.jpg)
38
DNS over DHT (1)
Problem: how to register a name onto a IP address
Assign a name to your machine, example ‘mymachine’
Check if this name is available or not using the DHT operation get(‘mymachine’).
If the result is null then register the name and the IP with the DHT operation put(‘mymachine’, 212.22..)
![Page 39: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/39.jpg)
39
DNS over DHT (2)
Problem: how to resolve a name onto a IP address
Use the DHT operation get(hostname). The result if not null is the IP address
you’re searching
![Page 40: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/40.jpg)
40
Content indexing/lookup on DHT A content has a set of metadata (i.e.
author, editor, genre, …) Build a different index based on DHT for
each metadata i.e. the index for author
put(‘john’, http://host/dir/content.avi)
![Page 41: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/41.jpg)
41
How DHT works
In DHT each node has a node ID which belogs to a set S (for instance the set of bitstrings with length 160)
Also keys must hashed in the same set S (hash(key) belongs to S)
![Page 42: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/42.jpg)
42
Web crawlers and DHT
Assume a network of nodes in a DHT Assume each node runs also a crawler. For each word in a Web page it performs
Put(word,URL) So a distributed index of the Web is
built[1]
![Page 43: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/43.jpg)
43
Web search and DHT
When the user type a keyword ‘foo’ lookup the DHT Get(‘foo’)
The DHT will give the list of URL indexed with ‘foo’
![Page 44: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/44.jpg)
44
Kademlia S = [00 ....0 - 11 ...1] the set of 160bit
strings Each node has a node ID in S For each 'key' hash(key) is in S
![Page 45: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/45.jpg)
45
Kademlia distance Given x,y in S Define the distance d(x,y) = xor(x,y) d has the following properties: d(x,y) = d(y,x) d(x,x) = 0 d(x,y) + d(y,z) >= d(x,z)
![Page 46: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/46.jpg)
46
k-Buckets in kademlia
Each node stores an array of lists: list[i] i = 0,1, ... , 159 list[i] stores up to k tuples: (IP,port,ID) list[i] stores tuples whose ID is:
2^i <= D(this,ID)< 2^(i+1) list[i] is ordered as LRS (last recent
seen)
![Page 47: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/47.jpg)
47
Tree for nodes in kademlia
1
1
1
1
0
0
0
0
0101
![Page 48: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/48.jpg)
48
k-Buckets in kademlia For small values of i, list[i] has few
elements For larger values of i, list[i] is likely to
contain more elements.
![Page 49: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/49.jpg)
49
Operations in kademlia
PING (IP, port) STORE (key, value) FIND_VALUE (key) FIND_NODE (ID)
![Page 50: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/50.jpg)
50
Lookup in Kademlia FIND_NODE(hash(k)) Compute D=xor(this,hash(key)) Find a tuples in list[i] (i.e. a=3) Send FIND_NODE(hash(key)) to the 3
nodes I receive other node addresses. Reiterate
FIND_NODE(hash(key)) on them. Stop when no new addresses are received
![Page 51: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/51.jpg)
51
Nodes Joining and Leaving
Whenever one node asks another for its contacts, the called node stores the contact information of the caller.
When a node joins the network it takes some of the contacts of an arbitrary node and uses them as its own.
It then does a search for itself. This results in other nodes being called, which makes them aware of the new node's existence
![Page 52: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/52.jpg)
52
Node Joining and Leaving (2)
A new node may have become the closest node to certain keys
The previous closest nodes will replicate the appropriate key/value pairs to the new node
Ignoring replication the cost of a node joining is only O(log n) messages.
![Page 53: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/53.jpg)
53
Range Query in DHT (1)
DHT maps a key onto a node It is easy to lookup a value given a key It is uneasy lookup values in a range of
keys Example 1:
Lookup all tuples in ‘aaaa’ < key < ‘bbbb’ Example 2:
Lookup all tuples in ’39,88’ < lat < ’39,94’
![Page 54: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/54.jpg)
54
References (1)
Napster Timeline http://www.cnn.tv/SPECIALS/2001/napster/timeline.html
The Gnutella Developer Forum http://www.the-gdf.org/wiki/index.php?title=Main_Page
History of Gnutella in ‘Gnutella’ http://ntrg.cs.tcd.ie/undergrad/4ba2.02-03/p5.html
Slyck.com DHT Links
http://www.etse.urv.es/~cpairot/dhts.html
![Page 55: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/55.jpg)
55
References (2)
YACY (DHT Web search/index) http://www.yacy.net/yacy/
Kademlia: A Peer-to-peer Information System Based on the XOR Metric. (paper)
Khashmir – Kademlia in Python http://khashmir.sourceforge.net/
A Case Study in Building Layered DHT Applications (paper on range query/DHT)
http://www.placelab.org/publications/pubs/IRS-TR-05-001.pdf
![Page 56: Introduction P2p](https://reader033.vdocuments.mx/reader033/viewer/2022061202/547bbfabb4af9fef158b4ee5/html5/thumbnails/56.jpg)
56
LicenseAttribution-ShareAlike 2.5 You are free:to copy, distribute, display, and perform the work to make derivative works to make commercial use of the work Under the following conditions: Attribution. You must give the original author credit. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a licence identical to this one.For any reuse or distribution, you must make clear to others the licence terms of this work. Any of these conditions can be waived if you get permission from the copyright holder. Your fair use and other rights are in no way affected by the above.This is a human-readable summary of the Legal Code (the full licence). Disclaimer