introduction to peer-to-peer systems

15
ALMA MATER STUDIORUM – UNIVERSITA’ DI BOLOGNA Introduction to Peer-to-Peer Systems Ozalp Babaoglu © Babaoglu Complex Systems Introduction Peer-to-peer (P2P) systems have become extremely popular and contribute to vast amounts of Internet traffic P2P basic definition: A P2P system is a distributed collection of peer nodes Each node is both a server and a client: May provide services to other peers May consume services from other peers Very different from the client-server model 2 © Babaoglu Complex Systems P2P History: 1969 - 1990 The origins: In the beginning, all nodes in Arpanet/Internet were peers Each node was capable of: Performing routing (locate machines) Accepting ftp connections (file sharing) Accepting telnet connections (distributed computation) 3 © Babaoglu Complex Systems P2P History: 1999 - today The advent of Napster: Jan 1999: the first version of Napster was released by Shawn Fanning, student at Northeastern University July 1999: Napster Inc. founded Feb 2001: Napster closed down 4

Upload: others

Post on 10-May-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Peer-to-Peer Systems

ALMA MATER STUDIORUM – UNIVERSITA’ DI BOLOGNA 

Introduction toPeer-to-Peer Systems

Ozalp Babaoglu

© Babaoglu Complex Systems

Introduction

■ Peer-to-peer (P2P) systems have become extremely popular and contribute to vast amounts of Internet traffic

■ P2P basic definition:■ A P2P system is a distributed collection of peer nodes■ Each node is both a server and a client:

▴ May provide services to other peers▴ May consume services from other peers

■ Very different from the client-server model

2

© Babaoglu Complex Systems

P2P History: 1969 - 1990

■ The origins:■ In the beginning, all nodes in Arpanet/Internet were peers■ Each node was capable of:

▴ Performing routing (locate machines)▴ Accepting ftp connections (file sharing)▴ Accepting telnet connections (distributed computation)

3 © Babaoglu Complex Systems

P2P History: 1999 - today

■ The advent of Napster:■ Jan 1999: the first version of Napster was released by Shawn

Fanning, student at Northeastern University■ July 1999: Napster Inc. founded■ Feb 2001: Napster closed down

4

Page 2: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

Napster

Napster Central Index Server

Napster Client 4

Napster Client 3

Napster Client 2

Napster Client 1

Your ComputerQuery: song.mp3

Client 2

Copy: song.mp3

- Client/Server search- P2P download- Napster is not “pure P2P”

© Babaoglu Complex Systems

Client/Server vs. Peer-to-Peer

• Servers well connected to the “core” of the Internet

• Servers carry out critical tasks• Clients only talk to servers

• Nodes located at the “periphery of the Internet”

• Tasks distributed across all nodes• Clients talk to other clients

© Babaoglu Complex Systems

Example – Video sharing

Client-Server: YouTubeAdvantages

• Client can disconnect after upload

• Uploader needs little bandwidth

• Other users can find the file easily(just use search on server webpage)

Disadvantages

• Server may not accept file or remove it later (according to content policy)

• Whole system depends on the server(what if shut down like Napster?)

• Server storage and bandwidthare expensive!

client-server

uploader

downloader downloader

downloader

downloaderdownloader

© Babaoglu Complex Systems

Example – Video sharing

Peer-to-peer: BitTorrentAdvantages

• Does not depend on a central server

• Bandwidth shared across nodes(downloaders also act as uploaders)

• High scalability, low cost

Disadvantages

• Seeder must remain on-line to guarantee file availability

• Content is more difficult to find(downloaders must find .torrent file)

• Freeloaders cheat in order to download without uploading

peer-to-peer

seeder

downloader downloader

downloader

downloaderdownloader

Page 3: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

P2P vs. client-server

Client-server

Asymmetric: client and servers carry out different tasks

Global knowledge: servers have a global view of the network

Centralization: communications and management are centralized

Single point of failure: a server failure brings down the system

Limited scalability: servers easily overloaded

Expensive: server storage and bandwidth capacity is not cheap

Peer-to-peer

Symmetric: No distinction between node; they are peers

Local knowledge: nodes only know a small set of other nodes

Decentralization: no global knowledge, only local interactions

Robustness: several nodes may fail with little or no impact

High scalability: high aggregate capacity, load distribution

Low-cost: storage and bandwidth are contributed by users

9 © Babaoglu Complex Systems

P2P and Overlay Networks

10

■ Peer-to-Peer systems are usually structured as “overlays”■ Logical structures built on top of a physical routed

communication infrastructure (IP) that creates the allusion of a completely-connected graph

■ Links based on logical “knows” relationships rather than physical connectivity

© Babaoglu Complex Systems11

Overlay networks

Physical network: “who has a communication link to whom”© Babaoglu Complex Systems

12

Overlay networks

Logical network: “who can communicate with whom”

Page 4: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems13

Overlay networks

Overlay network (ring): “who knows whom”

© Babaoglu Complex Systems14

Overlay networks

Overlay network (binary tree): “who knows whom”

© Babaoglu Complex Systems

P2P Environment

■ Completely decentralized control with limited local states■ High latency, low bandwidth communication■ Churn

■ Nodes may disconnect temporarily■ New nodes are continuously joining the system, while others

leave permanently■ Security

■ P2P clients runs on machines under the total control of their owners

■ Malicious users may try to bring down the system■ Selfishness

■ Users may run hacked clients in order to avoid contributing resources

15 © Babaoglu Complex Systems

Why P2P?

■ Decentralization enables deployment of applications that are:■ Highly available■ Fault-tolerant■ Self organizing■ Scalable■ Difficult or impossible to shut down

■ This results in a “grassroots” approach and “democratization” of the Internet

16

Page 5: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems 17

P2P Problems

■ Overlay construction and maintenance■ e.g., random, two-level, ring, etc.

■ Data location■ locate a given data object among a large number of nodes

■ Data dissemination■ propagate data in an efficient and robust manner

■ Global reasoning with local information■ maintain local views with small per node state

■ Tolerance to churn■ maintain system invariants (e.g., topology, data location, data

availability) despite node arrivals and departures

© Babaoglu Complex Systems

P2P Applications

■ Sharing of content:■ File sharing, content delivery networks■ Gnutella, eMule, Akamai

■ Sharing of storage:■ Distributed file systems

■ Sharing of CPU time:■ Parallel computing, Grid■ Seti@home, Folding@home, FightAids@home

(typically not pure P2P)

18

© Babaoglu Complex Systems

P2P Topologies

■ Unstructured■ Structured

■ Centralized■ Hierarchical

■ Hybrid

19 © Babaoglu Complex Systems

Evaluating topologies

■ Manageability■ How hard is it to keep working?

■ Information coherence■ How authoritative is info?

■ Extensibility■ How easy is it to grow?

■ Fault tolerance■ How well can it handle failures?

■ Censorship■ How hard is it to shut down?

20

Page 6: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

Unstructured

■ Manageable■ Difficult, many owners

■ Coherent■ Difficult, unreliable peers

■ Extensible■ Anyone can join in

■ Fault tolerance■ redundancy

■ Censorship■ Difficult to shut down

© Babaoglu Complex Systems

Structured: Centralized

■ Manageable■ System is all in one place

■ Coherent■ Information is centralized

■ Extensible■ No

■ Fault tolerance■ Single point of failure

■ Censorship■ Easy to shut down

© Babaoglu Complex Systems

Structured: Hierarchical

■ Manageable■ Chain of authority

■ Coherent■ Cache consistency

■ Extensible■ Add more leaves, rebalance

■ Fault tolerance■ Root is vulnerable

■ Censorship■ Just shut down the root

© Babaoglu Complex Systems

Hybrid: Superpeers

■ Manageable■ Same as decentralized

■ Coherent■ Better than decentralized

■ Extensible■ Anyone can join in

■ Fault tolerance■ redundancy

■ Censorship■ Difficult to shut down

Page 7: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

Some Common Topologies

■ Flat unstructured: a node can connect to any other node■ only constraint: maximum degree dmax

■ fast join procedure■ good for data dissemination, bad for location

■ Two-level unstructured: nodes connect to a superpeer■ superpeers form a small overlay■ used for indexing and forwarding■ high load on superpeer

■ Flat structured: constraints based on node ids■ allows for efficient data location■ constraints require long join and leave procedures

25 © Babaoglu Complex Systems

Data location in unstructured networks:Flooding

26

■ Problem: find the set of nodes S that store a copy of object O

■ Flooding: forward the search message to all neighbors (first Gnutella protocol)■ A search message contains either keywords or an object id■ Advantages:

▴ simplicity▴ no topology constraints

■ Disadvantages:▴ high network overhead (huge traffic generated by each search request)▴ flooding stopped by TTL (which produces search horizon)▴ only applicable to small number of nodes

© Babaoglu Complex Systems

Data location in unstructured networks:Flooding

■ Flooding (cont.)■ Flooding in a flat unstructured network:

27

search horizon forTTL = 2

obj

Objects that lie outside of the horizon are not found

© Babaoglu Complex Systems

Data location in unstructured networks:Superpeers

28

■ Two-level overlay: use superpeers to track the locations of an object [Gnutella 2, BitTorrent]■ Each node connects to a superpeer and advertises the list of

objects it stores■ Search requests are sent to the supernode, which forwards them

to other supernodes■ Advantages:

▴ highly scalable■ Disadvantages:

▴ superpeers must be reliable, powerful and well connected to the Internet (expensive)

▴ superpeers must maintain large state▴ the system relies on a small number of superpeers

Page 8: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

Data location in unstructured networks:Superpeers

■ Two-Level Overlay (cont.)

29

A two-level overlay is a partially centralized system

In some systems, superpeers may be disconnected (e.g., BitTorrent)

responserequestobj

© Babaoglu Complex Systems

Data location in structured networks:Key-Based Routing

■ Structured networks: use a routing algorithm that implements Key-Based Routing (KBR) [Chord, Pastry, Overnet, Kad, eMule]

■ KBR (also known as Distributed Hash Tables) works as follows:■ nodes are (randomly) assigned unique node identifiers (nodeId)■ given a key k, the node whose nodeId is numerically closest to k

among all nodes in the network is known as the root of key k■ given a key k, a KBR algorithm can route a message to the root

of k in a small number of hops, usually O(log n)■ the location of an object with id objectId is tracked by the root of

k = objectId■ thus, one can find the location of an object by routing a message

to the root of k = objectId and querying the root for the location of the object

30

Structured overlay network: Chord

Basics

Each peer is assigned a unique m-bit identifier id .Every peer is assumed to store data contained in a file.Each file has a unique m-bit key k .Peer with smallest identifier id � k is responsible forstoring file with key k .succ(k): The peer (i.e., node) with the smallest identifierp � k .

NoteAll arithmetic is done modulo M = 2m. In other words, ifx = k ·M +y , then x mod M = y .

22 / 53

Structured overlay network: Chord

Basics

Each peer is assigned a unique m-bit identifier id .Every peer is assumed to store data contained in a file.Each file has a unique m-bit key k .Peer with smallest identifier id � k is responsible forstoring file with key k .succ(k): The peer (i.e., node) with the smallest identifierp � k .

NoteAll arithmetic is done modulo M = 2m. In other words, ifx = k ·M +y , then x mod M = y .

22 / 53

Page 9: Introduction to Peer-to-Peer Systems

Example

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

1819

20

21

22

23

24

25

26

27

28

29

3031

Peer�9�storesfiles�with�keys5,�6,�7,�8,�9

Peer�1�storesfiles�with�keys29,�30,�31,�0,�1

Peer�20�storesfile�with�key�21

23 / 53

21

Efficient lookups

Partial view = finger table

Each node p maintains a finger table FTp[] with at most mentries:

FTp[i] = succ(p +2i�1)

Note: FTp[i] points to the first node succeeding p by atleast 2i�1.To look up a key k , node p forwards the request to nodewith index j satisfying

q = FTp[j] k < FTp[j +1]

If p < k < FTp[1], the request is also forwarded to FTp[1]

24 / 53

Example finger tables

succ(p�+�2����)i-10 12

3

4

5

6

7

8

9

10

11

12

13

14151617

1819

20

21

22

23

24

25

26

27

28

29

3031

1 42 43 94 95 18

1 92 93 94 145 20

1 112 113 144 185 28

1 142 143 184 205 28

1 182 183 184 285 1

1 202 203 284 285 4

1 212 283 284 285 41 28

2 283 284 15 9

1 12 13 14 45 14

i

25 / 53

Example lookup: 15@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[4] 15 < FT4[5]) 4! 14

2 p = 14 < 15 < FTp[1]) 14! 18

26 / 53

Page 10: Introduction to Peer-to-Peer Systems

Example lookup: 15@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[4] 15 < FT4[5]) 4! 14

2 p = 14 < 15 < FTp[1]) 14! 18

26 / 53

Example lookup: 15@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[4] 15 < FT4[5]) 4! 14

2 p = 14 < 15 < FTp[1]) 14! 18

26 / 53

Example lookup: 22@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[5] 22) 4! 20

2 FT20[1] 22 < FT20[2]) 20! 21

3 p = 21 < 22 < FT21[1]) 21! 28

27 / 53

Example lookup: 22@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[5] 22) 4! 20

2 FT20[1] 22 < FT20[2]) 20! 21

3 p = 21 < 22 < FT21[1]) 21! 28

27 / 53

Page 11: Introduction to Peer-to-Peer Systems

Example lookup: 22@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[5] 22) 4! 20

2 FT20[1] 22 < FT20[2]) 20! 21

3 p = 21 < 22 < FT21[1]) 21! 28

27 / 53

Example lookup: 22@4

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14

1 FT4[5] 22) 4! 20

2 FT20[1] 22 < FT20[2]) 20! 21

3 p = 21 < 22 < FT21[1]) 21! 28

27 / 53

Example lookup: 18@20

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14 1 p = 20 ⌅ 18 < FTp[1]6) 20! 21

2 FT20[5] < 18) 20! 4

3 FT4[4] 18 < FT4[5]) 4! 14

4 p = 14 < 18 < FTp[1]) 14! 18

28 / 53

Example lookup: 18@20

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14 1 p = 20 ⌅ 18 < FTp[1]6) 20! 21

2 FT20[5] < 18) 20! 4

3 FT4[4] 18 < FT4[5]) 4! 14

4 p = 14 < 18 < FTp[1]) 14! 18

28 / 53

Page 12: Introduction to Peer-to-Peer Systems

Example lookup: 18@20

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14 1 p = 20 ⌅ 18 < FTp[1]6) 20! 21

2 FT20[5] < 18) 20! 4

3 FT4[4] 18 < FT4[5]) 4! 14

4 p = 14 < 18 < FTp[1]) 14! 18

28 / 53

Example lookup: 18@20

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14 1 p = 20 ⌅ 18 < FTp[1]6) 20! 21

2 FT20[5] < 18) 20! 4

3 FT4[4] 18 < FT4[5]) 4! 14

4 p = 14 < 18 < FTp[1]) 14! 18

28 / 53

Example lookup: 18@20

0 12

3

4

5

6

7

8

9

10

11

12

13

14151617

18

19

20

21

22

23

24

25

26

27

28

29

3031

1 4

2 4

3 9

4 9

5 18

1 9

2 9

3 9

4 14

5 20

1 11

2 11

3 14

4 18

5 28

1 14

2 14

3 18

4 20

5 28

1 18

2 18

3 18

4 28

5 1

1 20

2 20

3 28

4 28

5 4

1 21

2 28

3 28

4 28

5 4

1 28

2 28

3 28

4 1

5 9

1 1

2 1

3 1

4 4

5 14 1 p = 20 ⌅ 18 < FTp[1]6) 20! 21

2 FT20[5] < 18) 20! 4

3 FT4[4] 18 < FT4[5]) 4! 14

4 p = 14 < 18 < FTp[1]) 14! 18

28 / 53

The Chord graph

EssenceEach peer represented by a vertex; if FTp[i] = j , add arc h

�!i , ji,

but keep directed graph strict.

1

4

9

11

14

18

20

21

28

29 / 53

Page 13: Introduction to Peer-to-Peer Systems

Chord: path lengths

ObservationWith dn

2 (i , j) = min{|i� j |,n� |i� j |}, we can see that every peeris joined with another peer at distance 1

2n, 14n, 1

8n, . . . ,1.

7.5

7.0

6.5

6.0

5.5

2 4 6 8 10 12 14 16 18 20

Ave

rage�p

ath

�length

Network�size�(x�1000)

30 / 53

Chord: degree distribution

600

500

400

300

200

100

20 40 60 80 100

Indegree

Occ

urr

ence

s

11 12 13 14 15 16 17

3500

2500

1500

500

Outdegree

Occ

urr

ence

s

31 / 53

Chord: clustering coefficient

0.12

0.11

0.10

0.09

201 5 10 15

NoteCC is computed over undirected Chord graph; x-axis showsnumber of 1000 nodes.

32 / 53 © Babaoglu Complex Systems

Data location in structured networks:Key-Based Routing

52

■ Structured networks■ Advantages:

▴ completely decentralized (no need for superpeers)▴ routing algorithm achieves low hop count for large network sizes

■ Disadvantages:▴ each object must be tracked by a different node▴ objects are tracked by unreliable nodes (which may disconnect)▴ keyword-based searches are more difficult to implement than with

superpeers (because objects are located by their objectid)▴ the overlay must be structured according to a given topology in order to

achieve a low hop count▴ routing tables must be updated every time a node joins or leaves the

overlay

Page 14: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

Effects of Churn

■ Churn can have several effects on a P2P system:■ data objects may be become unavailable if all replicas disconnect■ routing tables may become inconsistent

(e.g., entries may point to disconnected nodes)■ the overlay may become partitioned if many nodes suddenly

disconnect:

53 © Babaoglu Complex Systems

Churn - Preventing Partitions

■ A naïve approach to preventing partitions is to increase the average node degree

■ Ring partitions can be avoided by keep a list of successor nodes

54

© Babaoglu Complex Systems

Churn Tolerance

■ Node arrivals and departures must not disrupt the normal behavior of the system■ system invariants must be maintained

▴ connected overlay (i.e., no partitions), low average path length▴ data objects accessible from anywhere in the network

■ two types of churn tolerance:▴ dynamic recovery: ability to react to changes in the overlay to maintain

system invariants (e.g., heal partitions)▴ static resilience: ability to continue operating correctly before adaptation

occurs (e.g., route messages through alternate paths)

55 © Babaoglu Complex Systems

Security

■ Security in P2P systems is hard to enforce:■ Users have full control of their computers■ Modified clients may not follow the standard protocol■ Data may be corrupted■ Private data stored on remote computers may disclosed

56

Page 15: Introduction to Peer-to-Peer Systems

© Babaoglu Complex Systems

Security - Weak identities

■ The user may leave the system and rejoin it with a new identity (different user id)

■ If an attack is detected, the attacker can re-enter the system with a new id

■ An attacker may create a large number of false identities (Sybil attack)

A

S1 S2

S3

S4S5

S6

Example of Sybil attack:

- Nodes S1 to S6 are actually 6 instances of the P2P client running on the same machine

- The attacker can intercept all traffic coming from or going to node A

© Babaoglu Complex Systems

Security - Strong identities

■ The user cannot change its identity■ Solution: use a centralized, trusted Certification Authority

(CA)■ Each new user must obtain an identity certificate■ The certificate is digitally signed by the CA, whose public key is

known by all users■ A certificate cannot be forged (require the CA’s private key)■ To prove his identity, a user signs a message with his private key,

and attaches the corresponding certificate signed by the CA■ Strong identities prevent Sybil Attacks■ If an attacker is caught, it cannot easily rejoin the system

58

© Babaoglu Complex Systems

Security - Weak vs. Strong identities

■ Strong identities require a centralized CA■ New nodes must contact the CA before joining the network:

▴ The CA response may be slow▴ If the CA is unavailable, new nodes cannot join

■ The security of the system depends on CA:▴ The CA must correctly verify the identity of the requester▴ The CA’s private key must be secret

■ Many P2P systems use weak identities■ IP addresses already gives some identity information■ Some systems ensure anonymity

59