distributed hash tables david tam patrick pang presentation outline what is dht (distributed hash...

26
Distributed Hash Tables David Tam Patrick Pang

Upload: dorothy-young

Post on 31-Mar-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Distributed Hash Tables

David Tam

Patrick Pang

Page 2: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Presentation Outline• What is DHT (Distributed Hash Table)?

• Why DHTs?

• Applications

• How lookup works?

• Alternatives to DHTs

• Performance – Routing

• Performance – Load Balancing

• Security – Routing Attack

• Security – Inconsistent Behaviour

• Comparison to Other Facilities

• Current Research Projects

• Conclusion

Page 3: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

What is DHT?

Distributed hash table

Distributed application

get (key) data

node node node….

put(key, data)

(Figure adopted from Frans Kaashoek)

DHT provides the information look up service for P2P applications.• Nodes uniformly distributed across key space• Nodes form an overlay network• Nodes maintain list of neighbours in routing table• Decoupled from physical network topology

Page 4: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Why DHTs?

Why Do We Need DHTs?• Simplifies the development for large-scale distributed Apps • Better security and robustness• Simple API• Exploits P2P resources

Why Middleware?• Simplifies the development for large-scale distributed Apps • Better security and robustness• Simple API

Page 5: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Applications

• Anything that requires a hash table• Databases, FSes, storage, archival• Web serving, caching• Content distribution• Query & indexing• Naming systems• Communication primitives• Chat services• Application-layer multi-casting• Event notification services• Publish/subscribe systems ?

Page 6: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

How lookup works?

214

12

107

5

0

3

4

6

89

11

13start interval succ.

3 [3,4) 5

4 [4,6) 5

6 [6,10) 7

10 [10,2) 10

Finger Table for Node 215 1

Example: Chord [Stoica et. al.]

Page 7: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

How lookup works?

214

12

107

5

0

3

4

6

89

11

13

15

start interval succ.

11 [11,12) 12

12 [12,14) 12

14 [14,2) 14

2 [2,10) 2

Finger Table for Node 101

Example: Chord

Page 8: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

How lookup works?

214

12

107

5

0

3

4

6

89

11

13

15

start interval succ.

11 [11,12) 12

12 [12,14) 12

14 [14,2) 14

2 [2,10) 2

Finger Table for Node 101

Example: Chord

Page 9: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

How lookup works?

1

214

12

107

5

0

3

4

6

89

11

13

15

start interval succ.

15 [15,0) 15

0 [0,2) 1

2 [2,6) 2

6 [6,13) 7

Finger Table for Node 14

Example: Chord

Page 10: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

How lookup works?

1

214

12

107

5

0

3

4

6

89

11

13

15

start interval succ.

15 [15,0) 15

0 [0,2) 1

2 [2,6) 2

6 [6,13) 7

Finger Table for Node 14

Example: Chord

Page 11: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

How lookup works?

214

12

107

5

0

3

4

6

89

11

15

Now Node 2 can retrive information for key 0 from Node 1.

1Example: Chord

Page 12: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Alternatives to DHTs• Distributed file system• Centralized lookup• P2P flooding queries

Server

Client

ClientClient

ClientInternet

Server (Figures adopted from Frans Kaashoek)

N4Target

Start

N6N9

N7N8

N3

N2

N1

N1

0

N4Target

Start

N6N9

N7

N8

N3

N2

N1

N10DB

Page 13: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Performance -- Lookup

Purpose -- to locate a target node•Each step, try to get closer to locating target node

• Ask a closer neighbour• Performance & scalability tied directly to lookup algorithm

2 Aspects to Scalability• size of routing table – O(log N)• lookup path length – O(log N)

2 Aspects to Performance• Path latency• Lookup path length (# hops)

3 Techniques• proximity lookup• proximity neighbour selection• geographic layout

Page 14: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Performance -- Load Balancing

Issues• Hot-spots

• Content• Lookup

• Heterogeneous nodes & paths• System flux

Solution• Replication is the key

• Also good for fault-tolerance• Cache lookup answers backwards along path

Page 15: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Security – Incorrect Lookup (1)• When asked for the “next hop”, give a wrong answer

start interval succ.

3 [3,4) 5

4 [4,6) 5

6 [6,10) 7

10 [10,2) 10

Finger Table for Node 2

Node 2 to Node 10: Please tell me how to reach key 0 ….

214

12

107

5

3

4

6

89

11

13

15 10

Page 16: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Security – Incorrect Lookup (2)• When asked for the “next hop”, give a wrong answer

start interval succ.

11 [11,12) 12

12 [12,14) 12

14 [14,2) 14

2 [2,10) 2

Finger Table for Node 10

214

12

107

5

0

3

4

6

89

11

13

15 1

Node 2 to Node 10: Please tell me how to reach key 0 ….

Node 10 answers: ask Node 14

Page 17: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

• When asked for the “next hop”, give a wrong answer

start interval succ.

15 [15,0) 15

0 [0,2) 1

2 [2,6) 2

6 [6,13) 7

Finger Table for Node 14

214

12

107

5

3

4

6

89

11

13

15 1

Node 2 to Node 14: Please tell me how to reach key 0 ….

Node 14 answers: ask Node 10

Security – Incorrect Lookup (3)

0

Page 18: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Security – Incorrect Lookup (4)

Solution [Sit and Morris]:• “Define verifiable system invariant”• “Allow the querier to observe lookup progress”

Our idea how this can be implemented:• Concretely, using an integral monotonically decreasing quantity to implement the idea of “progress”.• The concept of “monotonically decreasing quantity” has been used in program construction guaranteeing total correctness. [Parnas]

Page 19: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Security – Inconsistent Behaviour• Inconsistent Behaviour, i.e., lie intelligibly

• Sybil attack [Kaashoek]

Solution 1: public key solution

Page 20: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Security – Inconsistent Behaviour• Inconsistent Behaviour, i.e., lie intelligibly

• Sybil attack [Kaashoek]

Solution 1: public key solution

Solution 2: Byzantine Protocol

Byzantine Generals Problem:

How to find out the traitors among the Generals? [Lamport]

Page 21: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

• Inconsistent Behaviour, i.e., lie intelligibly

• Sybil attack [Kaashoek]

Security – Inconsistent Behaviour

Solution 1: public key solution

Solution 2: Byzantine Protocol

Byzantine Generals Problem:

How to find out the traitors among the Generals? [Lamport]

Commander

Lieutenant 1 Lieutenant 2“he said ‘retreat’”

attackattack

Page 22: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Security – Inconsistent Behaviour• Inconsistent Behaviour, i.e., lie intelligibly

• Sybil attack [Kaashoek]

Solution 1: public key solution

Solution 2: Byzantine Protocol

Byzantine Generals Problem:

How to find out the traitors among the Generals? [Lamport]

Commander

Lieutenant 1 Lieutenant 2

retreatattack

“he said ‘retreat’”

Page 23: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Comparison to Other Facilities

Facility Abstraction Easy Use/Prg Scalability Load-Balance

DHT high high high yes

Centralized Lookup medium medium low no

P2P flooding queries medium high low no

Distributed FS low medium medium no

Facility Fault-Tolerance Self-Org Admin

DHT high yes low

Centralized Lookup low no medium

P2P flooding queries depends yes low

Distributed FS medium no high

Page 24: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Research Projects

Iris – security & fault-tolerance – US Gov’tChord – circular key spacePastry – circular key spaceTapestry – hypercube spaceCAN – n-dimensional key spaceKelips – n-dimensional key spaceDDS -- middleware platform for internet service construction

-- cluster-based-- incremental scalability

Page 25: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

Summary

• Good middleware platform

• Exploits P2P networks

• An exciting new research area

Page 26: Distributed Hash Tables David Tam Patrick Pang Presentation Outline What is DHT (Distributed Hash Table)? Why DHTs? Applications How lookup works? Alternatives

References

• Lamport, Leslie et. al. The Byzantine Generals Problem

• Sit, Emil, Morris, Robert. Security Considerations for Peer-to-Peer Distributed Hash Tables

• Kaashoek, Frans. Distributed Hash Tables – Building large-sacle, robust distributed applications

• Stoica, Ion et. al. Chord: A scalable peer-to-peer lookup service for Internet applications

• Parnas, D. L. Connecting Theory to Practice: Software Engineering Programme