bit torrent techtalks_dht
DESCRIPTION
As part of BitTorrent's Tech Talks series, Arvid Norberg explains how BitTorrent Distributed Hash Tables work.TRANSCRIPT
TECH TALKSBitTorrent DHT | Arvid Norberg
BitTorrent, Inc / 2013 Tech Talks 01
BitTorrent, Inc / 2013 Tech Talks
INTRODUCTION
(It’s like a hash table, where the payload is divided up across many nodes.)
02
DHT = DISTRIBUTED HASH TABLE
BitTorrent, Inc / 2013 Tech Talks
INTRODUCTIONThe DHT is primarily used to introduce peers to each other.
03
PEER
DHT
Who else is on swarm X?
172.4.12.7119.73.53.83217.13.98.220...
BitTorrent, Inc / 2013 Tech Talks
INTRODUCTION
04
protocol <what do nodes say to each other?>
topology <how are nodes organized?>
routing <how are target nodes found?>
routing table <representation of routing info>
traversal algorithm <search implementation?>
BitTorrent, Inc / 2013 Tech Talks 05
PROTOCOL
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
06
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who is close to this ID?, or who can I ask?>
PEER DHT
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
07
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who is close to this ID?, or who can I ask?>
PEER DHT
ONE-OFF MESSAGES
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
08
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who is close to this ID?, or who can I ask?>
PEER DHT
ONE-OFF MESSAGES
RECURSIVE
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
09
PEER DHTget_peers (x)
nodes: <...>
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who is close to this ID?, or who can I ask?>
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
10
PEER DHTget_peers (x)
nodes: <...>
nodes: <...>
get_peers (x)
REPEATS
get_peers (x)
nodes: <...>
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who is close to this ID?, or who can I ask?>
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
11
PEER DHTget_peers (x)
nodes: <...>
nodes: <...>
get_peers (x)
TERMINATES WHEN WE RECEIVE VALUES (I.E. PEERS) AND NO BETTER NODES
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who is close to this ID?, or who can I ask?>
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
12
ping <are you still around?>
announce_peer <hey, i’m on this swarm!>
get_peers <who else is on this swarm?, or who can I ask?>
find_node <who are close to this ID?, or who can I ask?>
EACH MESSAGE INCLUDES ONES OWN NODE ID.
THIS ENABLES NODES TO LEARN ABOUT NEW NODES AS PART OF THE PROTOCOL CHATTER.
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
13
bootstrap <find_nodes(self)>
refresh buckets <find_nodes (target-bucket)>
announce <get_peer(ih) + announce_peer(ih)>
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
14
Spoof protection
get_peers responds with a write-token
Write-token is typically a MAC of:- source (IP, port)- target info-hash- local secret (which may expire in tens of minutes)
announce_peer requires a valid write token to insert the node in the peer list
BitTorrent, Inc / 2013 Tech Talks 15
TOPOLOGYThis section describes how nodes canrespond to recursive lookup queries like this,with nodes closer and closer to the target.*
*Spoiler alert: it has to do with the topology
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
16
The DHT is made up by all BitTorrent peers, across all swarms.
PEERPEER
PEER
PEER
PEERPEER
PEER
PEER PEER
PEER
PEER
PEER
PEER
PEERPEER
PEER
PEER
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
17
PEERPEER
PEER
PEER
PEERPEER
PEER
PEER PEER
PEER
PEER
PEER
PEER
PEERPEER
PEER
PEER
71690...29736 24e52...a22a6
caa39...9087a 138b8...cea6f 09b9b...6216b
9a2ac...1993e 03afa...f0200f091c...7fc09
09b9b...6216b
6c6cf...10f46 f8466...b21cf 22942...48a07 5f08d...e2537
255d5...b923d 0f35f...386ab 79280...f8bf3
2f1d5...04177
Each node has a self-assigned address, or node ID.
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
18
0 2160
Consider all nodes lined up in the node ID space....
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
19
0 2160
....all nodes appear evenly distributed in this space.
node-ID space
Consider all nodes lined up in the node ID space....
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
20
0 2160
node-ID space
IDs are 160 bits (20 bytes) long...Keys in the hash table (info-hashes) are also 160 bits.
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
21
0 2160
node-ID space
Nodes whose ID is close to an info-hash are responsible for storing information about it.
info-hash
IDs are 160 bits (20 bytes) long...Keys in the hash table (info-hashes) are also 160 bits
BitTorrent, Inc / 2013 Tech Talks
TOPOLOGY
22
0 2160
node-ID space
Nodes whose ID is close to an info-hash are responsible for storing information about it.
info-hash
IDs are 160 bits (20 bytes) long...Keys in the hash table (info-hashes) are also 160 bits
BitTorrent, Inc / 2013 Tech Talks 23
ROUTING
BitTorrent, Inc / 2013 Tech Talks 24
It is impractical for every node to know about every other node.
There are millions of nodes.
Nodes come and go constantly.
Every node specializes in knowing about all nodes close to itself.
The farther from itself, the more sparse its knowledge of nodes become.
ROUTING
The routing table orders nodes based on their distance from oneself.
BitTorrent, Inc / 2013 Tech Talks
ROUTING
25
0 2160
node-ID space self
distance distance
BitTorrent, Inc / 2013 Tech Talks
ROUTING
26
0 2160
distance distance
self distance from self
node-ID space self
This is simplified. The euclidian distance is not actually used.
The routing table orders nodes based on their distance from oneself.
BitTorrent, Inc / 2013 Tech Talks
ROUTING
27
The euclidian distance has a problem that it “folds” the space, and nodes are no longer uniformly distributed (in the distance-space).
0 2160
distance distance
node-ID space self
BitTorrent, Inc / 2013 Tech Talks
ROUTING
28
The XOR distance metric makes it so that the distances are still uniformly distributed. It doesn't “fold” the space the way euclidian distance does.
0 2160
distance from selfself
The XOR distance metric is: d(a,b) = a ⊕ b
BitTorrent, Inc / 2013 Tech Talks
ROUTING
29
0 2160
distance from selfself
The distance space is divided up into buckets. Each bucket holds no more than 8 nodes.
BitTorrent, Inc / 2013 Tech Talks
ROUTING
3-
0 2160
distance from selfself
The distance space is divided up into buckets. Each bucket holds no more than 8 nodes.
The space covered by a bucket is half as big as the previous one.
You know about more nodes close to you.
bucket 0bucket 1bucket 2...
BitTorrent, Inc / 2013 Tech Talks
ROUTING
31
0 2160
node ID space
target
For every hop in a recursive lookup, the nodes distance is cut in half.
Lookup complexity:O(log n)
This illustration is also simplified, the XOR Distance metric will make you jump back and forth a bit. Cutting your distance in half every hop still holds.
BitTorrent, Inc / 2013 Tech Talks 32
ROUTING TABLE
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
33
The XOR distance metric applied to the routing table just counts the length of the common bit-prefix.
10101110100010101001110...10101011010010110100101...
OUR NODE ID:OTHER NODE ID:
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
34
The XOR distance metric applied to the routing table just counts the length of the common bit-prefix.
10101110100010101001110...10101011010010110100101...
OUR NODE ID:OTHER NODE ID:
shared bit prefix: 5 bits node belongs in bucket 5
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
35
A 160 bit space can be cut in half 160 times. There is a max of 160 buckets.
bucket 0bucket 1bucket 2...
0 2160
no prefixprefix = 0prefix = 00...
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
36
View of the routing table in node ID space (instead of distance space).
0 2160
bucket 0 3 4 bucket 2 bucket 1
bit 0
bit 1
bit 2
bit 3
self
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
37
A naïve routing table implementation would be an array of 160 buckets....
bu
cke
t 5
bu
cke
t 4
bu
cke
t 3
bu
cke
t 2
bu
cke
t 1
bu
cke
t 0...
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
38
A naïve routing table implementation would be an array of 160 buckets...
Not very efficient, since the majority of buckets will be empty.
bu
cke
t 5
bu
cke
t 4
bu
cke
t 3
bu
cke
t 2
bu
cke
t 1
bu
cke
t 0...
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
39
A typical routing table starts with only bucket 0.
When the 9th node is added, the bucket is split into bucket 0 and bucket 1, with the nodes moved to their respective bucket.
Only the highest numbered bucket is ever split.
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
40
bucket 0
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
41
bucket 0
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
42
bucket 0
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
43
bucket 0bucket 1
arrange nodes into the correct bucket
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
44
bucket 0bucket 1
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
45
bucket 0bucket 1
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
46
bucket 0bucket 1
BitTorrent, Inc / 2013 Tech Talks
ROUTING TABLE
47
bucket 0bucket 1
BitTorrent, Inc / 2013 Tech Talks 48
TRAVERSAL ALGORITHMA deeper look at the get_peers and announce_peer query.
BitTorrent, Inc / 2013 Tech Talks
TRAVERSAL ALGORITHM
49
Pick known nodes out of routing table, close to the target we're looking up.
Sort by distance to target.
(IP, Node ID)
(IP, Node ID)
(IP, Node ID)
(IP, Node ID)
clo
ser
to t
arg
et
BitTorrent, Inc / 2013 Tech Talks
TRAVERSAL ALGORITHM
50
Send requests to 3 (or so) at a time.
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
clo
ser
to t
arg
et
get_peers (ih)
get_peers (ih)
get_peers (ih)
BitTorrent, Inc / 2013 Tech Talks
TRAVERSAL ALGORITHM
51
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
clo
ser
to t
arg
et
nodes (....)
nodes (....)
nodes (....)
Send requests to 3 (or so) at a time.
BitTorrent, Inc / 2013 Tech Talks
TRAVERSAL ALGORITHM
52
Nodes are inserted in sorted order. Nodes we already have are ignored. Nodes that don't respond, are marked as stale
(IP, Node ID)
(IP, Node ID)
(IP, Node ID)
(IP, Node ID)
(IP, Node ID) x
(IP, Node ID)
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
(IP, Node ID)
clo
ser
to t
arg
et
BitTorrent, Inc / 2013 Tech Talks
TRAVERSAL ALGORITHM
53
Keep requests 3 outstanding at all times.
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
(IP, Node ID) x
(IP, Node ID)
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
clo
ser
to t
arg
et
get_peers (ih)
get_peers (ih)
get_peers (ih)
BitTorrent, Inc / 2013 Tech Talks
TRAVERSAL ALGORITHM
54
Terminating condition: the top 8 nodes have all been queried (and responded).
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
clo
ser
to t
arg
et
BitTorrent, Inc / 2013 Tech Talks
PROTOCOL
55
Send announce_peer to the top 8 nodes.
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID) x
(IP, Node ID)
clo
ser
to t
arg
et
announce_peer (ih)
announce_peer (ih)
announce_peer (ih)
announce_peer (ih)
announce_peer (ih)
announce_peer (ih)
announce_peer (ih)
announce_peer (ih)