how much anonymity does network latency leak?

How Much Anonymity does Network Latency Leak?Paper by: Nicholas Hopper, Eugene

Vasserman, Eric Chan-TinPresented by: Dan Czerniewski

October 3, 2011

• Goal of every anonymous communication scheme is to allow users to communicate while concealing information about who communicates with whom.

The Idea

• Chaum proposed sending messages through a “Mix server” that mixes together messages from several senders before forwarding these messages to their destinations, concealing relationship between sender and receiver.

• There are many schemes, yet all rely on “mixing” relays.

High-latency

• Like Mixmaster and Mixminion• Deliver messages at a significant delay• Schemes implement countermeasures that

increase delay– Pool Mixing– Consume Bandwidth• Cover Traffic

Low-Latency

• Like Tor, I2P, AN.ON, Crowds, Anonymizer.com

• Commercial proxy aggregators• Allows anonymous use of application services– Remote login and web browsing

• Reduced anonymity guaranteed• Security against “local” adversary

• Malicious servers acting as local adversaries can observe the network latency of a connection made over a Tor circuit

• 3 experiments that measure the extent to which this information leakage compromises the anonymity of clients using a low-latency anonymity scheme– Analysis of noise-free anonymity leakage– A passive linkability attack– An active client-identification attack

Quick overview of Tor

• Low-latency, bandwidth-efficient, anonymizing layer for TCP streams

• With use of at least 3 nodes, no node knows the identities of both communicating parties

Latency without Noise

• Anonymity circuit imposed no delay at all• Difference between connecting to a server

normally and over the anonymity service is in the latter, the client’s IP address is missing

• Best possible case for an attack based solely on Round-Trip Time (RTT) information.

• Analyzed on MIT King Data Set and PlanetLab systems

Circuit Linking via Latency

• 2 colluding servers both accept connections from the same exit node

• The 2 servers try to determine whether they are communicating with different clients or the same client.

Are they the same?

• The servers have to calculate the RTT between each node in the connection

• The servers then have to calculate the “queueing” time for each node

• Add everything to gether• If everything is equivalent, then it indicates

probabilistically they are the same• If not, they come from different distributions

Attack by the server

• Sends HTML with 1000 separate tags printing empty images.

• Causes browser to make 1000 separate connections to server.

• The amount of calls varies, but with a possible 24 concurrent connections, this requires about 42 “rounds” of connections.

2 different tests

• Comparison of means– Confidence interval for the mean of each sample

population with traversal at fixed time• Kolmogorov-Smirnov– Computes the largest difference in cumulative

probability density between two sample sets• Receiver Operating Characteristic Curves– Each point corresponds to the true positive and false

positive rates for one setting of the rejection thresholds

Client Location Via Latency

• Adversary is three logical entities: Aserver, Aclient and Ator

• Goal is to identify V’s network latency

The Attack

• Basic Idea– Calculate the RTT between V and E

• 3 steps– Measuring first hop latency– Estimating candidate RTTs– Eliminating Candidates

Measuring first hop latency

• Aserver and Ator can determine circuit order after several iterations

• Aclient connects using the circuit and calculates RTT

• With information we can estimate the RTT from V to E

Estimating candidate RTTs

• Need a method to obtain the RTT between two hosts without explicit cooperation of either

• Vivaldi embedding algorithm– Ease of implementation– Disadvantage: in order to be accurate without

cooperation, several nodes must be used for the service

Eliminating Candidates

• Check all candidates to see if their RTTs are consistent with the estimated RTT between V and E

• Accepting means the RTT is within 85% of the estimate RTT between V and E

Limitations

• Limited data on conditional information gain• Client location attack assumes that a user

repeatedly accesses a server from the same network location

Mitigation

• Onion routing minimizes the success probability of Murdoch-Danezis’ clogging attack

• Adding sufficient delays to make the RTT and timing characteristics of Tor servers independent of the underlying network topology

Thank you!

• Questions?

how much anonymity does network latency leak?

Documents