determining the geographic location of internet hosts venkata n. padmanabhan microsoft research...

9
Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at Berkeley SIGMETRICS 2001

Upload: kellie-simmons

Post on 13-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

Determining the Geographic Location of Internet Hosts

Venkata N. Padmanabhan

Microsoft Research

Lakshminarayanan Subramanian

University of California at Berkeley

SIGMETRICS 2001

Page 2: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

Background Location-aware services are relevant in the Internet context

too targeted advertising event notification territorial rights management

Existing approaches: user input: burdensome, error-prone whois: manual updates, host may not be at registered location

Goal: estimate location based on client IP address challenging problem because an IP address does not inherently

indicate location

Page 3: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

IP2GeoMulti-pronged approach that exploits various “properties” of the

Internet DNS names of router interfaces often indicate location Network delay tends to correlate with geographic distance Hosts that are aggregated for the purposes of Internet routing also

tend to be clustered geographically

GeoTrack determine location of closest router with recognizable DNS name

GeoPing use delay measurements to triangulate location

GeoCluster extrapolate partial IP-to-location mapping information using cluster

information derived from BGP routing data

Page 4: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

GeoPing Delay-based triangulation is conceptually simple

delay distance distance from 3 or more non-collinear points location

But there are practical difficulties network path may be circuitous transmission and queuing delays may corrupt delay estimate one-way delay is hard to measure

GeoPing delay is measured from several distributed probes minimum delay among several samples is picked Nearest Neighbor in Delay Space (NNDS) algorithm

construct a delay map containing (delay vector,location) tuples given a delay vector, search through the delay map for closest match location corresponding to the closest match is our location estimate

Page 5: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

Validation of Delay-based Approach

0

0.2

0.4

0.6

0.8

1

0 1000 2000 3000 4000 5000

Geographic Distance (kilometers)

Cu

mu

lati

ve

Pro

ba

bil

ity

5-15 ms 25-35 ms 65-75 ms

Delay tends to increase with geographic distance

Page 6: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

Impact of the Number of Probes

0

200

400

600

800

1000

1200

1400

1600

1800

0 5 10 15

Number of probes

Err

or

Dis

tan

ce

(k

m)

25th 50th 75

Highest accuracy when 7-9 probes are used

Page 7: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

GeoCluster Basic idea

divide up the space of IP addresses into clusters using BGP prefixes use partial IP-to-location mapping data to infer location of each cluster given target IP address, find matching cluster via longest-prefix match. location of the matching cluster is our estimate of host location

Issues partial IP-to-location mapping information may not be entirely accurate BGP prefixes might not correspond to geographic clusters

Sub-clustering algorithm use partial IP-to-location mapping information to test whether a BGP prefix

is likely to correspond to a geographic cluster if the test is negative, divide the prefix into two and recursively apply the test

to each half in the end we are only left with geographically clustered prefixes dispersion offers an indication of the accuracy of a location estimate

Page 8: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

Performance of IP2Geo

0

0.2

0.4

0.6

0.8

1

0 1000 2000 3000 4000

Error distance (kilometers)

Cu

mu

lati

ve P

rob

abil

ity

GeoTrack GeoPing GeoCluster

Median error: GeoCluster: 28 km,GeoTrack: 102 km, GeoPing: 382 km

Page 9: Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at

Summary IP2Geo combines several techniques that leverage different

sources of information GeoTrack: DNS names GeoPing: network delay GeoCluster: address aggregates used for routing

Median error varies between 20 and 400 km

Even a 30% success rate is useful especially since we can tell when the estimate is likely to be accurate

Forthcoming paper at SIGCOMM 2001

For more information visit: http://www.research.microsoft.com/~padmanab/