on the impact of clustering on measurement reduction may 14 th, 2009 d. saucez, b. donnet, o....
TRANSCRIPT
On the Impact of Clustering on Measurement Reduction
May 14th, 2009
http://inl.info.ucl.ac.be
D. Saucez, B. Donnet, O. BonaventureThanks to P. François
Université catholique de Louvain
Measurements to Improve netapps/service performance
Bandwidth?
Delay?
Loss?
3
? ? ? ? ? ?
? ? ? ? ? ? ? ?? ? ? ? ? ?
Scalability issues with large-scale measurements
4
How to reduce themeasurement overhead?
Limit the number of measured destinations Clustering
Limit the number of measuring sources Collaboration
5
Limit the number of measured destinationsGroup destinations into Clusters
6
Clustering techniques
Geographic Clustering
Group nodes by city
n-agnostic clustering [1]
group nodes by /n prefix
AS Clustering [2]
group nodes by Autonomous System
BGP Clustering [3]
group nodes by longest match BGP prefix
[1] Szymaniak, M. et al., Practical large-scale latency estimation. Computer Networks, 2008[2] Krishnamurthy, B., Wang, J., Topology modeling via cluster graphs. ACM SIGCOMM Workshop on Internet Measurement (IMW), 2001[3 ]Krishnamurthy, B., Wang, J., On network-aware clustering of web clients. ACM SIGCOMM, 2000
7
How clustering impacts the accuracy?
8
Evaluation setup
Maxmind + Routeviews 1month traceroute traces (Archipelago)
Two monitors: san-us (San Diego, US)
bcn-es (Barcelona, SP)*
9
RTT error (bcn-es)
Geographic, AS n-agnostic, BGP
15% with more than 100% error
10% with more than 200% error
90% with less than 50% error
50% with less than 10% error
10
Clustering reduces the number of measured destinationswithout loosing too much accuracy...
... can we reduce the number of source of measurements?
11
Limit the number of measuring sourcesMake measurement sources collaborating
12
Collaboration fundamentals
Popular destinations are measured by several nodes
Popularity d: #nodes measuring d
Different collaboration approaches Centralized authority/measurement source
Distributed measurements (ICS)
13
How much reduction can we obtain?
14
When can we observe measurement reduction?
Clustering reduces measurements if a cluster C covers at least two measured destinations
Collaboration reduces measurements if at least two topologically closed sources have to measure the same destination
15
Evaluation setup
Campus traffic UCL, 1 link to Belnet @1Gbps
1 month full NetFlow traces 7.45 TB of filtered outgoing traffic
10K sources, 36M destinations
16
Will collaboration help?
74% of the destinations are contacted byonly 1 source
Some destinations are contacted by1K+ sources!
Few percents are contacted by 10+ sources
17
Will clustering help?
At least 45% of the clusters cover more than 10 nodes
1E+4
1E+5
1E+6
1E+7
1E+8
#dest24-agnBGPGeoAS
# of
des
tinat
ions
18
Conclusion
Clustering/Collaboration to reduce measurement overhead
Reduction/accuracy tradeoff
Simple, though efficient techniques, tend to preserve accuracy
19
Questions?
http://inl.info.ucl.ac.be
20
Backup
21
Combine Clustering and Collaboration
22
Hop error (bcn-es)
0% more than 50% error
10% more than 50% error
bigger the n, smaller the error
Geographic, AS n-hybrid, n-agnostic, BGP
23
Error variation inside clusters
75th percentile
50ty percentile
25th percentile
24
The reduction
Collaboration only: 40% gain
20-hyb only: 62% gain
20-hyb + Collaboration: 99% gain
Collaboration + Clustering always better than clustering or collaboration only
25
Are clustering and collaboration so different?
Let C, a cluster of nodes to measure
Let SC, the set of nodes measuring C
SC is cluster
nodes in SC can collaborate
=> SC is the set of collaborating nodes
26
4.43.50.24.150.50.24.200.50.2
n-hybrid Clustering
4.0.0.0/8
...
4.128.0.0/94.0.0.0/9
4.23.88/23
4.43.50/24
...
A
B C
A
B
C...
BGP clusters
4.150.48.0/20
4.200.48.0/20
20-hybrid clusters
BGP prefixes can be huge:
=> Group nodes by longest match BGP prefix down to a given length
27
traceroute to 4.150.50.2 (4.150.50.2), 30 hops max, 40 byte packets 1 192.168.1.1 (192.168.1.1) 3.535 ms 3.710 ms 3.967 ms 2 c-69-180-16-1.hsd1.ga.comcast.net (69.180.16.1) 11.983 ms 13.665 ms 14.154 ms 3 ge-2-1-ur01.a2atlanta.ga.atlanta.comcast.net (68.86.108.17) 17.101 ms 17.618 ms 18.499 ms 4 te-9-1-ur02.a2atlanta.ga.atlanta.comcast.net (68.85.232.38) 17.983 ms 18.840 ms 19.282 ms 5 te-9-3-ur01.b0atlanta.ga.atlanta.comcast.net (68.86.106.54) 20.043 ms 20.624 ms 21.441 ms 6 po-4-ar01.b0atlanta.ga.atlanta.comcast.net (68.86.106.9) 21.963 ms 8.144 ms 12.080 ms 7 pos-1-3-0-0-cr01.atlanta.ga.ibone.comcast.net (68.86.90.125) 14.802 ms 14.893 ms 15.513 ms 8 te-9-1.car1.Atlanta2.Level3.net (4.71.252.29) 113.775 ms 113.945 ms 114.383 ms 9 ae-62-51.ebr2.Atlanta2.Level3.net (4.68.103.29) 16.732 ms 17.245 ms 17.630 ms10 ae-3.ebr2.Chicago1.Level3.net (4.69.132.73) 44.394 ms 45.461 ms 44.855 ms11 ae-21-52.car1.Chicago1.Level3.net (4.68.101.34) 42.847 ms ae-21-54.car1.Chicago1.Level3.net (4.68.101.98) 41.702 ms ae-21-52.car1.Chicago1.Level3.net (4.68.101.34) 42.151 ms ...
traceroute to 4.200.50.2 (4.200.50.2), 30 hops max, 40 byte packets 1 192.168.1.1 (192.168.1.1) 1.800 ms 2.745 ms 3.339 ms 2 c-69-180-16-1.hsd1.ga.comcast.net (69.180.16.1) 11.581 ms 14.657 ms 15.170 ms 3 ge-2-1-ur01.a2atlanta.ga.atlanta.comcast.net (68.86.108.17) 13.574 ms 17.884 ms 18.412 ms 4 te-9-1-ur02.a2atlanta.ga.atlanta.comcast.net (68.85.232.38) 18.855 ms 19.299 ms 19.680 ms 5 te-9-3-ur01.b0atlanta.ga.atlanta.comcast.net (68.86.106.54) 20.549 ms 21.048 ms 21.990 ms 6 po-4-ar01.b0atlanta.ga.atlanta.comcast.net (68.86.106.9) 21.430 ms 7.738 ms 9.826 ms 7 pos-1-4-0-0-cr01.atlanta.ga.ibone.comcast.net (68.86.90.121) 11.735 ms 12.293 ms 15.289 ms 8 * * * 9 ae-62-51.ebr2.Atlanta2.Level3.net (4.68.103.29) 25.935 ms 26.458 ms 26.833 ms10 ae-63-60.ebr3.Atlanta2.Level3.net (4.69.138.4) 28.142 ms ae-73-70.ebr3.Atlanta2.Level3.net (4.69.138.20) 27.507 ms ae-63-60.ebr3.Atlanta2.Level3.net (4.69.138.4) 28.508 ms11 ae-7.ebr3.Dallas1.Level3.net (4.69.134.21) 50.636 ms 49.957 ms *12 ae-3.ebr2.LosAngeles1.Level3.net (4.69.132.77) 67.687 ms 61.311 ms 77.365 ms13 ae-72-72.csw2.LosAngeles1.Level3.net (4.69.137.22) 75.953 ms ae-62-62.csw1.LosAngeles1.Level3.net (4.69.137.18) 68.112 ms 67.813 ms14 ge-9-2.core1.LosAngeles1.Level3.net (4.68.102.167) 69.337 ms ge-5-2.core1.LosAngeles1.Level3.net (4.68.102.135) 68.195 ms ge-5-1.core1.LosAngeles1.Level3.net (4.68.102.71) 71.751 ms ...
Traceroute verdict*
28
N-hybrid example
4.0.0.0/84.0.0.0/94.128.0.0/94.20.90.56/294.21.103.0/244.224.56.0/244.23.112.0/244.23.113.0/244.23.114.0/244.23.88.0/234.23.88.0/244.23.89.0/244.23.92.0/224.23.92.0/234.23.94.0/234.36.118.0/24
4.38.0.0/204.38.0.0/214.38.8.0/214.43.50.0/234.43.50.0/244.43.51.0/244.67.104.0/214.67.96.0/204.67.96.0/214.78.22.0/234.78.56.0/234.79.181.0/244.79.201.0/264.79.22.0/234.79.248.0/24
Level 3: 4.0.0.0/8 4.43.50.2?
BGP: 4.43.50.0/24 20-hybrid: 4.43.50.0/24
4.150.50.2? BGP: 4.128.0.0/9 20-hybrid: 4.150.48.0/20
4.200.50.2? BGP: 4.128.0.0/9 20-hybrid: 4.200.48.0/20
BGP (Routeviews)
Natural follow up, came for free → dessin
29
References[1] Xie et al., P4P: Provider Portal for Applications, in Proc. ACM SIGCOMM, 2008
[2] Aggarwal et al., Can ISPs and P2P systems co-operate for improvedperformance?, ACM SIGCOMM Computer Communications Review (CCR),37(3):29–40, July 2007
[3] Saucez et al., Interdomain Traffic Engineering in a Locator/Identifier Separation Context, Internet Network Management Workshop 2008
[4] Dabek et al., Vivaldi, a decentralized network coordinated system. ACM SIGCOMM, 2004
[5] Krishnamurthy, B., Wang, J., Topology modeling via cluster graphs. ACM SIGCOMM Workshop on Internet Measurement (IMW), 2001
[6] Szymaniak, M. et al., Practical large-scale latency estimation. Computer Networks, 2008
[7 ]Krishnamurthy, B., Wang, J., On network-aware clustering of web clients. ACM SIGCOMM, 2000