magellan: a tool for unicast fault isolation cengiz alaettinoglu packet design llc ramesh govindan...

22
Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information Sciences Institute

Upload: maria-anthony

Post on 08-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

Goals User's perspective What is of interest to user Internet wide routing monitoring not just an AS History of route changes not just a snapshot Fault diagnosis link/router failure/repair

TRANSCRIPT

Page 1: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Magellan: A Tool for Unicast Fault Isolation

Cengiz AlaettinogluPacket Design LLC

Ramesh GovindanInformation Sciences Institute

John MehringerInformation Sciences Institute

Page 2: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Motivation

Why can't I reach www.cnn.com? Why is the Internet soooo slow today? It was fine yesterday!

Page 3: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Goals

User's perspective What is of interest to user

Internet wide routing monitoring not just an AS

History of route changes not just a snapshot

Fault diagnosis link/router failure/repair

Page 4: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Challenges

Scaling Directed search by correlating destinations Shared learning

Automated heuristics for fault isolation Route change Location of link/router failure/repair Oscillations Others?

Page 5: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Data Collection

Select target's interesting to the user tcpdump/libpcap Weighting / aging (not implemented)

Initial path to targets traceroute

Monitoring paths Carefully constructed ICMP probes

Page 6: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Snapshot

Page 7: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information
Page 8: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Monitoring

Construct a routing graph Nodes: routers Links: (to, from, source, destination, hop, statistics...)

Probe each link Send two ICMP Echo Request packets to destination

For ttl = hop - 1, hop, verify incident routers, to, from

Page 9: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Scheduling Probes

WRR schedule a probe for each link Limits the rate of probe packets Weights: some links are more important/interesting

Distance to link No of destinations using it History of volatility

Exponentially averaged

Page 10: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Test Result

Positive Do nothing

Negative Determine new path

Incremental traceroute from the link upstream and downstream

Determine cause Automatic heuristics based

Page 11: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Active Fault Isolation

Link failure Probe the link using other destinations that uses it Correlate results

Router failure Generalize on link failure

Oscillations History of old routes Back and forth between a set of routes

Page 12: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Magellan Components

Magellan Nam

Perl Script

Visualization Offline or real-time Great for debugging/tuning

Page 13: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Snapshot

Link or router failure I want the nam buttons, etc...

Page 14: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Effectiveness thru Measurement

Picked 500 popular web sites Yahoo, msn, aol, cnn, ... www.web100.com

Monitored routes to these destinations for 7 days

Page 15: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Measurements

Number of Link Probes: 839694 Probe per second: 1.39 / second

Total Failures: 2078 Router Failures: 334 Link Failures: 951 Unknown cause: 793

Transients Number of Oscillations: 541

Page 16: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

No of Path Changes

Page 17: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Effect of Path Length

Page 18: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Dominant Path

Page 19: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Cumulative Dominant Path

Page 20: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Future work: Distributed Magellan

Magellan 1

Magellan 2

Weight to probe inversely proportional to ratio of distances

Shared learning

Page 21: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Related Work

Topology Maps Router/AS level interconnections Mercator, skitter, AT&T Not all links are usable (routing policy/metrics)

Routing Topology Effect of policy/metrics Npd Vern Paxson's work Focus is on measurement

Page 22: Magellan: A Tool for Unicast Fault Isolation Cengiz Alaettinoglu Packet Design LLC Ramesh Govindan Information Sciences Institute John Mehringer Information

Conclusions

Unicast fault isolation User's perspective Automated heuristics History of changes

http://www.isi.edu/scan