internet routing instability three papers presented by michael a. smith craig labovitz, g. robert...

51
Internet Routing Internet Routing Instability Instability Three Papers Presented by Michael A. Smith Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability." IEEE/ACM Transactions "Internet Routing Instability." IEEE/ACM Transactions on Networking, 6(5):515-528, 1998. on Networking, 6(5):515-528, 1998. Craig Labovitz, G. Robert Malan, Farnam Jahanian, Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Origins of Internet Routing Instability", IEEE "Origins of Internet Routing Instability", IEEE INFOCOM 1999. INFOCOM 1999. Craig Labovitz, G. Abha Ahuja, Farnam Jahanian, Craig Labovitz, G. Abha Ahuja, Farnam Jahanian, "Experimental Study of Internet Stability and Backbone "Experimental Study of Internet Stability and Backbone Failures." FTCS 1999. Failures." FTCS 1999.

Upload: nathaniel-harper

Post on 16-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Internet Routing InstabilityInternet Routing InstabilityThree Papers Presented by Michael A. SmithThree Papers Presented by Michael A. Smith

Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability." IEEE/ACM Transactions on Networking, 6(5):515-Instability." IEEE/ACM Transactions on Networking, 6(5):515-528, 1998.528, 1998.

Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Origins of Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Origins of Internet Routing Instability", IEEE INFOCOM 1999.Internet Routing Instability", IEEE INFOCOM 1999.

Craig Labovitz, G. Abha Ahuja, Farnam Jahanian, "Experimental Craig Labovitz, G. Abha Ahuja, Farnam Jahanian, "Experimental Study of Internet Stability and Backbone Failures." FTCS 1999. Study of Internet Stability and Backbone Failures." FTCS 1999.

Page 2: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 22 of 50 of 50

BackgroundBackground EventsEvents

NSFNet backbone ended in April ‘95 Evident

Network degradationNetwork degradation bandwidth shortagesbandwidth shortages lack of router switching capacitylack of router switching capacity

“Death of Internet is Imminent” reported by popular pressreported by popular press

Routing Instability (“route flaps”)Routing Instability (“route flaps”) Informally defined as:

““the rapid change of network reachability and the rapid change of network reachability and topology information”topology information”

Page 3: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 33 of 50 of 50

The Internet BackboneThe Internet Backbone 12 large ISPs, tier one12 large ISPs, tier one

4000-6000 tier two providers 4000-6000 tier two providers

Large public exchange points are Large public exchange points are considered the “core” of the Internet.considered the “core” of the Internet.

Backbone service providers must maintain a Backbone service providers must maintain a complete map, or complete map, or default-freedefault-free routing table. routing table.

Divided into different regions of Divided into different regions of administrative control called autonomous administrative control called autonomous systems (AS’s).systems (AS’s).

Most AS’s exchange routing information Most AS’s exchange routing information through the border gateway protocol (BGP).through the border gateway protocol (BGP).

Page 4: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 44 of 50 of 50

Routing InstabilityRouting Instability

OriginsOrigins Router configuration errors Transient physical and data link

problems Software bugs

EffectsEffects Poorer end-to-end network

performance Degradation of overall efficiency of

the Internet infrastructure

Page 5: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 55 of 50 of 50

Route FlapsRoute Flaps Result in large number of routing Result in large number of routing

updates passed to core Internet updates passed to core Internet exchange point routers.exchange point routers.

Network instability spreads from Network instability spreads from router to router and propagates router to router and propagates throughout the network.throughout the network.

Effects in Internet infrstructure:Effects in Internet infrstructure: Increased packet loss Delays in time for network convergence Resource overhead (CPU, memory, etc.)

Page 6: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 66 of 50 of 50

BGPBGP An incremental protocolAn incremental protocol

Does not flood intra-domain network with Does not flood intra-domain network with topological information or link state entries topological information or link state entries (like IGRP and OSPF)(like IGRP and OSPF)

Sends update information only upon Sends update information only upon changes in topology or policychanges in topology or policy

Uses TCP as underlying transport Uses TCP as underlying transport mechanism (as opposed to reliability mechanism (as opposed to reliability through datagram service)through datagram service)

As a path vector routing protocol, it limits As a path vector routing protocol, it limits the distribution of reachability information.the distribution of reachability information.

Page 7: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 77 of 50 of 50

Routing on the BackboneRouting on the Backbone pathpath - sequence of intermediate AS’s between source and - sequence of intermediate AS’s between source and

destination routers that form a directed route for packets destination routers that form a directed route for packets to travelto travel

Router configuration files allow the stipulation of routing Router configuration files allow the stipulation of routing policies which may:policies which may:

specify the filtering of specific routes modify path attributes before sharing

Policy decisions can be made based on:Policy decisions can be made based on: announcement of routes from peers attributes of announced routes (such as MED’s)

After each router makes a new local decision on the best After each router makes a new local decision on the best route to a destination, it sends it.route to a destination, it sends it.

As the route propagates, each AS appends its unique As the route propagates, each AS appends its unique number to the route’s number to the route’s ASPATHASPATH, which, in conjunction with , which, in conjunction with the prefix, provides a specific handle for transit. the prefix, provides a specific handle for transit.

The The ASPATH ASPATH mechanism allows a router to detect and mechanism allows a router to detect and prevent routing loops.prevent routing loops.

Page 8: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 88 of 50 of 50

Routing Information in BGPRouting Information in BGP Two forms:Two forms:

Announcements Indicates that a router has either learned a new network Indicates that a router has either learned a new network

attachment or has made a policy decision to prefer a diff. attachment or has made a policy decision to prefer a diff. route to a destination.route to a destination.

Withdrawals Sent when a router decides that a network is no longer Sent when a router decides that a network is no longer

reachablereachable Paper distinguishes between:Paper distinguishes between:

Explicit – associated with actual withdrawal message Implicit – existing route replaced by new route

A BGP A BGP updateupdate may contain multiple may contain multiple announcements and withdrawals.announcements and withdrawals. Ideally, routers should only generate routing

updates for relatively infrequent policy changes and the addition of new physical networks.

It’s been found that BGP’s ASPATH It’s been found that BGP’s ASPATH mechanism is not sufficient to ensure mechanism is not sufficient to ensure network convergence.network convergence.

Page 9: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 99 of 50 of 50

Methodology of StudiesMethodology of Studies Geographically diverse exchange points.Geographically diverse exchange points.

Although the route servers do not forward network Although the route servers do not forward network traffic, the route servers do peer with over 90% of traffic, the route servers do peer with over 90% of the service providers at each exchange point.the service providers at each exchange point.

Page 10: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1010 of 50 of 50

Route Tracker ArchitectureRoute Tracker Architecture

Devloped on Sun workstations

Uses MRT and IPMA toolkits to analyze BGP updates

Page 11: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1111 of 50 of 50

““Internet Routing Instability”Internet Routing Instability” Monitored BGP updates generated by Monitored BGP updates generated by five five service provider service provider

backbone routers at the major U.S. public exchange points backbone routers at the major U.S. public exchange points over a period of over a period of ninenine months. months.

Paper distinguishes three types of updates:Paper distinguishes three types of updates: forwarding instability – may reflect legitimate topological changes

and affects the paths on which data will be forwarded routing policy fluctuation – reflects changes in routing policy

information that do no affect forwarding paths pathological – updates are redundant BGP information that do not

reflect routing nor forwarding instability

Instability is defined as:Instability is defined as: an instance of either forwarding instability or policy fluctuation

Data reflects the stability of inter-domain Internet routing, Data reflects the stability of inter-domain Internet routing, or changes in topology or policy among AS’sor changes in topology or policy among AS’s

““Intra-domain routing instability is not explicitly measured Intra-domain routing instability is not explicitly measured and is only indirectly observed through BGP information and is only indirectly observed through BGP information exchanged with a domain’s peer.”exchanged with a domain’s peer.”

Page 12: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1212 of 50 of 50

Results of StudyResults of Study The number of BGP updates exchanged per day in The number of BGP updates exchanged per day in

the Internet core is one or more orders of magnitude the Internet core is one or more orders of magnitude larger than expected.larger than expected.

Routing information is dominated by pathological, or Routing information is dominated by pathological, or redundant updates, which may not reflect changes in redundant updates, which may not reflect changes in routing policy or topology.routing policy or topology.

Instability and redundant updates exhibit a specific Instability and redundant updates exhibit a specific periodicity of 30 and 60 seconds.periodicity of 30 and 60 seconds.

Instability and redundant updates show a surprising Instability and redundant updates show a surprising correlation to network usage and exhibit correlation to network usage and exhibit corresponding daily and weekly cyclic trends.corresponding daily and weekly cyclic trends.

Instability is not dominated by a small set of Instability is not dominated by a small set of autonomous systems or routes.autonomous systems or routes.

Page 13: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1313 of 50 of 50

Results of Study (2)Results of Study (2) Instability and redundant updates exhibit Instability and redundant updates exhibit

both strong high and low frequency both strong high and low frequency components. Much of the high frequency components. Much of the high frequency instability is pathological.instability is pathological.

Discounting the contribution of redundant Discounting the contribution of redundant updates, the majority (over 80%) of Internet updates, the majority (over 80%) of Internet routes exhibits a high degree of stability.routes exhibits a high degree of stability.

This work has led to specific architectural This work has led to specific architectural and protocol changes in commercial and protocol changes in commercial Internet routers through the collaboration Internet routers through the collaboration with vendors.with vendors.

Page 14: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1414 of 50 of 50

Methodology of Study (2)Methodology of Study (2) 12 Gb of data starting in January ’9612 Gb of data starting in January ’96

Uses several tools from XYZ toolkitUses several tools from XYZ toolkit

Focuses on largest exchange, Mae-Focuses on largest exchange, Mae-EastEast

Data verification against BGP Data verification against BGP backbone logs from a number of large backbone logs from a number of large service providersservice providers

Page 15: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1515 of 50 of 50

More BackgroundMore Background Problems of network topology fluctuation (non-Problems of network topology fluctuation (non-

convergence):convergence): packets get dropped packets delivered out of order

Internet routers of the day were based on route caching Internet routers of the day were based on route caching architecture.architecture.

Each interface card maintains a routing table of cache of destination and next-hop lookups

If found, then switch on CPU independent “fast-path.”

Sustained levels of instability increase the probability of Sustained levels of instability increase the probability of packet encountering a cache miss, which leads to:packet encountering a cache miss, which leads to:

increased load on CPU increased switching latency dropped or lost packets queuing delay, preventing timely routing of Keep-Alive packets

It should be noted that new generations of routers that do It should be noted that new generations of routers that do not require caching and are able to maintain the full not require caching and are able to maintain the full routing table in memory do not exhibit the same routing table in memory do not exhibit the same pathological loss under heavy routing updates.pathological loss under heavy routing updates.

Page 16: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1616 of 50 of 50

Route Flap StormsRoute Flap Storms A failed router can instigate a “route flap A failed router can instigate a “route flap

storm.”storm.” This pathological oscillation causes overloaded routers

to be marked as unreachable since the required interval of Keep-Alive transmissions is not met.

Peers of the failed router find alternative paths for destinations previously reachable and transmit updates.

After the failed router recovers, it will re-initiate BGP peering sessions with peers, transmit large state dumps, and cause more routers to fail.

““Route Flap Storms” in 1996 caused Route Flap Storms” in 1996 caused extended outages for several million extended outages for several million network customers.network customers.

Newer generations of routers provide a Newer generations of routers provide a mechanism for giving BGP and Keep-Alive mechanism for giving BGP and Keep-Alive messages higher priority.messages higher priority.

Page 17: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1717 of 50 of 50

Battling Routing InstabilityBattling Routing Instability Route Aggregation (Supernetting):Route Aggregation (Supernetting):

combines a number of smaller IP prefixes into a single, less specific route announcement.

reduces overall number of networks visible on the core Internet

fails in multi-homing (when end-sites have redundant connections to the internet via multiple service providers).

In 1996, more than 25% (and growing) of prefixes In 1996, more than 25% (and growing) of prefixes were multi-homed and therefore non-aggregatable.were multi-homed and therefore non-aggregatable.

Deployment of route dampening Deployment of route dampening algorithmsalgorithms “hold-down” updates that exceed certain

parameters (i.e. quota of updates per hour) can introduce artificial connectivity problems

as “legitimate” announcements are delayed.

Page 18: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1818 of 50 of 50

ProblemsProblems The internet continues to exhibit high The internet continues to exhibit high

levels of routing instability despite the levels of routing instability despite the increased emphasis on aggregation and increased emphasis on aggregation and route dampening.route dampening.

Internet topology is growing increasingly Internet topology is growing increasingly less hierarchical with the addition of new less hierarchical with the addition of new exchange points and peering exchange points and peering relationships.relationships.

The behavior and dynamics of Internet The behavior and dynamics of Internet routing stability has gone mostly without routing stability has gone mostly without formal study prior to the publication of formal study prior to the publication of the paper. Little was known!the paper. Little was known!

Page 19: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 1919 of 50 of 50

ObservationsObservations Disproportionalism:Disproportionalism:

42,000 Internet prefixes 1300 Autonomous Systems 1500 Unique ASPATHS 3-6 million routing updates per day 125 updates per network per day

At times, 100 prefix announcements per sec.At times, 100 prefix announcements per sec.

Once exceeded 30 million, monitor crashed!

This is a problem for all but the most This is a problem for all but the most high-end of commercial routers, and high-end of commercial routers, and even they exhibit problems.even they exhibit problems.

Page 20: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2020 of 50 of 50

Classification of BGP UpdatesClassification of BGP Updates

WADiffWADiff – A route is explicitly withdrawn as it becomes – A route is explicitly withdrawn as it becomes unreachable and it is later replaced with an unreachable and it is later replaced with an alternative route to the same destination; forwarding alternative route to the same destination; forwarding instability.instability.

AADiffAADiff – A route is implicitly withdrawn and replaced – A route is implicitly withdrawn and replaced by an alternative route as the original route becomes by an alternative route as the original route becomes unreachable, or a preferred alternative path unreachable, or a preferred alternative path becomes available; forwarding instability.becomes available; forwarding instability.

WADupWADup – A route is explicitly withdrawn and then re- – A route is explicitly withdrawn and then re-announced as unreachable. This may reflect announced as unreachable. This may reflect transient topological (link or router failure, or it may transient topological (link or router failure, or it may represent a pathological oscillation; forwarding represent a pathological oscillation; forwarding instability or pathological behavior (instability or pathological behavior (see next slidesee next slide))

All considered to be instabilityAll considered to be instability

Page 21: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2121 of 50 of 50

Classification of Pathological Classification of Pathological Behavior (Redunant Updates)Behavior (Redunant Updates) AADupAADup – A route is implicitly withdrawn and – A route is implicitly withdrawn and

replaced with a duplicate of the original replaced with a duplicate of the original route (a router should only send an update route (a router should only send an update for a change in topology). for a change in topology).

WWDupWWDup – The repeated transmission of BGP – The repeated transmission of BGP withdrawals for a prefix that is currently withdrawals for a prefix that is currently unreachable.unreachable.

All considered to beAll considered to be pathological pathological instability.instability.

Pathological updates may have a minimal Pathological updates may have a minimal impact on the performance of the Internet.impact on the performance of the Internet.

Page 22: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2222 of 50 of 50

Expected InstabilityExpected Instability Problems affecting aggregation into Problems affecting aggregation into

supernets:supernets: Multi-homing initial lack of hierarchical IP address space

allocation reluctance to renumber IP addresses

Result: Large number of globally visible Result: Large number of globally visible addressesaddresses

Each globally visible address is reachable by Each globally visible address is reachable by one or more paths.one or more paths.

You would expect Internet instability to be You would expect Internet instability to be proportional to the total number of available proportional to the total number of available paths to all globally visible network paths to all globally visible network addresses or aggregatesaddresses or aggregates

Page 23: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2323 of 50 of 50

Mae-East Routing UpdatesMae-East Routing Updates

Most WWDup withdrawals are transmitted by routers belonging to Most WWDup withdrawals are transmitted by routers belonging to AS’s that never previously announce reachability from the withdrawn AS’s that never previously announce reachability from the withdrawn prefixes.prefixes.

On average, 500,000 – 6 million pathological withdrawals per dayOn average, 500,000 – 6 million pathological withdrawals per day

Page 24: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2424 of 50 of 50

Update Totals per ISP on a Update Totals per ISP on a Given DayGiven Day

Many of the exchange point routers withdraw an Many of the exchange point routers withdraw an order of order of magnitudemagnitude more routes than they announce during a given more routes than they announce during a given day.day.

Provider I shows the disproportionate effect that a single Provider I shows the disproportionate effect that a single service provider can have on the global routing mesh.service provider can have on the global routing mesh.

Page 25: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2525 of 50 of 50

More ObservationsMore Observations Guess what:Guess what:

There is a strong causal relationship between the manufacturer of router used by an ISP and the ISP’s exhibited level of pathological BGP behavior.

Routing updates have a regular, Routing updates have a regular, specific periodicity, usually either 30 specific periodicity, usually either 30 or 60 seconds.or 60 seconds.

The The persistencepersistence of instability is the of instability is the duration of time that routing duration of time that routing information fluctuates before it information fluctuates before it stabilizes.stabilizes.

Page 26: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2626 of 50 of 50

Origins of Routing PathologiesOrigins of Routing Pathologies

Some pathological withdrawals can be Some pathological withdrawals can be at attributed to implementation at attributed to implementation decisionsdecisions time-space trade off in not maintaining state

of advertisements stateless BGP = O(N*U) updates Presentation of results led to a router

vendor’s updating of software to a partial state

Stateless BGP contributes an Stateless BGP contributes an insignificant number of updates and insignificant number of updates and does not account for oscillating does not account for oscillating behavior of WWDup and AADup behavior of WWDup and AADup updates.updates.

Page 27: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2727 of 50 of 50

Origins of Routing Pathologies (2)Origins of Routing Pathologies (2)

Single-homed, stateless peer routers should Single-homed, stateless peer routers should result in at most result in at most O(N) O(N) updates, but instead:updates, but instead: It seemed that each legitimate withdrawal

induces some type of short-lived pathological network oscillation

Persistence of these updates is between 1 and 5 minutes

Periodic routing instability may be caused Periodic routing instability may be caused by:by: inadvertant synchronization on update

transmission improper configuration of interaction between IGP

and BGP (conversion is lossy) Internet Routing Instability still remains Internet Routing Instability still remains

poorly understoodpoorly understood

Page 28: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2828 of 50 of 50

Forwarding InstabilityForwarding Instability

Instability DensityInstability Density

Black squares are above a particular threshold (mean of Black squares are above a particular threshold (mean of detrended data) (345 updates in March, 770 in September)detrended data) (345 updates in March, 770 in September)

Page 29: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 2929 of 50 of 50

Forwarding Instability (2)Forwarding Instability (2)

A week of raw forwardingA week of raw forwarding

Little instability over the weekendLittle instability over the weekend

Page 30: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3030 of 50 of 50

Forwarding Instability (3)Forwarding Instability (3)

Time series analyses, FFT and MEM spectral estimation, validate results.

Routing instability corresponds closely to trends in Internet bandwidth usage and packet loss (intuitively obvious?)

Rigorous justification of network usage equating to routing instability is problematic due to the size and heterogeneity of the internet.

Page 31: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3131 of 50 of 50

Fine-grained Instability Stats.Fine-grained Instability Stats.

NoNo single AS consistently dominates single AS consistently dominates the instability statistics.the instability statistics.

There is There is notnot a correlation between a correlation between the size (# routes responsible for in the size (# routes responsible for in table) of an AS and its proportion of table) of an AS and its proportion of the instability statistics.the instability statistics.

A small set of paths or prefixes do A small set of paths or prefixes do notnot dominate the instability statistics; dominate the instability statistics; instability is evenly distributed across instability is evenly distributed across routesroutes

Page 32: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3232 of 50 of 50

Fine-grained Instability Stats. (2)Fine-grained Instability Stats. (2)

Internet routing tables are dominated by 6-8 ISPs

Over the course of the month, their share of the default-free routing tables did not change significantly

Page 33: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3333 of 50 of 50

Fine-grained Instability Stats. (3)Fine-grained Instability Stats. (3)

Internet routing tables are dominated by 6-8 ISPs

Over the course of the month, their share of the default-free routing tables did not change significantly

Page 34: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3434 of 50 of 50

Fine-grained Instability Stats. (4)Fine-grained Instability Stats. (4)

80-100% of the daily instability is contributed by Prefix + AS pairs announced less than 50 times.

(a) ISP A announced seven routes between 630 and 650 times with no withdrawals

Page 35: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3535 of 50 of 50

Fine-grained Instability Stats. (5)Fine-grained Instability Stats. (5)

80-100% of the daily instability is contributed by Prefix + AS pairs announced less than 50 times.

(c) ISP A announced seven routes between 630 and 650 times with no withdrawals

Page 36: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3636 of 50 of 50

Fine-grained Instability Stats. (6)Fine-grained Instability Stats. (6)

(a) 20-90% of AADiff events are contributed by (a) 20-90% of AADiff events are contributed by routes that changed 10 times or lessroutes that changed 10 times or less

No single route consistently dominates the instability No single route consistently dominates the instability measured.measured.

Some days, a single Prefix+AS pair contributes Some days, a single Prefix+AS pair contributes substantially (40%) - account for lowest curve in (a) substantially (40%) - account for lowest curve in (a) (ISP A)(ISP A)

WADiff climbs to a plateau about 95% faster than WADiff climbs to a plateau about 95% faster than other three categories.other three categories.

WADiff has fewest number of Prefix+AS pairs that WADiff has fewest number of Prefix+AS pairs that dominate their days.dominate their days. Comforting, since categories probably best represent

topological instability

Investigation on prefix alone provided similar results.Investigation on prefix alone provided similar results.

Page 37: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3737 of 50 of 50

Temporal Properties of Temporal Properties of Instability StatisticsInstability Statistics Update frequency distributions for Update frequency distributions for

instability events at Prefix+AS levelinstability events at Prefix+AS level Update frequency is the inverse of the inter-

arrival time between routing updates; higher frequency corresponds to a short inter-arrival time

Other work has been able to capture Other work has been able to capture the lower frequencies through both the lower frequencies through both routing table snapshots and end-to-routing table snapshots and end-to-end techniquesend techniques

Page 38: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3838 of 50 of 50

Temporal Properties of Temporal Properties of Instability Statistics (2)Instability Statistics (2)

Histogram distribution captured in 30 second and 1 minute bins

You would expect a Poisson distribution reflecting exogneous events, such as power outages, fiber cuts, and natural human events.

30 second periodicity suggests widespread systematic influence in origin.

Page 39: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 3939 of 50 of 50

Temporal Properties of Temporal Properties of Instability Statistics (3)Instability Statistics (3)

Histogram distribution captured in 30 second and 1 minute bins

You would expect a Poisson distribution reflecting exogneous events, such as power outages, fiber cuts, and natural human events.

30 second periodicity suggests widespread systematic influence in origin.

Page 40: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4040 of 50 of 50

ConclusionsConclusions Routing instability can have a significant Routing instability can have a significant

deleterious impact in Internet infrastructuredeleterious impact in Internet infrastructure

Majority (99%) of routing information is Majority (99%) of routing information is pathological and may not reflect real pathological and may not reflect real network topological changes.network topological changes.

Instability is well distributed across AS’s and Instability is well distributed across AS’s and prefix space.prefix space.

Instability and redundant routing Instability and redundant routing information exhibit a strong periodicity (of information exhibit a strong periodicity (of unknown origin).unknown origin).

Page 41: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4141 of 50 of 50

Conclusions (2)Conclusions (2)

Proportion of Internet Routes Proportion of Internet Routes affected by routing updatesaffected by routing updates

Page 42: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4242 of 50 of 50

Conclusions (3)Conclusions (3) Current trends in the evolution of the Current trends in the evolution of the

Internet may have a significant impact on Internet may have a significant impact on routing instability and the future routing instability and the future performance of the network.performance of the network.

25% of networks are multi-homed and the 25% of networks are multi-homed and the growth rate is about lineargrowth rate is about linear

Proliferation of exchange points is leading to Proliferation of exchange points is leading to a less hierarchical Internet.a less hierarchical Internet.

This research helps characterize the effect This research helps characterize the effect of added topological complexity since the of added topological complexity since the end of the NSFNet backbone.end of the NSFNet backbone.

Page 43: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4343 of 50 of 50

““Origins of Internet Routing Origins of Internet Routing Instability”Instability” 28 months gathering data from more than 40 28 months gathering data from more than 40

commercial routers, switches, and Unix-based commercial routers, switches, and Unix-based PC routersPC routers

Also collected IBGP information at the state of Also collected IBGP information at the state of Michigan’s public Internet backbone, MichNetMichigan’s public Internet backbone, MichNet

Maintains that routing instability remains well Maintains that routing instability remains well distributed across prefix and AS space but distributed across prefix and AS space but that instability is that instability is not related to prefix length.not related to prefix length.

Since previous paper’s work, the volume of Since previous paper’s work, the volume of inter-domain routing messages in the Internet inter-domain routing messages in the Internet core has decreased by an order of magnitude.core has decreased by an order of magnitude.

Page 44: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4444 of 50 of 50

Research Pays OffResearch Pays Off

Number of BGP updates almost doubled in 28 mo.’sNumber of BGP updates almost doubled in 28 mo.’s Number of announcements per day eventually Number of announcements per day eventually

(finally) surpassed the number of withdrawals at Mae (finally) surpassed the number of withdrawals at Mae East.East.

On average, across backbone, exchange point On average, across backbone, exchange point routers generated only half of the number of routers generated only half of the number of withdrawals at the number of announcementswithdrawals at the number of announcements

Page 45: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4545 of 50 of 50

New Routing Update New Routing Update CategoriesCategories We still have AADiff, AADup, and We still have AADiff, AADup, and

WWDup, but we add:WWDup, but we add: Tup and Tdown – fluctuation in the

reachability for a given prefix. An announced route is withdrawn and transitions down, or a currently unreachable prefix is announced as reachable and transitions up

Page 46: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4646 of 50 of 50

Breakdown of BGP UpdatesBreakdown of BGP Updates

Tup roughly equal to Tdown, connection recovery (good!) Fluctuation in prefix reachability account for over 40% of all non

WWDup BGP traffic After January ’98, AADup comprised largest cat. of updates.

Page 47: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4747 of 50 of 50

Analysis of AADiffsAnalysis of AADiffs

90% of MED oscillations involve 90% of MED oscillations involve only two large ISPs, product of only two large ISPs, product of their specific routing policies.their specific routing policies.

Page 48: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4848 of 50 of 50

Dynamically Mapped Dynamically Mapped MEDMED

AS2 always wants traffic flowing from AS3 to AS1 to take the shortest path through its network, so instead of setting the MED value via static configuration rules, AS2 dynamically maps the IGP distance between R5 and R3, and between R5 and R4 to the MED attribute value associated with route advertisements from routers R3 and R4 to AS1.

AS2 influences AS1 who wants to reach Network A. AS1 will prefer the route via R4.

Page 49: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 4949 of 50 of 50

More ResultsMore Results

Page 50: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 5050 of 50 of 50

ConclusionsConclusions

ImprovementImprovement Routing update messages reduced by

a magnitude Suppressed pathological withdrawals Instability is still well distributed

across AS and prefix space More bugs in router software led to

anomalies

Page 51: Internet Routing Instability Three Papers Presented by Michael A. Smith Craig Labovitz, G. Robert Malan, Farnam Jahanian, "Internet Routing Instability."

Spring 2006Spring 2006 Internet Routing InstabilityInternet Routing Instability 5151 of 50 of 50

““Experimental Study of Internet Stability Experimental Study of Internet Stability and Wide-Area Backbone Failures”and Wide-Area Backbone Failures”

ConclusionsConclusions Internet has proven remarkably

robust. A small number of routes contribute to

overall unavailability. 40% of routes exhibit multiple failures Outages lasting longer than two hours

usually represent long-term outages requiring significant engineering effort for repair

BGP failures must stemp from non-hardware/software sourcdes, probably TCP characteristics.