identifying problematic inter-domain routing issues

30
Identifying problematic Identifying problematic inter-domain routing inter-domain routing issues issues Olaf Maennel, Anja Feldmann Olaf Maennel, Anja Feldmann Saarland University, Saarbücken, Germany Saarland University, Saarbücken, Germany

Upload: chesna

Post on 06-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Identifying problematic inter-domain routing issues. Olaf Maennel, Anja Feldmann Saarland University, Saarbücken, Germany. BGP scalability?!! BGP convergence times??? A lot of open questions, that need understanding! What happens really in the Internet?. Motivation. Data munching - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Identifying problematic  inter-domain routing issues

Identifying problematic Identifying problematic inter-domain routing issuesinter-domain routing issues

Identifying problematic Identifying problematic inter-domain routing issuesinter-domain routing issues

Olaf Maennel, Anja FeldmannOlaf Maennel, Anja FeldmannSaarland University, Saarbücken, GermanySaarland University, Saarbücken, Germany

Olaf Maennel, Anja FeldmannOlaf Maennel, Anja FeldmannSaarland University, Saarbücken, GermanySaarland University, Saarbücken, Germany

Page 2: Identifying problematic  inter-domain routing issues

MotivationMotivationMotivationMotivation

• BGP scalability?!!BGP scalability?!!

• BGP convergence times???BGP convergence times???

• A lot of open questions, A lot of open questions, that need understanding! that need understanding!

• What happens really in the Internet?What happens really in the Internet?

• BGP scalability?!!BGP scalability?!!

• BGP convergence times???BGP convergence times???

• A lot of open questions, A lot of open questions, that need understanding! that need understanding!

• What happens really in the Internet?What happens really in the Internet?

Page 3: Identifying problematic  inter-domain routing issues

TOOL: “Character”TOOL: “Character”TOOL: “Character”TOOL: “Character”

• Data munchingData munching•automatic processing of raw dataautomatic processing of raw data•providing an intermediate levelproviding an intermediate level

• Characterizing BGP updatesCharacterizing BGP updates•identification of update eventsidentification of update events

• Data munchingData munching•automatic processing of raw dataautomatic processing of raw data•providing an intermediate levelproviding an intermediate level

• Characterizing BGP updatesCharacterizing BGP updates•identification of update eventsidentification of update events

Page 4: Identifying problematic  inter-domain routing issues

TOOL: “Character”TOOL: “Character”TOOL: “Character”TOOL: “Character”

RAW-DATARAW-DATARAW-DATARAW-DATA

FileFinder - PackageFileFinder - PackageFileFinder - PackageFileFinder - Package

your functionyour function(or "Check" functions)(or "Check" functions)

your functionyour function(or "Check" functions)(or "Check" functions) resultsresultsresultsresults

Page 5: Identifying problematic  inter-domain routing issues

route change eventsroute change eventsroute change eventsroute change events

• Identification of routing updatesIdentification of routing updates– type of changes, flapping, type of changes, flapping,

session resets, …session resets, …• Processing of updates in the context ofProcessing of updates in the context of

– related related (same prefix)(same prefix)

– surrounding surrounding (near in time)(near in time)

• How “character” worksHow “character” works– Input: Input:

table dump1 – all updates – table dump2table dump1 – all updates – table dump2

• Identification of routing updatesIdentification of routing updates– type of changes, flapping, type of changes, flapping,

session resets, …session resets, …• Processing of updates in the context ofProcessing of updates in the context of

– related related (same prefix)(same prefix)

– surrounding surrounding (near in time)(near in time)

• How “character” worksHow “character” works– Input: Input:

table dump1 – all updates – table dump2table dump1 – all updates – table dump2

Page 6: Identifying problematic  inter-domain routing issues

Output: route_btoaOutput: route_btoaOutput: route_btoaOutput: route_btoa

1011363829|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1011387198|W|195.66.224.112|3549| 80.96.15.0/24| | 1011387339|A|195.66.224.112|3549| 80.96.15.0/24|3549 701 702 8708| 1011387369|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1010976980|W|195.66.224.112|3549|80.96.150.0/24| | 1010977007|A|195.66.224.112|3549|80.96.150.0/24|3549 209 1755 15471|

1011363829|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1011387198|W|195.66.224.112|3549| 80.96.15.0/24| | 1011387339|A|195.66.224.112|3549| 80.96.15.0/24|3549 701 702 8708| 1011387369|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1010976980|W|195.66.224.112|3549|80.96.150.0/24| | 1010977007|A|195.66.224.112|3549|80.96.150.0/24|3549 209 1755 15471|

TimestampTimestampTimestampTimestamp Updated PrefixUpdated PrefixUpdated PrefixUpdated Prefix

AS PathAS PathAS PathAS Path• All updates like Merit’s All updates like Merit’s "route_btoa –m" "route_btoa –m"• All updates like Merit’s All updates like Merit’s "route_btoa –m" "route_btoa –m"

Page 7: Identifying problematic  inter-domain routing issues

Example data setsExample data setsExample data setsExample data sets

• RIPE’s RRC00:RIPE’s RRC00:Jan 14, 2002 01:00 – Jan 20, 2002 01:10 Jan 14, 2002 01:00 – Jan 20, 2002 01:10

• RIPE’s RRC00:RIPE’s RRC00:Jan 14, 2002 01:00 – Jan 20, 2002 01:10 Jan 14, 2002 01:00 – Jan 20, 2002 01:10

Page 8: Identifying problematic  inter-domain routing issues

Output: route_btoaOutput: route_btoaOutput: route_btoaOutput: route_btoa

1011363829|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1011387198|W|195.66.224.112|3549| 80.96.15.0/24| | 1011387339|A|195.66.224.112|3549| 80.96.15.0/24|3549 701 702 8708| 1011387369|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1010976980|W|195.66.224.112|3549|80.96.150.0/24| | 1010977007|A|195.66.224.112|3549|80.96.150.0/24|3549 209 1755 15471|

1011363829|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1011387198|W|195.66.224.112|3549| 80.96.15.0/24| | 1011387339|A|195.66.224.112|3549| 80.96.15.0/24|3549 701 702 8708| 1011387369|A|195.66.224.112|3549| 80.96.15.0/24|3549 3300 702 8708| 1010976980|W|195.66.224.112|3549|80.96.150.0/24| | 1010977007|A|195.66.224.112|3549|80.96.150.0/24|3549 209 1755 15471|

TimestampTimestampTimestampTimestamp Updated PrefixUpdated PrefixUpdated PrefixUpdated Prefix

AS PathAS PathAS PathAS Path• Classification of eachClassification of each update is appended: update is appended:• Classification of eachClassification of each update is appended: update is appended:

Page 9: Identifying problematic  inter-domain routing issues

Output: What has changed?Output: What has changed?Output: What has changed?Output: What has changed?

|:|24.|199 |AA-DIFF|ASPath-way Community|3549|3320->3300|8708|origin ||:|25.|23369|AW-DIFF| | | | | ||:|26.|141 |WA-DIFF|ASPath-way Community|3549|3300->701 |702 |transit||:|27.|30 |AA-DIFF|ASPath-way Community|3549|701->3300 |702 |transit||:|1. |-1 |AW-DIFF| | | | | ||:|2. |27 |WA-DIFF|ASPath-way Community|3549|3300->209 |1755|transit|

|:|24.|199 |AA-DIFF|ASPath-way Community|3549|3320->3300|8708|origin ||:|25.|23369|AW-DIFF| | | | | ||:|26.|141 |WA-DIFF|ASPath-way Community|3549|3300->701 |702 |transit||:|27.|30 |AA-DIFF|ASPath-way Community|3549|701->3300 |702 |transit||:|1. |-1 |AW-DIFF| | | | | ||:|2. |27 |WA-DIFF|ASPath-way Community|3549|3300->209 |1755|transit|

#update#update

time since time since last updatelast updatetime since time since last updatelast update

change to last updatechange to last update

What has What has changed?changed?

What has What has changed?changed?

Page 10: Identifying problematic  inter-domain routing issues

Type of changesType of changesType of changesType of changes

Page 11: Identifying problematic  inter-domain routing issues

Output: AS Path changesOutput: AS Path changesOutput: AS Path changesOutput: AS Path changes

|:|24.|199 |AA-DIFF|ASPath-way Community|3549|3320->3300|8708|origin ||:|25.|23369|AW-DIFF| | | | | ||:|26.|141 |WA-DIFF|ASPath-way Community|3549|3300->701 |702 |transit||:|27.|30 |AA-DIFF|ASPath-way Community|3549|701->3300 |702 |transit||:|1. |-1 |AW-DIFF| | | | | ||:|2. |27 |WA-DIFF|ASPath-way Community|3549|3300->209 |1755|transit|

|:|24.|199 |AA-DIFF|ASPath-way Community|3549|3320->3300|8708|origin ||:|25.|23369|AW-DIFF| | | | | ||:|26.|141 |WA-DIFF|ASPath-way Community|3549|3300->701 |702 |transit||:|27.|30 |AA-DIFF|ASPath-way Community|3549|701->3300 |702 |transit||:|1. |-1 |AW-DIFF| | | | | ||:|2. |27 |WA-DIFF|ASPath-way Community|3549|3300->209 |1755|transit|

last ‘stable’ last ‘stable’ ASAS

last ‘stable’ last ‘stable’ ASAS

from where from where to where? to where?

from where from where to where? to where? rejoining ASrejoining AS rejoining ASrejoining AS

Page 12: Identifying problematic  inter-domain routing issues

Output: Old AS PathOutput: Old AS PathOutput: Old AS PathOutput: Old AS Path

3549__95%_ 3320__47%_ 5483_*15%* 8708__78%_| 2 |0. |22.|#8|flapping|3549__95%_ 3300__65%_ 702__61%_ 8708_**3%*| 5 |3. |20.|#6| |

3549__95%_ 3300__65%_ 702__63%_ 8708__36%_| 5 |21.|21.|#1| |

3549__95%_ 701__66%_ 702__64%_ 8708__53%_| 3 |0. |24.|#9| |

3549__96%_ 3300__67%_ 1755__54%_ 15471_*21%*| * |* |* |* | |

3549__96%_ 3300__67%_ 1755__54%_ 15471__33%_| * |* |* |* | |

3549__95%_ 3320__47%_ 5483_*15%* 8708__78%_| 2 |0. |22.|#8|flapping|3549__95%_ 3300__65%_ 702__61%_ 8708_**3%*| 5 |3. |20.|#6| |

3549__95%_ 3300__65%_ 702__63%_ 8708__36%_| 5 |21.|21.|#1| |

3549__95%_ 701__66%_ 702__64%_ 8708__53%_| 3 |0. |24.|#9| |

3549__96%_ 3300__67%_ 1755__54%_ 15471_*21%*| * |* |* |* | |

3549__96%_ 3300__67%_ 1755__54%_ 15471__33%_| * |* |* |* | |

AS on the “old” AS on the “old” Path Path

percentage of prefixes still reachablepercentage of prefixes still reachablepercentage of prefixes still reachablepercentage of prefixes still reachable

Page 13: Identifying problematic  inter-domain routing issues

Sets of updates for a prefixSets of updates for a prefixwith same attributeswith same attributes

Sets of updates for a prefixSets of updates for a prefixwith same attributeswith same attributes

n-way changen-way changen-way changen-way change>>44>>44

reconvergencereconvergencereconvergencereconvergence4.4.4.4.

flappingflappingflappingflapping3.3.3.3.

duplicateduplicateduplicateduplicate2.2.2.2.

new changenew changenew changenew change1.1.1.1.

Page 14: Identifying problematic  inter-domain routing issues

Output: “n-way flapping”Output: “n-way flapping”Output: “n-way flapping”Output: “n-way flapping”

| 2 |0. |22.|#8|flapping|208326|85% |<- | | (8708)__72%_ 5483| 5 |3. |20.|#6| | |8% |-1 | | (8708)__79%_ 702| 5 |21.|21.|#1| | |8% |-2 | | (8708)__78%_ 702| 3 |0. |24.|#9| | |8% |flap-3|23540| (8708)__78%_ 702| * |* |* |* | | |100%| | |(15471)**95%* 1755| * |* |* |* | | |100%| | |(15471)**95%* 1755

| 2 |0. |22.|#8|flapping|208326|85% |<- | | (8708)__72%_ 5483| 5 |3. |20.|#6| | |8% |-1 | | (8708)__79%_ 702| 5 |21.|21.|#1| | |8% |-2 | | (8708)__78%_ 702| 3 |0. |24.|#9| | |8% |flap-3|23540| (8708)__78%_ 702| * |* |* |* | | |100%| | |(15471)**95%* 1755| * |* |* |* | | |100%| | |(15471)**95%* 1755

distance to last equal updatedistance to last equal updatedistance to last equal updatedistance to last equal update

first and last first and last occurrence in occurrence in update seriesupdate series

first and last first and last occurrence in occurrence in update seriesupdate series

flappingflappingflappingflapping

reconvergencereconvergencereconvergencereconvergence

time to time to last flaplast flaptime to time to last flaplast flap

percentage of other percentage of other prefixes by the prefixes by the originating AS originating AS identified as identified as

flappingflapping

percentage of other percentage of other prefixes by the prefixes by the originating AS originating AS identified as identified as

flappingflapping

Page 15: Identifying problematic  inter-domain routing issues

Categorization of changesCategorization of changesCategorization of changesCategorization of changes

Page 16: Identifying problematic  inter-domain routing issues

Probability distribution ofProbability distribution ofdistance between flapsdistance between flaps

Probability distribution ofProbability distribution ofdistance between flapsdistance between flaps

Page 17: Identifying problematic  inter-domain routing issues

Time between equal updatesTime between equal updatesTime between equal updatesTime between equal updates

Page 18: Identifying problematic  inter-domain routing issues

Session resetsSession resetsSession resetsSession resets

• peering connection breakdown -peering connection breakdown -a whole table must be exchangeda whole table must be exchanged

• Update storms are propagated Update storms are propagated through the internet…through the internet…

• How big is the problem?How big is the problem?

• peering connection breakdown -peering connection breakdown -a whole table must be exchangeda whole table must be exchanged

• Update storms are propagated Update storms are propagated through the internet…through the internet…

• How big is the problem?How big is the problem?

Page 19: Identifying problematic  inter-domain routing issues

Output: possible session resetsOutput: possible session resetsOutput: possible session resetsOutput: possible session resets

(8708)__72%_ 5483**66%* 3320**28%* 3549___0%_| 2 |3320 5483| | (8708)__79%_ 702___5%_ 3300___3%_ 3549___0%_| | | | (8708)__78%_ 702___5%_ 3300___3%_ 3549___0%_| | |peak| (8708)__78%_ 702___5%_ 701___1%_ 3549___0%_| | |peak|(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |

(8708)__72%_ 5483**66%* 3320**28%* 3549___0%_| 2 |3320 5483| | (8708)__79%_ 702___5%_ 3300___3%_ 3549___0%_| | | | (8708)__78%_ 702___5%_ 3300___3%_ 3549___0%_| | |peak| (8708)__78%_ 702___5%_ 701___1%_ 3549___0%_| | |peak|(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |

AS numberAS numberAS numberAS number

Percentage of updated Percentage of updated vs. all associated vs. all associated

prefixes with an AS.prefixes with an AS.

Percentage of updated Percentage of updated vs. all associated vs. all associated

prefixes with an AS.prefixes with an AS.

Page 20: Identifying problematic  inter-domain routing issues

Identification of session resetsIdentification of session resetsIdentification of session resetsIdentification of session resets

All prefixes All prefixes updatedupdated

All prefixes All prefixes updatedupdated

Page 21: Identifying problematic  inter-domain routing issues

Output: possible session resetsOutput: possible session resetsOutput: possible session resetsOutput: possible session resets

(8708)__72%_ 5483**66%* 3320**28%* 3549___0%_| 2 |3320 5483| | (8708)__79%_ 702___5%_ 3300___3%_ 3549___0%_| | | | (8708)__78%_ 702___5%_ 3300___3%_ 3549___0%_| | |peak| (8708)__78%_ 702___5%_ 701___1%_ 3549___0%_| | |peak|(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |

(8708)__72%_ 5483**66%* 3320**28%* 3549___0%_| 2 |3320 5483| | (8708)__79%_ 702___5%_ 3300___3%_ 3549___0%_| | | | (8708)__78%_ 702___5%_ 3300___3%_ 3549___0%_| | |peak| (8708)__78%_ 702___5%_ 701___1%_ 3549___0%_| | |peak|(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |(15471)**95%* 1755___0%_ 3549___0%_ 3300___0%_| 1 |15471 | |

number of ASs involvednumber of ASs involvednumber of ASs involvednumber of ASs involved

ASs involvedASs involvedASs involvedASs involved

Page 22: Identifying problematic  inter-domain routing issues

Updates due to session resetsUpdates due to session resetsUpdates due to session resetsUpdates due to session resets

Page 23: Identifying problematic  inter-domain routing issues

Duration of session resetsDuration of session resetsDuration of session resetsDuration of session resets

Page 24: Identifying problematic  inter-domain routing issues

Output: ClassificationOutput: ClassificationOutput: ClassificationOutput: Classification

|2|3320 5483| | 7.0|instable |...| | | | 5.9|instable |...| | |peak|16.2|instable |...| | |peak|16.2|re-stable change|...|1|15471 | | 1.3|instable |...|1|15471 | | 1.4|instable |...

|2|3320 5483| | 7.0|instable |...| | | | 5.9|instable |...| | |peak|16.2|instable |...| | |peak|16.2|re-stable change|...|1|15471 | | 1.3|instable |...|1|15471 | | 1.4|instable |...

further changes?further changes?further changes?further changes?

update rate update rate per secondper second

update rate update rate per secondper second

peak peak identificationidentification

peak peak identificationidentification

further further suggestions?!suggestions?!

further further suggestions?!suggestions?!

Page 25: Identifying problematic  inter-domain routing issues

Update burstUpdate burstUpdate burstUpdate burst

• Like packet flowsLike packet flows

• Bursts consists of several updates Bursts consists of several updates – same prefix same prefix – short time windowshort time window

• Like packet flowsLike packet flows

• Bursts consists of several updates Bursts consists of several updates – same prefix same prefix – short time windowshort time window

Page 26: Identifying problematic  inter-domain routing issues

Burst durationBurst durationBurst durationBurst duration

Page 27: Identifying problematic  inter-domain routing issues

Updates in burstUpdates in burstUpdates in burstUpdates in burst

Page 28: Identifying problematic  inter-domain routing issues

Output CharacterOutput CharacterOutput CharacterOutput Character

• Classification of updatesClassification of updates

• Statistical informationStatistical information

• Missing updates / verificationMissing updates / verification

• Classification of updatesClassification of updates

• Statistical informationStatistical information

• Missing updates / verificationMissing updates / verification

Page 29: Identifying problematic  inter-domain routing issues

Ongoing workOngoing workOngoing workOngoing work

• RTG – a realistic RTG – a realistic RRouting outing TTable able (and update) (and update) GGeneratorenerator

• generation of tables and updates with generation of tables and updates with ‘real-world’ characteristics‘real-world’ characteristics

• Use RTG to benchmark router Use RTG to benchmark router performanceperformance

• RTG – a realistic RTG – a realistic RRouting outing TTable able (and update) (and update) GGeneratorenerator

• generation of tables and updates with generation of tables and updates with ‘real-world’ characteristics‘real-world’ characteristics

• Use RTG to benchmark router Use RTG to benchmark router performanceperformance

Page 30: Identifying problematic  inter-domain routing issues

ConclusionConclusionConclusionConclusion

Thank you !Thank you !Thank you !Thank you !

If you are interested, pleaseIf you are interested, pleasevisit our website:visit our website:

http://www.net.uni-sb.de/~olafmhttp://www.net.uni-sb.de/~olafm

If you are interested, pleaseIf you are interested, pleasevisit our website:visit our website:

http://www.net.uni-sb.de/~olafmhttp://www.net.uni-sb.de/~olafm