consensus routing: the internet as a distributed system 2009. 2. 26 john p. john, ethan...

33
Consensus Routing: The Internet as a Distributed System 2009. 2. 26 John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented by John P. John Modified by Moonyoung Chung

Upload: solomon-townsend

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Consensus Routing: The Internet as a Distributed System

2009. 2. 26

John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson

Presented by John P. John

Modified by Moonyoung Chung

Contents Introduction Motivation and Goals Consensus Routing– Stable Mode– Transient Mode

Evaluation Conclusions

2NSDI '08

Internet Routing

3NSDI '08

A goal of the Internet is global reachability

But, BGP fails to achieve this goal– Physical paths exist, but not BGP paths– 10-15% of BGP updates cause loops and blackholes– 90% of all packet losses on the Internet due to loops

BGP

NSDI '08 4

Opaque policy routing– Preferred routes visible to neighbors– Underlying policies not visible and under local control

Mechanism:– Autonomous Systems(ASes) send preferred path to

neighbors– If AS receives new path, start using right away– Forward path to neighbors, after some delay– Path eventually propagates to all ASes

Example

2

3

4

5

1

Destination

5: 4-55: 3-4-55: 1-5

5: 4-55: 2-4-5

NSDI '08 5

BGP link failure

NSDI '08 6

5: 4-55: 3-4-55: 1-5

5: 4-55: 2-4-5

2

3

4

5

1

Destination

5:4-5

Link 4-5 failsAS4 withdraws pathfrom upstream ASes

BGP link failure

NSDI '08 7

5: 4-55: 3-4-55: 1-5

5: 4-55: 2-4-5

2

3

4

5

1

Destina-tion

AS 2 and 3 pick theirnext best paths

Routing loop is formed!

BGP policy change

NSDI '08 8

5

AS4 wants all traffic destined for AS5 to come through AS6

5: 4-55: 3-4-55: 6-4-55: 1-5 5: 4-5

5: 2-4-55: 6-4-5

5: 4-55: 2-4-5

5:4-5

AS4 withdraws the pathfrom AS2 and AS3

2

3

4

1

6

Destina-tion

BGP policy change

NSDI '08 9

5

5: 4-55: 3-4-55: 6-4-55: 1-5 5: 4-5

5: 2-4-55: 6-4-5

5: 4-55: 2-4-5

2

3

4

1

6

Destina-tion

AS 2 and 3 pick theirnext best paths

Routing loop is formed!

Lack of Consistency

NSDI '08 10

The underlying cause of all these problems is in-consistent global state– Link failures– Traffic engineering– Scheduled Maintenance– Link coming up

Protocol behavior complex, unpredictable No indicator of when system converged to consis-

tent state

Motivation and Goal

NSDI '08 11

Goal:– Networks that have high availability

Insight:– Consistency is the key

Consensus Routing

NSDI '08 12

Lesson from distributed system design:– De-couple safety and liveliness

Safety: Forwarding tables are always consistent and policy compliant, consistent view of global state

Liveness: Routing system adapts to failures quickly and maintains high availability

Safety: Stable Mode

NSDI '08 13

Problem: Inconsistent state

Solution: – Apply updates only after they have reached all depen-

dent ASes– Apply updates synchronously across ASes

Stable Mode Consistent view of global state– Stable Forwarding Table (SFT)

at kth epoch1. Update log

2. Distributed snapshot

3. Frontier computation

4. SFT computation

5. View change

NSDI '08 14

Update log

NSDI '08 15

1

4

6 5

3

2

ASes compute and forward routes as before, but don’t apply to forwarding table

Distributed Snapshot

1

4

6 5

3

2

NSDI '08 16

Some node(s) calls for the (k+1)th distributed snapshot

1. Run BGP, but don’t applythe updates

Periodically, a distributed snapshot is taken

Updates in transit, or being processed are marked incomplete

Frontier Computation: Aggregation

1

4

6 5

3

2

* frontier: the most recent complete update at each AS

NSDI '08 17

ASes send snapshot report to the consolidators 1. the saved sequence of updates2. the set of incomplete updates

Consolidators 1. Run BGP, but don’t applythe updates

2. Distributed Snapshot

Frontier Computation: Consensus

1

4

6 5

3

2

NSDI '08 18

1. Run BGP, but don’t applythe updates

2. Distributed Snapshot3. Send info to consolidators

Consolidators run a consensusalgorithm to agree on the setof incomplete updates

Consolidators

Frontier Computation: Flood

1

4

6 5

3

2

NSDI '08 19

Consolidators

Consolidators flood the incomplete set to all the ASes

1. Run BGP, but don’t applythe updates

2. Distributed Snapshot3. Send info to consolidators4. Consensus

SFT Computation & View Change

1

4

6 5

3

2

Details and proof of consistency in the paper

NSDI '08 20

1. Run BGP, but don’t applythe updates

2. Distributed Snapshot3. Send info to consolidators4. Consensus5. Flood

Apply completed updates

Versioning, Garbage collection

Mechanism

NSDI '08 21

Other details in the paper:– Transition between epochs– Slow/unresponsive ASes– Failed ASes– Reintegration of failed ASes– Provable safety and liveness properties

Transient Mode: Liveness Problem: Upon link failure, need to wait till path

reaches everyone

Solution: Dynamically re-route around the failed link– use existing techniques• Pre-computed backup paths• Deflection• Detour routing

NSDI '08 22

Routing Deflection

NSDI '08 23

S

1

2

Destina-tion

D

3

deflect packet to neighbor

traverse a different route

Backtracking

NSDI '08 24

S

1

2

Destina-tion

backtrack-ing

D

34

Detour Routing

NSDI '08 25

S

4

5

Destina-tion

tunnel

D

3

B Tier 1

B is responsible for forwarding packets

Backup routes Pre-computed failover paths

e.g. RBGP, scheme for pre-computing backup routes to each destination

NSDI '08 26

BGP

NSDI '08 27

Time

Conn

ectiv

ity

Link Failure (or other BGP event)

BGP convergesto alternate path

Globalreachability

CompletelyUnreachable

Consensus Routing

NSDI '08 28

Time

Conn

ectiv

ity

Globalreachability

CompletelyUnreachable

Time

Conn

ectiv

ity

Globalreachability

CompletelyUnreachable

Link Failure (or other BGP event)Switch to

transient routingSnapshot

Evaluation In the talk, answer the following:– How does consensus routing affect connectivity?– What is the traffic overhead?

Methodology– Extensive simulations on realistic Internet-scale topologies.– an implemented XORP prototype.– experiments on PlanetLab.

NSDI '08 29

Methodology

NSDI '08 30

1 2

3 54

Fail each access link ofeach multi-homed stubAS

See what fraction of ASesare temporarily disconnecteduntil convergence

23,390 ASes, 46,095 links 9,100 multi-homed stub AS

Connectivity

Consensus routing maintains complete connectivity in over 99% of the cases

BGP maintains completeconnectivity in < 40% of the failure cases

NSDI '08 31

Overhead

Entire update is not sent, only identifiers of the updates

overhead

NSDI '08 32

Conclusions BGP’s transient problems are due to inconsistent

global state Consensus routing enables consistent routing state

with opaque policies– key technique: separation of safety and liveness

We can have an Internet that has high availability!

NSDI '08 33