interdomain routing jennifer rexford advanced computer networks tuesdays/thursdays 1:30pm-2:50pm

47
Interdomain Routing Interdomain Routing Jennifer Rexford Jennifer Rexford Advanced Computer Networks Advanced Computer Networks http://www.cs.princeton.edu/courses/archive/ http://www.cs.princeton.edu/courses/archive/ fall08/cos561/ fall08/cos561/ Tuesdays/Thursdays 1:30pm-2:50pm Tuesdays/Thursdays 1:30pm-2:50pm

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Interdomain RoutingInterdomain Routing

Jennifer RexfordJennifer Rexford

Advanced Computer NetworksAdvanced Computer Networkshttp://www.cs.princeton.edu/courses/archive/fall08/http://www.cs.princeton.edu/courses/archive/fall08/

cos561/cos561/Tuesdays/Thursdays 1:30pm-2:50pmTuesdays/Thursdays 1:30pm-2:50pm

Outline

• Interdomain routing– Autonomous Systems (ASes)

• Path-vector routing– Loop detection and convergence– Flexible path selection

• Business relationships– Customer-provider and peer-peer– Hierarchy from tier-1 ASes to stubs

• Border Gateway Protocol (BGP)– Announcements and withdrawals– Import and export policies

• Discussion of Paxson96 and Labovitz97

Interdomain Routing: Between Networks

• AS-level topology– Nodes are Autonomous Systems (ASes)– Destinations are prefixes (e.g., 12.0.0.0/8)– Edges are links and business relationships

1

2

34

5

67

Client Web server

AS Numbers (ASNs)

ASNs are 16 bit values.64512 through 65535 are “private”

• Level 3: 1 • MIT: 3• Harvard: 11• Yale: 29• Princeton: 88• AT&T: 7018, 6341, 5074, … • UUNET: 701, 702, 284, 12199, …• Sprint: 1239, 1240, 6211, 6242, …• …

ASNs represent units of routing policy

Currently around 30,000 in use.

Challenges for Interdomain Routing

• Scale– Prefixes: 250,000, and growing– ASes: 30,000, and growing– Routers: at least in the millions…

• Privacy– ASes don’t want to divulge internal topologies– … or their business relationships with

neighbors• Policy

– No Internet-wide notion of a link cost metric– Need control over where you send traffic– … and who can send traffic through you

Policy-Based Path-Vector Routing

Path-Vector Routing

• Extension of distance-vector routing– Support flexible routing policies– Reduce convergence time (avoid count-to-

infinity)• Key idea: advertise the entire path

– Distance vector: send distance metric per dest d

– Path vector: send the entire path for each dest d

32 1

d

“d: path (2,1)” “d: path (1)”

data traffic data traffic

Faster Loop Detection

• Node can easily detect a loop– Look for its own node identifier in the path– E.g., node 1 sees itself in the path “3, 2, 1”

• Node can simply discard paths with loops– E.g., node 1 simply discards the advertisement

32 1

“d: path (2,1)” “d: path (1)”

“d: path (3,2,1)”

Flexible Policies

• Each node can apply local policies– Path selection: Which path to use?– Path export: Whether to advertise the path?

• Examples– Node 2 may prefer the path “2, 3, 1” over

“2, 1”– Node 1 may not let node 3 hear the path “1,

2”2 3

1

2 3

1

Business Relationships

Business Relationships

• Neighboring ASes have business contracts– How much traffic to carry– Which destinations to reach– How much money to pay

• Common business relationships– Customer-provider

• E.g., Princeton is a customer of USLEC• E.g., MIT is a customer of Level3

– Peer-peer• E.g., UUNET is a peer of Sprint• E.g., Harvard is a peer of Harvard Business School

Customer-Provider Relationship

• Customer needs to be reachable from everyone– Provider tells all neighbors how to reach the customer

• Customer does not want to provide transit service– Customer does not let its providers route through it

d

d

provider

customer

customer

provider

Traffic to the customer Traffic from the customer

announcements

traffic

Multi-Homing: Two or More Providers

• Motivations for multi-homing– Extra reliability, survive single ISP failure– Financial leverage through competition– Better performance by selecting better path– Gaming the 95th-percentile billing model

Provider 1 Provider 2

Princeton Example

• Internet: customer of USLEC and Patriot• Research universities/labs: customer of

Internet2• Local non-profits: provider for several non-

profits

Patriot USLEC Internet2

Peer-Peer Relationship

• Peers exchange traffic between customers – AS exports only customer routes to a peer

– AS exports a peer’s routes only to its customers

– Often the relationship is settlement-free (i.e., no $$$)

peerpeer

Traffic to/from the peer and its customers

d

announcements

traffic

AS Structure: Tier-1 Providers

• Tier-1 provider– Has no upstream provider of its own– Typically has a national or international backbone

• Top of the Internet hierarchy of ~10 ASes– AOL, AT&T, Global Crossing, Level3, UUNET, NTT,

Qwest, SAVVIS (formerly Cable & Wireless), and Sprint

– Full peer-peer connections between tier-1 providers

AS Structure: Other ASes

• Other providers– Provide transit service to downstream

customers– … but, need at least one provider of their

own– Typically have national or regional scope– Includes several thousand ASes

• Stub ASes– Do not provide transit service to others– Connect to one or more upstream providers– Includes vast majority (e.g., 85-90%) of the

ASes

Border Gateway Protocol

04/19/23

• Prefix-based path-vector protocol

• Policy-based routing based on AS Paths

• Evolved during the past 18 years• 1989 : BGP-1 [RFC 1105], replacement for EGP

• 1990 : BGP-2 [RFC 1163]

• 1991 : BGP-3 [RFC 1267]

• 1995 : BGP-4 [RFC 1771], support for CIDR

• 2006 : BGP-4 [RFC 4271], update

Border Gateway Protocol

04/19/23

BGP Operations

Establish session on TCP port 179

Exchange all active routes

Exchange incremental updates

AS1

AS2

While connection is ALIVE exchangeroute UPDATE messages

BGP session

Incremental Protocol

• A node learns multiple paths to destination– Stores all of the routes in a routing table– Applies policy to select a single active route– … and may advertise the route to its

neighbors

• Incremental updates– Announcement

• Upon selecting a new active route, add node id to path

• … and (optionally) advertise to each neighbor

– Withdrawal• If the active route is no longer available• … send a withdrawal message to the neighbors

04/19/23

ASPATH Attribute

AS7018128.112.0.0/16AS Path = 88

AS 1239Sprint

AS 1755Ebone

AT&T

AS 3549Global Crossing

128.112.0.0/16AS Path = 7018 88

128.112.0.0/16AS Path = 3549 7018 88

AS 88

128.112.0.0/16

Princeton

Prefix Originated

AS 12654RIPE NCCRIS project

AS 1129Global Access

128.112.0.0/16AS Path = 7018 88

128.112.0.0/16AS Path = 1239 7018 88

128.112.0.0/16AS Path = 1129 1755 1239 7018 88

128.112.0.0/16AS Path = 1755 1239 7018 88

BGP Policy: Applying Policy to Routes

• Import policy– Filter unwanted routes from neighbor

• E.g. prefix that your customer doesn’t own

– Manipulate attributes to influence path selection• E.g., assign local preference to favored routes

• Export policy– Filter routes you don’t want to tell your

neighbor• E.g., don’t tell a peer a route learned from other

peer

– Manipulate attributes to control what they see• E.g., make a path look artificially longer than it is

04/19/23

BGP Policy: Influencing Decisions

Best Route Selection

Apply Import Policies

Best Route Table

Apply Export Policies

Install forwardingEntries for bestRoutes.

ReceiveBGPUpdates

BestRoutes

TransmitBGP Updates

Apply Policy =filter routes & tweak attributes

Based onAttributeValues

IP Forwarding Table

Apply Policy =filter routes & tweak attributes

Open ended programming.Constrained only by vendor configuration language

Import Policy: Local Preference

• Favor one path over another– Override the influence of AS path length– Apply local policies to prefer a path

• Example: prefer customer over peer

AT&T Sprint

Yale

Tier-2

Tier-3

Local-pref = 100

Local-pref = 90

Import Policy: Filtering

• Discard some route announcements– Detect configuration mistakes and attacks

• Examples on session to a customer– Discard route if prefix not owned by the customer– Discard route that contains other large ISP in AS

path

Patriot

Princeton

USLEC

128.112.0.0/16

Export Policy: Filtering

• Discard some route announcements– Limit propagation of routing information

• Examples– Don’t announce routes from one peer to

another

AT&T SprintUUNET

Export Policy: Filtering

• Discard some route announcements– Limit propagation of routing information

• Examples– Don’t announce routes for network-

management hosts or the underlying routers themselves

USLEC

Princeton

network operator

Export Policy: Attribute Manipulation

• Modify attributes of the active route– To influence the way other ASes behave

• Example: AS prepending– Artificially inflate the AS path length seen by others– To convince some ASes to send traffic another way

Patriot

Princeton

USLEC

128.112.0.0/16

Sprint

88 88 88

BGP Policy Configuration

• Routing policy languages are vendor-specific– Not part of the BGP protocol specification– Different languages for Cisco, Juniper, etc.

• Still, all languages have some key features– Policy as a list of clauses– Each clause matches on route attributes– … and either discards or modifies matching routes

• Configuration often done by human operators– Implementing the policies of their AS– Biz relationships, traffic engineering, security, …

Measuring Internet Routing

Motivations for Measuring the Routing System

• Characterizing the Internet– Internet path properties– Demands on Internet routers– Routing convergence

• Improving Internet health– Protocol design problems– Protocol implementation problems– Configuration errors or attacks

• Operating a network– Detecting and diagnosing routing problems– Traffic shifts, routing attacks, flaky

equipment, …

Techniques for Measuring Internet Routing

• Active probing– Inject probes along path through the data plane– E.g., using traceroute

• Passive route monitoring– Capture control-plane messages between routers– E.g., using tcpdump or a software router– E.g., dumping the routing table on a router

• Injecting network events– Cause failure/recovery at planned time and place– E.g., BGP route beacon, or planned maintenance

Internet Routing is Hard to Measure

• Nobody knows the Internet topology– No central registry of the AS-level graph– Little public information about intra-AS

topologies• Deploying monitoring infrastructure is hard

– Forwarding: active probes of end-to-end paths– Routing: passive monitoring of routing messages

• Many measurement challenges– Network conditions vary by location– Network conditions change over time– One-way measurements are hard to collect– Controlled experiments are hard to do

Two Papers for Today

• Both early measurement studies of routing– Initially appeared at SIGCOMM’96 and ’97– Both won the “best student paper” award

• And recently won the SIGCOMM “test of time” award!

– Early glimpses into the health of Internet routing

– Early wave of papers on Internet measurement• Differences in emphasis

– Paxson96: end-to-end active probing to measure the characteristics of the data plane

– Labovitz97: passive monitoring of BGP update messages from several ISPs to characterize (in)stability of the interdomain routing system

Backup Slides Summarizing Paxson96 and Labovitz97

Active Measurement: Traceroute

• Time-To-Live field in IP packet header– Source sends a packet with a TTL of n– Each router along the path decrements the

TTL– “TTL exceeded” sent when TTL reaches 0

• Traceroute tool exploits this TTL behavior

source destination

TTL=1

Time exceeded

TTL=2

Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message

Paxson Study: Forwarding Loops

• Forwarding loop– Packet returns to same router multiple times

• May cause traceroute to show a loop– If loop lasted long enough– So many packets traverse the loopy path

• Traceroute may reveal false loops– Path change that leads to a longer path– Causing later probe packets to hit same nodes

• Heuristic solution– Require traceroute to return same path 3

times

Paxson Study: Causes of Loops

• Transient vs. persistent– Transient: routing-protocol convergence– Persistent: likely configuration problem

• Challenges– Appropriate time boundary between the

two?– What about flaky equipment going up and

down?– Determining the cause of persistent loops?

• Causes of persistent loops– E.g., misconfiguration

0.0.0.0/012.1.2.0/24

Paxson Study: Path Fluttering

• Rapid changes between paths– Multiple paths between a pair of hosts– Load balancing policies inside the network

• Packet-based load balancing– Round-robin or random– Multiple paths for packets in a single flow

• Flow-based load balancing– Hash of some fields in the packet header– E.g., IP addresses, port numbers, etc.– To keep packets in a flow on one path

Paxson Study: Routing Stability

• Route prevalence– Likelihood of observing a particular route– Relatively easy to measure with sound

sampling– Poisson arrivals see time averages (PASTA)– Most host pairs have a dominant route

• Route persistence– How long a route endures before a change– Much harder to measure through active probes– Look for cases of multiple observations– Typical host pair has path persistence of a

week

Paxson Study: Route Asymmetry

• Hot-potato routing • Other causes– Asymmetric link

weights in intradomain routing

– Cold-potato routing, where AS requests traffic enter at particular place

• Consequences– Lots of asymmetry– One-way delay is not

necessarily half of the round-trip timeCustomer A

Customer B

multiplepeeringpoints

Provider A

Provider B

Early-exit routing

Labovitz Study: Surprising Statistics

• BGP is an incremental protocol– In theory, no update messages in steady state

• Two kinds of update messages– Announcement: advertising a new route– Withdrawal: withdrawing an old route

• Study saw an alarming number of updates– At the time, Internet had around 45,000 prefixes– Routers were exchanging 3-6 million updates/day– Sometimes as high as 30 million in a day

• Placing a very high load on the routers

Labovitz Study: Classifying Update Messages

• Analyze update messages – For each (prefix, peer) tuple– Classify the kinds of routing changes

• Forwarding instability– WADiff: explicit withdraw, replaced by alternate– AADiff: implict withdraw, replaced by alternate

• Pathological– WADup: explicit withdraw, and then reanounced– AADup: duplicate announcement– WWDup: duplicate withdrawal

Labovitz Study: Duplicate Withdrawals

• Time-space trade-off in router implementation– Common system building technique– Trade one resource for another– Can have surprising side effects

• The gory details– Ideally, you should not send a withdrawal if

you never sent a neighbor a corresponding announcement

– Requires remembering what update message you sent to each neighbor

– Easier to just send everyone a withdrawal when your route goes away

Labovitz Study: Practical Impact

• “Stateless BGP” is compliant with the standard– But, it forces other routers to handle more

load– So that you don’t have to maintain state– Arguably very unfair, and bad for global

Internet

• One router vendor was largely at fault– Router vendor modified its implementation– ISPs then deployed the updated software

Labovitz Study: Still Hard to Diagnose Problems

• Despite having very detailed view into BGP– Some pathologies were very hard to diagnose

• Possible causes– Flaky equipment– Synchronization of BGP timers– Interaction between BGP and intradomain

routing– Policy oscillation

• Topics addressed more in subsequent work