network layer routing - andrzej dudaduda.imag.fr/m1/routing.pdf · goal: determine “good” path...

Post on 01-Oct-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Computer Networking

Network Layer routing

http://duda.imag.fr

Prof. Andrzej Duda duda@imag.fr

2

Network Layer

Chapter goals: §  understand principles of routing protocols

§  internal protocols §  distance vector (RIP §  link state (OSPF)

§  external routing §  distance path (BGP)

3

Routing and Packet forwarding §  Packet forwarding - data plane

§  forward every packet to the next hop §  in real time

§  Routing - control plane §  computation of routing tables or data structures for unicast and

multicast §  normally only between routers §  non-real time: latency up to several minutes §  two level hierarchy

§  internal routing inside an administrative domain (called autonomous system - AS)

§  external routing between AS (administrative domains or ISPs) §  uses routing protocols such as

§  internal: RIP, OSPF, EIGRP §  external: BGP

4

Autonomous systems

host

subnetwork

autonomous system

border router

internal router

switch (bridge)

interconnection layer 2

interconnection layer 3

VLAN

5

Interconnection of AS

NAP, MAE, GIX, IXP

subnetworks

border router

autonomous system

6

Routing

Graph abstraction for routing algorithms:

§  graph nodes are routers §  graph edges are

physical links §  link cost: delay, $ cost,

or congestion level

Goal: determine “good” path (sequence of routers) thru

network from source to dest.

Routing protocol

A

E D

C B

F 2

2 1 3

1

1 2

5 3

5

§  “good” path: §  typically means minimum

cost path §  other def’s possible

7

Distance vector §  Dynamic routing based on distributed estimation of

the distance to the destination §  uses the distributed algorithm by Bellman-Ford (dynamic

programming) §  each router receives aggregated information from its

neighbors §  estimates the local cost to its neighbors §  computes the best routes §  no global network states

§  Distance §  number of hops §  delay

8

Bellman-Ford algorithm §  Bellman-Ford algorithm

§  node i knows cost c(i,k) to its immediate neighbours (+∞ for most values of k)

§  distance D(i,n) is given by: D(i,n) = mink (c(i,k) + D(k,n)) §  in the worst case, convergence after N-1 iterations

§  Distributed Bellman-Ford algorithm §  initially: D(i,n) = 0 if i directly connected to n and D(i,n) =

+∞ otherwise §  node i receives from neighbour k latest values of D(k,n) for

all n (distance vector) §  node i computes the best estimates

D(i,n) = mink (c(i,k) + D(k,n))

9

Bellman-Ford algorithm

c(i,m)

c(i,1) D(1,n)

c(i,k) D(k,n)

D(m,n)

i n

1

k

m

10

Example of Bellman-Ford

A B

G H

K J I

A I H K Table of J

A 0 24 20 21 8 A B 12 36 31 28 20 A G 18 31 6 31 18 H H 17 20 0 19 12 H I 21 0 14 22 10 I J 9 11 7 10 0 - K 24 22 22 0 6 K J: 8 10 12 6

computation of G : 18+8=26, 31+10=41, 6+12=18, 6+31=37 → choice of 18, H

11

Distance vector example §  Simple network

§  routers connected by links §  destinations = subnetworks connected to routers §  symmetric links §  cost = number of hops

12

Initialization

dest link cost

A local 0

A

l1 A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0

B

dest link cost

C local 0

C

dest link cost

D local 0

D

dest link cost

E local 0

E

13

Distance vector announcement

dest link cost

A local 0

A

l1 A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l1 1

B

dest link cost

C local 0

C

dest link cost

D local 0 A l3 1

D

dest link cost

E local 0

E

from A: dest cost A 0

from A: dest cost A 0

14

Distance vector announcement

dest link cost

A local 0 B l1 1

A

l1 A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l1 1

B

dest link cost

C local 0 A l2 2 B l2 1

C

dest link cost

D local 0 A l3 1

D

dest link cost

E local 0 A l4 2 B l4 1

E

from B: dest cost A 1 B 0

from B: dest cost B 0 A 1

from B: dest cost A 1 B 0

15

Distance vector announcement

dest link cost

A local 0 B l1 1 D l3 1

A

l1 A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l1 1

B

dest link cost

C local 0 A l2 2 B l2 1

C

dest link cost

D local 0 A l3 1

D

dest link cost

E local 0 A l4 2 B l4 1 D l6 1

E

from D: dest cost D 0 A 1

from D: dest cost D 0 A 1

16

Final

dest link cost

A local 0 B l1 1 D l3 1 C l1 2 E l1 2

A

l1 A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l1 1 C l2 1 E l4 1 D l1 2

B

dest link cost

C local 0 A l2 2 B l2 1 D l5 2 E l5 1

C

dest link cost

D local 0 A l3 1 B l3 2 C l6 2 E l6 1

D

dest link cost

E local 0 A l4 2 B l4 1 D l6 1 C l5 1

E

17

Link failure

dest link cost

A local 0 B l1 ∞ D l3 1 C l1 ∞ E l1 ∞

A

A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l1 ∞ C l2 1 E l4 1 D l1 ∞

B

dest link cost

C local 0 A l2 ∞ B l2 1 D l5 2 E l5 1

C

dest link cost

D local 0 A l3 1 B l3 ∞ C l6 2 E l6 1

D

dest link cost

E local 0 A l4 ∞ B l4 1 D l6 1 C l5 1

E

from A: dest cost A 0 B,C,E ∞ D 1

18

Link failure

dest link cost

A local 0 B l1 ∞ D l3 1 C l1 ∞ E l1 ∞

A

A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l1 ∞ C l2 1 E l4 1 D l1 ∞

B

dest link cost

C local 0 A l2 ∞ B l2 1 D l5 2 E l5 1

C

dest link cost

D local 0 A l3 1 B l6 2 C l6 2 E l6 1

D

dest link cost

E local 0 A l4 ∞ B l4 1 D l6 1 C l5 1

E

from B: dest cost E 0 A ∞ B,D,C 1

19

Final state after failure

dest link cost

A local 0 B l3 3 D l3 1 C l3 3 E l3 2

A

A B

l6 D E

l4 l3 C l5

l2

dest link cost

B local 0 A l4 3 C l2 1 E l4 1 D l4 2

B

dest link cost

C local 0 A l5 3 B l2 1 D l5 2 E l5 1

C

dest link cost

D local 0 A l3 1 B l6 2 C l6 2 E l6 1

D

dest link cost

E local 0 A l6 2 B l4 1 D l6 1 C l5 1

E

20

Equal link costs - link failures

dest link cost

A local 0 B l3 3 D l3 1 C l3 3 E l3 2

A

A B

D E

l4 l3 C l5

l2

dest link cost

B local 0 A l4 3 C l2 1 E l4 1 D l4 2

B

dest link cost

C local 0 A l5 3 B l2 1 D l5 2 E l5 1

C

dest link cost

D local 0 A l3 1 B l6 ∞ C l6 ∞ E l6 ∞

D

dest link cost

E local 0 A l6 2 B l4 1 D l6 1 C l5 1

E

21

Counting to infinity

dest link cost

A local 0 B l3 3 D l3 1 C l3 3 E l3 2

A

A

D

l3

dest link cost

D local 0 A l3 1 B l3 4 C l3 4 E l3 3

D

from A: dest cost A 0 B,C 3 D 1 E 2

§  Loop between A and D §  Exchange of routes, costs increase

by 2 each cycle §  Convergence to a stable state

§  ∞ = large number §  e.g. RIP: ∞ = 16

22

Split horizon

§  Minimize the effects of bouncing and counting to infinity

§  Rule §  if A routes packets to X via B, it does not announce this

route to B

23

Example of split horizon

dest link cost

A local 0 B l3 3 D l3 1 C l3 3 E l3 2

A

A B

D E

l4 l3 C l5

l2

dest link cost

B local 0 A l4 3 C l2 1 E l4 1 D l4 2

B

dest link cost

C local 0 A l5 3 B l2 1 D l5 2 E l5 1

C

dest link cost

D local 0 A l3 1 B l6 ∞ C l6 ∞ E l6 ∞

D

dest link cost

E local 0 A l6 2 B l4 1 D l6 1 C l5 1

E

24

Split horizon

dest link cost

A local 0 B l3 3 D l3 1 C l3 3 E l3 2

A

A

D

l3

dest link cost

D local 0 A l3 1 B l6 ∞ C l6 ∞ E l6 ∞

D

from A: dest cost A 0

§  Split horizon cuts the process of counting to infinity

25

Split horizon

dest link cost

A local 0 B l3 ∞ D l3 1 C l3 ∞ E l3 ∞

A

A

D

l3

dest link cost

D local 0 A l3 1 B l6 ∞ C l6 ∞ E l6 ∞

D

from D: dest cost D 0 B,C,E ∞

§  Split horizon cuts the process of counting to infinity

26

Split horizon may fail

B

E

l4 C l5

l2

dest link cost

B local 0 A l4 ∞ C l2 1 E l4 1 D l4 ∞

B

dest link cost

C local 0 A l5 3 B l2 1 D l5 2 E l5 1

C

dest link cost

E local 0 A l6 ∞ B l4 1 D l6 ∞ C l5 1

E

from E: dest cost E 0 A ∞ C 1 D ∞

27

Split horizon may fail

B

E

l4 C l5

l2

dest link cost

B local 0 A l2 4 C l2 1 E l4 1 D l2 3

B

dest link cost

C local 0 A l5 3 B l2 1 D l5 2 E l5 1

C

dest link cost

E local 0 A l6 ∞ B l4 1 D l6 ∞ C l5 1

E

from C: dest cost C 0 A 3 D 2 E 1

from C: dest cost C 0 B 1

28

Split horizon may fail

B

E

l4 C l5

l2

dest link cost

B local 0 A l2 4 C l2 1 E l4 1 D l2 3

B

dest link cost

C local 0 A l5 3 B l2 1 D l5 2 E l5 1

C

dest link cost

E local 0 A l4 5 B l4 1 D l4 4 C l5 1

E from B: dest cost A 4 B 0 C 1 D 3

29

RIP v1 §  Distance vector protocol §  Metric - hops §  Network span limited to 15

§  ∞ = 16

§  Split horizon §  Destination network identified by IP address

§  no prefix/subnet information - derived from address class

§  Encapsulated as UDP packets, port 520 §  Largely implemented (routed on Unix) §  Broadcast every 30 seconds or when update detected §  Route not announced during 3 minutes

§  cost becomes ∞

30

Message format

address family zero

IP address zero

zero

0 31

command version zero

metric

§  May be repeated 25 times §  Command

§  REQUEST - 1 (sent at boot to initialize) §  RESPONSE - 2 (broadcast each 30 sec)

31

Missing netmask

B

C

D

A

E F

10.0.0.0 (255.0.0.0)

10.0.0.0 (255.0.0.0)

10.1.0.0 255.255.0.0

10.2.0.0 255.255.0.0

§  A and E can forward to 10.0.0.0 §  Packet to 10.2.0.1 can go through F or B

§  if sent to B, it goes through A and C

§  If link C-D broken, no route to destination

packet to 10.2.0.1

32

RIP v2 (RFC 2453)

§  Subnetworks §  take into account CIDR prefixes and netmasks

§  Authentication §  Multicast

§  224.0.0.9 mapped to MAC 01-00-5E-00-00-09 §  on LAN only, no need for IGMP

33

Message format

address family route tag

IP address netmask

next router

0 31

command version unused

metric

§  Command, version unchanged §  One address family - authentication §  Next router

§  used at the border of different routing domains (e.g. RIP and OSPF)

§  Route tag §  for external routes (used by BGP)

34

Announcing netmasks

B

C

D

A

E F

10.1.0.0 (255.255.0.0)

10.2.0.0 (255.255.0.0)

10.1.0.0 255.255.0.0

10.2.0.0 255.255.0.0

§  E can forward to 10.2.0.0 §  Packet to 10.2.0.1 can go through F

packet to 10.2.0.1

35

Routing domains

B C

D

A

E F

§  Different routing domains §  e.g. routers under different administrations that run different

routing protocols (RIP, OSPF)

§  If A wants to send a packet to F, it goes through D and E

§  When announcing F, D adds E as next router

domain 1 (OSPF)

domain 2 (RIP)

36

Simple authentication

xFFFF authentication type = 2

password on 16 bytes

0 31

command version unused

§  Configuration of gated (/etc/gated.conf) rip yes { interface all version 2 multicast authentication simple "qptszwmz" }

37

MD5 authentication

xFFFF authentication type = 3

zero

0 31

command version unused

packet length key Id

xFFFF x01

auth. length increasing sequence no.

zero

route info

seal

38

MD5 authentication

§  Seal §  MD5 digest on the message using a shared secret §  sequence number avoids replay attacks

§  Configuration of gated (/etc/gated.conf) rip yes { interface all version 2 multicast authentication md5 "qptszwmz" }

39

Link State Routing §  Principles

§  estimate metrics with neighbors §  bandwidth, delay, cost (fixed by administrator)

§  build a packet with the metrics of all neighbors §  flood to all routers §  compute the shortest path to all destinations (Dijkstra) §  update if modification of topology

§  Used in OSPF (Open Shortest Path First) and PNNI (ATM routing protocol)

40

Topology Database Synchronization

§  Neighbouring nodes synchronize before starting any relationship §  Hello protocol; keep alive §  initial synchronization of database §  description of all links (no information yet)

§  Once synchronized, a node accepts link state advertisements §  contain a sequence number, stored with record in the

database §  only messages with new sequence number are accepted §  accepted messages are flooded to all neighbours §  sequence number prevents anomalies (loops or blackholes)

41

Example network

n1

A

B

n6

D E

n4

n3

C

n5 n2

F

n7

§  Each router knows directly connected networks

42

Initial routing tables

net type

n1 Ether n2 P-to-P

A

n1

A

B

n6

D E

n4

n3

C

n5 n2

F

n7

net type

n6 Ether n5 P-to-P

D net type

n6 Ether n7 Ether

E

net type

n1 Ether n7 Ether

F

net type

n1 Ether n4 P-to-P n5 P-to-P

C

net type

n3 Ether n2 P-to-P n4 P-to-P

B

43

Flooding

net cost

n1 10 n2 100

A

net cost

n6 10 n5 100

D net cost

n6 10 n7 10

E net cost

n1 10 n7 10

F

net cost

n1 10 n4 100 n5 100

C

net cost

n3 10 n2 100 n4 100

B

§  The local metric information is flooded to all routers §  After convergence, all routers have the same

information

44

Topology graph §  Arrows to nets with a given metric

§  except P-to-P, stub, and external networks

§  From nets to routers, metric = 0

A

B

C

D

F

E

100

10 10

10

10

100 100

n1

100 n6

n7

n3

100

10

100

10

10

10

0

0

0

0

0 0

external network

54

0

stub network

point to point link

broadcast network

external network

45

Simplified graph

§  Only arrows with metrics between routers §  Execute the SPF (Shortest Path First - Dijkstra)

algorithm on the graph

A

B

C

D

F

E

100

10

10

10

10

10

100 100

46

SPF at A

1.  Initialization 1.  PATH variable: router A (the best path to destination) 2.  TENT variable: empty (tentative paths)

2.  For each router N in PATH 1.  for each neighbor M of N

1.  c(A, M) = c(A, N) + c(N, M) 2.  if M is not in PATH nor in TENT with a better cost, insert M

with direction N in TENT

3.  If TENT is empty, end. Otherwise take the entry with the best cost from TENT, insert it into PATH and go to 2.

At the end PATH contains the tree of best paths to all destinations

47

Executing SPF

100 10

A

B C F

10

PATH TENT consider a router

Before: TENT: A-B(100), A-C(10), A-F(10) PATH: A After: TENT: A-B(100), A-F(10) PATH: A-C(10)

48

Executing SPF

100 10

A

B C F

10

20 110

A B D 110

F 20

Before: TENT: A-B(100), A-F(10), C-D(110) PATH: A-C(10) After: TENT: A-B(100), C-D(110) PATH: A-C(10), A-F(10)

49

Executing SPF

100 10

A

B C F

10

D 110

20

A

C

E

20

20

Before: TENT: A-B(100), C-D(110), F-E(20) PATH: A-C(10), A-F(10) After: TENT: A-B(100), C-D(110) PATH: A-C(10), A-F(10), F-E(20)

50

Executing SPF

100 10

A

B C F

10

D 110

20

E

F 30

D 30

Before: TENT: A-B(100), C-D(110) TENT: A-B(100), E-D(30) PATH: A-C(10), A-F(10), F-E(20) After: TENT: A-B(100) PATH: A-C(10), A-F(10), F-E(20), E-D(30)

51

Executing SPF

100 10

A

B C F

10

20

E

D 30

130 40

C E

Before: TENT: A-B(100) PATH: A-C(10), A-F(10), F-E(20), E-D(30) After: TENT: PATH: A-C(10), A-F(10), F-E(20), E-D(30) A-B(100)

52

Executing SPF

100 10

A

B C F

10

20

E

D 30

200 200

A C

Before: TENT: PATH: A-C(10), A-F(10), F-E(20), E-D(30) After: TENT: PATH: A-C(10), A-F(10), F-E(20), E-D(30), A-B(100)

53

Result

100 10

A

B C F

10

§  Tree of best paths to all destinations

20

E

D 30

dest next cost

B direct 100 C direct 100 D F 30 E F 20 F direct 10

A

54

Routing table of A

net next

n1 direct n2 direct n3 B n4 C n5 C n6 F n7 F

A

n1

A

B

n6

D E

n4

n3

C

n5 n2

F

n7

55

Towards OSPF §  OSPF (Open Shortest Path First)

§  Link State protocol §  Link State information: LSA (Link State Advertisement) §  different sub-protocols: Hello, Database Description, Link

State flooding

§  It allows to §  separate hosts and routers §  consider different types of networks

§  broadcast (Ethernet), NBMA (ATM, X.25), point-to-point (PPP)

§  divide large networks into several areas §  independent route computing in each area

56

Separate hosts and routers

§  Link should be described in the DB §  link between a router and each host,

but LANs in most cases: advertize the link to the "stub network"

§  link of the form of a broadcast network (Ethernet)

§  IP address of the subnetwork (stub network)

§  e.g. n3 identified by 128.88.38/24

§  link to a neighbor router §  IP address of the neighbor router

§  e.g. n2 identified by 176.44.23.254 §  no IP address assigned to the interface

–  interface index

A

B

n3

n2

57

Designated routers

§  Number of neighbors §  if n routers, n(n-1)/2 neighbors

§  Election of a designated router on a LAN §  n-1 neighbors §  flooding

§  advertise to 224.0.0.6 (all designated routers) §  flooded to 224.0.0.5 (all routers)

§  back-up designated router §  listens to advertisements, but does not flood §  failure of the designated router detected by Hello

–  back-up becomes designated router

l1

A B D C A B

D C

A B

D C

A B

D C

58

Virtual networks

§  LAN represented as a virtual network §  less entries in the DB §  real cost to n1, zero to routers

n1

A B D C

A C

F

n1

F

10

0

10

10

10

0 0

0

59

Divide large networks §  Why divide large networks? §  Cost of computing routing tables

§  update when topology changes §  SPF algorithm

§  n routers, k links §  complexity O(n*k)

§  size of DB, update messages grows with the network size

§  Limit the scope of updates and computational overhead §  divide the network into several areas §  independent route computing in each area §  inject aggregated information on routes into other areas

60

Hierarchical Routing §  A large OSPF domain can be configured into areas

§  one backbone area (area 0) §  non backbone areas (areas numbered other than 0)

§  All inter-area traffic goes through area 0 §  strict hierarchy

§  Inside one area: link state routing as seen earlier §  one topology database per area

area 0

B1 X3

X1

X4 A1

area 2 area 1

X1

X3 X4 B2 A2

61

Principles §  Routing method used in the higher level:

§  distance vector §  no problem with loops - one backbone area

§  Mapping of higher level nodes to lower level nodes §  area border routers (inter-area routers) belong to two areas

§  Inter-level routing information §  summary link state advertisements (LSA) from other areas

are injected into the local topology databases

62

Example

§  Assume networks n1 and n2 become visible at time 0. Show the topology databases at all routers

area 0

B1 X4

X1

X3 A1

area 2 area 1

X2

X6 X5 B2 A2

n1

n2

10

10

10

6 6

6 6

6

6

10

10

10

63

Solution area 0

B1 X4

X1

X3 A1

area 2 area 1

X2

X6 X5 B2 A2

n1

n2

10

10

10

6 6

6 6

6

6

n1

n2

area 2 topology database

area 0 topology database

n1, m=10 n2, m=16

n1, m=16 n2, m=10

n1, m=28 n2, m=22

n1, m=22 n2, m=16

10

10

10

area 1 topology database

64

Explanations

§  All routers in area 2 propagate the existence of n1 and n2, directly attached to B1 (resp. B2).

§  Area border routers X4 and X6 belong to area 2, thus they can compute their distances to n1 and n2

§  Area border routers X4 and X6 inject their distances to n1 and n2 into the area 0 topology database (item 3 of the principle). The corresponding summary LSA is propagated to all routers of area 0.

§  All routers in area 0 can now compute their distance to n1 and n2, using their distances to X4 and X6, and using the principle of distance vector (item 1 of the principle).

65

Comments §  Distance vector computation causes none of the RIP

problems §  strict hierarchy: no loop between areas

§  External and summary LSA for all reachable networks are present in all topology databases of all areas §  most LSAs are external §  can be avoided in configuring some areas as terminal: use default entry to the backbone

§  Area partitions require specific support

66

Classification of routers

§  Internal routers §  a router with all directly connected networks belonging to

the same area

§  Area border routers §  attached to multiple areas §  condense LSA of their attached areas for distribution to the

backbone

§  Backbone routers §  a router that has an interface to the backbone area

§  AS boundary routers §  exchange routing information with routers belonging to

other AS

67

Classification of routers

area 0

B1 X4

X1

X3 A1

area 2 area 1

X2

X6 X5 B2 A2

n1

n2

10

10

10

6 6

6 6

6

6

10

10

10

AS-boundary router

backbone router

area border router internal router

68

OSPF protocol §  On top of IP (protocol type = 89) §  Multicast

§  224.0.0.5 - all routers of a link §  224.0.0.6 - all designated and backup routers

§  Sub-protocols §  Hello to identify neighbors, elect a designated and a

backup router §  Database description to diffuse the topology between

adjacent routers §  Link State to request, update, and ack the information on

a link (LSA - Link State Advertisement)

§  LSA §  tagged with the router Id and checksum §  5 different types

69

Convergence §  Route timeout after 1 hour

§  LS Update every 30 min.

§  Detect a failure §  40 sec (dead interval)

§  Smallest interval to recompute SPF §  30 sec (Dijkstra interval)

§  Reconfiguration time §  70 sec.

§  Proposals §  Hello each 100 ms §  SPF immediately

70

OSPF vs. RIP §  much more complex, but presents many advantages

§  no count to infinity §  no limit on the number of hops (OSPF topologies limited by

Network and Router LSA size (max 64KB) to O(5000) links) §  less signaling traffic (LS Update every 30 min) §  advanced metric §  large networks - hierarchical routing

§  most of the traffic when change in topology §  but periodic Hello messages §  in RIP: periodic routing information traffic

§  drawback §  difficult to configure

71

Interconnection of AS

NAP, MAE, GIX, IXP

subnetworks

border router

autonomous system

72

Interconnection of AS

§  Border routers §  interconnect AS

§  NAP or GIX, or IXP §  exchange of traffic - peering

§  Route construction §  based on the path through a series of AS §  based on administrative policies §  routing tables: aggregation of entries §  works if no loops and at least one route - external routing

protocols, e.g. BGP

73

Border Gateway Protocol

§  Path vector §  knowledge of the global state

§  path: sequence of AS with attributes §  loop detection: AS appears twice §  global optimization

§  Policy routing §  define what routes are accepted, chosen, and advertised

74

§  AS border router - BGP speaker §  peer-to peer relation with another AS border router §  connected communication (on top of a TCP connection)

Autonomous System C

C2

C1

C4 C3

IGRP

B2

B1 B4

B3

A2

A1

A4

A3

Autonomous System A (ex: IMAG) Autonomous System B

BGP

BGP

OSPF

Autonomous System D

BGP

BGP

D2

D3

D1

D5

D4 OSPF

area 0

area 2 area 1

BGP - Hierarchical routing

75

What does BGP do? §  BGP is a routing protocol between AS. It is

used to establish routes from one router in one AS to any network prefix in the world

§  There are two levels in BGP: §  Inter-domain: one AS is a virtual node in the higher

layer §  Intra-domain: distribution of routes inside one AS

§  The method of routing is §  Path vector §  With policy

§  A route advertisement from B to A for a destination prefix is an agreement by B that it will forward packets sent via A destined for any destination in the prefix.

advertisement C B:n1,n2

A

B

C

packet to n2

n1, n2

76

A

B

C

E n1, n2

A:n1,n2

A:n1,n2

C A:n1,n2 C:n3

B A:n1,n2 B:n5

D

D C A:n1,n2 D C: n3 D: n4

dest AS path

n1 B A n2 B A n3 D C n4 D n5 B

BGP table in E n5

n3

n4

Path Vector routing

§  AS maintains a table of best paths known so far §  Table updated using local rules §  Suitable when

§  no global meaning for costs can be assumed (heterogeneous environments) §  global topology is fairly stable

77

Border Routers, E-BGP and I-BGP §  E-BGP: BGP runs on border routers = “BGP speakers”

belonging to one AS only §  two border routers per boundary (OSPF - one per area boundary)

§  I-BGP: BGP speakers talks to each other inside the AS using “Internal-BGP” §  full mesh called the “BGP mesh” §  I-BGP is the same as E-BGP except for one rule: routes learned from a

neighbour in the mesh are not repeated inside the mesh

D1 D2

D4 D5

D3

A B

G H

C D

E F

X:n1 X:n1

A->C: D1,X: n1 C->E: D1,X: n1 C->D: D1,X: n1 C->F: D1,X: n1 E->G: D3,D1,X: n1

E-BGP

E-BGP

I-BGP

78

Policy Routing

§  Mainly 3 types of relations depending on money flows §  customer: EPFL is customer of Switch. EPFL pays Switch §  provider: Switch is provider for EPFL; Switch is paid by

EPFL §  peer: EPFL and CERN are peers: costs of interconnection is

shared

§  Type of relation is negotiated in bilateral agreements there is no architecture rule, just business

79

Goal of Policy Routing §  Example:

§  ISP3 - ISP2 is transatlantic link, cost shared between ISP2 and ISP 3

§  ISP 3 - ISP 1 is a local, inexpensive link §  Ci is customer of ISPi, ISPs are peers

§  It is advantageous for ISP3 to send traffic to n2 via ISP1

§  ISP1 does not agree to carry traffic from C3 to C2 §  ISP1 offers a “transit service” to C1 and

a “non-transit” service to ISP 2 and ISP3

§  The goal of “policy routing” is to support this and other similar requirements

ISP 1

ISP 3 ISP 2

C1

C2 C3

n2

provider

customer peers

80

How does Policy Routing Work ? §  Implemented by rules followed by BGP speakers

§  refuse to import or announce some routes §  modify the attributes that control which route is

preferred (see later)

§  Example §  ISP 1 announces to ISP 3 all networks of C1 §  ISP 1 announces to C1 all routes it has learnt from

ISP3 and ISP2 §  ISP2 announces “ISP2 n2” to ISP3 and ISP1;

assume that ISP1 announces “ISP1 ISP2 n2” to ISP3.

§  ISP 3 has two routes to n2: “ISP2 n2” and “ISP1 ISP2 n2”; assume that ISP3 prefers “ISP1 ISP2 n2”

§  packets from n3 to n2 are routed via ISP1 – undesired

§  solution: ISP 1 announces to ISP3 only routes to ISP1’s customers (not “ISP1 ISP2 n2”)

ISP 1

ISP 3 ISP 2

C1

C2 C3

n2 n3

provider

customer peers

81

Typical Policy Routing Rules §  Provider (ISP1) to customer (C1)

§  announce all routes learnt from other ISPs §  import only routes that belong to C1

example: import from IMAG only one route 129.88/16

§  Customer (C1) to Provider (ISP1) §  announce all routes that belong to C1 §  import all routes

§  Peers (ISP1 to ISP3) §  announce only routes to all customers of ISP1 §  import only routes to ISP3’s customer §  these routes are defined as part of peering

agreement

§  The rules are defined by every AS and implemented in all BGP speakers in one AS

ISP 1

ISP 3 ISP 2

C1

C2 C3

n2 n3

provider

customer peers

82

BGP (Border Gateway Protocol)

§  BGP-4, RFC 1771 §  AS border router - BGP speaker

§  peer-to peer relation with another AS border router §  connected communication

§  on top of a TCP connection, port 179 (vs. datagram (RIP, OSPF))

§  external connections (E-BGP) §  with border routers of different AS

§  internal connections (I-BGP) §  with border routers of the same AS

§  BGP only transmits modifications

83

BGP principles

§  Establish BGP session §  Update

§  list of destinations reachable via each router §  path attributes such as degree of preference for a particular

route

AS x AS y

1.1.1.1 2.2.2.2

n1 n2 n3

n4

n1,n2

n3,n4

84

BGP principles

§  n1 no longer reachable §  Incremental update

§  withdraw n1

AS x AS y

1.1.1.1 2.2.2.2

n2 n3 n4

withdraw n1

85

Summary §  The network layer transports packets from a sending

host to the receiver host §  Internet network layer

§  connectionless §  best-effort

§  Main components: §  addressing §  packet forwarding §  routing protocols and routers

§  Hierarchical routing protocols §  internal (RIP, OSPF, EIGRP) §  external (BGP)

86

top related