bgp.pdf

The Bryant Advantage CCNP ROUTE Study Guide

Chris Bryant, CCIE #12933 -- www.thebryantadvantage.com

Back To Index

BGP Overview

Introduction To BGP

When To Use BGP

When Not To Use BGP

The Peering Process

Advertising Routes In BGP

The ORIGIN Attribute

The AS_PATH Attribute

The NEXT_HOP Attribute

The Multi-Exit Discriminator (MED)

The LOCAL_PREF Attribute

The Weight Attribute

The Atomic Aggregate, Aggregate, And Other Attributes

Route Aggregation

Resetting Peer Connections

The Rule Of Synchronization

The BGP Full Mesh Dilemma

Route Reflectors

Clusters

Refreshing BGP Routes

Introduction To BGP

BGP is like nothing you’ve studied to this point. BGP is an external routing protocol used primarily by Internet Service Providers (ISPs).

Unless you work for an ISP today or in the future, you may have little or

no prior exposure to BGP. Understanding BGP is a great addition to your

skill set – and you have to know the basics well to pass the CCNP ROUTE exam.

Note that I said “the basics”. BGP is a very complex protocol, and when

you pursue your CCIE, you’ll see what I’m talking about. As with all

things Cisco, though, when broken down into smaller pieces, BGP

becomes quite understandable.

BGP defined:

An Internet protocol that enables groups of routers (called autonomous

systems) to share routing information so that efficient, loop-free routes can be established. BGP is commonly used within and between Internet

Service Providers (ISPs).

There are a couple of terms in there that apply to the protocols you’ve

mastered so far in your studies. The term “autonomous system” applies

to EIGRP as well as BGP -- you’ll be indicating a BGP AS in your configurations just as you did with EIGRP.

And just like EIGRP, "autonomous system" simply refers to a group of

Prefix Lists

Peer Groups

Confederations

Communities

Using Loopbacks To Form Peer Relationships

ISP Concerns With BGP

The Perils Of Redistribution

BGP Message Types

Regular Expressions

Private AS Numbers

This n' That

routers that is managed by a single administrative body.

An AS will use Exterior Gateway Protocols (EGP) to exchange updates

with other ASes. As you'll soon see, BGP is one such EGP!

Interior Gateway Protocols such as OSPF and EIGRP have their place as well, and that place is inside an AS. Routes learned via BGP can be

redistributed into an IGP, and vice versa - but you have to be extremely

careful in doing so.

Extremely careful. More on that later.

BGP shares some characteristics with some routing protocols you’ve

already studied:

BGP supports VLSM and summarization.

BGP will send full updates when two routers initially become

neighbors and will send only partial updates after that.

BGP does create and maintain neighbor relationships before

exchanging routes, and keepalives are sent to keep this relationship

alive.

You’ll hear BGP referred to as a path-vector protocol. As opposed to distance-vector protocols that exchange relatively simple information

about available routes, BGP routers will exchange extensive information

about networks to allow the routers to make more intelligent routing

decisions.

This additional BGP path information comes in the form of attributes, and

these path attributes are contained in the updates sent by BGP routers. Attributes themselves are broken up into two classes, well-known and

optional.

BGP also keeps a routing table separate from the IP routing table.

As with any set of design requirements, it's almost impossible to come

up with a strict set of rules as to when to use and not to use BGP.

Having said that, here are some general Cisco best practices with BGP.

When Should BGP Be Used?

Some circumstances under which BGP should be used:

If your company is connecting to more than one AS or ISP,

decisions on which links to use can be made by using BGP path

attributes.

If the routing policy of your organization and your ISP are different,

path attributes can again be helpful.

If your company is an ISP to begin with, traffic from other

autonomous systems will use your AS as a transit domain, so BGP

will be needed.

The first and third reasons listed are the major reasons organizations run

BGP. In short, if your AS has more than one connection to other ASes, or other ASes are using your AS as a transit area, BGP is practically a

necessity.

When Should BGP Not Be Used?

Some general guidelines on when not to use BGP:

When there is a single connection to the Internet or to another autonomous system.

No redundant link to the internet is present.

Situations where you don’t really care what path is used to reach a route in another AS.

When router resources are a concern (memory and CPU).

When there is a low-bandwidth connection between multiple

autonomous systems. In this situation, static and default routing may be a better choice if any of these circumstances exist.

The BGP Peering Process

Like TCP, BGP is connection-oriented ("reliable"). An underlying connection between two BGP speakers is established before any routing

information is exchanged. This connection takes place on TCP port 179.

As with EIGRP and OSPF, keepalive messages are sent out by the BGP speakers in order to keep this relationship alive.

Hint: TCP port 179 is a good port to leave unblocked by ACLs.

Once the connection is established, the BGP speakers exchange

routes and synchronize their tables. After this initial exchange, a BGP speaker will only send further updates upon a change in the network

topology.

The IGP protocols that use Autonomous Systems, IGRP and EIGRP,

require prospective neighbors to be in the same AS. This is not true with

BGP. Routers can be in different Autonomous Systems and still exchange routes.

A BGP peer that is in the same AS as the local router is an Internal BGP

(iBGP) peer, where a BGP peer in another AS is an External BGP (eBGP)

peer. That little "i" or "e" makes a big difference when it comes to

advertising routes and other BGP behaviors - so watch that letter!

A sample iBGP configuration (same AS):

Router bgp 100 Neighbor 10.1.1.2 remote-as 100

A sample eBGP configuration (different AS):

Router bgp 100 Neighbor 10.1.1.2 remote-as 200

Cisco recommends that eBGP peers be directly connected. iBGP peers are not required to be directly connected and generally aren't.

Before we get too deep into BGP theory, let’s get a configuration

started. You’ll use the router bgp command to configure a router as a

BGP speaker. Right after that, the neighbor command will be used to

identify this BGP speaker’s potential neighbors.

(The terms "peer" and "neighbor" are interchangeable in BGP, but it's

the neighbor statement that is used to statically define neighbors. BGP is not capable of discovering neighbors dynamically.)

Remember what I mentioned about BGP being a complex protocol? Take

a look at all the possible options for the neighbor command:

R1(config)#router bgp 100 R1(config-router)#neighbor 172.12.123.3 ? activate Enable the Address Family for this Neighbor advertise-map specify route-map for conditional advertisement advertisement-interval Minimum interval between sending BGP routing updates allowas-in Accept as-path with my AS present in it default-originate Originate default route to this neighbor description Neighbor specific description disable-connected-check One-hop away EBGP peer using loopback address distribute-list Filter updates to/from this neighbor ebgp-multihop Allow EBGP neighbors not on directly connected networks filter-list Establish BGP filters local-as Specify a local-as number maximum-prefix Maximum number of prefix accept from this peer next-hop-self Disable the next hop calculation for this neighbor next-hop-unchanged Propagate the iBGP paths's next hop unchanged for this neighbor password Set a password peer-group Member of the peer-group prefix-list Filter updates to/from this neighbor remote-as Specify a BGP neighbor remove-private-AS Remove private AS number from outbound updates route-map Apply route map to neighbor route-reflector-client Configure a neighbor as Route Reflector client send-community Send Community attribute to this neighbor shutdown Administratively shut down this neighbor soft-reconfiguration Per neighbor soft reconfiguration timers BGP per neighbor timers translate-update Translate Update to MBGP format unsuppress-map Route-map to selectively unsuppress suppressed routes update-source Source of routing updates version Set the BGP version to match a neighbor weight Set default weight for routes from this neighbor

Do not panic! You don’t have to know every single one of these to pass

the CCNP ROUTE exam. I’m just showing them to you to reinforce the

fact that BGP is a whole new world!

And the key to learning what every one of those commands do?

Mastering one at a time.

Let’s start with the basics and configure R1 and R3 as eBGP peers. We'll

place R1 into AS 100 and R3 into AS 200. The routers are on the 172.12.123.0 /24 network.

R1(config-router)#neighbor 172.12.123.3 % Incomplete command.

R1(config-router)#neighbor 172.12.123.3 remote-as 2 00

While almost all of the neighbor options are just that -- optional -- you do have to specify the BGP AS of the remote router. BGP has no mechanism

to dynamically discover neighbors. Remember, BGP speakers do not

have to be in the same AS to become peers.

To verify that the remote BGP speaker has become a peer, run show ip

bgp neighbor.

R1#show ip bgp neighbor BGP neighbor is 172.12.123.3, remote AS 200, external link BGP version 4, remote router ID 0.0.0.0 BGP state = Active Last read 00:01:39, hold time is 180, keepalive i nterval is 60 seconds Received 0 messages, 0 notifications, 0 in queue Sent 0 messages, 0 notifications, 0 in queue Route refresh request: received 0, sent 0 Default minimum time between advertisement runs i s 30 seconds

The output here can be a little misleading the first time you read it. The

first highlighted line shows 172.12.123.3 is a BGP neighbor, is located in

AS 200, and is an external link, indicating that the neighbor is in another AS entirely.

The second highlighted line shows the BGP state as Active. This sounds great, but it actually means that a BGP peer connection does not yet

exist with the prospective neighbor. Before we continue with this

example, let’s look at the different BGP states:

Idle is the initial state of a BGP connection. The BGP speaker is waiting

for a start event, generally either the establishment of a TCP connection or the re-establishment of a previous connection. Once the connection is

established, BGP moves to the next state.

There's nothing wrong with this state, but we don't want to stay there. If you note a connection has gone to idle and stayed there, check two

things:

The IP address in the neighbor statement (this is usually the issue).

While we're at it, make sure you have a neighbor statement for that

remote router.

Make sure your local router knows how to get to that same address.

Connect is the next state. In this state, a TCP connection request has been sent but a response has not yet been received. If the TCP

connection completes, BGP will move to the OpenSent stage; if the

connection does not complete, BGP goes to Active.

Active indicates that the BGP speaker is continuing to create a peer

relationship with the remote router - basically, this is the halfway point of the connection. The local router has successfully sent a BGP Open

packet to the address in the neighbor statement, but it hasn't heard

anything back yet.

As with Idle, there's nothing wrong with this state - unless your

connection stays there. If the connection goes Active and stays there, it's really a mirror image of the Idle issue we spoke of earlier, so...

Check the remote router's neighbor statement

Be sure the remote router knows how to get the OpenConfirm packet back to the local router (OpenConfirm is BGP's ACK)

And my personal favorite - make sure your AS numbers are correct, especially if the connection is flapping between Idle and Active.

OpenSent indicates that the BGP speaker has received an Open message from the peer. BGP will determine whether the peer is in the

same AS (iBGP) or a different AS (eBGP) in this state.

In OpenConfirm state, the BGP speaker is waiting for a keepalive

message. If one is received, the state moves to Established, and the

neighbor relationship is complete. It is in the Established state that update packets are actually exchanged.

So even though the show ip bgp neighbor output indicated that this is an

Active neighbor relationship, that’s not as good as it sounds. Of course,

the reason the peer relationship hasn’t been established is that we

haven’t configured R3 yet!

R3(config)#router bgp 200 R3(config-router)#neighbor 172.12.123.1 remote-as 1 00

Verify the peer establishment with show ip bgp neighbor:

R3#show ip bgp neighbor BGP neighbor is 172.12.123.1, remote AS 100, exter nal link BGP version 4, remote router ID 172.12.123.1

BGP state = Established, up for 00:01:18 Last read 00:00:17, hold time is 180, keepalive i nterval is 60 seconds Neighbor capabilities: Route refresh: advertised and received(old & ne w) Address family IPv4 Unicast: advertised and rec eived Received 5 messages, 0 notifications, 0 in queue Sent 5 messages, 0 notifications, 0 in queue Route refresh request: received 0, sent 0 Default minimum time between advertisement runs i s 30 seconds

Local host: 172.12.123.3, Local port: 179 (BGP uses TCP Port 179) Foreign host: 172.12.123.1, Foreign port: 11007

The peer relationship between R1 and R3 has been established.

Another handy command to view BGP peer information is show ip bgp

summary. While most of the information in this command deals with the

local router, a BGP peer summary table is shown at the very end of the command output. If you just want to see if peer relationships are in

place and how long they've been up, I find this command to be more

helpful than the show ip bgp neighbor command.

R1#show ip bgp summary BGP router identifier 172.12.123.1, local AS number 100 BGP table version is 2, main routing table version 2 1 network entries and 1 paths using 133 bytes of me mory 1 BGP path attribute entries using 60 bytes of memo ry 1 BGP AS-PATH entries using 24 bytes of memory 0 BGP route-map cache entries using 0 bytes of memo ry 0 BGP filter-list cache entries using 0 bytes of me mory BGP activity 1/1 prefixes, 1/0 paths, scan interval 60 secs

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd

172.12.123.2 4 100 83 84 2 0 0 01:19:03 0

172.12.123.3 4 200 104 103 2 0 0 01:39:24 1

BGP peers do not have to be in the same AS, and whether they are in the

same or a different AS determines whether they become iBGP or eBGP

peers.

That may not sound like a big deal, but it is.

In the following configuration, R1 has two peers, one sharing AS 100 and

the other in AS 200. (Also note what happens if you use the local router’s own IP address as a BGP peer address!)

R1(config)#router bgp 100 R1(config-router)#neighbor 172.12.123.3 remote-as 2 00 R1(config-router)#neighbor 172.12.123.1 remote-as 1 00 % Cannot configure the local system as neighbor R1(config-router)#neighbor 172.12.123.2 remote-as 1 00

R1#show ip bgp neighbor BGP neighbor is 172.12.123.2, remote AS 100, internal link BGP version 4, remote router ID 172.12.123.2

BGP neighbor is 172.12.123.3, remote AS 200, external link BGP version 4, remote router ID 172.12.123.3

Using Loopback Addresses To Create eBGP Adjacencies

When you were introduced to loopback interfaces in your CCNA studies, your first question was likely "Why do we create imaginary interfaces on

Cisco routers?"

The frustrating thing for both teacher and student in the CCNA is that

you're shown how to create those interfaces, but not really given many

reasons why. Here's one excellent reason why - and a classic BGP "gotcha" you must be aware of.

Using loopback addresses for BGP adjacencies allows us to keep those adjacencies even if physical interfaces go down for any reason.

(((( Illustration here of using loopbacks )))))

Sounds good, right? Now here's that "gotcha":

Loopback interfaces are not considered directly connected even if

they share a common subnet.

The ebgp-multihop command is necessary to configure eBGP peering

relationships where the addresses used to form the adjacency are not on

the same segment. You'll also need the update-source loopback

command when loopbacks are used to create eBGP adjacencies.

Static routes can also play a role in eBGP adjacencies. If you use loopback addresses for eBGP adjacencies, you may also need to

configure a static route on each router that points to the remote router's

loopback. After all, for this config to work, the router needs to know

how to get to the address used in the neighbor command.

Let's drive all of these concepts home by creating an adjacency between R1 and R3 using their respective loopback addresses as shown in the

following config.

R1(config)#int loopback1 R1(config-if)#ip address 1.1.1.1 255.255.255.255 R1(config-if)#router bgp 100 R1(config-router)#no auto R1(config-router)#no synch R1(config-router)#neighbor 3.3.3.3 remote-as 200 R1(config-router)#neighbor 3.3.3.3 ebgp-multihop 2 R1(config-router)#neighbor 3.3.3.3 update-source lo opback1

R3(config)#int loopback1 R3(config-if)#ip address 3.3.3.3 255.255.255.255 R3(config-if)#router bgp 200 R3(config-router)#no auto R3(config-router)#no synch R3(config-router)#neighbor 1.1.1.1 remote-as 100 R3(config-router)#neighbor 1.1.1.1 ebgp-multihop 2 R3(config-router)#neighbor 1.1.1.1 update-source lo opback1

The neighbor statements look good, the ebgp-multihop command is in

place, and the update-source command is as well. But is the adjacency

in place?

R3#show ip bgp summ BGP router identifier 3.3.3.3, local AS number 200 BGP table version is 1, main routing table version 1

Neighbor V AS MsgRcvd MsgSent TblVer InQ Out Q Up/Down State/PfxRcd 1.1.1.1 4 100 0 0 0 0 0 never Active


Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 3.3.3.3 4 200 0 0 0 0 0 never Active

Both routers show an adjacency state of Active, and we could wait a long, long time and they'd still be shown as Active. Just as with EIGRP,

Active is not good when it lasts.

The issue is that neither router has a route to the loopback address the remote router is using to form the adjacency.

Or in this case, not forming it.

By configuring static routes on each router that point to the remote router's loopback address, the BGP adjacency will form. We'll use two

host static routes here to get the job done.

R3(config)#ip route 1.1.1.1 255.255.255.255 serial1

R1(config)#ip route 3.3.3.3 255.255.255.255 serial1

The adjacencies come up just a few seconds after these static routes are

configured. Note that though the desired state of each neighbor

relationship is Established, that word doesn't actually appear where we

saw Active just a few minutes ago - at the utmost right of the show ip bgp summary configuration.


Neighbor V AS MsgRcvd MsgSent TblVer InQ Out Q Up/Down State/PfxRcd 1.1.1.1 4 100 8 8 1 0 0 00:00:08 0


Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 3.3.3.3 4 200 8 10 1 0 0 00:00:44 0

No route to the address in the neighbor statement = no neighbor!

If the address is on a directly connected subnet, that works, and we can get the route from an IGP - but don't forget about a simple default route.

Naturally, those static routes have to stay there; if they're removed, the

adjacencies will time out. To demonstrate, I removed the static route

from R3:

R3(config)#no ip route 1.1.1.1 255.255.255.255 seri al1

A few minutes later, the result is a lost adjacency.

R3# 00:22:24: %BGP-5-ADJCHANGE: neighbor 1.1.1.1 Down B GP Notification sent R3# 00:22:24: %BGP-3-NOTIFICATION: sent to neighbor 1.1 .1.1 4/0 (hold time expired)0 bytes

In short, there are three main reasons why BGP peerings fail to form or

are torn down after they're built.

The AS number is incorrectly identified in the config. If you do this, trust me, you're not the first and you won't be the last! :)

A peering has been configured for an eBGP router that is not

directly connected, and the ebgp-multihop option has been

omitted.

An ACL is blocking TCP port 179. Opening that port right back up will allow the adjacencies to reform, but you will have some anxious

moments in the meantime!

Advertising Routes In BGP

We use the network command in BGP, but not quite the same way we

did with RIP, EIGRP, and OSPF. It will look the same, but the BGP

network command identifies the networks that will be advertised by BGP,

where the network command with IGPs identifies the interfaces that will be enabled with that protocol.

The network specified in the BGP network command must be an exact match for a network contained in the IP routing table, and that includes

the mask.

A real-world note here (and it couldn't hurt on the exam) -- using the

mask in the network statement is not required, but I highly recommend

you use it. If you're called on to troubleshoot a BGP configuration and it's missing the masks on the network statements, that could well be the

issue. Use the masks or you'll end up only with the classful networks.

Here, we’ll advertise R3’s loopback (3.3.3.3 /32) in BGP.

R3(config)#router bgp 200 R3(config-router)#network 3.3.3.3 mask 255.255.255. 255

R1 quickly sees the route:

R1#show ip bgp BGP table version is 2, local router ID is 172.12.1 23.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocP rf Weight Path *>i3.3.3.3/32 172.12.123.3 0 100 0 i

For the route to be usable, you must see that asterisk. The best route is indicated with a combination of an asterisk and the ">" symbol -- that

means "valid and best".

Let's use another network to illustrate what happens if the mask is just a

bit off...

R3(config)#int loopback33 R3(config-if)#ip address 33.33.33.33 255.255.255.0

R3(config)#router bgp 200 R3(config-router)#network 33.33.33.33 mask 255.255. 255.255

Does R1 see the route?


Network Next Hop Metric LocP rf Weight Path *>i3.3.3.3/32 172.12.123.3 0 10 0 0 i

Nope! Due to the mismatched mask, R3 doesn't even see the route in

its own BGP table!


Network Next Hop Metric LocP rf Weight Path *> 3.3.3.3/32 0.0.0.0 0 32768 i

The BGP network mask must match the IP routing table's mask exactly

in order for the route to be successfully advertised via BGP. The loopback was configured with a /24 mask, but the BGP network

command specified a /32 mask. Here's how the route looks in the

IP routing table:

33.0.0.0/24 is subnetted, 1 subnets C 33.33.33.0 is directly connected, Loopback3 3

Once we change the BGP network statement to reflect a /24 mask, the route will appear in R3's BGP table and be successfully advertised to R1

via BGP. We'll first remove the erroneous network statement and then

enter the correct one.

R3(config)#router bgp 200 R3(config-router)#no network 33.33.33.33 mask 255.2 55.255.255 R3(config-router)#network 33.33.33.0 mask 255.255.2 55.0

R3#show ip bgp BGP table version is 3, local router ID is 33.33.33 .33 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocP rf Weight Path *> 3.3.3.3/32 0.0.0.0 0 32768 i *> 33.33.33.0/24 0.0.0.0 0 32768 i


Network Next Hop Metric LocP rf Weight Path *>i3.3.3.3/32 172.12.123.3 0 1 00 0 i *>i33.33.33.0/24 172.12.123.3 0 1 00 0 i

BGP Path Attributes

There are two classes of BGP Path Attributes, well-known and optional.

To truly understand BGP, you need to know exactly what these attributes

are and how they affect BGP.

You must master the application and use of these attributes to pass the

CCNP ROUTE exam.

Using the network we’ve built to this point, we will now examine these attributes, how to view them, and their impact on BGP path selection.

Here are the two categories of well-known attributes, both mandatory and discretionary:

Well-known mandatory: AS_PATH, origin, next-hop

Well-known discretionary: local preference, atomic aggregate

There are also optional attributes, both transitive and non-transitive.

Optional transitive: aggregator, community

Optional non-transitive: MED (multi-exit discriminator)

Those three mandatory attributes – AS_PATH, origin, and next-hop – will appear in all BGP update messages sent to neighbors. These are the only

three attributes that all BGP speakers must understand.

The optional attributes can be a bit of a pain for BGP operation, since not

every BGP speaker is going to understand all optional attributes. The

difference between "optional transitive" and "optional non-transitive" comes into play here.

A BGP path carrying an unrecognized transitive optional attribute will be

accepted; if this path is advertised to other routers, the Partial bit will be

set and the attribute advertised to the neighboring router.

Basically, marking an attribute as partial is the equivalent of the

advertising router saying "I didn't understand this attribute, but here is it anyway."

An unrecognized non-transitive optional attribute will not be passed on to other BGP speakers.

The Origin Attribute

The source of the routing update itself can be viewed with show ip bgp.


Network Next Hop Metric Lo cPrf Weight Path *> 3.3.3.3/32 172.12.123.3 0 0 200 i

There are three possibilities for the Origin code:

"i" -- path originated from an IGP via the network command

"e" -- path originated from an Exterior Gateway Protocol (EGP)

"?" -- Actual origin unclear; learned via route redistribution.

Those are shown in order of most preferred to least preferred, from top

to bottom.

The AS_PATH Attribute

This attribute shows the autonomous systems along the path to the destination network, including the AS the destination network resides

in. The shortest AS path is the preferred path.

The AS_PATH attribute helps to prevent routing loops; if a router

receives an update and sees its own AS number in the path to a

destination, that route will be discarded.

In this example, the only AS shown in the path is the AS the network resides in, AS 200.


Network Next Hop Metric LocPrf W eight Path *> 3.3.3.3/32 172.12.123.3 0 0 200 i

To see a longer AS_PATH attribute, we’ll add a few extra routers and

some additional autonomous systems. Every router will be advertising

its loopback address into BGP, and every router’s loopback is its own number in each octet (R1’s loopback is 1.1.1.1, etc.) Just for fun, we'll

build some multiple BGP peerings between two routers; in production

networks, we most likely would not do that.

The BGP configurations of the routers:

R1(config)#router bgp 100 R1(config-router)#neighbor 10.1.1.5 remote-as 500 R1(config-router)#neighbor 172.12.123.2 remote-as 1 00 R1(config-router)#neighbor 172.12.123.3 remote-as 3 00 R1(config-router)#network 1.1.1.1 mask 255.255.255. 255

R2(config)#router bgp 100 R2(config-router)#neighbor 172.12.123.1 remote-as 1 00

R2(config-router)#neighbor 172.12.123.3 remote-as 3 00 R2(config-router)#neighbor 172.12.234.3 remote-as 3 00 R2(config-router)#neighbor 172.12.234.4 remote-as 4 00 R2(config-router)#network 2.2.2.2 mask 255.255.255. 255

R3(config)#router bgp 300 R3(config-router)#neighbor 172.12.123.1 remote-as 1 00 R3(config-router)#neighbor 172.12.123.2 remote-as 1 00 R3(config-router)#neighbor 172.12.234.2 remote-as 1 00 R3(config-router)#neighbor 172.12.234.4 remote-as 4 00 R3(config-router)#neighbor 172.12.34.4 remote-as 40 0 R3(config-router)#network 3.3.3.3 mask 255.255.255. 255

R4(config)#router bgp 400 R4(config-router)#neighbor 172.12.234.3 remote-as 3 00 R4(config-router)#neighbor 172.12.234.2 remote-as 1 00 R4(config-router)#neighbor 172.12.34.3 remote-as 30 0 R4(config-router)#network 4.4.4.4 mask 255.255.255. 255

R5(config)#router bgp 500 R5(config-router)#neighbor 10.1.1.1 remote-as 100

R5(config-router)#network 5.5.5.5 mask 255.255.255. 255

Here are the peerings:

R1: eBGP to R5, iBGP to R2, eBGP to R3.

R2: eBGP to R4, eBGP to R3, iBGP to R1

R3: eBGP to R1, eBGP to R2 via the Serial network, eBGP to R2 via

the Ethernet segment, eBGP to R4 via the Ethernet segment, eBGP

to R4 via the Serial interface

R4: eBGP to R3 via the Ethernet segment, eBGP to R3 via the Serial

connection, eBGP to R2 via the Ethernet segment.

R5: eBGP to R1 via the Ethernet segment.

R1’s BGP table has at least one entry for every loopback in the network,

and multiple paths for most of them.

The “>” symbol indicates the best path, and therefore the path that will be used. From top to bottom, here's how BGP selects a best path

between multiple valid paths:

Highest weight (Cisco-proprietary attribute)

Highest local preference (1st if non-Cisco routers are involved)

Locally originated path preferred

Shortest AS_PATH

Best origin code ( i, then e, then ?)

Lowest MED

eBGP over iBGP path

lowest IGP metric to BGP next-hop

oldest path

path from BGP router with lowest BGP RID

You really need to know this order to master BGP for the workplace and

for your CCNP ROUTE and CCIE exams.

Let’s look at the BGP table from R1 again.

Again, the “>” indicates the path that will be used to reach that particular network. For more detailed information on any particular

path, use the show ip bgp command followed by the destination.

Before we use that command, though, did you notice that there seems

to be something odd with R1’s path selection for the network 3.3.3.3 and

4.4.4.4? Let’s take a look at the paths to 3.3.3.3 first.

BGP has identified both paths as being valid and loop-free, as indicated by the asterisk. The “>” indicating the best path is next to the path with

the next-hop of 172.12.123.3. The first criteria for BGP best path

selection is weight, and both paths have a weight of 0.

The next criteria is local preference. If the path with the next-hop of

172.12.234.3 has a local preference of 100, and the other path a local preference of zero, why is the path with the lowest local preference

being selected by BGP?

Before we answer that, let's look at R1's paths for 4.4.4.4:

There are two valid loop-free paths to 4.4.4.4, so BGP must choose the best path. The weights are the same, but again the local preferences

seem to favor the next-hop of 172.12.234.4. Even if the local prefs were

the same, the AS_PATH of the path with the next-hop of 172.12.234.4 is shorter than the other path. Then why is the path with the next-hop of

172.12.123.3 being selected?

Learn the following command – it will serve you well in the exam room

and on the job!

R1#show ip bgp 3.3.3.3 BGP routing table entry for 3.3.3.3/32, version 3 Paths: (2 available, best #1, table Default-IP-Rout ing-Table) Advertised to non peer-group peers: 10.1.1.5 172.12.123.2 300 172.12.123.3 from 172.12.123.3 (3.3.3.3) Origin IGP, metric 0, localpref 100, valid, e xternal, best 300 172.12.234.3 (inaccessible) from 172.12.123.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, i nternal

R1#show ip bgp 4.4.4.4

BGP routing table entry for 4.4.4.4/32, version 7 Paths: (2 available, best #1, table Default-IP-Rout ing-Table) Advertised to non peer-group peers: 10.1.1.5 172.12.123.2 300 400 172.12.123.3 from 172.12.123.3 (3.3.3.3) Origin IGP, localpref 100, valid, external, b est 400 172.12.234.4 (inaccessible) from 172.12.123.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, i nternal

The show ip bgp <network_number> command shows us that the paths

with a next-hop IP address on the 172.12.234.0 network are shown as

valid, and all paths involved have a local pref of 100.

Never trust the local prefs you see in the basic show ip bgp command if

something looks strange – run this more network-specific version of the command.

Two of the routes can't be used, though, because R1 has no IP

connectivity to any host on the 172.12.234.0 segment.

R1#ping 172.12.234.3

Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 172.12.234.3, tim eout is 2 seconds: ..... Success rate is 0 percent (0/5)

R1#ping 172.12.234.4

Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 172.12.234.4, tim eout is 2 seconds: ..... Success rate is 0 percent (0/5)

If a router cannot reach the address listed as the BGP next-hop, the path

cannot be used.

To get around this rule, we can use the bgp next-hop-self command on

R2. This will force R2 to announce itself as the next hop of all paths

advertised to the specified neighbor, in this case R1.

R2(config)#router bgp 100 R2(config-router)#neighbor 172.12.123.1 next-hop-se lf

R1’s BGP table now shows 172.12.123.2 as the next hop for all paths that formerly has 172.12.234.3 or .4 for that value.

Since R1 can reach 172.12.123.2, these paths can now be used by BGP.

The route to 4.4.4.4 now has a next-hop of 172.12.123.2.

The best route to 3.3.3.3 still has a next-hop of 172.12.123.3, but the

next-hop address for the other route to 3.3.3.3 is now 172.12.123.2.

Note in the show ip bgp x.x.x.x command output below that there is no inaccessible comment and the next-hop IP addresses have changed

there as well.

R1#show ip bgp 3.3.3.3 BGP routing table entry for 3.3.3.3/32, version 3 Paths: (2 available, best #1, table Default-IP-Rout ing-Table) Advertised to non peer-group peers: 10.1.1.5 172.12.123.2 300 172.12.123.3 from 172.12.123.3 (3.3.3.3) Origin IGP, metric 0, localpref 100, valid, e xternal, best 300 172.12.123.2 from 172.12.123.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, i nternal

R1#show ip bgp 4.4.4.4 BGP routing table entry for 4.4.4.4/32, version 8 Paths: (2 available, best #2, table Default-IP-Rout ing-Table) Advertised to non peer-group peers: 10.1.1.5 172.12.123.3 300 400 172.12.123.3 from 172.12.123.3 (3.3.3.3) Origin IGP, localpref 100, valid, external 400 172.12.123.2 from 172.12.123.2 (2.2.2.2) Origin IGP, metric 0, localpref 100, valid, i nternal, best

For the network 4.4.4.4, the path now in use is the one with the next

hop of 172.12.123.2, since its AS_PATH is shorter than the other valid

path. The path for the destination 3.3.3.3 is still the path with the next hop of 172.12.123.3.

Since the weights and local prefs are the same, none of the routes originated on R1, the AS_PATH length is the same, the origin code is the

same (IGP), and the MED is the same, the next criteria in line is eBGP

routes being used over iBGP routes. The path with the next hop of 172.12.123.3 is an eBGP route where the path with the next hop of

172.12.123.2 is an iBGP route.

I know that list of BGP best-path selection criteria is long, but sometimes

it really does come down to the eighth or ninth tiebreaker. It's

important to know this list for any real-world job involving BGP - but it won't hurt on exam day, either.

The Next-Hop Attribute

We just spend quite a bit of time examining this attribute, but let’s look

at the rules for determining the default next-hop.

When a BGP speaker sends a route to an eBGP neighbor, the next-hop

address is set to the transmitting interface of the router that originated the route. In this example, with R3 is advertising its loopback address

to its eBGP neighbor R1, the next-hop will be the IP address of R3's

serial interface.

Makes sense, right? Right!

Now here's the interesting part....

Regarding iBGP routes, for routes originated outside the AS, the next-

hop address will still be the source address of the router in the remote

AS that originally sent the route advertisement.

When the BGP route arrives at R2, the next-hop address is still that of

R3 -- and when there's no full mesh involved, that can lead to trouble.

If R2 does not have an entry in its routing table for the R1-R3 serial network, R1 should announce itself to R2 as the BGP next hop.

Otherwise, as you saw in the previous example, the route won't be

entered into the BGP table.

The Multi-Exit Discriminator (MED)

The MED is an optional attribute that comes in handy when there are multiple entrance paths to an AS. The remote AS sets MED values to tell

the other AS which path to use.

The MED is passed between the two autonomous systems, but the value

is not passed to any other ASes. The path with the lowest MED is the

preferred path.

Here, R3 has two possible entry points into AS 100, and therefore two

paths to R4. For varying reasons (one of the paths has greater bandwidth available, one of the paths involves a particularly slow or fast

router), you may want to influence R3's path selection from R1 and R2.

By sending a MED of 100 from R2 and a MED of 200 from R1, you are

actually telling R3 that the path into AS 100 via R2 is more desirable than

the path via R1.

When you write route-maps to set the MED, there is no "set MED"

option. Instead, you are setting the metric value.

R1(config)#route-map SET_MED permit 10 R1(config-route-map)#match ip address 1 R1(config-route-map)#set metric 200

To change the MED for all routes sent by that router, use the default-metric command in the BGP config.

R1(config)#router bgp 100 R1(config-router)#? Router configuration commands: address-family Enter Address Family command mode aggregate-address Configure BGP aggregate entr ies auto-summary Enable automatic network num ber summarizati bgp BGP specific commands default Set a command to its default s default-information Control distribution of defa ult information default-metric Set metric of redistributed routes

To enable the comparison of the MEDs of routes received from multiple

autonomous systems, use the bgp always-compare-med command.

R3(config)#router bgp 200 R3(config-router)#bgp ? always-compare-med Allow comparing MED from different neighbors

The Local Preference Attribute (LOCAL_PREF)

LOCAL_PREF is a well-known attribute that is also used when multiple

paths between autonomous systems exist. The LOCAL_PREF attribute is

just that… local. Routers within the local AS are told what path to use to

exit that AS.

The local preference value is passed only among iBGP peers, and this

value never leaves the local AS. In the following network, there are two exit paths for routers in AS 100 to reach AS 200. The LOCAL_PREF

attribute will be set in AS 100, and it will not leave that AS. The

LOCAL_PREF attribute indicates to the routers in AS 100 what path should be taken to AS 200. The path with the highest LOCAL_PREF is

chosen.

Changing The Local Preference Attribute

Both R1 and R2 have two paths to the 172.12.34.0/24 network.

Examining their BGP tables reveals that R1 will use R3 as a next-hop to

reach this network, and R2 will use R4 to reach it.

If we wanted R2 to use R3 as a next-hop instead, the most efficient way to do so is to change the local preference value, shown as “LocPrf” in the

BGP table.

When the local preference of a path is changed, all routers in the AS will

learn about it. Always run show ip bgp followed by the network number

when you want to examine local preferences:

R2#show ip bgp 172.12.34.0 BGP routing table entry for 172.12.34.0/24, version 4 Paths: (2 available, best #1, table Default-IP-Rout ing-Table) Advertised to non peer-group peers: 10.1.1.1 34 10.1.1.4 from 10.1.1.4 (4.4.4.4) Origin IGP, metric 0, localpref 100, valid, external, best 34 10.1.1.3 from 10.1.1.1 (172.12.123.1) Origin IGP, metric 0, localpref 100, valid, internal

Since both routes have a local preference of 100, the local preference for the path with the next-hop of 10.1.1.3 will have to be changed to a

higher value. There are two approaches for this.

The first is to change the default local preference for a router as a whole,

which means that every update the router sends out to other devices in

the same AS will carry this new local preference value. Here, we’ll double the default local preference on R1.

R1(config)#router bgp 12 R1(config-router)#bgp default ? ipv4-unicast Activate ipv4-unicast for a pee r by default local-preference local preference (higher=more p referred) route-target Control behavior based on Route -Target attributes

R1(config-router)#bgp default local-preference 200

Keep using IOS Help, you never know what you may learn! The router is

even reminding us that a higher local preference is preferred. (I wouldn't expect the exam to remind you, though.)

Let’s take a look at R2’s BGP table now:

Now the path with the next-hop of 10.1.1.3 is preferred, due to the

higher local preference.

R2#show ip bgp 172.12.34.0 BGP routing table entry for 172.12.34.0/24, version 5 Paths: (2 available, best #1, table Default-IP-Rout ing-Table) Advertised to non peer-group peers: 10.1.1.4 34 10.1.1.3 from 10.1.1.1 (172.12.123.1) Origin IGP, metric 0, localpref 200, valid, internal, best 34 10.1.1.4 from 10.1.1.4 (4.4.4.4) Origin IGP, metric 0, localpref 100 , valid, external

You can also assign new local preferences to individual prefixes with a

route map. Route maps allow you to change attribute values, or assign

attributes in the first place, on a route-by-route basis rather than the “all-or-nothing” approach other methods offer.

For the CCNP ROUTE exam and especially for real-world BGP routers that

can contain hundreds of routes, I'd become very comfortable with route

maps.

We will now remove the bgp default local-preference command from R1,

and add another segment connecting R3 and R4. This segment,

210.1.1.0 /24, will also be advertised into BGP on R3 and R4.

R1(config)#router bgp 12 R1(config-router)#no bgp default local-preference 2 00

R3(config)#router bgp 34 R3(config-router)#network 210.1.1.0 mask 255.255.25 5.0

R4(config)#router bgp 34 R4(config-router)#network 210.1.1.0 mask 255.255.25 5.0

Let's take a look at the BGP tables of R1 and R2 and see what next-hop

address each router is preferring for each of our two BGP paths.

If we use the bgp default local-preference command here, it will affect

both paths. What if we needed R2 to use 10.1.1.3 as the next hop for

data traveling to 172.12.34.0/24, but to continue using 10.1.1.4 as the next hop for 210.1.1.0/24?

You see the weakness of the "default" approach. Setting a default local preference somewhere in AS 12 won't give us what we need, but

configuring a route map will.

The prefixes that need the higher local preference first need to be

identified by an access-list. I know you know this – but don’t forget that

access lists use wildcard masks!

R1(config)#access-list 18 permit 172.12.34.0 0.0.0. 255

This ACL will match only this particular prefix, with all others being

denied by the implicit deny. We're not denying traffic with this config,

though - we're identifying traffic that should have its local preference

doubled.

The following route map that will assign a local preference of 200 to all routes matching access-list 18, with all other routes unaffected.

R1(config)#route-map PREFER_R3_FOR_172 permit 10 R1(config-route-map)#match ip address 18 R1(config-route-map)#set local-pref 200 R1(config-route-map)#route-map PREFER_R3_FOR_172 pe rmit 20 R1(config-route-map)#set local-pref 100

The route map will be applied to all routes coming in from R3 via the neighbor command.

The word “in” at the end of the command indicates the direction of the updates that will be affected by the route map.

R1(config)#router bgp 12 R1(config-router)#neighbor 10.1.1.3 route-map PREFE R_R3_FOR_172 ? in Apply map to incoming routes out Apply map to outbound routes R1(config-router)#neighbor 10.1.1.3 route-map PREFE R_R3_FOR_172 in

After clearing R1’s TCP connections, R2 now has this BGP table:

R2 still uses the next hop 10.1.1.4 to reach 210.1.1.0/24, but now uses 10.1.1.3 for the next hop to reach 172.12.34.0/24 due to the higher

local preference.

IOS Help shows us that route maps can be used to set almost any BGP

attribute:

R1(config-route-map)#set ? as-path Prepend string for a BGP AS-pat h attribute automatic-tag Automatically compute TAG value comm-list set BGP community list (for del etion) community BGP community attribute dampening Set BGP route flap dampening pa rameters default Set default information extcommunity BGP extended community attribut e interface Output interface ip IP specific information level Where to import route local-preference BGP local preference path attri bute metric Metric value for destination ro uting protocol metric-type Type of metric for destination routing protocol origin BGP origin code tag Tag value for destination routi ng protocol weight BGP weight for routing table

Whatever value you need to change in BGP, a route map is most likely

the best way to do it, both on the exam and in real life.

Third-Party Next-Hop

On occasion, you may see a next-hop address that you don't expect, particularly in a situation like the next diagram.

R1, R2, and R3 share a broadcast segment, in this case an Ethernet

segment. R1 has an eBGP peering with R2, and R2 has an iBGP peering with R3. (Always double-check your network documentation or exam

exhibit; never assume a full mesh.)

R3 will advertise its loopback network, 3.3.3.3/32, to R2 via iBGP. R2

will then advertise the route to R1.

R1:

router bgp 1500 neighbor 100.1.1.2 remote-as 2000

interface Ethernet0 ip address 100.1.1.1 255.255.255.0

R2:

router bgp 2000 neighbor 100.1.1.1 remote-as 1500 neighbor 100.1.1.3 remote-as 2000


R3:

router bgp 2000 network 3.3.3.3 mask 255.255.255.255 neighbor 100.1.1.2 remote-as 2000


As expected, R2's BGP table shows 100.1.1.3 as the next-hop address to

reach 3.3.3.3 /32.

There is no peering between R1 and R3, but R1 should get this route

from R2. Since this is an eBGP peering, the route is expected to have a

next-hop address of 100.1.1.2..... right?

Wrong! :)

The next-hop is 100.1.1.3. This is due to third-party next-hop, and

outside of RFCs, you don't hear much about this rule. A BGP speaker is

allowed to advertise the IP address of an internal peer as the next-hop

address IF the external peer receiving the route has a subnet in common with the internal peer.

Howzat for an "if"?

Since R1 and R3 share a subnet, R2 is allowed to send the IP address of

the internal peer, 100.1.1.3, as the next-hop address.

This built-in feature is designed to bring about the most accurate routing

possible, and in this example it does just that. R2 is advertising the route to an external peer, but R2 also knows that the external peer (R1)

shares a subnet with the internal peer (R3). R2 then advertises the

route with a next-hop of 100.1.1.3, resulting in R1 having the most

direct path to 3.3.3.3/32.

The Weight Attribute

Weight is the first value considered in BGP path selection among multiple paths.

There are three other major points to remember about this BGP

attribute:

Cisco-proprietary value

locally significant to a router

is never advertised to other routers

The path with the largest weight is preferred. The default weight for a

route originated on the local router is 32768, and it's zero for all other routes.

Adjusting The Weight Attribute

The R1-R2-R3 network is 172.12.123.0 /24, and the R2-R3-R4 segment is 10.1.1.0 /24. All final octets are the router's number. There is no

iBGP peering between R2 and R3. R4 is advertising its loopback address

of 4.4.4.4/32 into BGP.

R1(config)#router bgp 123 R1(config-router)#neighbor 172.12.123.2 remote-as 1 23 R1(config-router)#neighbor 172.12.123.3 remote-as 1 23

R2(config)#router bgp 123 R2(config-router)#neighbor 172.12.123.1 remote-as 1 23 R2(config-router)#neighbor 10.1.1.4 remote-as 4 R2(config-router)#neighbor 172.12.123.1 next-hop-se lf

R3(config)#router bgp 123 R3(config-router)#neighbor 172.12.123.1 remote-as 1 23 R3(config-router)#neighbor 10.1.1.4 remote-as 4 R3(config-router)#neighbor 172.12.123.1 next-hop-se lf

R4(config)#router bgp 4 R4(config-router)#neighbor 10.1.1.2 remote-as 123 R4(config-router)#neighbor 10.1.1.3 remote-as 123 R4(config-router)#network 4.4.4.4 mask 255.255.255. 255

R1 has two paths to 4.4.4.4/32. The path with the next hop

172.12.123.2 is in use, as indicated by the “>” symbol indicating the

best route.

That particular path was chosen by the BGP route selection process, and

just to review, here’s that process again:

Highest weight

Highest local pref

Locally originated path (next hop of 0.0.0.0 in show ip bgp)

Shortest AS_PATH

Lowest origin code ( i, e, ?)

Lowest MED (if remote AS is same for all routes)

External BGP over Internal BGP

Lowest IGP metric to next-hop address

Oldest route

Lowest BGP RID

Lowest neighbor IP address (if there's a tie here, you have a

problem!)

We went all the way down to the final tiebreaker in this scenario,

because all of the preceding criteria were the same. If you’re in an all-Cisco environment, it makes sense to change the weight of a route to

make it the preferred route, since that is the first criteria checked.

The weight for both routes is 0, so we'll use the neighbor command to

set the weight for all routes learned from 172.12.123.3 to 200.

The weight for the route with a next-hop of 172.12.123.3 is now 200,

making it the preferred path. Since the weight attribute is Cisco-

proprietary, so in a multi-vendor environment we'd change the local pref

to get a similar result.

This change to a route’s weight is locally significant only -- R1 will not advertise this route with a weight of 200.

The Atomic Aggregate Attribute

You should get 20 exam points just for saying that fast.

But you won't, so let's take a quick look...

When BGP paths are aggregated, this well-known attribute indicates the

router that performed the aggregation. This attribute gives notice to

downstream routers that more-specific BGP routing information was lost at the point of aggregation.

That's all fine - but what's "aggregation"? It's just another term for summarization. You'll perform this summarization the same way you did

for EIGRP and OSPF routes - but BGP has an interesting default that

those two protocols do not have. Stay tuned.

The Aggregator Attribute

This optional attribute gives the BGP Router ID and AS number of the

router that performed the aggregation. The aggregator attribute will

also include a list of all the AS numbers that these aggregated routes passed through.

The Community Attribute

This attribute allows us to logically group routers that have a common configuration, making them members of a community. Creating BGP

communities can save you a lot of work, as you'll see later in this

section.

And who doesn't like less work?

The Originator ID and Cluster ID Attributes

Both these optional attributes can be put into effect when route

reflectors are used. We’ll examine these attributes during the route

reflector discussion.

BGP Route Aggregation (Not AggreVation)

In your CCNA studies, you learned how to perform manual route summarization in both RIPv2 and EIGRP. BGP Route Aggregation works

much the same way – the routes to be summarized, or aggregated,

should be written out in binary and the common bits identified. These common bits yield both the aggregate route and the subnet mask, and

we need both of those to get the desired result.

BGP route aggregation gives us choices that RIPv2 and EIGRP did not.

You’ll remember that when manual summarization was configured with

those two protocols, the interface would send out only the summary

route and mask. With BGP, we can send out only the aggregate route and mask, or the aggregate route along with the more-specific routes.

The following network will be used to illustrate route aggregation.

On R5, additional loopback addresses have been configured: 16.1.1.1, 17.1.1.1, 18.1.1.1, and 19.1.1.1, all with /8 masks. They will now be

advertised via BGP.

R5(config-if)#router bgp 500 R5(config-router)#network 16.0.0.0 mask 255.0.0.0 R5(config-router)#network 17.0.0.0 mask 255.0.0.0 R5(config-router)#network 18.0.0.0 mask 255.0.0.0 R5(config-router)#network 19.0.0.0 mask 255.0.0.0

The downstream router, R1, has all four of these routes in its BGP table.

This is fine, but we know it’s a good idea to keep all routing tables as

concise as possible while also keeping them complete. We can

aggregate these four routes and advertise them as one aggregate route.

First, it’s time for our old friend binary math! Since these networks all

have “0” for the last three octets, we’ll only convert the first octet here.

Common bits are highlighted.

16 00010000 17 00010001

18 00010010

19 00010011

Working from left to right, we see that the four networks have the first

six bits in common. The value of the first six bits is 16, and the first six bits will be the bits that are set to “1” in the aggregate mask. This

binary string of 11111100 yields a mask of 252.0.0.0.

We'll inject the aggregate route into BGP via the aggregate-address

command.

R1 sees the aggregate and places it into its BGP table. Note that by

default, the more-specific routes are not removed from the BGP table.

With EIGRP, RIP, and OSPF summarization, those routes were gone.

To suppress the advertisement of the more-specific routes, use the

summary-only option with the aggregate-address command.

It's common to have an "oh, yeah, now I remember that" moment

(OYNIRTM) at that point in your config. If that happens to you, I recommend you remove the first aggregate-address command before

writing the one with the summary-only option.

There’s one more option with the aggregate-address command you

should know about.

Actually, there are several other options, but one more big one for

the CCNP ROUTE exam. You can learn the others when you go after your CCIE!

R5(config-router)#aggregate-address 16.0.0.0 252.0. 0.0 summary-only ? advertise-map Set condition to advertise attribu te as-set Generate AS set path information attribute-map Set attributes of aggregate route-map Set parameters of aggregate summary-only Filter more specific routes from u pdates suppress-map Conditionally filter more specific routes from updates

If you use the as-set option, the path advertised for this route will be an

AS_PATH that was traveled by all of the more-specific paths being aggregated. Cisco recommends that you do not use this option when a

great number of paths are being aggregated, since the aggregate may be

removed, updated, and replaced as AS-path reachability changes.

Why aggregate routes in the first place? For the same reason we did so

with other protocols – route aggregation lessens the load on router resources by making the routing tables smaller while still being complete

and accurate.

T-shooting hint: If your aggregate route isn't being advertised, be sure

your BGP table actually has the routes being summarized.

Resetting and Clearing The BGP Peer Connections

Sometimes you’ll find it necessary to reset the TCP session between BGP

speakers. Not all changes require this. For example, the route aggregation we just performed required no such reset. There is a “hard

reset” and a “soft reset”. The clear ip bgp* command performs a hard

reset where the TCP session itself is reset:

R1#clear ip bgp * R1# 09:18:36: %BGP-5-ADJCHANGE: neighbor 10.1.1.5 Down User reset 09:18:36: %BGP-5-ADJCHANGE: neighbor 172.12.123.2 D own User reset 09:18:36: %BGP-5-ADJCHANGE: neighbor 172.12.123.3 D own User reset

With this command, the BGP sessions themselves are reset and the neighbor adjacencies are lost. The adjacencies you see here came back

within 20 – 40 seconds, but BGP reachability was lost during that time.

To clear the sessions without resetting the sessions, use the soft option,

as shown here;

R1#clear ip bgp * soft

Internal BGP: Synchronization, Full Meshes, and Route Reflectors

We know all about eBGP and iBGP at this point.

Now we need to learn the important operational differences between the

two. These are vital to success on the ROUTE exam and working with BGP in real-world networks. Here are some basic rules and guidelines in

working with iBGP networks...

iBGP neighbors do not have to be directly connected. The

connection between iBGP routers is on TCP port 179.

It’s common practice to use a remote router’s loopback address in

the neighbor statement, rather than the closest physical address.

This allows us to keep our BGP adjacencies in situations where losing the physical address would result in losing that adjacency.

iBGP routers do not send updates to every single neighbor. The only way an iBGP router will advertise a route to its neighbors is if

the route was created by the transmitting router via the network

command, by static route redistribution, IGP route redistribution, or if the advertised route is a connected route in the first place.

This means that when a iBGP speaker learns about a route from an iBGP peer, the only kind of BGP router that route can then be advertised to is

an eBGP router. iBGP routers do not advertise routes received from one

iBGP neighbor to other iBGP neighbors.

In theory, this would mean that every AS would have to be fully meshed

in order for routes to be properly advertised. In the real world, this would create a great deal of overhead.

Thankfully, this is unnecessary overhead, because BGP gives us a way

around having to create such a logical nightmare. Before we take a look

at this solution, let’s examine BGP’s rule of synchronization.

The BGP rule of synchronization only matters when an AS is going to

serve as a transit area, and if there are non-BGP speakers in the transit

area.

In the illustration, AS 200 is serving as a transit area between AS 100

and AS 500. The issue is that the only iBGP neighbor relationship is between R2 and R4. This is a logical relationship only; when R4 wants to

send data to 200.20.0.0, it has to physically go through R3. Since R3 is

not running BGP, it can’t possibly know about this network, so R3 will

drop packets destined for 200.20.0.0.

Without the synchronization rule, R4 would advertise a path to 200.20.0.0 over its eBGP connection to R5. Of course, R5’s packets

destined for this network would be dropped at R3 as well.

The BGP Rule Of Synchronization states that a transit AS will not

advertise a route until every router in the transit AS has the route in its

IGP routing table.

R4 will not send an advertisement for network 200.20.0.0 to R5 until R4

hears an advertisement for that network from R3 via an IGP; that indicates that the non-BGP speaking R3 has a route for that network.

BGP Synchronization's major benefit is that packets that can’t possibly reach the desired remote network will not even be sent, reducing both

the amount of unnecessary traffic and the unnecessary strain on router

resources. After all, why send those packets if they can't reach the destination, anyway?

BGP Synchronization is turned off in many deployments, though, and as of IOS version 12.2(8) it's turned off by default. There are three

scenarios under which it's safe to turn synchronization off:

1. If all the routers in the AS are running BGP.

2. If a full mesh exists in the AS.

3. If the AS is not a transit AS to begin with.

To do so, simply run the BGP command no synchronization.

R1(config)#router bgp 100 R1(config-router)#no synchronization

The Problem With BGP Full Mesh Deployments

BGP’s rule of Split Horizon is much different than the Split Horizon rules

you learned in your CCNA studies.

BGP Split Horizon states that one iBGP peer can’t learn about a path

from one iBGP peer and then advertise it to another iBGP peer. Therefore, we would need a logical full mesh among all iBGP speakers in

an autonomous system.

You know how we see very few full meshes in Frame Relay? There's a

reason - and it's the same reason we don't see many BGP full meshes.

Any full-mesh deployment of BGP is going to have a large cost on the

router’s resources (memory, CPU). A full mesh is going to require a

large number of TCP connections, and the more routers you have, the more connections you’ll need.

Take an AS with 20 routers. The formula for determining the number of

connections needed for a full mesh is:

X (x – 1) / 2 , with “x” being the number of routers

This formula for 20 routers: 20 (20 – 1) / 2. That’s 20 x 19, which is 380, divided by 2, which is 190. BGP requires 190 separate TCP

connections for a 20-router AS!

Add this to the administrative nightmare you’ll have in creating this full

mesh, along with the additional configurations that will be needed when

routers are added or removed from the AS, and you’ve got quite a labor-intensive situation.

Three good reasons to avoid full-mesh iBGP deployments:

An unnecessarily large number of TCP sessions are created.

These sessions use a lot of bandwidth.

You're going to spend a lot of time configuring all these peer

connections, and sooner or later, you're going to miss one (especially in a large AS). Then you get to spend even more time

troubleshooting your network!

Luckily, there’s a way around the BGP Split Horizon rule – route reflectors.

Route Reflectors

BGP route reflectors are the exception to the BGP Split Horizon rule. A router configured as a BGP route reflector can take a route learned from

one iBGP peer and advertise it to another iBGP peer.

The iBGP peers that will be sending routes to the route reflector are

referred to as clients. When one client sends a route to the route

reflector, the RR does just that – it reflects the route to the other clients.

To the clients, this is a totally transparent process. The clients don’t even know they are clients, and they require no additional configuration.

All clients must peer with the RR. Clients will not have a peer relationship with other clients. This allows us to have BGP work with a partial mesh

rather than a full mesh.

Remember how we would need 190 separate TCP connections in a 20-

router AS? If you have a single router act as an RR in the same 20-router

AS, we’d need the RR to have a peering with each of the clients, and each of the other 19 BGP speakers (clients) would have a single BGP peer

relationship back to the RR. This would result in only 38 total TCP

connections being needed.

That’s a huge reduction in the overhead caused by all those TCP

connections, not to mention the hours of configuration and troubleshooting you'll save.

A BGP speaker that has a peer relationship to an RR does not have to be

a client; these speakers are called nonclients. Nonclients do have to

have a TCP connection to every other router in the AS.

Let's take a look at how the use and of route reflectors impacts a

network. The following BGP peer relationships are in place and are

indicated with dotted lines. Synchronization has been disabled. All interface IP addresses end with the router's number.

� R1 / R2 / R3 are on a frame network, 172.12.123.0 /24.

� R2 / R3 / R4 are on an ethernet segment, 10.1.1.0 /24.

� Each router has a loopback with its own number for each octet

(1.1.1.1, etc.).

Peers:

� R1: Peering with R2 and R3.

� R2: Peering with R1. � R3: Peering with R1 and R4.

� R4: Peering with R3.

R4 is in AS 4, and will advertise its loopback (4.4.4.4 /32) into BGP. R3

has R4’s loopback in its BGP table:

What about the other routers in AS 1235? Will they have this route in

their BGP tables? Let’s first look at R3’s iBGP peer, R1:

The route is there... but there is no ">" next to the route, so this is not a

"valid and best" route.

Here's a good three-step t-shooting process for BGP - and for just about

anything else in Ciscoworld:

What is the problem?

If we don't immediately know what the issue is, what command will

show us what the problem is?

Once we've identified the issue, how can we solve it?

The problem: The next-hop address for this route, 10.1.1.4, is

unreachable from R1. Never assume IP connectivity!

The command that verifies this: show ip bgp 4.4.4.4.

The solutions: Use dynamic or static routing to get a route to 4.4.4.4 in

R1's IP routing table, or configure next-hop-self on R3. Let's get some practice with next-hop-self:

R3(config)#router bgp 1235 R3(config-router)#neighbor 172.12.123.1 next-hop-se lf

R3#clear ip bgp * soft

The result is a next-hop address that R1 can reach, so the BGP route is now valid and best.

What about R1’s iBGP peer, R2?

R2#show ip bgp

R2#

When you run a show command on a Cisco router and are immediately

back at the enable prompt, that means there is nothing to show you. R2

does not have the route in its BGP table due to BGP’s Split Horizon rule. R1 learned about the route from an iBGP peer, and therefore cannot

advertise that route to other iBGP peers.

The same thing happens if both R2 advertises its loopback to R1. R1 can

put the route in its BGP table, but cannot advertise the routes to its

other iBGP peer, R3.

R2(config)#router bgp 1235 R2(config-router)#network 2.2.2.2 mask 255.255.255. 255

R1 will see the new route as valid and best....

... but will be unable to advertise it to R3.

R3 doesn’t have the route to R2's network, since it was learned by R1 via

an iBGP peer (R2) and can’t be advertised to another iBGP peer (R3).

There are two solutions to this issue. The first is to create a full mesh in

AS 1235. Using the formula mentioned earlier, this solution would

require 4 x (4-1) /2 connections, or 6 separate TCP connections.

This solution requires more of a router’s resources, and will take

additional time to configure and possibly troubleshoot -- and it's a horribly non-scalable solution.

We always have to plan for future growth, and the more growth we have

with a full mesh, the more administrative and logical overhead we have.

A much more scalable solution is to configure R1 as a route reflector.

R2 and R3 will be the route reflector clients. These routers will require no additional configuration.

R1 will identify these two neighbors as route reflector clients, allowing R1

to advertise routes learned via iBGP peers to other iBGP peers.

R1(config)#router bgp 1235 R1(config-router)#neighbor 172.12.123.2 route-refle ctor-client 00:34:00: %BGP-5-ADJCHANGE: neighbor 172.12.123.2 D own RR client config change R1(config-router)#neighbor 172.12.123.3 route-refle ctor-client 00:34:12: %BGP-5-ADJCHANGE: neighbor 172.12.123.3 D own RR client config change 00:34:27: %BGP-5-ADJCHANGE: neighbor 172.12.123.2 U p 00:34:38: %BGP-5-ADJCHANGE: neighbor 172.12.123.3 U p

The results here may be different than those you’ve seen elsewhere.

Configuring a BGP peer as a route reflector client will bring down the peer connection. As you can see from the timestamps, they were only down

for 25 to 30 seconds, but it’s an important point to remember. Especially

on production networks! :)

Let’s look at the BGP tables of the route reflector clients after the

adjacency reforms.

The route reflector is working perfectly.

Route reflectors serve two major purposes. First, they reduce the

number of TCP connections needed in an iBGP deployment. Just as importantly, route reflectors allow us to get around the rule of BGP Split

Horizon – because unlike other protocols you studied to get your CCNA,

you can’t turn BGP Split Horizon off at the interface level.

So if BGP Split Horizon is there to prevent routing loops, why don’t we

have routing loops form when using route reflectors and effectively disabling Split Horizon? We’re going to answer that in just a moment.

First, let's do a little verification.

To verify that a router is seen as a route reflector client, run show ip bgp

neighbor x.x.x.x. This is an excellent command for overall BGP

troubleshooting. This is a verbose command to say the least, but there's

some great information here.

Below you can see that 172.12.123.2 is seen as a route reflector client. I'm only showing you about half of this command's output since the

second half is more for Cisco TAC (Technical Assistance Center) calls, but

at the bottom of this output you can see the number of adjacency resets

and the reason for the last one. Pretty cool!

R1#show ip bgp neighbor 172.12.123.2 BGP neighbor is 172.12.123.2, remote AS 123, inter nal link BGP version 4, remote router ID 2.2.2.2 BGP state = Established, up for 00:00:41 Last read 00:00:41, hold time is 180, keepalive i nterval is 60 seconds Neighbor capabilities: Route refresh: advertised Address family IPv4 Unicast: advertised and rec eived Received 881 messages, 0 notifications, 0 in queu e Sent 890 messages, 0 notifications, 0 in queue Route refresh request: received 0, sent 0 Default minimum time between advertisement runs i s 5 seconds

For address family: IPv4 Unicast BGP table version 6, neighbor version 6 Index 1, Offset 0, Mask 0x2 Route-Reflector Client NEXT_HOP is always this router Outgoing update prefix filter list is NO16THROUGH 19 0 accepted prefixes consume 0 bytes Prefix advertised 17, suppressed 0, withdrawn 0 Number of NLRIs in the update sent: max 5, min 0

Connections established 4; dropped 3 Last reset 00:01:09, due to RR client config change

Clusters And The Originator-ID Attribute

BGP Clusters are a combination of route reflectors and clients that are

sharing information. Note that I said “reflectors”, not “reflector”. There can be more than one route reflector in a cluster. When deciding on the

routers that will be the route reflectors in a cluster, you should consider

both the peering relationships in place (and the ones that would need to

be added to make the route reflector work) and the impact on router resources that being an RR creates.

Make sure the routers that will serve as the route reflectors in your

network possess the resources to get the job done.

If BGP Split Horizon is intended to stop routing loops, why is Split

Horizon not an issue with clusters? Because the Originator-ID identifies

the router that originated the path. This attribute is set by the route reflector and effectively eliminates the chance of a routing loop. If the

router that originated the route receives the route in an update, the

update will be discarded.

Where Do Route Reflectors Send Routes?

Route reflectors have three possible types of peers – clients, nonclients,

and eBGP peers. How a route reflector handles the update depends on the device that sent the update:

Updates from RR clients are sent to all client and nonclient peers.

Updates from eBGP peers are sent to all client and nonclient peers.

Updates from nonclient peers are sent to all clients in the cluster.

Prefix Lists

Once you’ve got the basic BGP configuration up and running, it’s time to

fine-tune the routes being advertised...

... or maybe the routes that you don't want advertised.

BGP gives us several tools with which to control the flow of network advertisements, and the first of these is the prefix list.

Cisco states several reasons for the use of prefix lists, among them are

support for incremental updates, their high flexibility, and that writing

BGP prefix lists is much easier than writing access-lists that filter BGP

updates. (Trust me, they’re right.)

The major reason for using BGP prefix lists is that filtering BGP with prefix lists is much faster and efficient than other methods.

Why? BGP tables can be huge, and since prefix lists are going to match only on the prefix of the address, the entire process is much faster than

using ACLs.

It’s also easy to go back and insert lines in the middle of a pre-existing

prefix list, which is great when you've written a 20-line list and suddenly

have the need to put a line at position 12.

Before we look at the actual configuration, let’s look at the theory of how

a BGP prefix-list operates. It’s quite similar to an ACL. First, if a route is expressly permitted, it’s used; if it’s denied, it’s not used. (Makes

sense!)

Also lurking at the bottom of every prefix list is our old friend, the

implicit deny. The implicit deny here works the same as it does in an

ACL. Remember that if a prefix is not expressly permitted, it’s implicitly

denied, and any explicit deny statements do NOT override the implicit

deny.

Prefix lists work from top to bottom, just like ACLs, and when a match is

found, the list stops running. Prefix list statements are all numbered, with the lowest numbers at the top, so the line with the smallest

sequence number that matches the prefix will be the one that matches.

Even if you don’t actually number the statements as you write the prefix

list, they’re numbered by default – each line you write is numbered with

the sequence number incrementing by 5 for every line you write. This makes it easy for you to go back and add lines that you might have

forgotten to put in, or when the need arises later to add lines.

To see prefix lists in action, we'll use this network setup:

The R1/R2/R3 network is our old friend 172.12.123.0/24, and the R1-R5

segment is 15.1.1.0/24. Dotted lines indicate BGP peers.

In this example, R5 has four additional loopbacks that will be advertised

into BGP in addition to 5.5.5.5/32.

interface Loopback16 ip address 16.1.1.1 255.0.0.0 ! interface Loopback17 ip address 17.1.1.1 255.0.0.0 ! interface Loopback18 ip address 18.1.1.1 255.0.0.0 ! interface Loopback19 ip address 19.1.1.1 255.0.0.0 !

R5(config)#router bgp 5

R5(config-router)#network 5.5.5.5 mask 255.255.255. 255 R5(config-router)#network 16.0.0.0 mask 255.0.0.0 R5(config-router)#network 17.0.0.0 mask 255.0.0.0 R5(config-router)#network 18.0.0.0 mask 255.0.0.0 R5(config-router)#network 19.0.0.0 mask 255.0.0.0

The downstream routers R1, R2, and R3 all see the routes. Will they be

valid and best on all routers?

The unreachable next-hop address rears its ugly head again, as neither

R2 nor R3 have a route for 15.1.1.5. We'll remedy that with the appropriate next-hop-self commands on R1.

R1(config)#router bgp 123 R1(config-router)#neighbor 172.12.123.2 next-hop-se lf R1(config-router)#neighbor 172.12.123.3 next-hop-se lf

Let's verify that command's effect by checking the BGP tables on R2 and

R3.

Now both R2 and R3 have all five routes in their BGP tables, and they

are "valid and best".

Sometimes, though, you don't want every router in a network to have every available route.

Let’s say that you want R1 to know about all five networks, but R2 and R3 should not. We do want R2 and R3 to keep the route to 5.5.5.5/32,

though. A prefix list written on R1 and applied to neighbors R2 and R3

will do this. Let’s write and examine the prefix list first:

R1(config)#ip prefix-list NO16THROUGH19 deny 16.0.0 .0/8 R1(config)#ip prefix-list NO16THROUGH19 deny 17.0.0 .0/8 R1(config)#ip prefix-list NO16THROUGH19 deny 18.0.0 .0/8 R1(config)#ip prefix-list NO16THROUGH19 deny 19.0.0 .0/8 R1(config)#ip prefix-list NO16THROUGH19 permit 0.0. 0.0/0 le 32

Don’t forget your up arrow when writing prefix lists. That will save you a

lot of typing. Also, give your prefix list an intuitive name where those who follow behind you can tell what the purpose of the prefix list is in

the first place.

That also helps you remember why you wrote it in the first place!

That last line looks a little strange, doesn’t it? This is the prefix list

equivalent of an ACL’s “permit any” statement. Remember, the four

explicit deny statements do NOT override the unseen implicit deny. The

only way to avoid the implicit deny is to write an explicit statement that permits all prefixes.

Before we apply the prefix list, let's use IOS Help to illustrate what "le"

means.

R3(config)#ip prefix-list NO16THROUGH19 permit 0.0. 0.0/0 ? ge Minimum prefix length to be matched le Maximum prefix length to be matched <cr>

R3(config)#ip prefix-list NO16THROUGH19 permit 0.0. 0.0/0 le 32

"le" means "less than or equal to"; "ge" means "greater than or equal

to".

Now to apply this prefix list to the neighbors R2 and R3.

R1(config)#router bgp 123 R1(config-router)#neighbor 172.12.123.2 prefix-list NO16THROUGH19 out R1(config-router)#neighbor 172.12.123.3 prefix-list NO16THROUGH19 out

After resetting the connections to R2 and R3, those two routers no longer see the networks 16 – 19.0.0.0/8, but still see the route for

5.5.5.5 /32.

As with ACLs, you’ve got a few options when it comes to viewing prefix

lists and their contents. The basic command is show ip prefix-list.

R1#show ip prefix-list ip prefix-list NO16THROUGH19: 5 entries seq 5 deny 16.0.0.0/8 seq 10 deny 17.0.0.0/8 seq 15 deny 18.0.0.0/8 seq 20 deny 19.0.0.0/8 seq 25 permit 0.0.0.0/0 le 32

Notice that the first line of the prefix list was numbered “5”, and each

line increments by five, even though we entered no sequence numbers

while writing the list. These numbers do make it very easy to go back and add lines exactly where you want them.

Let’s say that after writing this list and applying it, you realize you want the network 16.1.0.0 /16 to be allowed while denying all other networks

with the prefix 16.0.0.0/8. Using the sequence numbers, we can add

such a line so that it is read before the line that denies all networks with the prefix 16.0.0.0/8.

R1(config)#ip prefix-list NO16THROUGH19 ? deny Specify packets to reject description Prefix-list specific descriptin permit Specify packets to forward seq sequence number of an entry

R1(config)#ip prefix-list NO16THROUGH19 seq 2 permi t 16.1.0.0/16

R1#show ip prefix-list ip prefix-list NO16THROUGH19: 6 entries seq 2 permit 16.1.0.0/16 seq 5 deny 16.0.0.0/8 seq 10 deny 17.0.0.0/8 seq 15 deny 18.0.0.0/8 seq 20 deny 19.0.0.0/8 seq 25 permit 0.0.0.0/0 le 32

The line we added with the sequence number “2” was put just where we

wanted it – at the top of the prefix list. In this order, an update for the

network 16.1.0.0/16 would be permitted while all other networks matching 16.0.0.0/8 will be denied.

Peer Groups

BGP Peer Groups help to lower the impact of routing on the router’s

resources, as well as lowering the amount of actual configuration needed for multiple peerings in BGP.

Anything that lessens both our workload and the CPU workload is fine

with me! This is a very powerful concept and you'll definitely see this

anywhere you work with BGP.

Peer group members inherit the settings applied to the peer group,

which is really the whole point of creating peer groups.

R1 will peer with R2, R3, and R5. R1 will have the same outbound policy for all three routers. This allows the configuration of a BGP Peer Group.

(Peer group members can have separate inbound policies.)

In the config below, the second line names the peer group, the third line

identifies the AS number, and the fourth line applies the same route-

map to all members of this peer group. Finally, the members of the peer group are identified with neighbor statements.

R1(config)#router bgp 1235 R1(config-router)#neighbor AS1235GROUP peer-group R1(config-router)#neighbor AS1235GROUP remote-as 12 35 R1(config-router)#neighbor AS1235GROUP route-map AS _POLICY out R1(config-router)#neighbor 2.2.2.2 peer-group AS123 5GROUP R1(config-router)#neighbor 3.3.3.3 peer-group AS123 5GROUP R1(config-router)#neighbor 5.5.5.5 peer-group AS123 5GROUP

As you add neighbors in AS1235, you only have to type one line per new

neighbor - the neighbor command followed by the IP address of the neighbor used for the peer relationship and the name of the peer group.

Note the direction of the route-map shown above - it's outbound. To

repeat, peer group members are required to share the same outbound

policies. They can share the same inbound policies, but they don't have

to.

Peer group names are locally significant only - the name of the group isn't passed to other routers. This means you can reuse the name

throughout the network, but I'd be careful about that - it can get a little

confusing to the network admins. Peer groups take a little getting used

to, but they're a very efficient way of configuring routers.

Not to mention saving you a lot of typing! :)

BGP Confederations

We'll BGP logical groups to another by creating BGP Confederations.

BGP Confederations are a logical grouping of autonomous systems that

appear to outside BGP speakers as a totally separate AS.

The internal AS numbers are not known to any BGP speaker outside the

Confederation. Using BGP Confederations also limits the number of iBGP peer connections - just as with route reflectors, a full mesh is not

needed. In the following example, R9 is totally unaware that there is a

confederation, and knows only of the existence of AS 321. R9 has no idea that AS 321 actually contains three other autonomous systems.

R1's configuration will look like this:

R1(config)#router bgp 123 R1(config-router)#bgp confederation identifier 321 (assigns number 321 to the confederation; this will be the AS number seen by R9) R1(config-router)#bgp confederation peers 7 671 (identifies the other AS numbers that are part of the confederation)

R1(config-router)#neighbor 9.9.9.9 remote-as 9 R1(config-router)#neighbor 2.2.2.2 remote-as 123 R1(config-router)#neighbor 3.3.3.3 remote-as 123 R1(config-router)#neighbor 5.5.5.5 remote-as 7 R1(config-router)#neighbor 6.6.6.6 remote-as 671

R9's neighbor statement for R1 will refer to AS 321 , the confederation number.

R9(config)#router bgp 9 R9(config-router)#neighbor 1.1.1.1 remote-as 321

Communities

BGP communities allow us to tag a route or group of routes with a common value that will follow it throughout the rest of the network.

(A good way to remember this is the simple phrase "Communities equal

consistency.") Communities are transitive optional attributes. Some common community values:

NO-EXPORT: Marking a route with this community attribute prevents it from being advertised to an eBGP peer.

NO-ADVERTISE: Taking the previous community one step further, this community attribute prevents the route from being advertised to ANY

other router.

The available communities change often, with new ones added, so I

recommend you check Cisco's website for the available communities for

your IOS. You'll have to master them to become a CCIE.

Internet Connections And BGP

Four little words, so much potential for trouble. Working with BGP can

become quite a complex endeavor, and trying to tell you everything

about BGP and internet connectivity here is, well, impossible. We’re going to take a few minutes here and look at some basic design

guidelines and some introductory terminology.

The first term is multihoming. This is a BGP configuration where multiple

connections to the internet exist. This allows for load balancing as well

as redundancy – you don’t want to have internet connectivity cut off if one path goes down. Single points of failure are never good, but can be

positively crippling with BGP.

From the ISP’s point of view, there are three ways to handle sending

routes to the BGP AS:

Send default routes only into the AS. (Low resource usage - uses

the least memory of these three options.)

Send default routes and selected more-specific routes into the AS.

Send all routes into the AS. (High resource usage - uses the most memory of these three options.)

If the ISP sends only default routes into the AS, the non-BGP speakers in the AS will naturally use the path with the best metric to reach external

destinations. With the other two choices, BGP will generally use the

AS_PATH value to decide how routers in the AS should reach external destinations. The ISP has to walk a line between having more-specific

routing tables and overtaxing router resources.

Communications Between Your Router And ISP

Having more than one connection to an ISP, or having connections to

multiple ISPs, is great for redundancy but can be tough on the

router. Hopefully you've got a brand-new top-of-the-line router for R6

here, but that isn't always the case. The amount of CPU and memory on this router is especially critical, and can impact the type of routes you

should be receiving from your ISP.

If R6 has plenty of memory and CPU (and yes, "plenty" is an arbitrary

term), you should be okay getting specific routes from the ISPs. If

memory and CPU are a concern, you should consider receiving only a default route from the ISPs. Receiving only default routes causes the

least stress on your router resources.

You can opt for a combination of default and more-specific routes, but in

the real world, you've usually got a router that can handle the load of

specific routes or a router that can only handle default routes.

IGP < > BGP Redistribution

Warning: Don’t ever, ever, ever perform redistribution between IGP and

BGP unless you really know what you’re doing. And I mean really know

what you’re doing. That’s what practice labs are for!

Route redistribution does not have be bidirectional. You can redistribute

RIP routes into an EIGRP AS without taking the EIGRP routes and placing them into RIP. For all practical purposes, route redistribution is not

dynamic; it must be configured. The exception is when EIGRP and IGRP

are running on the same router and are also using the same AS number.

What’s all this got to do with BGP, you ask? At times, it may be

necessary for you to place IGP routes into the BGP routing table. There are three ways to do this: the network command, redistribution of static

routes, and redistribution of dynamically learned IGP routes.

Cisco recommends you avoid the last choice whenever possible, and so

do I. That form of redistribution can easily lead to routing loops. The

network command is generally your best bet.

We have the ability to redistribute BGP routes into an IGP, but there is

rarely good reason to do so. The basic reason this is usually a bad idea is simple; the Internet has a LOT of routes, many more routes than your

network is going to be equipped to handle. A full BGP routing table can

have over 90,000 routes.

Another danger to avoid – routes learned via an IGP in one AS should never be redistributed to other autonomous systems via BGP. You’re

begging for a routing loop.

Private AS Numbers

BGP allows you quite a bit of range when it comes to selecting an AS number:

R1(config)#router bgp ? <1-65535> Autonomous system number

Just as there are private IP addresses, there are private AS numbers.

The AS numbers 64512 - 65535 are considered "private" AS numbers

and just as private IP addresses should not be advertised to external networks, neither should private AS numbers.

Public or private, you can't assign AS number zero with BGP, just as you

couldn't with EIGRP.

show ip bgp neighbor vs. show ip bgp summary

For OSPF and EIGRP, the show ip ospf neighbor and show ip eigrp neighbor commands are the way to check on adjacencies. For BGP, while

the show ip bgp neighbor command will certainly give you information on

the router's BGP neighbors, it may well be too much information. Here's

the output of show ip bgp neighbor on a BGP speaker that has only one neighbor!

R3#show ip bgp neighbor BGP neighbor is 172.12.23.2, remote AS 23, interna l link BGP version 4, remote router ID 172.12.23.2 BGP state = Established, up for 00:01:24 Last read 00:00:23, hold time is 180, keepalive i nterval is 60 seconds Neighbor capabilities: Route refresh: advertised and received(new) Address family IPv4 Unicast: advertised and rec eived Received 5 messages, 0 notifications, 0 in queue Sent 5 messages, 0 notifications, 0 in queue Route refresh request: received 0, sent 0 Default minimum time between advertisement runs i s 5 seconds

For address family: IPv4 Unicast BGP table version 1, neighbor version 1 Index 1, Offset 0, Mask 0x2 0 accepted prefixes consume 0 bytes

Prefix advertised 0, suppressed 0, withdrawn 0 Number of NLRIs in the update sent: max 0, min 0

Connections established 1; dropped 0 Last reset never Connection state is ESTAB, I/O status: 1, unread in put bytes: 0 Local host: 172.12.23.3, Local port: 11000 Foreign host: 172.12.23.2, Foreign port: 179

Enqueued packets for retransmit: 0, input: 0 mis-o rdered: 0 (0 bytes)

Event Timers (current time is 0x732670): Timer Starts Wakeups Next Retrans 6 0 0x0 TimeWait 0 0 0x0 AckHold 5 1 0x0 SendWnd 0 0 0x0 KeepAlive 0 0 0x0 GiveUp 0 0 0x0 PmtuAger 0 0 0x0 DeadWait 0 0 0x0

iss: 3768420242 snduna: 3768420364 sndnxt: 376842 0364 sndwnd: 16263 irs: 671210739 rcvnxt: 671210861 rcvwnd: 1 6263 delrcvwnd: 121

SRTT: 165 ms, RTTO: 1172 ms, RTV: 1007 ms, KRTT: 0 ms minRTT: 8 ms, maxRTT: 300 ms, ACK hold: 200 ms Flags: higher precedence, nagle

Datagrams (max data segment is 1460 bytes): Rcvd: 8 (out of order: 0), with data: 5, total data bytes: 121 Sent: 8 (retransmit: 0), with data: 5, total data b ytes: 121

That's a lot of information! To get a brief summary of BGP neighbor

status, use... you guessed it ... show ip bgp summary!

R3#show ip bgp summary BGP router identifier 5.5.5.5, local AS number 23 BGP table version is 1, main routing table version 1

Neighbor V AS MsgRcvd MsgSentTblVer InQ OutQ Up/Down State/PfxRcd 172.12.23.2 4 23 6 6 1 0 0 00:02:56 0

There's no "right" or "wrong" way to view BGP neighbors.. it all depends on how much information you need!

A Little Of This 'n' That

BGP Message Types, The Peering Process, And The BGP RID

Once the TCP connection is complete, the Open packet is the first one to

go out. If the values in that packet sent by "Router A" are acceptable to "Router B", then a keepalive is returned by "B" and the BGP connection

can then be built.

The Open message contains the BGP RID that we've seen in a couple of

show commands, and the rules for the BGP RID are (thankfully) the

same as they are for the OSPF RID.

You can hardcode the BGP RID as well, with the bgp router-id command.

R1(config)#router bgp 1235 R1(config-router)#bgp ? always-compare-med Allow comparing MED from different neighbors bestpath Change the default bestpa th selection client-to-client Configure client to clien t route reflection cluster-id Configure Route-Reflector Cluster-id confederation AS confederation paramete rs dampening Enable route-flap dampeni ng default Configure BGP defaults deterministic-med Pick the best-MED path am ong paths advertised f the neighboring AS fast-external-fallover Immediately reset session if a link to a direct connected external peer g oes down log-neighbor-changes Log neighbor up/down and reset reason redistribute-internal Allow redistribution of i BGP into IGPs (dangero router-id Override configured router identifier scan-time Configure background scan ner interval

R1(config-router)#bgp router-id ? A.B.C.D Manually configured router identifier

R1(config-router)#bgp router-id 11.11.11.11 R1(config-router)#^Z R1#show ipbgp 19:50:28: %BGP-5-ADJCHANGE: neighbor 15.1.1.5 Down Router ID changed 19:50:28: %BGP-5-ADJCHANGE: neighbor 172.12.123.2 D own Router ID changed 19:50:28: %BGP-5-ADJCHANGE: neighbor 172.12.123.3 D own Router ID changed 19:50:28: %SYS-5-CONFIG_I: Configured from console by console

Oh, yeah -- your adjacencies will come down when you do that.

show ip bgp verifies the change (table removed from output)

R1#show ip bgp BGP table version is 6, local router ID is 11.11.11 .1

Back to the packet types...

The BGP Update packet is unique in that unlike RIP and EIGRP updates that contain multiple routes, a BGP Update packet will contain info on

one route and one route only. Having seen BGP in action, you know

there can be much more information to carry about a BGP route than a RIP or EIGRP route.

A couple of times during the course, we saw a BGP Notification message - that's going to be sent any time a connection goes down.

BGP keepalives are sent every 60 seconds by default; the BGP default hold time is 180 seconds.

Watch your iBGP vs. eBGP neighbors. If you're looking at a potential

eBGP neighbor and that neighbor isn't directly connected, you need a static route pointing to that neighbor and the ebgp-multihop command.

In some cases with synchronization on, you can use a static route to

null0 - the "bit bucket" - to allow a BGP route to be used. It's doubtful

that'll appear on the CCNP ROUTE exam, but I mention it to let you know

that a static route to null0 does not help with eBGP neighbor

relationships.

With iBGP neighbors, since they're in the same autonomous system, it's likely that the route to the neighbor exists via an IGP. If not, you can

use a static route there as well. The key is that an IGP will not be

running between ASes, so with eBGP neighbors we have only the static route - not dynamically learned routes.

We saw the result of clear ip bgp * -- that's a hard BGP reset and it

brings the adjacencies down. We go to a lot of trouble to build those suckers, so let's not do that unless absolutely necessary.

R1#clear ip bgp * ? in Soft reconfig inbound update ipv4 Address family out Soft reconfig outbound update soft Soft reconfig vpnv4 Address family <cr>

Running the soft option shown above is the same as running out -- both result in a soft outbound reset.

Now if you're like me - and I mean no insult by that - you'd wonder why the "soft" option by itself doesn't perform both an inbound and outbound

update.

Simply put, the outbound update is easy on the router memory, and the

inbound update is a memory hog.

The soft inbound reset is fine for updating the BGP tables without tearing

the adjacencies down, but it's still a bit of a memory hog.

We have a relatively new method of performing this reset that's even

easier on everyone involved - and you may have seen it mentioned in the rather verbose output of show ip bgp neighbor:

R1#show ip bgp neighbors BGP neighbor is 15.1.1.5, remote AS 1235, internal link Member of peer-group POLICYOUT for session paramet ers BGP version 4, remote router ID 19.1.1.1 BGP state = Established, up for 00:01:26 Last read 00:00:25, hold time is 180, keepalive i nterval is 60 seconds Neighbor capabilities: Route refresh: advertised and received(new)

If your routers show some message involving route refresh, you can run

it with the clear ip bgp in command. The actual words "route refresh" aren't mentioned in the command.

R1#clear ip bgp * ? in Soft reconfig inbound update ipv4 Address family out Soft reconfig outbound update soft Soft reconfig vpnv4 Address family <cr>

R1#clear ip bgp * in ? <cr>

We've run a lot of show commands in this section, but not much

debugging. Let me show you a few basic debugs...

R1#debug ip bgp ? A.B.C.D BGP neighbor address dampening BGP dampening events BGP events in BGP Inbound information keepalives BGP keepalives out BGP Outbound information updates BGP updates vpnv4 VPNv4 NLRI information <cr>

R1#debug ip bgp keepalives BGP keepalives debugging is on R1# 20:30:48: BGP: 172.12.123.3 sending KEEPALIVE (io) 20:30:48: BGP: 172.12.123.3 KEEPALIVE rcvd 20:30:49: BGP: 172.12.123.2 sending KEEPALIVE (io)

R1#debug ip bgp events BGP events debugging is on

R1#clear ip bgp * soft R1# 20:32:12: BGP(0): 1 updates (average = 56, maximum = 56) 20:32:12: BGP(0): 15.1.1.5 updates replicated for n eighbors: 172.12.123.2 172.12.123.3 20:32:12: BGP: Import timer expired. Walking from 1 to 1 R1# R1#clear ip bgp * in R1# 20:32:27: BGP: Import timer expired. Walking from 1 to 1 R1# R1# R1#clear ip bgp * R1# 20:32:34: BGP: reset all neighbors due to User rese t 20:32:34: BGP: 15.1.1.5 reset due to User reset 20:32:34: %BGP-5-ADJCHANGE: neighbor 15.1.1.5 Down User reset 20:32:34: BGP: 172.12.123.2 reset due to User reset 20:32:34: %BGP-5-ADJCHANGE: neighbor 172.12.123.2 D own User reset 20:32:34: BGP: 172.12.123.3 reset due to User reset R1# 20:32:34: %BGP-5-ADJCHANGE: neighbor 172.12.123.3 D own User reset R1# 20:32:42: BGP: Import timer expired. Walking from 1 to 1 R1#u all All possible debugging has been turned off

Copyright © 2011 The Bryant Advantage. All Rights Reserved.

bgp.pdf

Documents

bgp bgp

bgp routers

bgp routes introduction

routing information

routing protocols youve

groups of routers

available routes

path attributes