understanding convergence in mpls vpn … · · 2017-02-061 understanding convergence in mpls vpn...
TRANSCRIPT
1
UNDERSTANDING CONVERGENCE IN MPLS VPN NETWORKS
Mukhtiar A. Shaikh ([email protected])Moiz Moizuddin ([email protected])
222
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
333
Importance of Convergence in L3VPN-Based Networks
• Convergence in the traditional overlay Layer 2 VPNs is pretty fast
• In the traditional Layer 2 VPN Frame or ATM-based networks, Service provider network is not a factor for Layer 3 convergence
• Customers are now moving to VPN services based on Layer 3 infrastructure (aka RFC 2547 based VPNs)
• It is necessary to understand the factors which impacts the L3VPN convergence and how it can be improved
• Convergence varies depending on the network size, PE-CE protocol, redundancy options, etc.
• Default convergence in the MPLS VPN networks could be very high in the order of 60+ secs…but not always ☺☺☺☺
444
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
555
Remote_CE
Convergence DefinitionWhat Is Convergence in MPLS VPN Networks?
• Convergence is the time it takes for the data traffic from the remote CE to reach the local CE after a topology change has occurred in the network
Local_CE
RR
Local_PE Remote_PE
666
What Is Up Convergence??
• Up convergence in L3VPN environment can be defined as the time it takes for traffic to be restored between VPN sites when:
A new prefix is advertised and propagated from a local CE to the remote CE, orA new site comes up
Local_CE Local-PEP
Routes Advertised
TrafficRestored
MPLS Core
Routes Advertised on the Primary link
RR
RR
P
Remote-CERemote-PE1
Remote-PE2
Routes advertised on the backup link
Backup link
• Up Convergence is applicable in cases where there is a backup link which comes up only after the primary goes down
• Or If we are using some sort of conditional advertisement• Up convergence can be loosely defined as route advertisement from
CE to CE
Px
777
What Is Down Convergence??
Local_CE Local-PEP
Routes Advertised RoutesWithdrawn
TrafficRestored
MPLS Core
Routes Received =Up Convergence
RR
RR
P
Remote-CERemote-PE1
Remote-PE2
Routes Withdrawn orReroute via Alternate Path = Down Convergence
• Down convergence can be defined as how fast the traffic is rerouted on an alternate path due to failure either in the
•SP network•Customer network •(Primary) PE-CE link
• Down convergence can be loosely defined as withdrawal of best path
888
What Are Convergence Points in a MPLS VPN Network??
• Overall VPN convergence is the sum of individual convergence points
Local_CE
Local_PE
RR
Remote_PE
Remote_CE
Advertisement of Routes to
MP-BGP Peers=5 sec
T4 T4
T1
T2
T3
Receipt of Local Routes into BGP
Table on the PE Router
T5
Receipt of Advertised Routes into BGP Table on
the PE Router
T6
T7
T8Processing of
Incoming Updates by the CE Router
Import of LocalRouting
Information into the
Corresponding Ingress VRF
Routing Table
PE Router Receivesa Routing
Updatefrom a CE
Router=30sec
Advertisement ofRoutes to CE
Routers=30 sec
Import of Newly Received
Routes into Local VRF’s
Routing Table=15sec
RIBRIB FIB
LCLC--HWHWFIBFIB
LCLCFIBFIB
VrfTable
MP-BGP TableMPMP--BGP BGP TableTable
MP-BGP TableMPMP--BGP BGP TableTable
RIBRIB FIB
LCLC--HWHWFIBFIB
LCLCFIBFIB
VrfTable
999
Summary (Theoretical Convergence)• Two sets of timers; first set consists of T1, T4, T6 and T7; second set comprises of T2, T3,
T5 and T8• First set mainly responsible for the slower convergence unless aggressively tweaked
down• Theoretically sums up to ~ 85 seconds [30 (T1)+5*2 (T4)+15(T6)+30 (T7)]• Once different timers are tuned, convergence mainly depends on T6; min T6=5 secs• Assuming ~“x” secs for T2, T3, T5 and T8 collectively
Remote_PE
Remote_CELocal_CE
RRLocal_PE
T8
T3
T2
T5T4 T4
T1T7
T6
~5+x Seconds
~5+x Seconds~5+x Seconds~5+x Seconds
Max Conv. Time (Timers Tweaked Scan=5, Adv=0)
~85+x SecondsRIP
~25+x SecondsEIGRP~25+x SecondsOSPF~85+x SecondsBGP
Max Conv. Time (Default Settings)PE-CE Protocol
101010
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
111111
Test Methodology
• Total number of PEs used = 100
Total number of vrfs created = 1000 vrfs
250 BGP sessions250 RIP instances20 OSPF sessions250 EIGRPRemaining sessions (~230) configured with static routing
• Same RD was used for each VPN
• 100 routes per vrf in steady state
• One additional test vrf with 1000 prefixes
• Total number of VPN routes =1000*100=100k*2
• 2-RRConvergence measured for the test vrf
• Testing done with reasonably large MPLS VPN network• Test tool was used for simulating the VPN sites, generating the VPN
routing information and sending traffic to the VPN prefixes
121212
Test Cases Carried Out…
• Test case I—Default timers BGP import scanner = 15Advertisement interval = 30 (EBGP) and 5 (IBGP)
• Test case III—Tweak BGP import scanner
BGP import scanner = 5router bgp 65001Address-family vpnv4
bgp import scan 5 Advertisement interval = default
• Test case II—Tweak BGP advertisement interval
BGP import scanner = 15Advertisement interval = 0
router bgp 65001Address-family vpnv4
neighbor a.b.c.d advertisement-interval 0
• Test case IV—Tweak BGP advertisement and import Scanner timers
BGP import scanner = 5 Advertisement Interval = 0
router bgp 65001Address-family vpnv4
bgp import scan 5 neighbor a.b.c.d advertisement- interval 0
131313
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
141414
Day in the Life of a VPN Update
• This section shows route propagation for a network using EIGRP as the PE-CE routing protocol
• Other routing protocols would exhibit similar behavior with few exceptions (explained later on)
• Default BGP timers are used; adv-interval = 5s, import scanner = 15s
• Up Convergence discussed as part of this case study
Remote_CELocal_CE
RRLocal_PE Remote_PE
EIGRP EIGRP
151515
Day in the Life of a VPN Update (Cont.)
• Routes are first installed in the VRF routing table• If PE-CE protocol is non-BGP (in this case EIGRP), additional time (can be negligible for
smaller number of prefixes) is needed to redistribute these routes into MP-BGP and generating VPNV4 prefixes
• Not all the prefixes are installed in the routing table at the same time as updating RT, FIB/LFIB takes some time
Remote_CE
RR
Local_PE Remote_PE
RIBRIB FIB
LCLC--HWHWFIBFIB
LCLCFIBFIB
Jan 2 10:35:01.341 PST: RT(vpn1): add 10.0.0.0/24 via 192.168.1.54, eigrp metric [170/2560002816]
VRFVRFBGPBGP
First Batch
Tic…Tic…5sec
Local_CE
After Receiving the First EIGRP Update, within 100 Msec (Approx) in This Example First Batch of IBGP Updates are Advertised to RR
Jan 2 10:35:01.469 PST: BGP(2): 192.168.10.12 send UPDATE(prepend, chgflags: 0x820) 100:1:10.0.0.0/24
• Once update is sent with some prefixes (First batch) to RR, the iBGP adv-int (5 secs)kicks in on local PE
(Adv-Interval)
161616
• When first batch is received on RR, routes are immediately sent to the remote PE
Day in the Life of a VPN Update (Cont.)
Remote_CELocal_CE
RR
Local_PE Remote_PE
VRFVRFBGPBGP
First Batch
Tic…Tic…5 Sec
First Batch
BGPBGP
VRFVRFBGPBGP
Tic…Tic…15sec
*Jan 2 10:35:01.996: BGP(2): 192.168.1.11 rcvd 100:1:10.0.0.0/24*Jan 2 10:35:02.000: BGP(2): 192.168.1.11 send UPDATE (format) 100:1:10.0.0.0/24
Remote_PE#Show ip bgp vpnv4 all 10.0.0.0BGP routing table entry for 100:1:10.0.0.0/24, version 14Paths: (1 available, best #1, no table)Flag: 0x820Not advertised to any peer65501192.168.1.11 (metric 30) from 192.168.10.12 (192.168.10.12)Origin incomplete, metric 0, localpref 100, valid, internal, bestExtended Community: RT:100:1
Jan 2 10:35:03.165 PST: RT(vpn1): add 10.3.132.0/24 via 192.168.1.54, eigrp metric [170/2560002816]
• In the mean time, more EIGRP prefixes are received from the CE, processed and installed in the VRF table on local PE router
• But these prefixes are subjected to the adv-interval and have to wait for a total of 5 secs before they are sent to RR
Tic…Tic…5Sec
(Import Scanner)(Adv-Interval)
171717
Day in the Life of a VPN Update (Cont.)
• After the import scanner timer expires, remote PE installs the routes in the VRF table; FIB/LFIB gets updated both on the RP and LCs
• Remote PE advertises the prefixes towards CE
Remote_CELocal_CE
RR
Local_PE Remote_PE
VRFVRFBGPBGP
1st Batch VRFVRFBGPBGP
BGPBGP
1st Batch
Jan 2 10:35:03.522 PST: BGP: ... start import cfg version = 0Jan 2 10:35:03.986 PST: RT(vpn1): add 10.0.0.0/24 via 192.168.1.11, bgp metric [200/2560002816]
Compare with the Timestamp 10:35:02.000When RR Sent the Update
Remote_PE# Show ip bgp vpnv4 all 10.0.0.0
BGP routing table entry for 100:1:10.0.0.0./24, version 22
Paths: (1 available, best #1, table vpna)
Flag: 0x820
Not advertised to any peer
65501
192.168.1.11 (metric 30) from 192.168.10.12 (192.168.10.12)
Origin incomplete, metric 0, localpref 100, valid, internal, best
Extended Community: RT:100:1
Routes Once Received at Remote CE are Processed and Installed in the RT ImmediatelyJan 2 10:35:04.042 PST: RT: add 10.0.0.0/24 via 193.1.1.1, eigrp metric [170/2560005376
RIBRIB FIB
LCLC--HWHWFIBFIB
LCLCFIBFIB
X Scan Timer Expires
181818
• Advertisement_interval expires on the local PE router and as a result it announces the second batch of routes to RR
Day in the Life of a VPN Update (Cont.)
Remote_CELocal_CE
RR
Local_PE Remote_PE
VRFVRFBGPBGP
2nd Batch VRFVRFBGPBGP
XAdv-Int Expires
2nd Batch
RR Receives the Second Batch of Updates*Jan 2 10:35:07.910: BGP(2): 192.168.1.11 rcvd 100:1:10.0.100.0/24
Tic…Tic…15 Sec
Remote PE Receives the Second Batch of UpdatesJan 2 10:35:08.238 PST: BGP(2): 192.168.10.12 rcvd UPDATE w/ attr: nexthop 192.168.1.11
• Not all the updates could be processed before we suspend the process; advertisement-interval kicks in again (5s)
• But routes don’t get installed in the routing table but wait for up to 15 secs;(reason for spike in the graph discussed later)
BGPBGP
(Import Scanner)
191919
Day in the Life of a VPN Update (Cont.)
• While remote PE is waiting for import-scan to expire, a third batch of updates is received• But again these updates are subjected to the scan-interval and wait for import-scanner
(up to 15s) before they are installed in the VRF Routing Table on remote PE.• Remaining prefixes get installed in routing table and are advertised to the remote CE
RR
Local_PE Remote_PE
VRFVRFBGPBGP
3rd Batch VRFVRFBGPBGP
3rd Batch
Local_CE Remote_CE
Jan 2 10:35:13.538 PST: BGP(2): 192.168.10.12 rcvd UPDATE w/ attr: nexthop 192.168.1.11
Jan 2 10:35:20.526 PST: BGP: Import walker start version 4472514, end version 4473513Jan 2 10:35:21.762 PST: RT(vpn1): add 10.3.132.0/24 via 192.168.1.11, bgp metric [200/2560002816]
Remote CE Receives All the Prefixes
Jan 2 10:35:22.266 PST: RT: add 10.0.100.0/24 via 193.1.1.1, eigrp metric [170/2560005376
Jan 2 10:35:22.998 PST: RT: add 10.3.132.0/24 via 193.1.1.1, eigrp metric [170/2560005376]
RR Receives the Third Batch of Updates*Jan 2 10:35:13.380 : BGP(2): 192.168.1.11 rcvd 100:1:10.3.132.0/24
Tic…Tic…15 SecBGPBGPX
Adv-Int Expires
(Import Scanner)
202020
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
212121
PE-CE=EIGRP/OSPF/BGP: Test Case I: BGP Import-Scan/Adv-Interval=Default (Cont.)
BGP Scan = 15, Adv = Default
Tim
e in
Sec
s
Tim
e in
Sec
sTi
me
in S
ecs
EIGRP = 15, Adv = Default
OSPF PECE—Import scan = 15 Adv = Default
Prefix Number Prefix Number
Prefix Number
• Diagrams show the average, maximum, minimum, and median values for all prefixes measured across 100 iterations
• Maximum convergence for each protocol is pretty close to the expected results
• Average/median close to 30 secs for BGP and little over 20 secs for other protocols
• Minimum convergence ranges from <1 sec for the first prefix to over 10 seconds for the last prefix
• The difference is not linear but on the average, a jump of 10 Secs between the convergence of first and the last (1000th) prefix is seen
~25+x SecondsEIGRP~25+x SecondsOSPF~85+x SecondsBGP
Max Conv. Time (Default Settings)PE-CE Protocol
Ref: Theoretical
222222
What Are Those Jumps??
• Straight lines indicate that all prefixes converged almost at the same time
• Jumps indicate that some prefixes converged before others• Jumps are because router is either waiting for advertisement interval
or bgp import scanner interval or waiting for both timers to expire
BGP (Default Timers)
Prefix Number
232323
BGP Scan = 5 Adv = 0 EIGRP Scan = 5 Adv = 0
OSPF Scan = 5 Adv = 0
Tim
e in
Sec
sTi
me
in S
ecs
Prefix Number Prefix Number
Prefix NumberTi
me
in S
ecs
PE-CE=EIGRP/OSPF/BGP: Test Case IV: Import-Scan=5, Adv-Interval=0
• The last scenario offers the best convergence times• The max is pretty close to 10 secs while average is ~5+
seconds
~5+x Seconds~5+x Seconds~5+x Seconds
Max Conv. Time (Timers Tweaked Scan=5, Adv=0)
~25+x SecondsEIGRP~25+x SecondsOSPF~85+x SecondsBGP
Max Conv. Time (Default Settings)PE-CE Protocol
Ref: Theoretical
242424
Summary Observed Up Convergence
• Most of the results are within the max theoretical limits• Important observation is that cumulative convergence is not
necessarily the simple addition of timers• Especially there can multiple occurrences of T1, T4, T6, or T7 before
all the prefixes have converged• Tweaked timers improve convergence
Remote_PE
Remote_CELocal_CE
RRLocal_PE
T8
T3
T2
T5T4 T4
T1T7
T6
252525
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
262626
Failure Scenarios
• PE Failure• RR Failure• CE (Link/Node) Failure• Failure in the MPLS Core
~ 60 SecsMPLS Core (Link/Node Failure)
~ 60 Secs3
~ 5 Secs4
CE (node/link) Failure
~ 15Secs2RR Failure
~ 65 Secs1Primary PE-Failure
Expected Max ConvergenceFailure Scenario
1. Assuming PE, RR Did Not Send a Notification to RR or to PE
2. Assuming Dual RRs and different RD case
3. Assuming CE Did Not Send a Notification to the PE Router
4. Assuming the PE-CE Link Failure Was Immediately Detected by the PE Router
Later Slides Will Explain How We Got theMaximum Value
272727
PE Router Failure Scenario
• In this case we measure how long it takes Remote_PE to select Local_PE1 as the bestpath for prefix N2 when Local_PE2 goes down (provided PE2 was preferred path)
• When Local_PE2 goes down it takes BGP scan time (default 60s) for RR to detect that next-hop for N2 is gone down
• If Local_PE2 doesn’t crash but rather reloads then it may send a BGP notification to RRs to close the BGP session
• In this case RR would send an immediate withdraw to Remote_PE
VPN-ASITE #1
Remote_CERemote_PE
P-1
RR-1 RR-2
Local_PE_1Local CE_1- VPN-A
SITE #2N2
Local_PE_2
Local_CE2 R2
R1
Traffic Flow
282828
Core Link Failure Scenario
• RR1 is reflecting PE2 as the bestpath and RR2 is reflecting PE1 as the best path; Remote_PE chooses the path from RR2 (i.e. PE1) as the bestpath
• The BGP session between the RRs and PE-1 may not go down for 3 minutes (default holdtimer assumed)
• When next-hop inaccessibility is detected by the BGP scanner process (runs every 60 secs), remote PE would switch over to the alternate path
Core Link FailureTi
me
in S
econ
ds
Prefix Number
VPN-ASITE #1
Remote_CERemote_PE
P-1
RR-1 RR-2
Local_PE_1Local CE_1- VPN-A
SITE #2N2
Local_PE_2
Local_CE2 R2
R1
Traffic Flow
292929
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence• Design Considerations• Summary
303030
Design Considerations
• Down convergence could be improved by using the NHT (next-hop tracking) feature.
• Risk of instabilities caused by BGP/routing churns as result of lowering to minimum values
Careful setting of the advertisement interval both for both IBGP and EBGP sessions is needed
Keeping the advertisement interval to 1 sec both for the IBGP and EBGP could prevent the unnecessary churn and at the same time could improve the convergence significantly
313131
Design Considerations
• Fast IGP timers help improving the overall convergence in case of failure in the SP core
• Conditionally advertise only PE and P loopback addresses to reduce the number of prefix+label rewrites in the core failure event
• Use of default BGP behavior (bgp fast-fall-over) • Use interface dampening and route dampening for
customer links/sessions to prevent the churns
323232
Agenda
• Introduction• Convergence Definition• Expected (Theoretical) Convergence• Test Methodology• Day in the Life of VPN Routing Update• Observed Up Convergence • Observed Down Convergence • Design Considerations• Summary
333333
Summary
• Possible to get less than 5s convergence for small number of prefixes
• Maximum up convergence could be reduced down to ~5 secs when both the advertisement and import scanner are lowered to their min possible values
• While BGP is little slower to react, no major difference in the convergence across various PE-CE protocols once BGP timers tweaked
• For large number of prefixes convergence may happen in multiple batches
343434
Q AND AQ AND A