designing multi-tenant data centers using evpn
TRANSCRIPT
DesigningMulti-TenantDataCentersusingEVPN-IRB
NeerajMalhotra,PrincipalEngineer,[email protected]
Objectives
ArchitectureObjectives– EvolvingDCRequirements
• Operationalsimplicityviauniformcontrol,dataplaneacrossL2,L3,DC,WAN• FlexibleworkloadplacementandmobilitywithinDCandacrossDCs• EfficientbandwidthutilizationwithinDC– nofloodandlearn,ECMP• Trafficengineering- trafficsteering,ECMP,FRR• HorizontalScaling• Multi-tenancywithL2andL3VPNinDC• InterworkingwithLegacyL3VPN/L2VPNWAN
ADCnetworkfabricmust.....
Leaf-1 Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
BD-1 BD-1
VM
BD-2BD-2
VMVM
Leaf-5
VM
BD-2
.....beseamlessandactlikeasingleswitch/router
Leaf-x Leaf-x+1
VM VMVM VM
WhynotVPLS?
Why not use VPLS in DC?
Simply not designed for DC use-case
L2 Only No All-ActiveRedundancy
No per-flowECMP
Load-balancing
Flood and Learn MAC learning
IsSub-optimal
WhatistheSolution?
FabricSolutionComponents
BGP-EVPNOverlay
EVPNIRBDCFabric
VMMobilityandAny-castL3GW
OverlayDistributed
IRB
MPLSorIPUnderlay
IPorMPLSUnderlay
Underlayvs.Overlay
Underlay=Transport
PhysicalNetworkIP,MPLS/SRTransport
TrafficSteering,ECMP,FRR,.....
Overlay =VPN(L2+L3)
ControlPlane– EVPNDataPlane– MPLS,VXLAN,.....
PolicyDriven
OverlayControlPlane– BGPEVPN
BGPEVPN– EVI
VMVM VMVMVM
EVI20EVI10
EVIextendedoverBGP-EVPNFabrictoalltheLeafsbelongingtotheEVI
Leafsthatdon’tbelongtoaspecificEVIwillnothaveMAC-VRFforthatEVI,providingefficientscalability
EVI: AnEVPNinstanceextendsLayer2betweentheLeafs
Leaf
Spine
BGPEVPN– HostConnectivityOptions,ESI
• EthernetSegmentIdentifier(ESI)‘0’
• NoDFelection
SingleHomeDevice(SHD)Multi-home(MHD)All-Active(Per-
Flow)LB
VM VM
ESI-0 ESI-0 ESI-1 ESI-1
• IdenticalESIonLeafs
• PerVLANDFelection
VMSinglehomedhostMulti-homingwithLinkBundling
Leaf
Spine
BGPEVPN– MACandIPLearning• MAC/IPaddressesareadvertisedalongwithL2andL3VPNencap (MPLSlabelorVNID)torestofLeafsviaMAC+IPRT-2
• IPPrefixroutesareadvertisedviaBGPEVPNviaRT-5
Leaf
Spine
DataPlane,ARP,NDlearningfromthehosts
VMVM VMVM
RR RR
EVPNRouteType2carriesMACandIPreachabilitywithL2+L3VPNencapsulation,L2+L3RTs
RD
EthernetSegmentIdentifier
EthernetTagID
MACAddressLength
MACAddress
IPAddressLength
IPAddress
MPLSLabel1
MPLSLabel2
BGPEVPN– LoadBalancingviaAliasingChallenge:Howtoload-balancetraffictowardsamulti-homeddeviceacrossmultipleLeafswhenMACaddressesarelearntbyonlyasingleLeaf?
RD
EthernetSegmentIdentifier
EthernetTagID
MPLSVPNLabel
EVPNRouteType1advertisesESIreachabilityper-EVItoenableMACECMPwithoutanexplicitMACrouteadvertisement
BGPEVPN– FastConvergenceviaMass-WithdrawChallenge:HowtoinformotherLeafsofafailureaffectingmanyMACaddressesquicklywhilethecontrol-planere-converges?
RD
EthernetSegmentIdentifier
EthernetTagID=ALLFF
MPLSLabel
EVPNRouteType1also advertisesESIreachabilitygloballyforALLEVIstoenableMACindependentconvergenceonESIfailure
BGPEVPN– Multi-destinationtrafficChallenge:HowtodistributeBUMtrafficacrossanEVPNinstance?
RD
EthernetTagID
IPAddressLength
OriginatingRouter’sIPadd.
EVPNRouteType3+PMSIATTR.InclusiveMulticastroutewithaPMSIattributesignalsparticipationinanEVPN’sfloodlist
VMVM
Leaf-3Leaf-1 Leaf-4Leaf-2
Flags
TunnelType
BUMVPNLabel
TunnelID/TEPIP
BGPEVPN- DesignatedForwarder(DF)Challenge:Howtopreventduplicatecopiesoffloodedtrafficfrombeingdeliveredtoamulti-homedEthernetSegment?
RD
EthernetSegmentIdentifier
IPAddressLength
OriginatingRouter’sIPadd.
EVPNRouteType4enablesESIdiscoveryandDFelection
BGPEVPN- SplitHorizonGroupFiltering
Leaf-2
Spine
VMVM
ESI-1
Echo!
Challenge:Howtopreventfloodedtrafficfromechoingbacktoamulti-homedEthernetSegment?
BUMLabel
SHLabel
0x01
Flags
Reserved
ESIMPLSLabel
0x06
Per- ESISHGLabelEXT-COMMwithEVPNRT-1enablesSHGfilteringtocutpotentialloopsbacktosameESI
Leaf-1
VM
VMMobility– MAC+IPChallenge:HowtodetectthecorrectlocationofMACafterthemovementofhostfromoneEthernetSegmenttoanotheralsocalled“MACmove”?
19
VMVM
IP-1MAC-1
Leaf-3Leaf-1
MAC IP ESI Seq. Next-Hop
MAC-1 IP-1 0 0 Leaf-1
Hostmove
Leaf-4Leaf-2
SequencenumberandNext-Hopvaluewillbechangedafterthehostmove
0x00
Reserved
SequenceNumber
0x06
MobilityEXT-COMMwithEVPNRT-2carriesMAC+IProutesequencenumber toenableMACmobility
VMVM
IP-1MAC-1
Leaf-3Leaf-1
MAC IP ESI Seq. Next-Hop
MAC-1 IP-1 0 1 Leaf-3
Leaf-4
ESI-1
Leaf-2SequencenumberisincrementedandNext-hopischangedtoLeaf-3
VMMobility,continued
OverlayIntegratedRoutingandBridging(IRB)
Howdowedointer-subnetrouting?
OverlayRoutingArchitectures
• CentralizedRouting• DistributedRouting– AsymmetricIRB• DistributedRouting– SymmetricIRB
Leaf-1 Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Bridging ontheleaf
CentralizedRouting
• east<->westroutedtraffictraversestocentralizedL3gateways• Scalebottleneck:
• CentralizedhavefullARP/MACstateintheDC• CentralizedGWneedstohostallDCsubnets
IRB-1GWMAC
IRB-2GWMAC
IRB-1GWMAC
IRB-2GWMAC
CentralizedRoutingontheSpine
Bridging ontheleaf
L3
L2
DistributedRouting– AsymmetricIRB
• Egresssubnetisalwayslocal• Inter-subnetpacketsrouteddirectlytodestinationVM’sDMAC• Scalebottleneck:
• Allegresssubnetsneedstobepresentoningressleaf• IngressleafmaintainsARP/NDstateeveryegressleaf
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
RoutedandBridgedtoremoteVM
IRB-1GWMAC
IRB-2GWMAC
IRB-1GWMAC
IRB-2GWMAC
IPorMPLSTransport(underlayrouting)
Bridging tolocalVMMAC
IRB-2GWMAC
VLAN-2
IRB-2GWMAC
VRF
L3
L2
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Routedtoremoteleaf
DistributedRouting– SymmetricIRB
IRB-1GWMAC
IRB-2GWMAC
IRB-1GWMAC
IRB-2GWMAC
IPorMPLSTransport(underlayrouting)
RoutedtolocalVM
IRB-2GWMAC
VRF
• RemoteVMIPisinstalledlikeaVPNIProuterecursivelyoverremoteleafnext-hop• Noadjacenciestoremotehostsevenifthesubnetislocal• Subnetdoesnotneedtobelocaloningressleafunlesstherearelocalhosts
L3
L2
OverlayDistributedAny-cast GW
Howdowelethostsmove?
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM
Leaf-5
VM
VLAN-2
Any-castGWIPandMACforsubnet-a
SymmetricIRB– DistributedAny-cast GW
• Any-castGWIPandAny-castGWMACconfiguredonALLleafswithlocalsubnet• Essentially,SubnetGWisdistributedacrossALLleafswithlocalsubnet
GWIP-aGWMAC-a
GWIP-bGWMAC-
b
GWIP-aGWMAC-a
GWIP-bGWMAC-
b
GWIP-bGWMAC-
b
VLAN-2
GWIP-bGWMAC-
b
VRF
VM
Any-castGWIPandMACforsubnet-a
Any-castGWIPandMACforsubnet-b
Any-castGWIPandMACforsubnet-b
Any-castGWIPandMACforsubnet-b
ControlandDataPlaneCallFlow
Puttingitalltogether.....
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
HostLearning- ARPREQUESTcontd.
1. IPpacketdestinedtoVM-btriggersARPforVM-bonLeaf-1fromany-castGWIP-bandany-castGWMAC-b2. ARPtoVM-bfloodedtoallremoteleafswhereVLAN-bisstretched(viaEVPNRT-3enabledIR)3. LeafsfloodonlocalBDports
GWIP-aGWMAC-a
GWIP-bGWMAC-
b
GWIP-aGWMAC-a
GWIP-bGWMAC-
b
GWIP-bGWMAC-
b
VLAN-2
GWIP-bGWMAC-
b
VRF
VM
DIP:VM-b
DIP:VM-b
ARP:VM-b
ARP:VM-b ARP:VM-b ARP:VM-b
ARP:VM-b
RT-2:VM-a
Leaf-2 Leaf-3
Spine-RR Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
HostLearning– ARPREPLY,MAC+IPRT-2
GWIP-aGWMAC-a
GWIP-bGWMAC-
b
GWIP-aGWMAC-a
GWIP-bGWMAC-
b
GWIP-bGWMAC-
b
VLAN-2
GWIP-bGWMAC-
b
VRF
VM
ARP:VM-b
ARPREPLY:VM-bVM-b-MACGWMAC-b
ARP:VM-b
• ARPREPLYtoGWMAC-bconsumedonLeaf-4andinstalledinARPtable
• EVPNMAC+IPRT-2advertisedtoremoteleafsviaRR
EVPNRT-2
RD:Leaf-4:
IVM-b--MAC
VM-b-IP
L23VPN LABEL/VNI
L2 VPNLABEL/VNI
NH-Leaf-4
L3-RT, L2-RT
VM-bMACReachabilityinstalledinMAC-VRFacrossremoteleafsVM-bIPReachabilityinstalledinIP-VRFacrossremoteleafsasBGPL3VPNrouteindependentofsubnet
beinglocalornot
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
Routedtoremoteleaf
IPVRF-a:IP-b/32->Leaf-4,L3VPNLabel
Inter-subnettraffictoVM-b
IRB-1GWMAC
IRB-2GWMAC
IRB-1GWMAC
IRB-2GWMAC
IPorMPLSTransport(underlayrouting)
RoutedtolocalVM-b
IPVRF-a:IP-b/32->BVIARPadjacency
IRB-2GWMAC
VLAN-2
IRB-2GWMAC
VRF-a
VM
Leaf-2 Leaf-3
Spine Spine
Leaf-4
VM-a
VLAN-1 VLAN-1
VM
VLAN-2 VLAN-2
VM VM-b
Leaf-5
VM
VLAN-2
Bridgedtoremoteleafnext-hop
MAC-VRF:MAC-b->Leaf-4,L2VPNLabel
Intra-subnettraffictoVM-b
IRB-1GWMAC
IRB-2GWMAC
IRB-1GWMAC
IRB-2GWMAC
IPorMPLSTransport(underlayrouting)
IRB-2GWMAC
VLAN-2
IRB-2GWMAC
VRF
VM
BridgedtolocalVM-bMAC
MAC-VRF:MAC-b->BE1.1
Summary
• Unifiedcontrol,dataplaneacrossL2,L3,DC,WAN• FlexibleworkloadplacementandmobilityacrossL2Overlay• Optimalbandwidthutilization– nofloodandlearn,ECMPinoverlay,underlay• TrafficengineeringwithMPLSfabric- trafficsteering,ECMP,FRR• HorizontalScalingwithdistributedsymmetricIRB• Multi-tenancywithL2andL3VPN• InterworkingwithLegacyL3VPN/L2VPNWAN
ThankYou