nfv sdn summit march 2014 d1 07 kireeti_kompella native mpls fabric
TRANSCRIPT
Copyright 2014 Juniper Networks
A NATIVE MPLS FABRIC
MPLS World Congress 2014
CTO, R&D
Kireeti Kompella
Copyright 2014 Juniper Networks
PROBLEM STATEMENT (DC)
Overlays are all the rage in the data center … § except that we’ve been doing overlays/underlays with MPLS pretty
much since 1997
The DC overlays start at the host (server) … § which requires true “plug-and-play” operation
To have an MPLS underlay network, the host must be part of the underlay … § this talk is about making that easy and plug-and-play
Copyright 2014 Juniper Networks
PROBLEM STATEMENT (ACCESS)
Many have asked that MPLS start at the access node (DSLAM, OLT, cell-site gateway) § “Seamless MPLS” has suggested the use of LDP “Downstream on
Demand” (DoD) for this § However, there haven’t been many implementations of LDP DoD
from access node vendors
Maybe the time has come for a different approach/protocol for this functionality, one that is easier to implement
Copyright 2014 Juniper Networks
OVERLAYS AND UNDERLAYS: NEW?
~64K end
points core: no endpoint
state
VCs Single VP
ATM overlay (two level)
O(10^4) VPNs,
O(10^7) addresses
PE P
core: no VPN or VPN address
state
LSP MPLS overlay (multi-level)
edge: lots of state
edge: lots of state
Copyright 2014 Juniper Networks
OVERLAY/UNDERLAY DATA PLANE
Commodity chips implemented the MPLS data plane about a decade ago Now, some have implemented just one of a largish crop of new overlay encapsulations; another is close to shipping § Each has its own way of identifying tenants, doing load balancing,
and of course, its own encap/decap implementations § And, as we speak, there is yet another proposal for an encap
Copyright 2014 Juniper Networks
MPLS has a very sophisticated, robust, scalable and interoperable control plane § Various types of hierarchy are supported § {BGP, T-LDP} [overlay] over {LDP, RSVP-TE} [underlay] § The overlay control plane is for service state § The underlay control plane is for traffic management
The new overlay encapsulations don’t have well-specified, interoperable control planes for either the overlay or the underlay § There is a recent proposal to extend BGP for an overlay (EVPN/
IPVPN) over VXLAN and perhaps NVGRE
OVERLAY/UNDERLAY CONTROL PLANES
Copyright 2014 Juniper Networks
WHY IP UNDERLAYS?
The theory is, IP is simple, plug-and-play, well understood § and all forwarding chips can do IP § also, network interface cards on servers can do TCP offload
Furthermore, IP is ubiquitous § Outside the DC, it is all IP § Upstream from the access node, it is again all IP
Copyright 2014 Juniper Networks
WHY IP UNDERLAYS?
The theory is, IP is simple, plug-and-play, well understood § and all forwarding chips can do IP § also, network interface cards on servers can do TCP offload
Furthermore, IP is ubiquitous § Outside the DC, it is all IP § Upstream from the access node, it is again all IP
The reality is, IP encapsulations are pretty heavyweight § VXLAN and NVGRE forwarding is relatively new § forwarding hasn’t got all the features yet § these encapsulations add significant complexity to forwarding • the above are true whether forwarding is in software or in hardware
Furthermore, while IP is ubiquitous, MPLS is very common too, especially in metro, core and inter-DC networks § To support WAN multi-tenancy, it is very common to use MPLS § Alternately, one can use IPsec, but that is a very different use case,
and requires very different forwarding support
^ not
…
Copyright 2014 Juniper Networks
The theory is, MPLS is complex, hard to provision, hard to understand, hard to troubleshoot § and few people seem to remember if forwarding chips can do MPLS § there is a concern about TCP offload if MPLS is used
Furthermore, MPLS features are not needed for DCs (?) § After all, who would need sophisticated load balancing in a DC? § Who would need traffic engineering? § Fast reroute?
WHY NOT MPLS UNDERLAYS?
Copyright 2014 Juniper Networks
The theory is, MPLS is complex, hard to provision, hard to understand, hard to troubleshoot § and few people seem to remember if forwarding chips can do MPLS § there is a concern about performance if MPLS is used
Furthermore, MPLS features are not needed for DCs (?) § After all, who would need sophisticated load balancing in a DC? § Who would need traffic engineering? § Fast reroute?
Furthermore, MPLS features are indeed needed in DCs § Among them, sophisticated load balancing with entropy labels § Traffic engineering to take care of “mice” and “elephant” flows § and fast reroute – DC applications are real-time and critical
WHY NOT MPLS UNDERLAYS?
The reality is, all networking is complex today. We need to make network provisioning and troubleshooting easier § while focusing on the features we want or even need § Excellent performance is possible with MPLS!
Copyright 2014 Juniper Networks
WHAT TO DO ABOUT OVERLAYS?
Having a proliferation of overlay technologies means significant work for forwarding planes (both software and hardware), which in turn means increased cost § most of all, a dilution of effort that serves no one § a repetition of features and a slow “catch-up” game § a host of interoperability issues – yet more features to implement
Focusing solely on encapsulations loses the plot – we have to look at the whole picture § provisioning, control plane, data plane, troubleshooting
Do it once, do it right!
Copyright 2014 Juniper Networks
CAN THE MPLS CONTROL PLANE (IN SOME CASES) BE TOO SOPHISTICATED?
Can’t have a flat IGP with so many hosts LDP DoD with static routing is a possibility, but not ideal Absolutely has to be plug-and-play – new hosts are added at a high rate
1000s of nodes
VMs
VMs
hosts
ToRs
100000s
VMs
10^7s
Copyright 2014 Juniper Networks
PROXY ARP RECAP
IP1 IP2
1) Hey, give me a hardware address (of type Ethernet) that I can use to reach IP2
3) You can use MAC1
MAC1 H2 T1 H1 T2 …
2) T1 can reach IP2
FIB To reach IP2, use MAC1
Copyright 2014 Juniper Networks
LABELED ARP (1)
IP1 IP2 … H2
T1 has a label (L2) to reach IP2
T1 H1 T2 Label L2 to reach
IP2
LDP Label L3 to reach
IP2
LDP
T2 leaks its hosts’ routes (here IP2) into LDP (with label L3)
LFIB L3: pop & send to H2
Copyright 2014 Juniper Networks
LABELED ARP (2)
IP1 IP2
1) Hey, give me a hardware address (of type MPLSoEthernet)
that I can use to reach IP2
3) You can use MAC1:L1
MAC1
LFIB L1 à L2
H2 T1 H1 T2 L2 L3 L1
Functionality is very much like LDP DoD.
However, ARP code is plug-and-play and
ubiquitous
…
2) T1 can reach IP2 via MPLS, so it allocates L1 for H1 to reach IP2
Note: new h/w type means that this code
can coexist with “normal” ARP code
LFIB L3: pop & send to H2
Copyright 2014 Juniper Networks
POINTS TO NOTE
ARP is ubiquitous – present on 10s of billions of devices L-ARP can coexist peacefully with existing Ethernet ARP § A host can choose to do E-ARP or L-ARP by setting the hardware
type in the ARP request
There is an IETF draft describing L-ARP in detail § The goal is to make this an open standard for anyone to implement
We have prototype L-ARP code for Linux servers § The goal is to make this open source for anyone to use
No, sorry, we don’t have an iOS/Android app (yet!) § But just ask :)
Copyright 2014 Juniper Networks
USE CASE 1: EGRESS PEERING TRAFFIC ENGINEERING
Content Server
ISP1
Data Center
Peering link to ISP1 Intra DC
Network
ISP2
Internet
Direct traffic to prefix1 to ISP1
Direct traffic to prefix2 to ISP2
IP1
IP2
RIB: Prefix1: nh IP1 Prefix2: nh IP2
Peering link to ISP2
Use MPLS underlay for traffic steering
CS L-ARPs for IP1/IP2 to get required labels
Bonus: DC switches carry “a few” MPLS LSPs rather than full
Internet routing
Copyright 2014 Juniper Networks
USE CASE 2: MPLS UNDERLAY FOR DCs (WITH VRFs/E-VPNs FOR OVERLAY)
IP1 IP2 … H2 T1 H1 T2
RR
VM2
BGP: to reach VM2, use VL2 with
IP2 as the nexthop
VM1
1) VM1 wants to talk
to VM2 (same VPN).
LDP LDP L-ARP L-ARP
2) BGP says to reach VM2, go to IP2 3) H1 resolves IP2 using L-ARP
4) Then, packets from VM1 to VM2 are encapsulated with outer label from L-ARP and inner label from BGP
Copyright 2014 Juniper Networks
CONCLUSION
A proliferation of encapsulations hurts the industry § Encapsulations need a number of features in the data plane –
encap/decap, load balancing, offload, hierarchy, FRR § Overlays need a control plane for services and tunnels
MPLS is a powerful data plane and has a great control plane – it is robust, scalable, and very widely used § It just needs to be zero-config, plug-and-play, rack-and-stack, which
is the proposal here See the demo in the booth! § We have prototype L-ARP client code for Linux servers and L-ARP
server code for Junos