15-744: computer networking l-7 software forwarding
TRANSCRIPT
15-744: Computer Networking
L-7 Software Forwarding
Software-Based Routers
• Motivation• Enabling innovation in networking research• Software data planes
• Readings:• OpenFlow: Enabling Innovation in Campus Networks• The Click Modular Router
• Optional reading• RouteBricks: Exploiting Parallelism To Scale Software
Routers
2
Active Networking Recap
• Network API exposes capabilities • Processing, queues, storage
• Custom code/functions run on each packet
• E.g., conventional IP is best effort, dst based• When could this be insufficient?
3
Two models of active networks
• “Capsule”• Packet carries code!
• Programmable router• Operator installs modules on router
• Pros/cons?
4
Criticisms
• Too far removed from conventional networks• Upgrade/deployability?
• Capsule was considered insecure
• No killer apps (continues to be problem)
• Performance?
5
Three logical stages (more hindsight)
• Active networking era• Case for “programmable” network devices
• “Separation” of control vs data era• Specifically about routing etc
• OpenFlow/Network OS era
6
Network Management
Traffic Engineering PerformanceSecurity ComplianceResilience
7
Problem: Toolbox is bad!
Traffic Engineering PerformanceSecurity ComplianceResilience
8
Why: Toolbox is implicit in routers!
Traffic Engineering PerformanceSecurity ComplianceResilience
9
Motivation: Management is complex, expensive, fragileNeed: Direct control, expressive policy, network-wide views
Solution
• Separate out the “data” and the “control”
• Open interface between control/data planes
• Logically centralized views• Simplifies optimization/policy management• Network-wide visibility
10
Today: OpenFlow
Controller
Config Config
OpenFlow
Next Lecture: ONIX
Controller
Config Config
E.g., ONIX, NOX, …
OpenFlow: Motivation
• The Internet is a “success disaster”• Many successful applications• Critical for economy as a whole• Too huge a vested infrastructure• Vendors loathe to change anything
• Fear in community: “ossification” • New ideas cannot get deployed
Driving questions
• Get our own operators comfortable with running network experiments
• Isolate experimental traffic from production traffic
• What is the functionality that enables innovation?
Rejected alternatives
• Get vendors to support
• Use PC/Linux based network elements
• Existing research prototypes for programmable elements
Their Path
• “Pragmatic compromise”
• Sacrifice generality for: • Performance• Cost• Vendor “buy-in”
Three Basic Features in OpenFlow
Controller
Config Config
FlowTable
SecureChannel
OpenProtocol
FlowTable Actions
• Forward on specific port/interface
• Forward to controller (encapsulated)
• Drop
• Forward legacy
• Future support: counters, modifiers
What is nice
• Fits well with the TCAM abstraction
• Most vendors already have this
• They can just expose this without exposing internals
Example Apps
• Ethane
• Amy’s own OSPF
• VLAN
• VoIP for Mobile
• Support for non-IP
Driving questions: Did it achieve this?
• Get operators comfortable with running experimental?
• Isolate experimental traffic from production traffic?
• What is the functionality that can enable innovation?
Software-Based Routers
• Enabling innovation in networking research• Software data planes
• Readings:• OpenFlow: Enabling Innovation in Campus Networks• The Click Modular Router
• Optional reading• RouteBricks: Exploiting Parallelism To Scale Software
Routers
22
Click overview
• Modular architecture• Router = composition of modules• Router = data flow graph
• An element is the basic unit of processing
• Three key components of each element:• Ports• Configuration• Method interfaces
23
Simple Tee Element
24
Two types of “connections”
• Push• Source element has finished processing• Sends it downstream• E.g., FromDevice
• Pull• Destination is ready to process• Initiates packet transfer• E.g., ToDevice
25
“Flow” of processing
26
Click Config File
27
Click Elements
28
Other elements
• Packet Classification
• Scheduling
• Queueing
• Routing
• What you write…
29
Idea: Polling
• Under heavy load, disable the network card’s interrupts
• Use polling instead• Ask if there is more work once you’ve done the first
batch
• Click paper we read – does pure polling
Takeaways
• Click is a flexible modular router
• Shows that s/w x86 can get pretty good performance
• Extensible/modular
• Widely used in academia/research• Play with it!
31
Software-Based Routers
• Enabling innovation in networking research• Software data planes
• Readings:• OpenFlow: Enabling Innovation in Campus Networks• The Click Modular Router
• Optional reading• RouteBricks: Exploiting Parallelism To Scale
Software Routers
32
Building routers
• Fast
• Programmable
• custom statistics• filtering
• packet transformation
• …
33RouteBricks slides: Katerina Argyraki, 2009
Why programmable routers
• New ISP services• intrusion detection, application
acceleration
• Simpler network monitoring • measure link latency, track down traffic
• New protocols• IP traceback, Trajectory Sampling, …
34
Enable flexible, extensible networks
Today: fast or programmable
• Fast “hardware” routers• throughput : Tbps
• little programmability
• Programmable “software” routers• processing by general-purpose CPUs
• throughput < 10Gbps
35
RouteBricks
• A router out of off-the-shelf PCs
• familiar programming environment
• large-volume manufacturing
• Can we build a Tbps router out of PCs?
36
packet processing+
switching
Router =
• N: number of external router ports
• R: external line rate
R
R
R
R
R
R
R
R
37
N
N
RR
A hardware router
• Processing at rate ~R per linecard
linecards linecards
38
A hardware router
• Processing at rate ~R per linecard
• Switching at rate N x R by switch fabric
switch fabric
N
RR
linecards linecards
39
commodity interconnect
RouteBricks
N
RR
• Processing at rate ~R per server
• Switching at rate ~R per server
servers servers
40
Outline
• Interconnect
• Server optimizations
• Performance
41
commodity interconnect
Requirements
N
RR
• Internal link rates < R• Per-server processing rate: c x R• Per-server fanout: constant
42
A naive solution
N
R
RR
43
A naive solution
N
RR R
44
• N external links of capacity R
• N2 internal links of capacity R
Valiant load balancing (VLB)
N
R
R R/NR/N
45
Valiant load balancing (VLB)
N
RR
46
• N external links of capacity R
• N2 internal links of capacity R
R/N R/N
2R/N
Valiant load balancing (VLB)
N
RR R/N R/N
• Per-server processing rate: 3R
• W/ uniform traffic: 2R
47
Per-server fanout?
N
R
48
Per-server fanout?
N
R
• Increase server capacity
49
Per-server fanout?
N
R
• Increase server capacity
50
Per-server fanout?
N
R
• Increase server capacity
• Add intermediate nodes• k-degree n-stage butterfly
51
Our solution: combination
• Assign max external ports per server
• Full mesh, if possible
• Extra servers, otherwise
52
Valiant load balancing
+full mesh k-ary n-fly
Recap
N
RR
Per-server processing rate: 2R – 3R
53
Outline
• Interconnect
• Server optimizations
• Performance
54
Setup: NUMA architecture
I/O hub
Mem
Cores
Mem
• Nehalem architecture, QuickPath interconnect
• CPUs: 2 x [2.8GHz, 4 cores, 8MB L3 cache]
• NICs: 2 x Intel XFSR 2x10Gbps
• kernel-mode Click
Ports
55
Single-server performance
I/O hub
Mem
Cores
Mem
Ports
56
• First try: 1.3 Gbps
Problem #1: book-keeping
• Managing packet descriptors• moving between NIC and memory
• updating descriptor rings
• Solution: batch packet operations• NIC batches multiple packet descriptors
• CPU polls for multiple packets
57
Single-server performance
I/O hub
Mem
Cores
Mem
Ports
58
• First try: 1.3 Gbps
• With batching: 3 Gbps
Problem #2: queue access
CoresPorts
59
Problem #2: queue access
60
• Rule #1: 1 core per port
Problem #2: queue access
61
• Rule #1: 1 core per port
• Rule #2: 1 core per packet
Problem #2: queue access
62
• Rule #1: 1 core per port
• Rule #2: 1 core per packet
Problem #2: queue access
63
• Rule #1: 1 core per port
• Rule #2: 1 core per packet
Problem #2: queue access
64
• Rule #1: 1 core per port
• Rule #2: 1 core per packet
queue
Single-server performance
I/O hub
Mem
Cores
Mem
Ports
65
• First try: 1.3 Gbps
• With batching: 3 Gbps
• With multiple queues: 9.7 Gbps
Recap
• State-of-the art hardware• NUMA architecture, multi-queue NICs
• Modified NIC driver• batching
• Careful queue-to-core allocation• one core per queue, per packet
66
Outline
• Interconnect
• Server optimizations
• Performance
67
Effect of application
68
• Throughput heavily depends on workload.
Summary
• Vision of active networking
• Separating data plane and control plane
• Building software routers by starting with: • closed, commercial routers vs.
• commodity PCs
• Pros and cons?
69
Next Lecture
• Software-Defined Networking• Readings:
• 4D: Read in full• Onix: Read intro• Ethane: Optional reading
70