insights into routervm’s flexibility and performance mel tsai [email protected]

21
Insights Into RouterVM’s Flexibility and Performance Mel Tsai [email protected]

Post on 22-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

Insights Into RouterVM’s Flexibility and Performance

Mel [email protected]

Page 2: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

2

OutlineNetwork Appliance ConvergenceBrief Overview of RouterVM & GPFsGPF FlexibilityGPF PerformanceDemo

Page 3: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

3

New Requirements in the Enterprise

ISP

Edge Router

Firewall / VPN

Server LoadBalancer

IP StorageGateway

IntrusionDetection

Content Cache

LinkCompressor Switch

Switch

Switch

Switch

Switch

Server Blades

SAN

ClientWorkstations

200 Mbps

2.5 Gbps

1 Gbps

1 Gbps

1 Gbps

1 Gbps

1 Gbps

1 Gbps

1 Gbps

40 Mbps

Offsite 1-2.5 Gbps

2.5 - 10 Gbps

Page 4: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

4

Network Appliance Convergence

Recent strong trend towards cascading multiple functions into one appliance

Netscalar, F5, Redline, Tasman, Inkra

The hardware is coming… We are slowing reaching the point where we can do almost anything to packet flows at line rate

But how do you manage multiple devices/functions in your network?What about configurability and ease-of-deployment?Can end-users or administrators program the device?What about the user interface?

Page 5: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

5

RouterVM OverviewRouterVM turns the concept of a “packet filter” into a high-level, programmable building-block for network appliance applications

FILTER 19 SETUP

NAME - SIP -

SMASK - DIP -

DMASK -PROTO -

SRC PORT -DST PORT -

VLAN - ACTION -

exampleany255.255.255.25510.0.0.0255.255.255.0tcp,udpany80defaultdrop

ClassificationParameters

Action

Traditional Filter

RouterVM Generalized Packet Filter (type L7)

Page 6: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

6

RouterVM HTTP Switch Example

Page 7: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

7

Trade-offs for GPF Flexibility

(cont )

# of classification fields morefewer

# of actions morefewer

# of programmatic elements morefewer

# of packet tagging options morefewer

classification depth deepershallower

# of control flow options morefewer

Extent and variety of per-flow state morefewer

Greater flexibility,

more difficult to use

…and generally higher

performance?

Less flexibility,

easier to use

…and generally lower

performance?

Page 8: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

8

Trade-offs for GPF Flexibility

(cont )

# of classification fields morefewer

# of actions morefewer

# of programmatic elements morefewer

# of packet tagging options morefewer

classification depth deepershallower

# of control flow options morefewer

Extent and variety of per-flow state morefewer

Greater flexibility,

more difficult to use

…and higher performance?

Less flexibility,

easier to use

…and lower performance?

Where is the sweet spot? Depends on the application and usage scenario!

Page 9: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

9

Trade-offs for GPF Flexibility

# of classification fields morefewer

# of actions morefewer

# of programmatic elements morefewer

# of packet tagging options morefewer

classification depth deepershallower

# of control flow options morefewer

Extent and variety of per-flow state morefewer

Greater flexibility,

(somewhat) more difficult to

use…and higher

performance?

Less flexibility,

easier to use

…and lower performance?

In addition, a complexity-hiding intelligent interface and the use of smart defaultscan shift the sweet spot towards greater flexibility, without decreasing ease of use.

Page 10: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

10

How many GPF types are enough?

Not a simple question, since the number of applications and usage scenarios supported by a library of GPFs is not equal to the number of available GPFs

By virtue of a common set of available actions, any GPF can support the following features:

Programmatic decision making (“if dest_ip == 127.0.0.0 then drop;”)Server load balancing (“loadbalance table SLB_Table;”)Packet field rewriting (“rewrite dest_ip 192.168.0.1;”)Packet duplication (“copy;”)QoS (“ratelimit 1 Mbps;”)Packet logging (“log intrusion_log.txt;”)Network address translation (“nat dir=forward, table=NAT_table;”)Server health monitoring (“if 192.168.0.5 is alive”);…and others

In practice, actions serve to multiply the base-level functionality of a given GPF to a much higher level than suggested by its name

“A server load-balancing, bandwidth throttling, health monitoring, and statistics-gathering ‘L7 filter’”

Page 11: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

11

Planned/Implemented GPF Libraryfor RouterVM .NET

Basic FilterSimple L2-L4 header classificationsAny RouterVM actions

L7 FilterAdds regular expressions & ADU reconstruction

NAT FilterAdds a few more capabilities beyond the simple NAT action that is available to all GPFs

Content CachingBuilds on the L7 filter functionality

WAN Link CompressionRelatively simple to specify, but requires lots of computation

IP-to-FC GatewayRequires its own table format & processing

XML PreprocessingNot very well documented, and difficulty is unknown…

Page 12: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

12

GPF Flexibility by OSI Layer

…As expected, GPF flexibility at the application layers starts to depend heavily on thebreadth of the GPF library and the availability of GPFs for specific applications

Page 13: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

13

GPF Performance: Basic Filters

Performance of filters has been measured on RouterVM for .NET using Win32 performance counters

Accurate to roughly 0.5 microsecondsMeasured on an Athlon XP 2000 system, Win2k

A basic filter with simple actions (no payload processing) requires roughly 3000 CPU cycles to perform its processing

This is mostly independent of packet sizeResults in ~284 Mbps for 64-byte packets, 6.7 Gbps for 1500-byte packets (theoretically of course)

If the average packet size is ~240 bytes, a packet stream can traverse 10 basic filters and still maintain 100 Mbps

…Keep in mind, this is with no optimization (yet)!

Page 14: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

14

GPF Performance: Complex Filters

What about complex L7 filters that search packet payloads with regular expressions?

Benchmark setup… Let’s hand-craft a packet stream of 256-byte packets:

L2-L4 Headers “Retreat” 25 bytes of char ‘X’ “Retreat” 25 bytes of char ‘X’ “Retreat” Padding with ‘X’

Create three different L7 filters, which search for three different patterns:^Retreat ^Retreat.*Retreat^Retreat.*Retreat.*Retreat

Although this is instructive, the setup is a little artificialWe’re searching every bit of every packet payload, whereas a real L7 filter would stop when it identifies a flow matching the expression

Page 15: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

15

GPF Performance: Complex Filters

L2-L4 Headers “Retreat” 25 bytes of char ‘X’ “Retreat” 25 bytes of char ‘X’ “Retreat” Padding with ‘X’

Page 16: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

16

GPF Performance: Complex Filters

L2-L4 Headers “Retreat” 25 bytes of char ‘X’ “Retreat” 25 bytes of char ‘X’ “Retreat” Padding with ‘X’

Lesson: try to use start-of-buffer

indicators ^ and avoid *’s…

Many apps can be identified with simple start-of-

buffer expressions

.NET Regex also involves payload copying, which

might be avoidable

Page 17: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

17

Thread OptimizationThe choice of thread boundaries, thread scheduling, and packet FIFO implementations has a tremendous impact on overall performance

My current choice of four threads per module/port is too many…Too difficult to optimally schedule the CPU, and overall performance is at least 10X slower than should be possibleAlso, threads waste a lot of time waiting for locks on the packet FIFOs, which also can be avoided by reducing the # of threads

Page 18: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

18

Performance ConclusionsRouterVM for .NET is just one possible implementation of RouterVM, and is only a demonstration of functionality, not performance

Many other performance aspects haven’t been mentioned, such as maintaining shared tables and per-flow state.

…Left for future presentations

Porting RouterVM to higher-performance parallel hardware should drastically increase performance

RouterVM’s 3000/cycles per packet per basic filter using .NET would be a terrible result for a network processor!

Dedicated search hardware is severely needed… It is trivial to come up with regular expression searches that require 200,000+ cycles per packet using .NET’s regular expression engineOther regular expression libraries may be faster, but a software-only approach will rarely be good enough for high-performance datacenter apps

Page 19: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

19

Backup

Page 20: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

20

Comments on GPF Flexibility

We can show that GPFs are flexible by examining the following GPF properties:

Classification capabilitiesHeaders fields only vs. headers + payloadsStateless classifications vs. stateful, individual packets vs. specific flowsSimple field searches vs. complex general search expressionsLayer support: L1 through L7

Action capabilitiesPacket handling (allow, drop, packet generation/copying)Packet rewriting (header field rewrites, truncation, header stripping/adding, checksum recalculations)Control flow (filter jump/skip via tags, messaging to downstream filters & RouterVM elements such as the routing engine)QoS support (e.g. rate limiting, WFQ, etc.)

(cont )

Page 21: Insights Into RouterVM’s Flexibility and Performance Mel Tsai mtsai@eecs.berkeley.edu

21

Comments on GPF Flexibility (cont)

Maintaining shared state and GPF interactionEfficient state sharing mechanism through tables or message passingMaintaining per-flow state within a filter, and between filtersMass storage capability (e.g. for content caching)

Computational PowerSimple, low-latency computations vs. complex, high-latency computations (e.g. NIDS, in-network antivirus scanning)

Specification Flexibility

Specific Application SupportStorage, XML, Wireless, etc.