architectural impact of stateful networking applications javier verdú, jorge garcía mario...

24
Architectural Impact of Architectural Impact of Stateful Networking Stateful Networking Applications Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for Networking and Communications Systems Princeton, New Jersey, USA October 26-28, 2005 ANCS - I

Upload: horatio-richardson

Post on 13-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact ofArchitectural Impact ofStateful Networking ApplicationsStateful Networking Applications

Javier Verdú, Jorge GarcíaMario Nemirovsky, Mateo Valero

The 1st Symposium on Architectures for

Networking and Communications Systems

Princeton, New Jersey, USA October 26-28, 2005

ANCS - I

Page 2: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 2

Trends of Internet Important growth of Internet Traffic

Consequent Traffic Aggregation increment• Low packet/flow temporal locality

End-point routers & appliances execute stateful apps Upper layer packet processing

• Larger workloads per packet

Facing new security issues Improvement of attacks methods

• Need to spread the knowledge futher than a packet

Page 3: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 3

Granularity Levels

Holding

Company

Department

User

Application

Flow

Packet

Stateful Application Model

ApplicationApplication

-

+

State Lifetime

PacketPacket FlowFlow

UserUser

CompanyCompany

DepartmentDepartment

Page 4: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 4

Research Limitations on Stateful Apps Pool of Benchmark Suites for Network Processors

CommBench NetBench NpBench NPForum

Lack of Stateful Benchmarks Most of them are stateless benchmarks

Creating new benchmarks Reliability???

• State size• State management

Page 5: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 5

Talk Outline Introduction

Network Traffic Properties

Description of Environment

Architectural Impact Analysis

Summary

Page 6: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 6

Network Traffic Properties Traffic Aggregation Level

Unique Flow rate in a given window

vs

Page 7: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 7

Network Traffic Properties Traffic Aggregation Level

Unique Flow rate in a given window

Intra-Flow Temporal Distribution How the packets are exchanged?

vs

Page 8: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 8

Network Traffic Properties Traffic Aggregation Level

Unique Flow rate in a given window

Intra-Flow Temporal Distribution How the packets are exchanged?

Inter-Flow Temporal Distribution Packet rate between packets of the same flow

vs

vs

Page 9: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 9

Snort is tuned with four different configurations Stream4

• Prevents Stick/Snot attacks Flow-Portscan

• Detects portscanning attacks SfPortscan

• Detects a variety of portscanning attacks Merged Engines

• The combination of the above engines

Argus is a monitoring/billing benchmark Currently it is included in NO benchmark suite Open source application

• http://www.qosient.com Equivalent to the commercial tool Cisco NetFlow

Benchmark Selection (I)

Page 10: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 10

Obviously, stateless applications keep no flowstate The state size may vary a lot between applications

The state management also may be quite different

Benchmark Selection (& II)

0

500

1000

1500

2000

2500

3000

MergedEngines

Stream4 FlowPortscan

SFPortscan

Argus AnyStateless

App

Benchmark

Flo

w S

tate

(B

yte

s)

Page 11: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 11

Evaluation Methodology Instrumented Binary Code: ATOM

Trace-driven simulation: Modified version of SMTSim Simulator

Simulation length Warming period

• 10K Packets Processing period

• 50K Packets Packet selection for the flow lifetime studies

Towards analysis of actual application behavior The baseline is an ample configuration

• ROB Size 256 entries– No significant improvements with larger ROBs

• Physical Regs: 192 int, 192 FP– No stress due to lack of regs

• Perceptron Branch Predictor– The most powerful configuration

• 64KB I$, 64KB DL1$, 2MB L2$– No significant improvements with larger caches

Page 12: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 12

Architectural Impact Analysis

Computational complexity

Available Parallelism

Impact of Bottlenecks

Branch Prediction

Data Cache Behavior

Page 13: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 13

Computational Complexity (I)

There are no significant differences among benchmarks Roughly 35% - 45% of memory accesses

Argus is more memory intesive

0

1000

2000

3000

4000

5000

6000

7000

8000

Merged Engines Stream4 Flow-Portscan SfPortscan Argus

Benchmark

Inst

ruct

ion

s p

er P

acke

t

Integer Computation Uncond. Branch

Cond. Branch Load

Store

Page 14: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 14

Computational Complexity (& II)

The instruction mix is similar along all the packets Some applications generate the hardest workload in the first

packets Other applications show almost constant workload

0

2000

4000

6000

8000

10000

12000

1 2 3 … … n-3 n-2 n-1 n

Flow Live (N-th Pkt)

Ins

tru

cti

on

s

Merged Engines Stream4Flow-Portscan SfPortscanArgus

Connecting Data Transfering Clossing

Page 15: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 15

Available Parallelism

Processor configuration modified towards avoiding any constraint

The ILP is independent of the app category It is inherent to the application itself

The evaluated apps show low ILP: ~3,7 IPC

0

2

4

6

8

10

MergedEngines

Stream4 Flow-portscan

SfPortscan Argus NpBench(ControlPlane)

NpBench(Data Plane)

Benchmark

IPC

(~4200) (~45)

Page 16: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 16

Impact of Bottlenecks

Stateful apps show very lower performance Roughly 0,6 IPC on average

The importance of the packet processing Constant vs concentrated workload

Memory Impact 3x – 19x of speed up

0

0,5

1

1,5

2

2,5

3

3,5

4

MergedEngines

Stream4 Flow-portscan SfPortscan Argus

Benchmark

IPC

Baseline Perfect Branch

Perfect Mem Perfect Mem & Perfect Branch

Page 17: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 17

Branch Prediction (I)

High branch prediction accuracy on average But we have two branch categories

Flow independent: similar among packets -> easy to predict Flow dependent: flow related -> sensitive to traffic properties

90%

92%

94%

96%

98%

100%

MergedEngines

Stream4 Flow-portscan SfPortscan Argus

Benchmark

Hit

Ra

te

Page 18: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 18

Branch Prediction (& II)

A single active connection Higher accuracy and no variations among n-th packets

High traffic aggregation level Lower accuracy and vairations among n-th packets

Negative aliasign due to flow dependent branches Most of our applications hide this effect due to concentrated workload

86%

88%

90%

92%

94%

96%

98%

100%

1 2 3 … … n-3 n-2 n-1 n

Flow Live (N-th Pkt)

Bra

nc

h P

red

icti

on

Hit

Ra

te

Connecting Data Transfering Clossing

86%

88%

90%

92%

94%

96%

98%

100%

1 2 3 … … n-3 n-2 n-1 n

Flow Live (N-th Pkt)

Bra

nc

h P

red

icti

on

Hit

Ra

te

Merged Engines

Stream4

Flow-Portscan

SfPortscan

Argus

Connecting Data Transfering Clossing

No traffic aggregation level High traffic aggregation level

Page 19: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 19

Data Cache Behavior (I)

Stateful apps need reduced DL1$ to get steady miss rate Taking advantage of flow independent memory references

Almost 100% of DL2$ accesses are misses It is unable to keep the state of the active flows

Larger flow-states emphasize network properties impact Getting higher steady state even with low traffic aggregation The intra-flow distribution may be more helpful

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

1024 2048 4096 8192

$L2 Size (KB)

Mis

s R

ate

Merged Engines Stream4

Flow-portscan SfPortscan

Argus

0%

5%

10%

15%

20%

25%

4 8 16 32 64 128 256 512 1024

$DL1 Size (KB)

Mis

s R

ate

Merged Engines Stream4

Flow-portscan SfPortscan

Argus

Page 20: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 20

Data Cache Behavior (& II)

Negative effects of the memory concentrated in the first packets Constant workload applications show similar miss rate for every

packet Extra miss rates for data structures maintainance

Merged Engines from 1,5% to 5% on average

0,0%

0,5%

1,0%

1,5%

2,0%

2,5%

3,0%

1 2 3 … … n-3 n-2 n-1 n

Flow Live (Pkt N-th)

To

tal

Dat

a L

2 M

iss

Rat

e

Merged Engines Stream4

Flow-Portscan SfPortscan

Argus

Connecting Data Transfering Clossing

Page 21: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 21

Summary (I) We present the architectural impact of Stateful

Networking Applications An important new type of applications

The behavior along the packets of a TCP connection Constant workload for the packets of a connection Workload concentrated in the first packets of a connection

Analysis of network traffic properties Branch prediction and data cache are sensitive to them

Page 22: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 22

Summary (& II) Reduced IPC on average

L2$ is unable to maintain the required states of active flows

Branch prediction also may improve once solved memory bottleneck

Other stateful applications may present different valuable results, but… The critical bottlenecks even may be more stressed

Our concern is … To have more sample applications to evaluate To analyse the apps in a more realistic environment

• Running simultaneously a number of applications

Page 23: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 23

Questions...

Page 24: Architectural Impact of Stateful Networking Applications Javier Verdú, Jorge García Mario Nemirovsky, Mateo Valero The 1st Symposium on Architectures for

Architectural Impact of Stateful Networking Applications 24

Traffic Traces Filtered Traffic Trace

Bidirectional TCP connections

Generating Synthetic Traffic Traces Mixing different traffic traces

• microTimestamp sorting based We are assuming a set of traces with the same bandwidth

link• In our case: MRA link

Avoiding the aliasing of IP addresses among aggregated traces

• The set of traces are originally sanitized

The resulting traffic trace shows roughly 1Gbps 170K active flows

• Achieved from the original OC12 MRA link (622Mbps)