gigabit rate packet pattern-matching using tcam

23
Gigabit Rate Packet Pattern- Matching Using TCAM Fang Yu and Randy H. Katz UC Berkeley T. V. Lakshman Bell Laboratories, Lucent Technologies

Upload: cairo-newton

Post on 02-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Gigabit Rate Packet Pattern-Matching Using TCAM. Fang Yu and Randy H. Katz UC Berkeley T. V. Lakshman Bell Laboratories, Lucent Technologies. Motivation. Numerous malicious probes and worms End-host based solution is not sufficient It is hard for all end users to apply patches quickly - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Gigabit Rate Packet Pattern-Matching Using TCAM

Gigabit Rate Packet Pattern-Matching Using TCAM

Fang Yu and Randy H. KatzUC Berkeley

T. V. LakshmanBell Laboratories, Lucent Technologies

Page 2: Gigabit Rate Packet Pattern-Matching Using TCAM

Motivation

Numerous malicious probes and worms End-host based solution is not sufficient

It is hard for all end users to apply patches quickly Worms can contaminate millions of hosts within hours

Network based solution – network intrusion detection systems (NIDS) Perform packet scanning for complicated worm

patterns in the network Stop worms from reaching end hosts Easy to manage for network administrators

Page 3: Gigabit Rate Packet Pattern-Matching Using TCAM

Pattern Matching for NIDS

Thousands of complicated patterns Patterns have variable lengths Patterns with correlation

“abc” followed by “cde” within 3 bytes

Patterns with negation “user” not followed by “|0a|” within 50 bytes

Require packet payload scanning Not supported by most current network devices,

which support packet header processing only

Page 4: Gigabit Rate Packet Pattern-Matching Using TCAM

Current Pattern Matching Schemes

Software based solutions Speed is slow

FPGA solutions Build large DFA or NFA for all patterns Build a KMP based search engine for each pattern

Bloom Filters One bloom filter for each pattern length Not scalable when pattern lengths vary dramatically

Page 5: Gigabit Rate Packet Pattern-Matching Using TCAM

Ternary-CAM (TCAM)

Fully associative memory compare input string with all the entries in parallel If multiple matches, report the index

of the first match Each cell takes one of three logic

states ‘0’, ‘1’, and ‘?’(don’t care)

Current TCAM technology Fast Match Time: 4 ns Size: 1-2MB Width configurable

1024 entries *1024 bytes width 2048 entries *512 bytes width

192.128.101.100

168.100.???.???

192.128.???.???

Match192.128.101.???

Input

TCAM

entry

cell

width

Page 6: Gigabit Rate Packet Pattern-Matching Using TCAM

Pattern Matching with TCAM

Put all the patterns into the TCAM Assume patterns are less or equal to

the TCAM width If less than the TCAM width, pad with

‘?’ Order the patterns according to lengths

in reverse order When matching entry ABC, report

matching of both pattern ABC and AB

Shift one byte each time

A B C D E F

C D E F

A B ? ?

MatchA B C ?

Input

TCAM

A B C D E F

C D E F

A B ? ?

A B C ?

Input

TCAM

Page 7: Gigabit Rate Packet Pattern-Matching Using TCAM

Analysis

Scan speed: 4 ns per TCAM lookup, shift one byte at a time 8bits/4ns =2 Gbps worst case scan rate

Limitation: require all the patterns to be shorter or equal to the TCAM width Set the TCAM width >= longest pattern’s length

Pad all short patterns to TCAM width Waste TCAM resources

Can we set TCAM width smaller and cut long patterns into smaller patterns?

Page 8: Gigabit Rate Packet Pattern-Matching Using TCAM

Long Patterns

Cut long patterns into smaller patterns TCAM width w=4 bytes DEFGABCDL is split into DEFG, ABCD, and L

Pad the last partial pattern with the tail of the second last partial pattern DEFGABCDL is split into DEFG, ABCD, and BCDL

DEFGABCDL

DEFGABCD

L

DEFGABCDL

DEFGABCD

BCDLShort partial patterns, many TCAM hits

Page 9: Gigabit Rate Packet Pattern-Matching Using TCAM

Concatenate Partial Patterns into Long Patterns Patterns:

ABCDABCD

DEFGABCDL

DEFGDEF

DEF

,

D E F G A B C D LInput

TCAM

Matching Table

Partial Hit List (PHL)

Position Matched entryD E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

Prefix Index

Suffix Index

DistanceMatched Long Pattern Index

1(ABCD) 1(ABCD) 4 ABCDABCD

2(DEFG) 1(ABCD) 4 3(DEGFABCD)

2(DEFG) 3(GDEF) 3 (DEGFDEF

3(DEGFABCD) 1(ABCD) 4 ABCDABCD

3(DEGFABCD) 2(BCDL) 1 DEFDABCDL

Position Matched entry

1 2(DEFG)D E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

A B C D

B C D L

G D E F

D E F ?

A B C DPosition Matched entry

5 3(DEFGABCD)

D E F G

A B C D

B C D L

G D E F

D E F ?

B C D L

Page 10: Gigabit Rate Packet Pattern-Matching Using TCAM

Correlated Patterns

One pattern after another E.g. “ABCD” followed by

“DEF” within 10 bytes The matching result of

“ABCD” has to be in PHL for 10 positions

A B C D

A A B ?

D E F G

Input

TCAM

A B C D

D E F ?

A ? ? ?

A B C D A D E F G

Pattern D E F

D E F G

A B C D

A ? ? ?

D E F ?

A A B ?

Page 11: Gigabit Rate Packet Pattern-Matching Using TCAM

Matching Process

TCAM reports a miss No extra memory lookup

TCAM reports a hit If it is a partial pattern

For every item in PHL One memory lookup into matching table to see whether it

generates a valid pattern

Examples based on statistical analysis n = 2000, mi = 200 bytes, w =4 bytes. Associate hit

rate is 2.2e-5, PHL size is 8.8e-5 w = 8 bytes, associate hit rate is 2.6e-15, PHL size is

2.08e-14

Associate hit rate

PHL size

Page 12: Gigabit Rate Packet Pattern-Matching Using TCAM

Malicious Attack?

When j = 1, probability is:

1-

E.g., n = 1000 and m=4, it is 0.029

When j increases, the probability

increases. If j=m, then probability =1

Window: distance between two correlated patterns After matching a pattern, what is possibility to match

another at window size j positions later?

A B C D

Input A B C D A A G G

Pattern

B C D A

)))2)!*(()2/((()!)2(( 181818 nmmm n .

A B C D

Input A B C D A A G G

Pattern

A A G G Worst case PHL size is at least: window size / m

Page 13: Gigabit Rate Packet Pattern-Matching Using TCAM

Simulation Results on ClamAV

ClamAv virus signature database Version 0.15, which contains simple patterns only 1768 patterns, varying from 6 bytes to 2189 bytes

0

50

100

150

200

250

300

350

400

1 10 100 1000 10000Length (bytes)

Nu

mb

er

of

Pa

tte

rns

Page 14: Gigabit Rate Packet Pattern-Matching Using TCAM

Effect of TCAM Width

Total TCAM space: Increase when w increases,

because of padding Mapping Table Size

Decreases as w increases

because of fewer partial patterns

1

10

100

1000

10000

4 8 16 32 64 128

256

512

1024

TCAM width(in bytes)

TC

AM

Sp

ace

(KB

)

0

0

1

10

100

1000

10000

Map

pin

g T

able

Siz

e (M

B)

TCAM Spaces ConsumedMemory Space for Mapping Table

wmw i /*

2))1/((* i

i wmw

Page 15: Gigabit Rate Packet Pattern-Matching Using TCAM

PHL Size on Real Data

For each packet, record average and maximum PHL size Avg: mean of the average PHL size over all packets AvgMax: mean of the maximum PHL sizes Max: maximum PHL size over all packets

TCAMWidth

MIT Dump Berkeley Dump

Avg AvgMax

Max Avg AvgMax

Max

4 0.042 0.27 4 0.03 0.48 4

8 4.8e-6 5.6e-4 8 1.e-6 1.9e-5 7

16 0 0 0 4.3e-7 5.8e-6 3

Page 16: Gigabit Rate Packet Pattern-Matching Using TCAM

Simulation Results on Snort

SNORT system (v2.1.2) has 1991 rules 1039 simple patterns 527 correlated patterns

Up to 7 sub-patterns

Set TCAM width as 128 bytes Patterns fit into a TCAM

size of 295KB

Win-dowSize

MIT Dump Berkeley Dump

Avg AvgMax

Max Avg AvgMax

Max

20 0.5523 2.7683 8 0.4702 1.5765 12

40 0.9881 3.5376 14 0.6500 1.8661 18

60 1.3151 3.9960 14 0.7313 1.9652 23

80 1.5491 4.2158 16 0.7587 2.0373 24

100 1.6867 4.3485 18 0.7661 2.0740 25

120 1.7725 4.4475 18 0.7669 2.0768 25

140 1.8308 4.5722 19 0.7669 2.0768 25

160 1.8800 4.6643 19 0.7669 2.0768 25

180 1.9244 4.7386 19 0.7669 2.0768 25

200 1.9662 4.8079 20 0.7669 2.0768 25

Page 17: Gigabit Rate Packet Pattern-Matching Using TCAM

Conclusions

Fast speed pattern matching is essential for building effective defenses against virus

Multiple pattern matching with TCAM Achieve multi-gigabit rate Search for thousands, or tens of thousands patterns

in parallel Support long patterns, correlated patterns, and also

patterns with negation, wildcards Can be extended to support higher rates with larger

TCAMs

Page 18: Gigabit Rate Packet Pattern-Matching Using TCAM

Backup Slides

Page 19: Gigabit Rate Packet Pattern-Matching Using TCAM

Long Patterns

What if pattern is longer than the width of TCAM? Split it into multiple partial patterns For example, TCAM width k=4

Patternindex

Pattern content

1 ABCD ABCD

2 DEFG ABCD L

3 DEFG DEF

4 DEF

4 bytes

D E F G

TCAM

A B C D

D E F ?

L ? ? ?

Short partial patterns, many TCAM hits

L ? ? ?

Page 20: Gigabit Rate Packet Pattern-Matching Using TCAM

Statistical Analysis

Example n = 2000, mi = 200 bytes, w =4 bytes. Associate hit rate is 2.2e-5, PHL

size is 8.8e-5 w = 8 bytes, associate hit rate is 2.6e-15, PHL size is 2.08e-14

Assume random input string, independent patterns Number of patterns: n Pattern size: mi bytes for pattern i TCAM width: w Total entries for partial items in TCAM: Associate hit rate is

Ignoring the dependency between neighboring positions, PHL size is

)1/( i

i wm

wi

i wm

)2(

)1/(

8

w

ii wm

w)2(

)1/(*

8

Page 21: Gigabit Rate Packet Pattern-Matching Using TCAM

Synthesized “Worst-case” Packets

Four sets of synthesized data 1, 10, and 100 randomly

inserted virus patterns per packet

0

5

10

15

20

16 32 64 128 256 512 1024

TCAM width

Max

Par

tial

Hit

Lis

t S

ize 1 Pattern/packet

10 Patterns/packet

100 Patterns/packet

0

0.05

0.1

0.15

0.2

0.25

0.3

16 32 64 128 256 512 1024

TCAM width(in bytes)

Ave

rag

e P

HL

Siz

e

1 Pattern/packet

10 Patterns/packet

100 Patterns/packet

0

1

2

3

4

5

16 32 64 128 256 512 1024

TCAM Width

AV

gM

ax P

HL

Siz

e

1 Pattern/packet

10 Patterns/packet

100 Patterns/packet

Page 22: Gigabit Rate Packet Pattern-Matching Using TCAM

Memory Lookup Process

TCAM reports a miss No extra memory lookup Memory lookup process is idle

TCAM reports a hit One memory lookup in the combined pattern table Lookups in matching table if PHL is not empty

a aTCAM

Lookuptime

Position

a a a a a a a a

1 2 3 4 5 6 7 8 9 10

MemoryLookup

time

Performing Memory Lookups Idle

hit hit hit miss miss miss miss miss hit

n'

hit

Page 23: Gigabit Rate Packet Pattern-Matching Using TCAM

Effects of Memory Ratio on Scan Rate Scan ratio

Total scanning time (including memory lookups) vs. the time spent on TCAM lookups only.

E.g., scan ratio=2 total scanning rate = TCAM access rate /2 Memory ratio

SRAM to TCAM access times

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

0.6 0.7 0.8 0.9 1% of Packets

Sc

an

Ra

tio

0.20.40.60.81

Memory Ratio