gigabit rate packet pattern-matching using tcam

22
Reviewer: Jing Lu Gigabit Rate Packet Pattern-Matching Using TCAM Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004

Upload: olive

Post on 31-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

Gigabit Rate Packet Pattern-Matching Using TCAM. Fang Yu, Randy H. Katz T. V. Lakshman UC Berkeley Bell Labs, Lucent ICNP’2004. Motivation. Malicious probes and worms spread Solutions: End-host based Anti-virus software, security patches - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Gigabit Rate Packet Pattern-Matching Using TCAM

Reviewer: Jing Lu

Gigabit Rate Packet Pattern-Matching

Using TCAMFang Yu, Randy H. Katz T. V. Lakshman

UC Berkeley Bell Labs, Lucent

ICNP’2004

Page 2: Gigabit Rate Packet Pattern-Matching Using TCAM

Motivation • Malicious probes and worms spread

• Solutions:• End-host based

• Anti-virus software, security patches• Ineffective and costly

• Network based• Network Intrusion Detection Systems (NIDS)• Payload processing for thousands of complicated content patterns at line speed• Fast and scalable multi-pattern matching schemes are highly needed

Page 3: Gigabit Rate Packet Pattern-Matching Using TCAM

Current Pattern Matching Schemes• Software based solutions

• Low speed

• FPGA base solutions• Do not scale well in terms of space or overall latency for large number of patterns

• Bloom filters• Able to handle thousands of patterns• Build a bloom filter for each possible pattern length• Hard to handle hundreds of possible pattern lengths

Page 4: Gigabit Rate Packet Pattern-Matching Using TCAM

Problem Definition• Pattern matching problem

Given: a set of k patterns {P1, P2, …, Pk}, k >= 1, and a packet of length n;Goal: find all the matching patterns in the packet.

• Simple patterns:• Deterministic form: specific value of the 256 values• Non-deterministic form:

• Case insensitive alphabet• wildcard byte (*)

• Composite patterns:• Negation(!)• Correlated patterns

Page 5: Gigabit Rate Packet Pattern-Matching Using TCAM

TCAM

• Three logic states: ‘0’, ‘1’, ‘?’• Given an input string, TCAM reports the lowest index match if there are multiple matches• 4 ns lookup time• Single-chip density ~ 2MB• Width of each entry is configurable

Page 6: Gigabit Rate Packet Pattern-Matching Using TCAM

Simple Pattern Matching Using TCAM

• Short patterns: length <= TCAM width w• Pad with ‘?’ if less than w• Organize patterns according to lengths in descending order• Input packet shift one byte at a time

• Throughput: 2GbpsA B C D E F

C D E F

A B ? ?

MatchA B C ?

Input

TCAM

A B C D E F

C D E F

A B ? ?

A B C ?

Input

TCAM

Page 7: Gigabit Rate Packet Pattern-Matching Using TCAM

Simple Pattern Matching Using TCAM

• Long patterns: length > TCAM width w• Divide long pattern to multiple short patterns

• Prefix pattern: first w bytes• Suffix patterns: remaining every w bytes. If the last suffix pattern is less than w bytes, pad it in the front with preceding bytes.• Example: DEFGABCDL

DEFG -------------------- prefix pattern ABCD BCDL

------ Suffix patterns

Page 8: Gigabit Rate Packet Pattern-Matching Using TCAM

Patterns in TCAMPattern Index Pattern Contents Prefix patterns Suffix patterns

1 ABCDABCD ABCD ABCD

2 DEFGABCDL DEFG ABCD, BCDL

3 DEFGDEF DEFG GDEF

4 DEF

A B C DD E F GB C D L

G D E FD E F ?

TCAM Index

12

345

Page 9: Gigabit Rate Packet Pattern-Matching Using TCAM

Data Structures in SRAM

Pattern Index in TCAM

Simple Pattern Index

Prefix Index Suffix Index

1 -1 1 1

2 4 2 -1

3 -1 -1 2

4 -1 -1 3

5 4 -1 -1

• Combined Pattern Table

A B C DD E F GB C D L

G D E FD E F ?

TCAM Index

12

345

Pattern Index

Pattern Contents

Prefix patterns Suffix patterns

1 ABCDABCD ABCD (1) ABCD (1)

2 DEFGABCDL DEFG (2), ABCD (1), BCDL (2)

3 DEFGDEF DEFG (2) GDEF (3)

4 DEF

DEFGABCD (3)

Page 10: Gigabit Rate Packet Pattern-Matching Using TCAM

Data Structures in SRAMPattern Index Pattern Contents Prefix patterns Suffix patterns

1 ABCDABCD ABCD (1) ABCD (1)

2 DEFGABCDL DEFG (2), DEFGABCD (3) ABCD (1), BCDL (2)

3 DEFGDEF DEFG (2) GDEF (3)

4 DEF

• Matching TablePrefix Index Suffix Index Distance Matched Long Pattern Index

1 1 4 1

2 1 4 3*

2 3 3 3

3 1 4 1

3 2 1 2

• Partial Hit List (PHL)• Generated during matching process

Page 11: Gigabit Rate Packet Pattern-Matching Using TCAM

Algorithm for Long Pattern Matching

Prefix Index

Suffix Index

Dist-ance

Matched Long Pattern Index

1 1 4 1

2 1 4 3*

2 3 3 3

3 1 4 1

3 2 1 2

Pattern Index in TCAM

Simple Pattern Index

Prefix Index

Suffix Index

1 -1 1 1

2 4 2 -1

3 -1 -1 2

4 -1 -1 3

5 4 -1 -1

D E F G A B C D LInput

TCAM

Partial Hit List (PHL)

Position Prefix IndexD E F G

A B C D

B C D L

G D E F

D E F ?

D E F GPosition Prefix Index

1 2D E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

A B C D

B C D L

G D E F

D E F ?

D E F G

A B C D

B C D L

G D E F

D E F ?

A B C DPosition Prefix Index

5 3D E F G

A B C D

B C D L

G D E F

D E F ?

B C D L

Matching TableCombined Pattern Table

Page 12: Gigabit Rate Packet Pattern-Matching Using TCAM

Composite Pattern Matching• Correlated Patterns

• Partial hit record for sub-patterns kept in PHL because distance between two sub-patterns can be larger than w

• Example: content: “user”; content: “root”; within 20 prefix: user; suffix: root; distance: 4-20 ---- 17 entries in matching table

• Pattern with negations• Usually part of a correlated pattern

• Pattern with wildcards• Distance between upper case character and its corresponding lower case character is 32.

Page 13: Gigabit Rate Packet Pattern-Matching Using TCAM

Analysis

wmw i /* 2))1/((* i

i wmw

wi

i wm

)2(

)1/(

8

• What is the impact of TCAM width on the scheme?

TCAM Size Matching Table Size

TCAM Hit Rate

PHL Size

w

ii wm

w)2(

)1/(*

8

* k patterns, mi bytes each, TCAM width w, and random input stream

Page 14: Gigabit Rate Packet Pattern-Matching Using TCAM

Analysis• What is the impact of memory lookups on system scan rate?

• Two kinds of memory lookups can be pipelined• With small TCAM hit rate and PHL size, overall scan time is dominated by TCAM lookup time

a aTCAM

Lookuptime

Position

a a a a a a a a

1 2 3 4 5 6 7 8 9 10

MemoryLookup

time

Performing Memory Lookups Idle

hit hit hit miss miss miss miss miss hit

n'

hit

Page 15: Gigabit Rate Packet Pattern-Matching Using TCAM

Malicious Attacks?• Correlated patterns can cause problem

• Distance between sub-patterns can be larger than w -- PHL size Backlogged memory lookups Scan rate • Sub-patterns can be short -- Hit rate PHL size Scan rate

• The probability of matching two patterns of 1 byte apart is very small, but packing sub-patterns consecutively to form a long packet can create a large PHL

• Limit max distance between sub-patterns

Page 16: Gigabit Rate Packet Pattern-Matching Using TCAM

Simulation Results• Rule sets:

• ClamAV (v0.15) virus signature database• 1768 simple patterns• Average pattern length = 55 bytes• Pattern length: 6 ~ 2189 bytes

• SNORT (v2.1.2)• 1039 simple patterns, 527 correlated patterns• Mostly 10 ~ 100 bytes, some 1 ~ 4 bytes long

• Packet traces:• Real – MIT trace (1M), Berkeley trace (6M)• Synthetic – Randomly insert patterns in packet payload

Page 17: Gigabit Rate Packet Pattern-Matching Using TCAM

ClamAV Pattern Set

• w = 128 bytes• TCAM = 240KB• SRAM < 10MB

1

10

100

1000

10000

4 8 16 32 64 128

256

512

1024 TCAM width

(in bytes)

TC

AM

Sp

ace

(KB

)

0

0

1

10

100

1000

10000

Mat

chin

g T

able

Siz

e (M

B)

TCAM Spaces ConsumedMemory Space for Mapping Table

Page 18: Gigabit Rate Packet Pattern-Matching Using TCAM

ClamAV Pattern Set

PHL size for ClamAV pattern set with real traces

• Avg PHL: Mean of average PHL size over all packets• AvgMax PHL: Mean of maximum PHL size over all packets• Max: Maximum PHL size in all packets

Page 19: Gigabit Rate Packet Pattern-Matching Using TCAM

ClamAV Pattern Set

PHL size for ClamAV pattern set with synthetic traces

0

0.05

0.1

0.15

0.2

0.25

0.3

16 32 64 128 256 512 1024

TCAM width(in bytes)

Ave

rag

e P

HL

Siz

e

1 Pattern/packet

10 Patterns/packet

100 Patterns/packet

0

1

2

3

4

5

16 32 64 128 256 512 1024

TCAM Width

AV

gM

ax P

HL

Siz

e

1 Pattern/packet

10 Patterns/packet

100 Patterns/packet

• SRAM lookup can catch up with the TCAM lookup• Scan rate = 2Gbps

Page 20: Gigabit Rate Packet Pattern-Matching Using TCAM

SNORT Pattern Set

PHL size for SNORT pattern set with real traces

Win-dowSize

MIT Dump Berkeley Dump

Avg AvgMax

Max Avg AvgMax

Max

20 0.5523 2.7683 8 0.4702 1.5765 12

40 0.9881 3.5376 14 0.6500 1.8661 18

60 1.3151 3.9960 14 0.7313 1.9652 23

80 1.5491 4.2158 16 0.7587 2.0373 24

100 1.6867 4.3485 18 0.7661 2.0740 25

120 1.7725 4.4475 18 0.7669 2.0768 25

140 1.8308 4.5722 19 0.7669 2.0768 25

160 1.8800 4.6643 19 0.7669 2.0768 25

180 1.9244 4.7386 19 0.7669 2.0768 25

200 1.9662 4.8079 20 0.7669 2.0768 25

• w = 128, TCAM size = 295KB

Page 21: Gigabit Rate Packet Pattern-Matching Using TCAM

SNORT Pattern Set• Scan Ratio = Total scan time/Total TCAM lookup time• Memory Ratio = SRAM access time/TCAM access time

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

0.6 0.7 0.8 0.9 1% of Packets

Sc

an

Ra

tio

0.20.40.60.81

Memory Ratio

• Scan rate > 1Gbps

Effects of Memory ratio on scan ratio

Page 22: Gigabit Rate Packet Pattern-Matching Using TCAM

Conclusion• A simple multi-pattern matching algorithm using TCAM• Support thousands of patterns with variable lengths• Support long patterns, correlated patterns, pattern with negation and wildcards• Achieve multi-gigabit rate on ClamAV and SNORT pattern sets