![Page 1: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/1.jpg)
Efficient Signature Matching with Multiple Alphabet Compression Tables
Shijin
Kong Randy Smith Cristian
Estan
Presented at SecureComm
2008, Istanbul, Turkey
![Page 2: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/2.jpg)
2
Signature Matching
Signature Matching a core component of network devices
Operation (ideal): For a set of signatures, match all relevant sigs in a single pass over payload
Many constraintsEvolving, complex signaturesWirespeed operationLimited memoryActive adversary
![Page 3: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/3.jpg)
3
Regular Expressions and DFAs
Regular expressions standard for writing sigsBuffer overflow: /^RETR\s[^\n]{100}/Format string attack: /^SITE\s+EXEC[^\n]*%[^\n]*%/
DFAs used for matching to input
![Page 4: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/4.jpg)
4
12
12
12
4
2
8
2
8
25
25
25
41
41
41
5
5
State 0 State 112
12
12
4
4
4
8
8
State 225
25
25
6
41
5
41
5
State 3
input_byte=1
crt_state=1 …
DFA Operation
next_state=12
if accept(next_state)
alert
![Page 5: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/5.jpg)
5
Matching with DFAs
AdvantagesFast – minimal per-byte processingComposable – combine many DFAs into one
DisadvantagesStates are heavyweight (1 KB each!)State-space explosion occurs when DFAs combined
Memory exhausted with only a few DFAs!Workaround: many DFAs matched in parallel
![Page 6: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/6.jpg)
6
12
12
12
4
2
8
2
8
25
25
25
41
41
41
5
5
State 0 State 112
12
12
4
4
4
8
8
State 225
25
25
6
41
5
41
5
State 3
…
Key: Reduce memory usage
Reduce size of transition tables
Reduce number of states
Strategy: aggressively reduce memory footprint, keep exec time low
![Page 7: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/7.jpg)
7
Main Contribution
Multiple Alphabet Compression TablesLightweight, applicable to hardware or software Compatible with other techniquesWorst case = average case
Results (in software)4x to 70x memory reduction 35% - 85% execution time increase
![Page 8: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/8.jpg)
8
Outline
IntroductionAlphabet Compression TablesInteracting with D2FAsExperimental Results
![Page 9: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/9.jpg)
9
12
12
12
4
2
8
2
8
25
25
25
41
41
41
5
5
State 0 State 112
12
12
4
4
4
8
8
State 225
25
25
6
41
5
41
5
State 3
…
Alphabet Compression: core observation
Some input symbols are equivalent; the transitions on those symbols at any state are identical.
![Page 10: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/10.jpg)
10
12
4
2
8
2
8
25
41
41
41
5
5
State 0 State 112
4
4
4
8
8
State 225
6
41
5
41
5
State 3
input_byte=1
crt_state=1
0
0
0
1
2
3
4
5
Alphabet compression table
…
index=0
next_state=12
Alphabet Compression Tables
![Page 11: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/11.jpg)
11
12
4
2
8
2
8
25
41
41
41
5
5
State 0 State 112
4
4
4
8
8
State 225
6
41
5
41
5
State 30
0
0
1
2
3
4
5
Alphabet compression table
…
Even further compression…
![Page 12: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/12.jpg)
12
12
4
2
8
2
8
25
41
41
41
5
5
State 0 State 112
4
4
4
8
8
State 225
6
41
5
41
5
State 30
0
0
1
2
3
4
5
Alphabet compression table
…
Even further compression…
![Page 13: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/13.jpg)
13
12
4
2
8
2
8
25
41
41
41
5
5
State 0 State 112
4
4
4
8
8
State 225
6
41
5
41
5
State 30
0
0
1
2
3
4
5
Alphabet compression table
…
Even further compression…
![Page 14: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/14.jpg)
14
State 0 State 1 State 2 State 3
ACT 0
…
0
0
0
1
1
1
2
2
ACT 1
25
41
5
12
4
2
8
12
4
5
25
6
41
5
0
0
0
1
2
3
2
3
Multiple ACTs
How do we know which ACT to use with which state?
![Page 15: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/15.jpg)
15
State 0 State 1 State 2 State 3
crt_act=1
next_state=12
ACT 0
…
0
0
0
1
1
1
2
2
ACT 1
0 25
1 41
0 5
0 12
1 4
0 2
0 8
0 12
1 4
0 8
0 25
1 6
1 41
0 5
crt_state=1
0
0
0
1
2
3
2
3
index=0input_byte=1
next_act=0
Multiple ACTs
![Page 16: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/16.jpg)
16
Constructing Multiple ACTs
Partition states appropriatelyfor example:
{S1, S2, S3, …, Sn} { {S1, S8,}, {S2, S3, S9,}, … }
Construct single ACT for each group of statesSee algorithm in paper
![Page 17: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/17.jpg)
17
Partitioning States for ACTs
Input: number of ACTs to use m, DFA DOutput: a partition of states into m subsets
Use greedy, heuristic approach:
States = Set of all states in D;while (m>1) {
Subset = GetEquivClassPartition(States);AddToResult(Subset);States = States
–
Subset;m--;
}return Result;
![Page 18: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/18.jpg)
18
How many Compression Tables?
Avg
trans per state Avg
exec time
Eight ACTs is enough
![Page 19: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/19.jpg)
19
Outline
IntroductionAlphabet Compression TablesInteracting with D2FAsExperimental Results
![Page 20: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/20.jpg)
20
ACTs
and D2FAs
Two kinds of redundancy
Symbols have identical behavior for large subsets of states
Compress with (multiple) ACTs
S1
S2
S3
S4
a,b,c
d
e
a,b,c
a,b,c
![Page 21: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/21.jpg)
21
ACTs
and D2FAs
Two kinds of redundancy
Symbols have identical behavior for large subsets of states
Compress with (multiple) ACTs
Symbols at many states lead to common next states
Compress with D2FAs
S1
S2
S3
S4
a,b,c
d
e
a,b,c
a,b,c
S2
fe
S1
fe
c
c
h
h
![Page 22: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/22.jpg)
22
12
12
12
4
2
8
2
8
25
25
25
41
41
41
5
5
State 0 State 112
12
12
4
4
4
8
8
State 225
25
25
6
41
5
41
5
State 3
…
D2FAs: core observation
Kumar et al (Sigcomm
2006); Kumar et al (ANCS 2006); Becchi
et al (ANCS 2007)
For many pairs of states, the transitions for most characters are identical!
Idea: store only one copy
![Page 23: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/23.jpg)
23
12
12
12
4
2
8
2
8
25
25
25
41
41
41
5
5
State 0 State 112
12
12
4
4
4
8
8
State 225
25
25
6
41
5
41
5
State 3
…
D2FAs: core observation
For many pairs of states, the transitions for most characters are identical!
Idea: store only one copy
Kumar et al (Sigcomm
2006); Kumar et al (ANCS 2006); Becchi
et al (ANCS 2007)
![Page 24: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/24.jpg)
24
2
8
2
12
12
12
4
4
4
8
8
State 0 State 1
25
25
25
41
41
41
5
5
State 2
6
5
41
State 3
input_byte=1
crt_state=1 next_state=12
…
2 0
D2FAs
Default transitions
Issue: good compression,
potentially heavy run- time cost
![Page 25: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/25.jpg)
25
ACTs
and D2FAs Together
Combine ACTs and D2FAs to address both kinds of redundancy
Procedure:1.
Apply D2FA compression to DFAs2.
Apply multiple ACT compression to D2FA results
Only slight modification to ACT constructionAdd “not handled here” symbolDeal with default transitions
![Page 26: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/26.jpg)
26
2
8
2
12
12
12
4
4
4
8
8
State 0 State 1
25
25
25
41
41
41
5
5
State 2
6
5
41
State 3
…
2 0
ACTs
+ D2FAs
Default transitions
![Page 27: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/27.jpg)
27
2
8
2
12
4
8
State 0 State 1
25
41
5
State 2
6
5
41
State 3
…
2 0
ACTs
+ D2FAs
ACT 00
0
0
1
1
1
2
2
![Page 28: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/28.jpg)
28
0 2
0 8
0 2
0 12
1 4
0 8
State 0 State 1
0 25
1 41
0 5
State 2
1 6
0 5
1 41
State 3
…
0 2 1 0
ACTs
+ D2FAs
ACT 00
0
0
1
1
1
2
2
ACT 10
0
0
1
2
3
4
0
![Page 29: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/29.jpg)
29
0 2
0 8
0 2
0 12
1 4
0 8
State 0 State 1
0 25
1 41
0 5
State 2
1 6
0 5
1 41
State 3
…
0 2 1 0
ACTs
+ D2FAs
ACT 00
0
0
1
1
1
2
2
ACT 10
0
0
1
2
3
4
0
crt_act=1crt_state=1
index=0
input=1
next_state=12
next_act=0
index=0
![Page 30: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/30.jpg)
30
Outline
IntroductionAlphabet Compression TablesInteracting with D2FAsExperimental Results
![Page 31: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/31.jpg)
31
Experimental Setup
1550 HTTP, SMTP, FTP signaturesGrouped by protocol and rule set (Snort or Cisco)
DFA Set Splitting (Yu, 2006) to cluster DFAsProvide memory bound a prioriHeuristically combine into as few DFAs as possible
Experiment Environment10 GB traces, run on 3.0 GHz P4Exec time measured with cycle-accurate counters
![Page 32: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/32.jpg)
32
Memory vs
Time
4000 States114 DFAs
8000 States82 DFAs
16000 States45 DFAs
ideal
![Page 33: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/33.jpg)
33
Memory vs
Time
Snort HTTP Cisco IPS HTTP
Lowest mem, highest exec!
![Page 34: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/34.jpg)
34
Memory vs
Time
Snort SMTP Cisco IPS SMTP
Lowest mem, highest exec!
![Page 35: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/35.jpg)
35
Conclusion
Multiple alphabet compression tablesLightweightApplicable to hardware or software platformsCompatible with other techniques
Provides better time vs. space performance4x to 70x memory reduction35% to 85% execution time increase
Best technique a function of time, memory limitsACTs add superior design points
![Page 36: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/36.jpg)
36
Efficient Signature Matching with Multiple Alphabet Compression Tables
Thank you
![Page 37: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/37.jpg)
37
intentionally blank
![Page 38: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/38.jpg)
38
Regular Expressions and DFAs
Regular expressions standard for writing sigsBuffer overflow: /^RETR\s[^\n]{100}/Format string attack: /^SITE\s+EXEC[^\n]*%[^\n]*%/
DFAs used for matching to input
Match sig
2!
Payload Header
C0 A8 64 01 site exec % retr
% S1
S2
r
e [^\n]
[^\nrs]
\n
t r
\n
s
[^\n]…
i t e
\n
S3
S2
%…
![Page 39: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/39.jpg)
39
Memory Usage
DFA ACT D2FA ACT + D2FA
Snort HTTP 74 8.1 8.8 4.3
Snort SMTP 98 5.7 42 3.4
Snort FTP 94 4.9 9.2 3.9
DFA ACT D2FA ACT + D2FA
Cisco HTTP 116 30 4.7 17
Cisco SMTP 110 29 3.0 18
Cisco FTP 83 5.1 1.7 1.9
All results reported in megabytes (MB)
![Page 40: Efficient Signature Matching with Multiple Alphabet Compression](https://reader031.vdocuments.mx/reader031/viewer/2022030323/58a2e83b1a28abd1778b8d33/html5/thumbnails/40.jpg)
40
Memory vs
Time
Snort FTP Cisco IPS FTP