c 2013 liu yang all rights reserved - rutgers universityvinodg/students/liuyang_phdthesis.pdf ·...
TRANSCRIPT
c© 2013
Liu Yang
ALL RIGHTS RESERVED
NEW PATTERN MATCHING ALGORITHMS FORNETWORK SECURITY APPLICATIONS
BY LIU YANG
A dissertation submitted to the
Graduate School—New Brunswick
Rutgers, The State University of New Jersey
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
Graduate Program in Computer Science
Written under the direction of
Vinod Ganapathy
and approved by
New Brunswick, New Jersey
May, 2013
ABSTRACT OF THE DISSERTATION
New Pattern Matching Algorithms for Network Security
Applications
by Liu Yang
Dissertation Director: Vinod Ganapathy
Modern network security applications, such as network-based intrusion detection sys-
tems (NIDS) and firewalls, routinely employ deep packet inspection to identify malicious
traffic. In deep packet inspection, the contents of network packets are matched against
patterns of malicious traffic to identify attack-carrying packets. The pattern matching
algorithms employed for deep packet inspection must satisfy two requirements. First,
the algorithms must be fast. Network security applications are often implemented as
middleboxes that reside on high-speed Gbps links, and the algorithms are expected to
perform at such speeds. Second, the algorithms must be space-efficient. The middle-
boxes that perform pattern matching are often implemented as hardware components,
they employ fast but expensive SRAM technology to ensure good performance.
Unfortunately, existing pattern matching algorithms suffer from a fundamental time-
space tradeoff. The large majority of patterns are regular expressions, and there are
three prior approaches for matching such patterns: deterministic finite automaton
(DFAs), non-deterministic finite automaton (NFAs), and recursive backtracking-based
approaches. DFAs are fast to operate, but are space-inefficient. NFAs are space effi-
cient, but are slow to operate. Recursive backtracking is fast for benign packets but is
vulnerable to attack-carrying packets that can induce algorithmic complexity attacks.
ii
This dissertation proposes novel algorithms for time- and space-efficient pattern
matching that also resist known algorithmic complexity attacks. It presents three con-
tributions. First, it introduces NFA-OBDDs, a new data structure that allows time-
and space-efficient matching of regular expressions. Second, it presents an extension to
NFA-OBDDs that allows them to model submatch extraction, an important feature in
real-world patterns used by network security applications. Finally, it presents a tech-
nique to efficiently match a non-regular pattern language: regular expressions extended
with back-references. This disseration presents experimental results demonstrating that
the new algorithms can beat the performance of existing, widely-deployed algorithms
(such as Google’s RE2 and PCRE) by several orders of magnitude.
iii
Acknowledgements
I would like to express my thanks to:
• my advisor Prof. Vinod Ganapathy, for his insightful advice and support, for
providing me an excellent research environment at Rutgers University.
• Prof. Liviu Iftode, for his co-advising of my research work in mobile security.
• Dr. Markus Jakobsson, for being my first research navigator in computer security
and privacy, for providing me valuable advice to my life and studies in USA.
• Dr. Pratyusa Manadhata, Dr. William Horne, Dr. Prasad Rao, Dr. Randy
Smith, Rezwana Karim, Nader Boushehrinejadmoradi, Pallab Roy, and all my
collaborators for contributing their expertise in my research work.
• Prof. Jie Hu, for her kindly help and encouragement during my studies in USA.
• my colleagues and other members of the Disco-Lab in Computer Science De-
partment at Rutgers University, for their helpful discussion and feedbacks to my
research projects and presentations.
• my friends over the years, for sharing my happiness and concerns.
Thank you so much.
My biggest source of strength and motivation originates from my parents. My
progress in studies and career is due to their endless love, unconditional support, sac-
rifice, and encouragement. My success is their success. Without their affection and
guidance, I would not be able to receive my degrees.
Funding. My graduate studies were funded by NSF grants CNS-0831268, CNS-
0915394, CNS-0931992, CNS-0952128 and CNS-1117711. The Cloud and Security
iv
Group at HP Laboratories (Princeton, NJ) and the Security and Cryptography Group
at Microsoft Research (Redmond, WA) employed me as a research intern in Summer
2011 and Summer 2012, respectively. Their supports are gratefully acknowledged.
v
Dedication
I dedicate this dissertation to my wonderful family. Particularly to my wife, Weiwei,
who has provided me many years of support and understanding to my research work,
and to our lovely son Andy, who is the joy of our life. I must thank my mother-in-law,
who has helped us so much in baby-sitting, and my father-in-law, who has supported
us both financially and emotionally.
vi
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1. Pattern Matching in Network Security Applications . . . . . . . . . . . . 2
1.2. Existing Approaches for Pattern Matching . . . . . . . . . . . . . . . . . 3
1.2.1. Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Non-deterministic Finite Automata . . . . . . . . . . . . . . . . . 4
Deterministic Finite Automata . . . . . . . . . . . . . . . . . . . 4
1.2.2. Pattern Matching Algorithms . . . . . . . . . . . . . . . . . . . . 5
DFA-based Matching . . . . . . . . . . . . . . . . . . . . . . . . . 5
Thompson’s NFA-based Matching . . . . . . . . . . . . . . . . . 6
Recursive Backtracking-based Matching . . . . . . . . . . . . . . 7
1.3. Challenges in Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1. State Blow-up of DFAs . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.2. Growth of Pattern Sets . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.3. Slow Operation of NFAs . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.4. Algorithmic Complexity Attacks . . . . . . . . . . . . . . . . . . 14
1.4. Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5. Contributors to the Dissertation . . . . . . . . . . . . . . . . . . . . . . 16
vii
1.6. Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2. Background: Ordered Binary Decision Diagrams . . . . . . . . . . . . 18
2.1. Definition of OBDD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2. Operations in OBDDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3. Representing Relations and Sets . . . . . . . . . . . . . . . . . . . . . . 21
3. Improving NFA-based Pattern Matching using OBDDs . . . . . . . . 23
3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2. Representing and Operating NFAs and NFA-OBDDs . . . . . . . . . . . 26
3.2.1. NFA Operation using Boolean Function Manipulation . . . . . . 27
3.2.2. NFA-OBDDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3. Experimental Apparatus and Data Sets . . . . . . . . . . . . . . . . . . 30
3.3.1. Signature Sets and Network Traffic . . . . . . . . . . . . . . . . . 32
3.3.2. Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4. Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.1. NFA-OBDDs: Construction and Performance . . . . . . . . . . . 36
3.4.2. Comparison with NFAs . . . . . . . . . . . . . . . . . . . . . . . 37
3.4.3. Comparison with the PCRE Package . . . . . . . . . . . . . . . . 38
3.4.4. Comparison with DFA Variants . . . . . . . . . . . . . . . . . . . 39
Multiple DFAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Hybrid Finite Automata . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.5. Deconstructing NFA-OBDD Performance . . . . . . . . . . . . . 42
3.4.6. Impact of Variable Ordering on NFA-OBDD Performance . . . . 43
3.5. Matching Multiple Input Symbols . . . . . . . . . . . . . . . . . . . . . 46
3.5.1. Adapting to the Streaming Model . . . . . . . . . . . . . . . . . 47
3.5.2. Reducing Space Consumption using Alphabet Compression . . . 49
3.5.3. Performance of k-stride NFA-OBDDs . . . . . . . . . . . . . . . 51
3.6. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.7. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
viii
4. Fast Submatch Extraction using OBDDs . . . . . . . . . . . . . . . . . 56
4.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.2. Design and Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.1. Solution Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.2. Tagging NFAs for Submatch . . . . . . . . . . . . . . . . . . . . 59
4.2.3. Operations on Tagged NFAs . . . . . . . . . . . . . . . . . . . . 62
Match Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Submatch Extraction . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.4. Boolean Function Representation . . . . . . . . . . . . . . . . . . 66
Match Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Submatch Extraction . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2.5. Submatch-OBDD . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.6. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3.1. Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Snort-2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Snort-2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Firewall-504 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.2. Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3.3. Performance Results . . . . . . . . . . . . . . . . . . . . . . . . . 76
Snort-2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Snort-2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Firewall-504 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.4. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5. A New Algorithm for Patterns with Back References . . . . . . . . . 82
5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
ix
5.2. Design of Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2.1. Pattern Compilation . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2.2. Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Frontier Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Acceptance Checking . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.4. Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4.1. Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Patho-01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Snort-46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.2. Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.5. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6. Conclusion and Future Directions . . . . . . . . . . . . . . . . . . . . . . 95
6.1. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.1.1. NFA-OBDD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.1.2. Submatch-OBDD . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.1.3. NFA-backref . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2. Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.2.1. Hardware-based NFA-OBDD and Submatch-OBDD . . . . . . . 97
6.2.2. Security in Software Defined Networking . . . . . . . . . . . . . . 97
Security Applications in SDN . . . . . . . . . . . . . . . . . . . . 97
Digital Forensics in SDN . . . . . . . . . . . . . . . . . . . . . . . 98
Configuration Validation in SDN . . . . . . . . . . . . . . . . . . 98
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
x
List of Tables
4.1. Transition table of an example tagged NFA . . . . . . . . . . . . . . . . 62
4.2. Boolean encoding of transitions in Table 4.1. . . . . . . . . . . . . . . . 67
4.3. Execution time and memory consumption for the Snort-2009 data set . 77
4.4. Execution time and memory consumption for the Snort-2012 data set . 78
4.5. Execution time and memory consumption for the Firewall data set . . . 79
5.1. Transition table of an example tagged NFA . . . . . . . . . . . . . . . . 86
xi
List of Figures
1.1. An example signature from Snort 2012. . . . . . . . . . . . . . . . . . . 2
1.2. A simplified network-based intrusion detection system. . . . . . . . . . . 3
1.3. An example NFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4. A DFA equivalent to the NFA in Figure 1.3 . . . . . . . . . . . . . . . . 5
1.5. An NFA-based pattern matching example . . . . . . . . . . . . . . . . . 8
1.6. An example of using backtracking to do match test. . . . . . . . . . . . 10
1.7. The time-space tradeoff of different pattern matching approaches. . . . . 11
1.8. An example of DFA combination . . . . . . . . . . . . . . . . . . . . . . 12
1.9. An example of DFA state blow-up . . . . . . . . . . . . . . . . . . . . . 12
1.10. The growth trend of Snort signatures . . . . . . . . . . . . . . . . . . . . 13
1.11. An example path tree traversed by the recursive backtracking agorithm. 15
2.1. An example of a Boolean formula and OBDDs . . . . . . . . . . . . . . 19
2.2. An example of Apply and Restrict operations in OBDDs . . . . . . . 20
3.1. NFA for (0|1)∗1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2. Components of our software-based implementation of NFA-OBDDs. . . 31
3.3. Statistics of the HTTP traces used in our experiments . . . . . . . . . . 33
3.4. Statistics of the FTP traces used in our experiments . . . . . . . . . . . 34
3.5. NFA-OBDD construction results. . . . . . . . . . . . . . . . . . . . . . . 36
3.6. Performance data of different implementations . . . . . . . . . . . . . . 37
3.7. Raw performance numbers for the charts shown in Figure 3.6. . . . . . . 38
3.8. Fraction of time spent performing OBDD operations. . . . . . . . . . . . 42
3.9. Impact of OBDD variable ordering on the performance of NFA-OBDDs. 44
3.10. 2-stride NFA for Figure 3.1. . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.11. The NFA in Figure 3.10 adapted for streaming. . . . . . . . . . . . . . . 48
xii
3.12. Memory versus throughput for 1-stride and 2-stride NFA-OBDDs . . . . 52
4.1. Basic elements of tagged NFAs . . . . . . . . . . . . . . . . . . . . . . . 61
4.2. The union, concatenation, and closure constructs of tagged NFAs . . . . 61
4.3. An example tagged ε-NFA . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4. An example ε-free tagged NFA . . . . . . . . . . . . . . . . . . . . . . . 62
4.5. Example of frontier derivation in a tagged NFA . . . . . . . . . . . . . . 63
4.6. The ordered binary decision diagram of a Boolean function . . . . . . . 68
5.1. The tagged NFA constructed from “(a*)aa(a*)”. . . . . . . . . . . . . 85
5.2. The toolchain of our back reference implementation. . . . . . . . . . . . 89
5.3. Performance of different implementations for the Patho-01 data set . . . 91
5.4. Performance of different implementations for the Snort-46 pattern set . 92
xiii
1
Chapter 1
Introduction
Network security applications, e.g., network-based intrusion detection systems (NIDS),
employ pattern matching to identify data of interest. An ideal pattern matching algo-
rithm in network security applications must satisfy two requirements: time efficiency
and space efficiency. Since network security applications are often deployed over high
speed network links, time efficiency requires the time spent on processing each byte of
data to be small to keep up with the Gbps of packet processing speed. Space efficiency
requires that the representation of larger number of patterns be small to fit into the
main memory of a system. In this dissertation, we first show that existing approaches
for pattern matching suffer from a time-space tradeoff. We then present several pattern
matching algorithms that overcome the weaknesses of existing solutions. Our experi-
mental results have shown that the new algorithms outperform existing approaches by
one to three orders of magnitude while avoiding memory blow-up and resisting known
algorithmic attacks.
In this chapter, we first briefly describe how network security applications employ
pattern matching to identify data of interest in Section 1.1, followed by the description
of existing pattern matching algorithms in Section 1.2. We describe the main chal-
lenges of pattern matching in Section 1.3 and summarize the main contributions of this
dissertation to address those challenges in Section 1.4. We list the contributors to this
dissertation in Section 1.5. Finally, we describe the organization of this dissertation in
Section 1.6.
2
1.1 Pattern Matching in Network Security Applications
Modern computer networks rely on intrusion detection systems (IDS) for security. Intru-
sion detection systems can be broadly categorized as host-based IDS [92] and network-
based IDS (NIDS). NIDS can be further categorized into anomaly-based detection sys-
tems [91, 89, 63] and signature-based detection systems [93, 61, 17, 5]. Signature-based
NIDS employ patterns to describe the features of malicious data (called signatures).
Figure 1.1 shows an example signature from Snort [66, 5], a commercial signature-
based NIDS1. The “tcp $EXTERNAL NET any -> $HTTP SERVERS $HTTP PORTS” part
in the signature specifies that this signature applies to TCP packets coming from an
outside network address with any port number to an HTTP server at the HTTP ports.
The pcre part shows a pattern that is used to match the payload of a packet. In
particular, pattern
"/username=[^&\x3b\r\n]{255}/si"
will match network packets containing contents of strings in the form of “username=”
followed by 255 characters that are not ‘&’, ‘;’, or spaces.
alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"...,
pcre:"/username=[^&\x3b\r\n]{255}/si"; metadata:service http; ...
classtype:web-application-attack; sid:2702; rev:9;)
Figure 1.1: An example signature from Snort 2012.
Figure 1.2 shows how a NIDS performs deep packet inspection by matching incoming
packets against known patterns of malicious traffic. In this figure, the pattern set
contains one toy pattern "evil". Packets that contains “evil” in the payloads will
be considered as malicious and alerts will be triggered. All other packets that do
not contain “evil” in the payloads will be considered innocent and allowed to pass the
NIDS. Patterns in real systems are often more complex, like the one shown in Figure 1.1.
1For simplicity, we use the term NIDS to refer a signature-based NIDS in our following descriptions.
3
Figure 1.2: A simplified network-based intrusion detection system.
1.2 Existing Approaches for Pattern Matching
In the past, attack patterns were keywords that could be efficiently matched using
string matching algorithms, e.g, KMP [44], Boyer-Moore [16], Wu-Manber [96], and
Aho-Corasick [7] algorithms. The increasing complexity of attacks has led the research
community to investigate more expressive pattern representations, which require the
full power of regular expressions with some extended features, such as submatch ex-
traction and back references. Existing approaches for pattern matching can be broadly
categorized as finite automaton-based and recursive backtracking based techniques.
1.2.1 Finite Automata
The majority of patterns used in network security applications are regular expressions
(can be described by regular languages). Finite automata are natural representations
for regular expressions. It is known that regular expressions, deterministic finite au-
tomata (DFAs), and non-deterministic finite automata (NFAs) are equivalent in terms
of expressive power. Therefore, regular expression matching can be performed by op-
erating the corresponding NFAs or DFAs. Tools such as GNU grep [4], Awk [3], and
Tcl [59] implement regular expression matching using NFA and DFA-based approaches.
4
1 2
a
a a
3
Figure 1.3: An NFA constructed from regular expression “a*aa”.
Non-deterministic Finite Automata
An NFA can be represented using a 5-tuple: (Q, Σ, ∆, q0, Fin), where Q is a finite set
of states, Σ is a finite set of input symbols (the alphabet), ∆: Q× (Σ ∪ {ε})→ 2Q is a
transition function, q0 ∈ Q is a start state, and Fin ⊆ Q is a set of accepting (or final)
states. The transition function ∆(s, i) = T describes the set of all states t ∈ T such
that there is a transition labeled i from s to t. Note that ∆ can also be expressed as a
relation δ: Q× Σ×Q, so that (s, i, t) ∈ δ for all t ∈ T such that ∆(s, i) = T .
Given a regular expression, we can use the Thompson’s algorithm [84] to construct
an ε-NFA that recognizes the same language as the given regular expression. An ε-
NFA can be converted to an ε-free-NFA using the ε-closure algorithm. For example,
Figure 1.3 shows an ε-free-NFA constructed from regular expression “a*aa”, where the
start state is numbered by 1, and the final state is numbered by 3. For simplicity, we
use the term NFA to refer an ε-free-NFA in our following descriptions.
Deterministic Finite Automata
An NFA (Q, Σ, ∆, q0, Fin) can be converted to a DFA that recognizes the same
language using the subset construction algorithm [38]. The corresponding DFA has
states that are subsets of Q. The initial state of the DFA is {q0}. For a state S in
the DFA, the transition function is defined as T (S, i) = ∪{T (q, i)|q ∈ S}, where i is an
input symbol. In other words, the transition function maps a state S (a subset of Q)
and input symbol i to a set of all states that can be reached by an i-transition from a
state in S. A state S of a DFA is an accept state if and only if at least one member
5
{1}
a
a a
{1,2} {1,2,3}
Figure 1.4: The DFA recognizing the same language as the NFA in Figure 1.3.
of S is an accept state of the corresponding NFA. Figure 1.4 shows a DFA constructed
from the example NFA in Figure 1.3 using the subset construction algorithm. The start
state of the DFA is {1}, and the accept state is {1, 2, 3}.
1.2.2 Pattern Matching Algorithms
DFA-based Matching
An important feature of DFAs is that at any state, each possible input symbol leads to
at most one new state. A DFA-based matching algorithm is described by Algorithm 1.
The input of the algorithm is an input string str and a DFA with start state S0 and final
set of states F . At line 1, the algorithm assigns the start state S0 as current state. The
loop from line 2 to line 8 performs match test. For the ith input symbol, the transition
function T returns the next state by looking up the transition relation of the DFA. If
the returned next state is null, then the algorithm returns false which means that the
input string does not match the DFA. If an input symbol is the last character of the
input string, line 6 checks whether the next state is one of the states in the final state
set. If so, the algorithm returns true which means that the input string is matched by
the DFA. Line 8 renames the next state as the current state and the loop continues.
Finally, the algorithm returns false if the DFA is not in a final state after consuming
the last symbol of the input string.
For the example DFA in Figure 1.4 with input string “aaaa”, the match test can
be performed in the following way according to Algorithm 1. Start from start state
{1}, with the first input symbol ‘a’, the next state of the DFA is {1, 2}. Using {1, 2}
6
Algorithm: DFA-MATCH-TESTInput : An input string str to be matched, and a DFA with start state
S0 and final state set F .Output : true or falsecurrent state = S0;1
for i=1 to strlen(str) do2
next state = T (current state, str[i]);3
if next state is null then4
return false;5
if i==strlen(str) and next state ∈ F then6
return true;7
current state = next state;8
return false;9
Algorithm 1: DFA-based matching algorithm.
as the current state, with the second input symbol ‘a’, the next state is {1, 2, 3}. Using
{1, 2, 3} as current state, with the third input symbol ‘a’, the next state of DFA is still
{1, 2, 3}. After consuming the fourth input symbol ‘a’, the DFA stays at state {1, 2, 3},
which is an accept state. Therefore, the DFA is matched by the input string “aaaa”.
For an input string of length n, the running time of Algorithm 1 is O(n) since for
each input symbol, at most one transition lookup is performed. Thus, a DFA-based
matching algorithm is time efficient.
Thompson’s NFA-based Matching
Different than a DFA, at any state an NFA can have multiple choices for the next state
after reading an input symbol. An NFA matches an input string if there is a way that it
can read the input string and transit to a final state at the end of the input string. An
NFA-based match test is described in Algorithm 2. Line 1 initializes the start states as
the current set of states. Starting from the start states with the first input symbol, the
loop between lines 2 to 8 processes the transitions for each input symbol. Given a set
of current states, the next set of states (also called frontiers) is the union of states that
are reachable from any state s in the current states, as is shown in the loop from lines 4
to 5. After consuming the last input symbol, line 6 checks whether there exists a state
in the current states that is also a final state. If so, the algorithm returns true, which
7
Algorithm: Thompson’s algorithmInput : An input string str to be matched, and an NFA (Q, Σ, ∆,
q0, Fin).Output : true or falsecurrent states = {q0};1
for i=1 to strlen(str) do2
next states = φ;3
foreach s ∈ current states do4
next states = next states ∪ δ(s, str[i]);5
if (i==strlen(str)) and (next states ∩ Fin 6= φ) then6
return true;7
current states = next states;8
return false;9
Algorithm 2: Thompson’s algorithm for NFA-based match test.
means that the NFA is matched by the input string. For an NFA that has m states,
the size of frontier set (current states) is O(m). It can be calculated that the running
time of Algorithm 2 is O(m× n) for a string of length n. Comparing with DFA-based
matching, NFA-based matching is inefficient.
For the example NFA in Figure 1.3 with input string “aaaa”, the NFA-based match-
ing can be performed in the following way according to Algorithm 2. Start from start
state set {1} with the first input symbol ‘a’, the next set of states is {1, 2}. Renaming
{1, 2} as current set of states, with the second input symbol ‘a’, the next frontier set
can be obtained as {1, 2, 3}. Using the same method to process the third and fourth
input symbol. At the end, the frontier set is {1, 2, 3}. Since state 3 is an accept state,
the NFA is matched by the input string “aaaa”. Figure 1.5 demonstrates the process
of matching “aaaa” with the example NFA in Figure 1.3. It can be observed that
processing each input symbol involves multiple transition lookups.
Recursive Backtracking-based Matching
Another way to simulate an NFA is using recursive backtracking, which is shown in
Algorithm 3 and Algorithm 4. Line 1 of Algorithm 3 initializes a Boolean variable
matched to false. Line 2 calls BT-MATCH to perform a recursive backtracking-based
match test on the input string str. BT-MATCH may change the value of the matched
8
Figure 1.5: Matching input string aaaa with the example NFA in Figure 1.3 usingThompson’s algorithm.
Algorithm: BT-MATCH-TESTInput : an input string str and an NFA(Q, Σ, ∆, q0, Fin).Output : true of falsematched ← false;1
BT-MATCH(str, q0, 0);2
return matched;3
Algorithm 3: A recursive backtracking-based matching algo-rithm.
variable during its execution. Algorithm 4 is the core part to perform the match test. It
operates in a depth-first-search style. For a current state s with the ith input symbol,
the algorithm processes all states in δ(s, str[i]) in a depth-first-search way (lines 2 to
6). Lines 3 to 5 do an acceptance check if the current symbol is the last character
of str. If a state t ∈ δ(s, str[i]) is an accept state, then a true value is assigned to
matched and the procedure terminates; otherwise it recursively calls BT-MATCH by
passing t and i+1 as new parameters. A recursive backtracking algorithm may have to
scan an input string multiple times before it finds a match. Tools like PCRE and the
regular expression libraries in some high level languages such as Java, Perl, and Python
implemented pattern matching using recursive backtracking.
For the example NFA in Figure 1.3 with input string “aaaa”, Figure 1.6 shows
how a backtracking approach finds a match after trying three paths. The backtracking
algorithm first tries a path 1 -> 1 -> 1 -> 1 -> 1 and fails. It then backtracks one
9
Algorithm: BT-MATCH(str, s, i)Input : str is a string to be matched, s is a current state s,
and i is the offset of an input symbolif i < strlen(str) then1
foreach t ∈ δ(s, str[i]) do2
if (i == strlen(str)− 1) and (t ∈ Fin) then3
matched ← true;4
return;5
BT-MATCH(str, t, i+ 1);6
7
Algorithm 4: The body of recursive backtracking-basedmatching algorithm.
step and tries path 1 -> 1 -> 1 -> 1 -> 2 and fails again. After that, it backtracks
two steps and tries 1 -> 1 -> 1 -> 2 -> 3 and succeeds (state 3 is an accept state).
In this example, the last input character was scanned three times, and the last second
character was scanned twice before the algorithm found that the input string matches
the pattern.
1.3 Challenges in Pattern Matching
Ideally, pattern matching in network security applications should satisfy two require-
ments: time efficiency and space efficiency. Since network applications are often de-
ployed over high speed network links, time efficiency requires the time spent on process-
ing each byte of data to be small to keep up with the Gbps of packet processing speed.
Space efficiency requires that the representation of larger number of patterns be small
to fit into the main memory of a system. In this section, we will show that pattern
matching in practice suffer from a time-space tradeoff. DFA-based approaches are fast,
but suffer from state blow-up due to the subset construction mechanism. NFA-based
approaches are space efficient, but are slow in operation because an NFA can be simul-
taneously in multiple states at any instant. Recursive backtracking-based approaches
are fast in general, but are vulnerable to algorithmic complexity attacks [75]. The per-
formance of a NIDS implemented using recursive backtracking could be slowed down by
several orders of magnitude under algorithmic complexity attacks [75]. The time-space
10
a
a
a
a
a
1
2 3
1
1
1
2
a
1
a
Figure 1.6: An example of using backtracking to do match test.
tradeoff of different approaches can be shown by Figure 1.7, where the x-axis denotes
space and the y-axis denotes time spent on processing a byte of data. An ideal solution
should be close to the origin of the coordinate system. As we can see, none of the ex-
isting approaches can be considered as ideal. The goal of this dissertation is to propose
new algorithms that are close to the ideal solution.
1.3.1 State Blow-up of DFAs
From Section 1.2.2, we know that for an input string of length n, the running time
of a DFA-based matching algorithm is O(n). Thus, DFA-based algorithms are time
efficient, making DFAs attractive candidates for pattern matching. However, DFA-
based algorithms have a potential disadvantage: state blow-up. Real-world pattern
matching often needs to match an input stream against a set of patterns. A DFA-based
solution then requires constructing one DFA that is combined from the individual DFAs
of all patterns in the pattern set. In this way, the combined DFA recognizes a language
that is the union of languages described by the individual patterns in the pattern set.
According to the subset construction mechanism, combining two DFAs can result in
11
Figure 1.7: The time-space tradeoff of different pattern matching approaches.
a multiplicative increase in number of sates. That’s said: if the sizes of two DFAs
are m and n, then the size of the combined DFA is O(m × n). Smith et al. [77]
formally characterize this blowup using the notion of ambiguity. Figure 1.8 shows an
example of DFA combination. The left side shows two DFAs of patterns “.*ab.*cd”
and “.*ef.*gh”. The right side shows a DFA combined from the two DFAs on the
left. The combined DFA recognizes any input string that matches either “.*ab.*cd”
or “.*ef.*gh”. The individual DFAs have five states each, while the combined DFA
has sixteen states. The problem will be more pronounced if hundreds of patterns are
combined into one DFA. As a result, DFA representations for large sets of regular
expressions often consume several gigabytes of memory, and do not fit within the main
memory of most NIDS.
State blow-up can also happen for a single pattern that has certain constructs. For
example, if a pattern contains both wildcard characters “.*” and a quantifier “{n}”
at different locations, the size of the resulting DFA can be exponential to the value of
the quantifier. Figure 1.9 shows the DFA of an example pattern “.*1[0|1]{3}”. This
pattern matches any binary strings where the last fourth symbol is 1. The size of the
DFA is 16. In fact, for such type of patterns with quantifier n, the size of corresponding
DFAs is O(2n) [38]. In our experiments, we have encountered patterns from Snort rule
12
Figure 1.8: An example of DFA combination resulting multiplicative increase in numberof states (Picture courtesy: [76]).
set [78] where the DFA of a single pattern consumes more than 2.5GB of memory.
Figure 1.9: The DFA of pattern “.*1[0|1]{3}”. The size of this DFA is exponentialto the value of quantifier 3.
1.3.2 Growth of Pattern Sets
The increasing diversity of network attacks has led to a quick growth in the number
of attack signatures used by NIDS. A common challenge faced by NIDS is how to
adapt to the quick growth of signature databases. Figure 1.10 shows the trend of
growth of the Snort rule set from 2005 to 2012. It can be observed that the number
13
of signatures increased by around seven times in the past eight years. This upward
trend will likely accelerate in the future as NIDS vendors begin to employ automated
and semi-automated methods for signature generation [42, 58, 19, 73]. Space-efficiency
mandates that the size of the representation should grow proportionally (e.g., linearly)
with the number of attack signatures.
N
um
ber
of
sig
nat
ure
s
Year
0
5000
10000
15000
20000
25000
30000
2005 2006 2007 2008 2009 2010 2011 2012
Figure 1.10: The number of signatures in Snort increases by seven times in eight years.
1.3.3 Slow Operation of NFAs
Different than DFAs, the combining of NFAs only leads to an additive increase in
number of states [38]. For example, to combine two NFAs, we only need to create a
new start state and a new accept state, and then add two ε-transitions connecting the
new start state with the old start states and, two ε-transitions connecting the old accept
states with the new accept state. However, an NFA-based matching algorithm is often
time inefficient. For an NFA with m states, the time complexity to match an input
string of length n is O(m× n) (see the description in Section 1.2.2). The main reason
for this time inefficiency is due to the fact that each frontier update (getting a next
set of states for an input symbol) can require O(m) transition lookups. If we can find
an approach to reduce the cost of frontier updates, the time efficiency of NFA-based
matching can be improved. In this dissertation, we will propose techniques to achieve
that.
14
1.3.4 Algorithmic Complexity Attacks
As described in Section 1.2.2, recursive backtracking has been widely adopted for pat-
tern matching implementation. There are two main reasons behind this. The first rea-
son is that recursive backtracking is fast in general and easy to implement. The second
reason is that patterns used in practice often contain features that cannot be described
by regular languages. One of such important features is back reference. Recursive
backtracking is the de facto implementation of back references. However, a recursive
backtracking matching algorithm can be very slow in certain cases, as is shown by an
example below.
Figure 1.11 shows the process of using recursive backtracking algorithm to match
pattern “host.*com.*uuid=.*wv=.*cargo” with the following string which has 45
characters:
"hostcomhostcomhostcomuuid=uuid=uuid=wv=wv=wv="
We denote the five parts separated by “.*” in the pattern by P1, P2, P3, P4, and
P5 respectively, i.e., P1=“host”, P2=“com”, etc. A number on an edge between two
nodes in the figure denotes an offset where a subexpression Pi(i = 1 . . . 5) is matched
in the input string. For example, the leftmost edge between P1 and P2 is labeled by 3,
which means that “host” is matched by the input string at offset 3. The above pattern
is matched by an input string if and only if P1, P2, P3, P4, and P5 are sequentially
matched by the input string. It can be observed that a backtracking approach needs
to try 45 paths for the input string before it can claim that the example pattern is
not matched by the example input string. In general, for a pattern that has k parts
separated by wildcard characters “.*”, the running time of a backtracking algorithm
can be close to O(nk) [75], where n is the length of the input string. Such a behavior
that triggers a backtracking algorithm to exhaustively try all execution paths for input
strings is called algorithmic complexity attack. Researchers have demonstrated that the
throughput of a NIDS employing recursive backtracking for pattern matching can be
slowed down for several orders of magnitude [75]. Therefore, recursive backtracking is
not a good choice for pattern matching.
15
P1
P2 P2 P2
P3
P4
P5 P5 P5
P4
P5 P5 P5
P4
P5 P5 P5
P3
P4
P5 P5 P5
P4
P5 P5 P5
P4
P5 P5 P5
P3
P4
P5 P5 P5
P4
P5 P5 P5
P4
P5 P5 P5
3 10
17
6 13
20
25 30
35
38 41 44 38 41 44 38 41 44
25 30
35
38 41 44 38 41 44 38 41 44
25 30
35
38 41 44 38 41 44 38 41 44
Figure 1.11: An example path tree traversed by the recursive backtracking agorithm.
1.4 Summary of Contributions
This dissertation focuses on addressing the main challenges described in Section 1.3.
The problem statement of this dissertation is: Pattern matching algorithms employed
by network security applications typically demonstrate a time-space tradeoff: NFA-
based matching is slow but memory-efficient, while DFA-based matching is fast but
memory-intensive. Many practical implementations avoid the tradeoff by eliding the
use of NFAs and DFAs, but are vulnerable to algorithmic complexity attacks when
scanning malicious network traffic.
The thesis statement of this dissertation is:
Using ordered binary decision diagrams, it is possible to design pattern match-
ing algorithms that are up to three orders of magnitude faster than tradi-
tional NFA-based pattern matching algorithms, retain the memory-efficiency
of NFAs, and are immune to known algorithmic complexity attacks.
In this dissertation, we propose several new pattern matching algorithms for network
security applications. In particular, we make three contributions.
First, we investigate patterns that can be desribed by regular languages, i.e., regular
expressions. We propose NFA-OBDD, a time and space efficient data structure for reg-
ular expression matching. We evaluate the performance of NFA-OBDD using real-world
patterns and network traffic traces. Our experimental results show that NFA-OBDD
has three benefits: (1) It is three orders of magnitude faster than a traditional NFA
16
implementation, while retaining the space efficiency of NFAs. (2) NFA-OBDD is faster
or at least competitive to PCRE, a widely used pattern matching tool implemented
using recursive backtracking. (3) The time efficiency of NFA-OBDD is comparable to
MDFA, a variant of DFA-based pattern matching approach, but consuming much less
memory than MDFA.
Second, we investigate a more general case, regular expressions extended with sub-
match extraction, which is an important feature in real-world patterns used by network
security applications. We propose an extension of NFA-OBDD to model submatch ex-
traction. Our experimental results show that our submatch extraction approach (called
Submatch-OBDD) is one order of magnitude faster than PCRE and Google’s RE2 (a
pattern matching tool that supports regular expression matching and submatch extrac-
tion).
Third, we study an even more general case, patterns containing back references,
which are non-regular languages. We propose NFA-Backref, an efficient pattern match-
ing approach for patterns with back references. Our exprimental results show that
NFA-Backref resists known algorithmic complexity attacks, and outperforms PCRE by
at least three orders of magnitude for certain types of patterns. For benign patterns,
NFA-Backref is one order of magnitude slower than PCRE.
1.5 Contributors to the Dissertation
This section lists the co-authors of the papers from which the materials are used in this
dissertation. The NFA-OBDD model in Chapter 3 is a collaborated work with my advi-
sor Professor Vinod Ganapathy, my colleagues Rezwana Karim, and Randy Smith from
University of Wisconsin-Madison. Vinod Ganapathy motivated the NFA-OBDD model
and directed the project. Rezwana Karim contributed by implementing the one-gram
NFA-OBDD construction and execution. Randy Smith contributed by working on the
regular expression parsing. The Submatch-OBDD presented in Chapter 4 is collabo-
rated with Vinod Ganapathy, and some researchers from HP Laboratories, Pratyusa
17
Manadhata, William Horne, and Prasad Rao. Vinod Ganapathy contributed by validat-
ing the correctness of the Submatch-OBDD model. Pratyusa Manadhata contributed
by discussing the correctness of the submatch extraction algorithm and partially de-
signing the experimental evaluation. William Horne contributed by proposing a tagging
approach for capturing groups. Prasad Rao contributed by writing tools to generate
synthetic traces. The NFA-Backref algorithm presented in Chapter 5 is collaborated
with Vinod Ganapathy and Pratyusa Manadhata. Vinod Ganapathy contributed by
brainstorming and discussing ideas for pattern matching with back references. Pratyusa
Manadhata contributed by validating the correctness of the new back reference algo-
rithm.
1.6 Dissertation Organization
This dissertation is organized as follows. Chapter 2 describes the background of or-
dered binary decision diagrams (OBDDs). Chapter 3 presents NFA-OBDD, a time and
space efficient pattern matching approach for regular expressions. Chapter 4 presents
Submatch-OBDD, an extension of NFA-OBDD to model submatch extraction. Chap-
ter 5 presents an efficient algorithm for patterns containing back references. Finally,
Chapter 6 concludes the thesis and discusses directions of future study.
18
Chapter 2
Background: Ordered Binary Decision Diagrams
An ordered binary decision diagram (OBDD) is a data structure that can represent
arbitrary Boolean formulae. OBDDs transform Boolean function manipulation into
efficient graph transformations, and have found wide use in a number of application
domains. For example, OBDDs are used extensively by model checkers to improve
the efficiency of state-space exploration algorithms [21]. OBDDs and their variants
have also been used in the analysis and design of intrusion detection systems and
firewalls [23, 37, 104, 103, 34, 33].
2.1 Definition of OBDD
Formally, an OBDD represents a Boolean function f(x1, x2, . . . , xn) as a rooted, di-
rected acyclic graph (DAG) that has two kinds of nodes: non-terminals and up to two
terminals, which are labeled 0 and 1. Terminal nodes do not have outgoing edges. Each
non-terminal node v is associated with a label var(v) ∈ {x1, x2, . . ., xn}, and has two
successors low(v) and high(v). The edges to these successors are labeled 0 and 1,
respectively. An OBDD is ordered in the sense that node labels are associated with a
total order <. Node labels along all paths in the OBDD from the root to the terminal
nodes follow this total order. An OBDD must also satisfy two additional properties:
• there are no two non-terminal nodes u and v such that var(u) = var(v), low(u)
= low(v), and high(u) = high(v); and
• there is no non-terminal u with low(u) = high(u).
In his seminal article, Bryant [20] introduced algorithms to construct OBDDs for
Boolean formulae and showed that for a given total order of the variables of a Boolean
19
x i y f(x, i, y)
0 0 0 10 0 1 00 1 0 10 1 1 11 0 0 11 0 1 01 1 0 01 1 1 1
(a) A Boolean function f(x, i, y).
x
ii
yy
1 0
(b) OBDD(f) with
x < i < y.
i
x
y y
10
(c) OBDD(f) with
i < x < y.
i
0 1
(d) OBDD(I(i)), the
identity function.
Figure 2.1: An example of a Boolean formula and OBDDs with different variable or-derings. Solid edges are labeled 1, dotted edges are labeled 0.
formula, the OBDD representation of that formula is canonical, i.e., for a given ordering,
two OBDDs for a Boolean formula are isomorphic. Figure 2.1(b) depicts an example of
an OBDD for the Boolean formula f(x, i, y) shown in Figure 2.1(a). In this figure, the
variable ordering is x < i < y. To evaluate the Boolean formula for a given variable
assignment, say {x ← 1, i ← 0, y ← 1}, it suffices to traverse the appropriately
labeled edges from the root to the terminal nodes; in this case f(1, 0, 1) evaluates to 0.
Figure 2.1(c) depicts the OBDD for f with the variable ordering i < x < y. Although not
evident from this example, the size of OBDDs is sensitive to the total order imposed
on the Boolean variables; it is NP-hard to choose a total order that yields the most
compact OBDD for a Boolean function [20].
An OBDD representation of a Boolean formula offers several advantages. First,
OBDDs are often more compact than other representations of Boolean formulae, such
as decision trees, conjunctive normal form (CNF) and disjunctive normal form (DNF).
Intuitively, this is because an OBDD captures and eliminates redundant nodes in the
decision tree representation of a Boolean formula. Second, OBDDs allow properties
of Boolean functions to be checked efficiently. For example, to determine whether
a Boolean function is satisfiable (or unsatisfiable), it suffices to check whether the
terminal node labeled 1 (respectively, 0) is reachable from the root node. Because
OBDD construction and manipulation algorithms eliminate nodes that are unreachable
from the root, checking (un)satisfiability is a constant-time operation.
20
x
ii
0 1
y
(a) Apply(∧, OBDD(f), OBDD(I(i))).
x
y
10
(b) Restrict(OBDD(f), i ← 1).
Figure 2.2: Result of the Apply and Restrict operations on the OBDD in Fig-ure 2.1(b).
2.2 Operations in OBDDs
OBDDs allow Boolean functions to be manipulated efficiently. Bryant [20] describes two
operations, Apply and Restrict, which allow OBDDs to be combined and modified
with a number of Boolean operators. These two operations are implemented as a
series of graph transformations and reductions to the input OBDDs, and have efficient
implementations; their time complexity is polynomial in the size of the input OBDDs.
We describe Apply and Restrict informally below, and refer the reader to Bryant’s
article [20] for details of these algorithms.
Apply allows binary Boolean operators, such as ∧ and ∨, to be applied to a pair of
OBDDs. The two input OBDDs, OBDD(f) and OBDD(g), must have the same variable
ordering. Apply(<op>, OBDD(f), OBDD(g)) computes OBDD(f <op> g), which
has the same variable ordering as the input OBDDs. Figure 2.2(a) presents the OBDD
obtained by combining the OBDD in Figure 2.1(b) with OBDD(I(i)) (Figure 2.1(d)),
where I is the identity function. Intuitively, Apply is implemented as a simple recursive
algorithm that processes the DAG representing the OBDD in layers, with each recursive
step processing a subgraph of the previous step, and finally reducing the resulting DAG
so that it satisfies the properties of an OBDD (e.g., deleting unreachable nodes, and
suitably merging nodes).
The Restrict operation is unary, and produces as output an OBDD in which
the values of some of the variables of the input OBDD have been fixed to a certain
value. That is, Restrict(OBDD(f), x←k) = OBDD(f |(x←k)), where f |(x←k) denotes
that x is assigned the value k in f . In this case, the output OBDD does not have
21
any nodes with the label x. Figure 2.2(b) shows the OBDD obtained as the output
of Restrict(OBDD(f), i ← 1), where OBDD(f) is the OBDD of Figure 2.1(b). In-
tuitively, the Restrict operation is implemented by eliminating the nodes labeled i,
suitably redirecting edges from i’s predecessors to point to i’s successors and removing
unreachable nodes.
Finally, Apply and Restrict can be used to implement existential quantification,
which is used in a key way in the operation of NFA-OBDDs, as described in Section 3.2.
In particular, ∃xi.f(x1, . . . , xn) = f(x1,. . . , xn)|(xi ← 0) ∨ f(x1,. . . , xn)|(xi ← 1). There-
fore, we have: OBDD(∃ xi.f(x1, . . . , xn)) = Apply(∨, Restrict(OBDD(f), xi ← 1),
Restrict(OBDD(f), xi ← 0)). Note that OBDD(∃ xi.f(x1, . . . , xn)) will not have a
node labeled xi.
2.3 Representing Relations and Sets
OBDDs can be used to represent relations of arbitrary arity. If R is an n-ary rela-
tion over the domain {0, 1}, then we define its characteristic function fR as follows:
fR(x1, . . . , xn) = 1 if and only if R(x1, . . . , xn). For example, the characteristic func-
tion of the 3-ary relation R = {(1, 0, 1), (1, 1, 0)} is fR(x1, x2, x3) = (x1 ∧ x2 ∧ x3) ∨
(x1∧x2∧ x3). fR is a Boolean function and can therefore be expressed using an OBDD.
An n-ary relation Q over an arbitrary domain D can be similarly expressed using
OBDDs by bit-blasting each of its elements. That is, if the domain D has m elements,
we map each of its elements uniquely to bit-strings containing dlgme bits (call this
mapping φ). We then define a new relation R(φ(x1), . . . , φ(xn)) = Q(x1, . . . , xn). R
is a n × dlgme-ary relation over {0, 1}, and can be converted into an OBDD using its
characteristic function.
A set of elements over an arbitrary domain D can also be expressed as an OBDD
because sets are unary relations, i.e., if S is a set of elements over a domain D, then
we can define a relation RS such that RS(s) = 1 if and only if s ∈ S. Operations
on sets can then be expressed as Boolean operations and performed on the OBDDs
representing these sets. For example, S ⊆ T can be implemented as OBDD(S) −→
22
OBDD(T ) (logical implication), while isEmpty(S∩T ) is equivalent to checking whether
OBDD(S)∧OBDD(T ) is satisfiable. The conversion of relations and sets into OBDDs is
used in a key way in the construction and operation of NFA-OBDDs, which we describe
next.
23
Chapter 3
Improving NFA-based Pattern Matching using OBDDs
3.1 Introduction
Deep packet inspection allows network intrusion detection systems (NIDS) to accu-
rately identify malicious traffic by matching the contents of network packets against
attack signatures. In the past, attack signatures were keywords that could efficiently
be matched using string matching algorithms [7, 16, 44, 96, 55, 9, 10, 86, 90, 30]. How-
ever, the increasing complexity of network attacks has led the research community to
investigate richer signature representations (e.g., [80, 101, 88]), many of which require
the full power of regular expressions. Because NIDS are often deployed over high-speed
network links, algorithms to match such rich signatures must also be efficient enough to
provide high-throughput intrusion detection on large volumes of network traffic. This
problem has spurred much recent research, and in particular has led to the investiga-
tion of new representations of regular expressions that allow for efficient inspection of
network traffic (e.g., [102, 46, 76, 30, 13, 11]).
As we described in Chapter 1, to be useful for deep packet inspection in a NIDS,
any representation of regular expressions must satisfy two key requirements: time-
efficiency and space-efficiency. Finite automata are a natural representation for regular
expressions, but offer a tradeoff between time- and space-efficiency. This time/space
tradeoff has motivated much recent research, primarily with a focus on improving the
space-efficiency of DFAs. These include heuristics to compress DFA transition tables
(e.g., [46, 13]), techniques to combine regular expressions into multiple DFAs [102], and
variable extended finite automata (XFAs) [76], which offer compact DFA representa-
tions and guarantee an additive increase in states when signatures are combined, pro-
vided that the regular expressions satisfy certain conditions. These techniques trade
24
time for space, and though the resulting representations fit in main memory, their
matching algorithms are slower than those for traditional DFAs.
In this chapter, we take an alternative approach and instead focus on improving
the time-efficiency of NFAs. NFAs are not currently in common use for deep packet
inspection, and understandably so—their performance can be several orders of magni-
tude slower than DFAs. Nevertheless, NFAs offer a number of advantages over DFAs,
and we believe that further research on improving their time-efficiency can make them
a viable alternative to DFAs. Our position is supported in part by these observations:
• NFAs are more compact than DFAs. Determinizing an NFA involves a subset
construction algorithm, which can result in a DFA with exponentially more states
than an equivalent NFA [38].
• NFA combination is space-efficient. As pointed out in Chapter 1, combining
two NFAs only results in an additive increase in number of states. This feature
of NFAs is particularly important, given that the diversity of network attacks has
pushed NIDS vendors to deploy an ever increasing number of signatures.
• NFAs can readily be parallelized. An NFA may contain multiple outgoing
transitions for a single input symbol from each state, all of which must be followed
when that input symbol is encountered. An NFA simulator can easily parallelize
these operations as shown in prior work [71, 25].
Motivated by these advantages, we develop a new approach to improve the time-
efficiency of NFAs. Our core insight is that a technique to efficiently apply an NFA’s
transition relation to a set of states can greatly improve the time-efficiency of NFAs.
Such a technique would apply the transition relation to all states in the frontier in a
single operation to produce a new frontier. We develop an approach that uses ordered
binary decision diagrams [20] (OBDDs) to implement such a technique. Our use of
OBDDs to process NFA frontiers is inspired by symbolic model checking, where the use
of OBDDs allows the verification of systems that contain an astronomical number of
states [21].
25
To evaluate the feasibility of our approach, we constructed NFAs in software using
HTTP and FTP signatures from Snort. We operated these NFAs using OBDDs and
evaluated their time-efficiency and space-efficiency using traces of real HTTP and FTP
traffic. Our experiments showed that NFAs that use OBDDs (NFA-OBDDs) outperform
traditional NFAs by approximately three orders of magnitude. Our experiments also
showed that NFA-OBDDs retain the space-efficiency of NFAs. In contrast, our machine
ran out of memory when trying to construct DFAs (or their variants) from our signature
sets.
In addition to improving the time-efficiency of NFAs, our approach has a number of
advantages. First, construction of NFA-OBDDs from regular expressions is fully auto-
mated and does not change signature semantics. In contrast, prior work on improved
signature representations has required manual analysis of regular expressions (e.g., to
identify and eliminate ambiguity [77]) or requires the semantics of signatures to be
modified (e.g., [102]). Second, it uniformly handles all regular expressions. Prior tech-
niques, especially those that convert regular expressions into DFAs (or variants), often
require manual intervention when regular expressions have certain kinds of constructs
(e.g., counters; see [76, 102]). Last, NFA-OBDDs may be amenable to a hardware im-
plementation. Both NFAs (e.g., [71, 25, 32]) and OBDDs [74, 104] have individually
been implemented in hardware. It may be possible to combine ideas from prior work
to construct NFA-OBDDs in hardware.
Our main contributions in this chapter are as follows:
• Design of NFA-OBDDs. We develop a novel technique that uses OBDDs to
improve the time-efficiency of NFAs (Section 3.2). We also describe how NFA-
OBDDs can be used to improve the time and space-efficiency of NFA-based multi-
byte matching (Section 3.5).
• Comprehensive evaluation using Snort signatures. We evaluated NFA-
OBDDs using Snort’s HTTP and FTP signature sets and observed a speedup
of about three orders of magnitude over traditional NFAs. We also compared
the performance of NFA-OBDDs against a variety of automata implementations,
26
including the PCRE package and a variant of DFAs (Section 3.4).
The main benefit of NFA-OBDDs is in improving the performance (i.e., time and
space-efficiency) of deep packet inspection by NIDS, independent of its effectiveness
at detecting attacks. We acknowledge that matching network traffic against regular
expressions is no longer sufficient to detect a large fraction of attacks, and that addi-
tional security mechanisms and advanced forms of signatures (e.g., vulnerability signa-
tures [88, 19]) are necessary. Nevertheless, real deployments use layered defenses, and
NIDS will remain a cornerstone of network security for the foreseeable future. Advanced
signature matching techniques also employ regular expression matching (e.g., see [69])
and we expect that NFA-OBDDs will benefit them as well.
The rest of this chapter is organized as follows: Section 3.2 describes the construction
and operation of NFA-OBDDs; Section 3.3 describes the experimental setup and data
sets used in our evaluation, while Section 3.4 compares the performance of NFA-OBDDs
against other techniques to match regular expressions. Section 3.5 extends NFA-OBDDs
to multi-stride automata and presents experimental evaluation of multi-stride NFA-
OBDDs. We discuss related work in Section 3.6 and conclude in Section 3.7.
3.2 Representing and Operating NFAs and NFA-OBDDs
Same as in Chapter 1, we represent an NFA using a 5-tuple: (Q, Σ, ∆, q0, Fin),
where Q is a finite set of states, Σ is a finite set of input symbols (the alphabet), ∆:
Q × (Σ ∪ {ε}) → 2Q is a transition function, q0 ∈ Q is a start state, and Fin ⊆ Q is a
set of accepting (or final) states. The transition function ∆(s, i) = T describes the set
of all states t ∈ T such that there is a transition labeled i from s to t. Note that ∆ can
also be expressed as a relation δ: Q×Σ×Q, so that (s, i, t) ∈ δ for all t ∈ T such that
∆(s, i) = T .
An NFA may have multiple outgoing transitions with the same input symbol from
each state. Hence, it maintains a frontier F of states that it can be in at each step
during execution. The frontier is initially the singleton set {q0} but may include any
subset of Q during the operation of the NFA. For each symbol in the input string, the
27
NFA must process all of the states in F and find a new set of states by applying the
transition relation.
While non-determinism leads to frontiers of size O(|Q|) in NFAs, it also makes them
space-efficient in two ways. First, NFAs for certain regular expressions are exponentially
smaller than the corresponding DFAs. For example, an NFA for (0|1)∗1(0|1)n has
O(n) states, while the corresponding DFA has O(2n) states [38]. Second, and perhaps
more significantly from the perspective of NIDS, NFAs can be combined space-efficiently
while DFAs often cannot. To combine a pair of NFAs, NFA1 and NFA2, it suffices to
create a new state qnew, add ε transitions from qnew to the start states of NFA1 and
NFA2, and designate qnew to be the start state of the combined NFA. This leads to an
NFA with O(|Q1| + |Q2|) states. In contrast, combining two DFAs, DFA1 and DFA2,
can sometime result in a multiplicative increase in the number of states because the
combined DFA must have a state corresponding to s × t for each pair of states s and
t in DFA1 and DFA2, respectively. The number of states in the DFA can possibly be
reduced using minimization, but this does not always help.
3.2.1 NFA Operation using Boolean Function Manipulation
We now describe how the process of applying an NFA’s transition relation to a frontier of
states can be expressed as a sequence of Boolean function manipulations. NFA-OBDDs
implement Boolean functions and operate on them using OBDDs. For the discussion
below and in the rest of this chapter, we assume NFAs in which ε transitions have been
eliminated (using standard techniques [38]). This is mainly for ease of exposition; NFAs
with ε transitions can also be expressed using NFA-OBDDs. Note that ε elimination
may increase the total number of transitions in the NFA, but does not increase the
number of states.
We define four Boolean functions for an NFA (Q, Σ, δ, q0, Fin). These functions use
three vectors of Boolean variables: ~x, ~y, and ~i. The vectors ~x and ~y are used to denote
states in Q, and therefore contain dlg |Q|e variables each. The vector~i denotes symbols
in Σ, and contains dlg |Σ|e variables. As an example, for the NFA in Figure 3.1, these
vectors contain one Boolean variable each; we denote them as x, y, and i.
28
Figure 3.1: NFA for (0|1)∗1.
• T (~x, ~i, ~y) denotes the NFA’s transition relation δ. Recall that δ is a set of triples
(s, i, t), such that there is a transition labeled i from state s to state t. It can
therefore be represented as a Boolean function as described in Section 2.3. For
example, consider the NFA in Figure 3.1. Using 0 to denote state A and 1 to
denote state B, T (x, i, y) is the function shown in Figure 2.1(a).
• Iσ(~i) is defined for each σ ∈ Σ, and denotes a Boolean representation of that
symbol. For the NFA in Figure 3.1, I0(i) = i (i.e., i = 0) and I1(i) = i.
• F(~x) denotes the current set of frontier states of the NFA. It is thus a Boolean
representation of the set F at any instant during the operation of the NFA. For
our running example, if F = {A}, F(x) = x, while if F = {A, B}, then F(x) =
x ∨ x.
• A(~x) is a Boolean representation of Fin, and denotes the accepting states. In
Figure 3.1, A(x) = x.
Note that T (~x, ~i, ~y), Iσ(~i) and A(~x) can be computed automatically from any
representation of NFAs. The initial frontier F = {q0} can also be represented as a
Boolean formula.
Suppose that the frontier at some instant during the operation of the NFA is F(~x),
and that the next symbol in the input is σ. The following Boolean formula, G(y),
symbolically denotes the new frontier of states in the NFA after σ has been processed.
G(~y) = ∃ ~x.∃ ~i.[T (~x,~i, ~y) ∧ Iσ(~i) ∧ F(~x)]
To see why G(~y) is the new frontier, consider the truth table of the Boolean function
T (~x, ~i, ~y). By construction, this function evaluates to 1 only for those values of ~x,
29
~i, and ~y for which (~x, ~i, ~y) is a transition in the automaton. Similarly, the function
F(~x) evaluates to 1 only for the values of ~x that denote states in the current frontier of
the NFA. Thus, the conjunction of T (~x, ~i, ~y) with F(~x) and Iσ(~i) only “selects” those
rows in the truth table of T (~x, ~i, ~y) that correspond to the outgoing transitions from
states in the frontier labeled with the symbol σ. However, the resulting conjunction is a
Boolean formula in ~x,~i and ~y. To find the new frontier of states, we are only interested
in the values of ~y (i.e., the target states of the transitions) for which the conjunction
has a satisfying assignment. We achieve this by existentially quantifying ~x and ~i to
obtain G(~y). To express the new frontier in terms of the Boolean variables in ~x, we
rename the variables in ~y with their counterparts in ~x.
We illustrate this idea using the example in Figure 3.1. Suppose that the current
frontier of the NFA is F = {A, B}, and that the next input symbol is a 0, which
causes the new frontier to become {A}. In this case, T (x, i, y) is the function shown
in Figure 2.1(a), I0(i) = i and F(x) = x ∨ x. We have T (x, i, y) ∧ I0(i) ∧ F(x) =
(x∧ i∧ y). Existentially quantifying x and i from the result of this conjunction, we get
G(y) = y. Renaming the variable y to x, we get F(x) = x, which is a Boolean formula
that denotes {A}, the new frontier.
To determine whether the NFA accepts an input string, it suffices to check that
F ∩ Fin 6= ∅. Using the Boolean notation, this translates to check whether F(~x)
∧ A(~x) has a satisfying assignment. In the example above with F = {A}, F(x) = x
and A(x) = x, so the NFA is not in an accepting configuration. Recall that checking
satisfiability of a Boolean function is an O(1) operation if the function is represented
as an OBDD.
3.2.2 NFA-OBDDs
The main idea behind NFA-OBDDs is to represent and manipulate the Boolean func-
tions discussed above using OBDDs. Formally, an NFA-OBDD for an NFA (Q, Σ, δ, q0,
Fin) is a 7-tuple (~x, ~i, ~y, OBDD(T ), {OBDD(Iσ | ∀σ ∈ Σ)}, OBDD(Fq0), OBDD(A)),
where ~x, ~i, ~y are vectors of Boolean variables, and T , Iσ, and A are the Boolean for-
mulae discussed in Section 3.2.1. Fq0 denotes the Boolean function that denotes the
30
frontier {q0}. For each input symbol σ, the NFA-OBDD obtains a new frontier as
discussed earlier. The main difference is that the Boolean operations are performed as
operations on OBDDs.
The use of OBDDs allows NFA-OBDDs to be more time-efficient than NFAs. In
an NFA, the transition table must be consulted for each state in the frontier, lead-
ing to O(|δ| × |F |) operations per input symbol. In contrast, the complexity of
OBDD operations to obtain a new frontier is approximately O(sizeof(OBDD(T )) ×
sizeof(OBDD(F))). This is because the complexity of obtaining a new frontier is
dominated by the cost of an Apply operation on OBDD(T ) and OBDD(F), which
costs O(sizeof(OBDD(T )) × sizeof(OBDD(F))) [20]. Since OBDDs are a compact
representation of the frontier F and the transition relation δ, NFA-OBDDs are more
time-efficient than NFAs. The improved performance of NFA-OBDDs is particularly
pronounced when the transition table of the NFA is sparse or the NFA has large fron-
tiers, because OBDDs can effectively remove redundancy in the representations of δ
and F .
The reason that NFA-OBDDs retain the space-efficiency of NFAs is that NFA-
OBDDs can be combined using the same algorithms that are used to combine NFAs.
Although the use of OBDDs may lead NFA-OBDDs to consume more memory than
NFAs, our experiments show that the increase is marginal. In particular, the cost is
dominated by OBDD(T ), which has a total of 2×dlg |Q|e + dlg |Σ|e Boolean variables.
Even in the worst case, this OBDD consumes only O(|Q|2 × |Σ|) space, which is com-
parable to the worst-case memory consumption of the transition table of a traditional
NFA. However, in practice, the memory consumption of NFA-OBDDs is much smaller
than this asymptotic limit.
3.3 Experimental Apparatus and Data Sets
We evaluated the feasibility of our approach using a software-based implementation of
NFA-OBDDs. As depicted in Figure 3.2, the experimental apparatus consists of two
offline components and an online component.
31
Figure 3.2: Components of our software-based implementation of NFA-OBDDs.
The offline components are executed once for each set of regular expressions, and
consist of re2nfa and nfa2obdd. The re2nfa component accepts a set of regular expres-
sions as input, and produces an ε-free NFA as output. To do so, it first constructs NFAs
for each of the regular expressions using Thompson’s construction [84, 38], combines
these NFAs into a single NFA, and eliminates ε transitions. The nfa2obdd component
analyzes this NFA to determine the number of Boolean variables needed (i.e., the sizes
of the ~x, ~i and ~y vectors), and constructs OBDD(T ), OBDD(A), OBDD(Iσ) for each
σ ∈ Σ, and OBDD(Fq0).
As discussed in Chapter 2, the size of an OBDD for a Boolean formula is sensitive
to the total order imposed on its variables. Variable ordering also impacts the structure
of OBDDs, and therefore the performance of NFA-OBDDs. We empirically determined
that an ordering of variables of the form ~i < ~x < ~y yields high-performance NFA-
OBDDs. Our implementation of nfa2obdd therefore uses this ordering for ~i, ~x and ~y.
Within each vector, nfa2obdd orders variables in increasing order from most significant
bit to least significant bit. Section 3.4.6 presents a detailed evaluation of the impact of
variable ordering on the performance of NFA-OBDDs.
The online component, exec nfaobdd, begins execution by reading these OBDDs into
memory and processes a stream of network packets. It matches the contents of these
network packets against the regular expressions using the NFA-OBDD. To manipulate
OBDDs and produce a new frontier for each input symbol processed, this component
interfaces with Cudd, a popular C++-based OBDD library [79]. It checks whether each
frontier F produced during the operation of the NFA-OBDD contains an accepting
state. If so, it emits a warning with the offset of the character in the input stream that
32
triggered a match, as well as the regular expression(s) that matched the input.1 Note
that in a NIDS setting, it is important to check whether the frontier F obtained after
processing each input symbol contains an accepting state (rather than after processing
the entire input string, which is the traditional operating model for finite automata).
This is because any byte in the network input may cause a transition in the NFA that
triggers a match with a regular expression. We call this the streaming model because
the NFA continuously processes input symbols from a network stream. This model is
equivalent to using regular expressions to find all matching substrings within a string,
as the characters in the string are presented to the matching algorithm one at a time.
3.3.1 Signature Sets and Network Traffic
Signature Sets
We evaluated our implementation of NFA-OBDDs with three sets of regular expressions
(we have made these signatures available for download [98]). The first set was obtained
from the authors of the XFA paper [76], and contains 1503 regular expressions that
were synthesized from the March 2007 snapshot of the Snort HTTP signature set. The
second and third sets, numbering 2612 and 98 regular expressions, were obtained from
the October 2009 snapshot of the Snort HTTP and FTP signature sets, respectively.
About 50% of these regular expressions were taken from the uricontent fields of the
signatures, while the rest were extracted from the pcre fields. Although extracting just
pcre fields from individual Snort rules only captures a portion of the corresponding
rules, it suffices for our experiments as our primary goal is to evaluate the performance
of NFA-OBDDs against other regular-expression based techniques. All three sets of reg-
ular expressions include client-side and server-side signatures. For all sets, we excluded
Snort signatures that contained non-regular constructs, such as back-references and sub-
routines (which are allowed by Perl-compatible regular expression package (PCRE)),
as these constructs are not regular and therefore cannot be implemented in NFA-based
models. In all, we excluded 1837 HTTP and 41 FTP signatures due to non-regular
1Multiple regular expressions may trigger a match on an input symbol; these regular expressionscan be identified using the set of states that appear in the conjunction F(~x) ∧ A(~x).
33
Number of HTTP commands Matches triggered: Total number (# Distinct sigs.)
Trace GET POST HEAD PUT HTTP/1503 HTTP/2612Rutgers 653,670 137,737 3,504 1,576 1,816,410 (47) 17,107,588 (120)DARPA 1,333,469 36,386 450,480 126,824 37,952,078 (121) 190,662,579 (205)
Figure 3.3: Statistics characterizing various aspects of the HTTP traces used in ourexperiments. The “Matches triggered” columns show the total number of signaturematches triggered by the traces as well as the number of distinct signatures thatmatched.
constructs.
HTTP traffic
We evaluated the performance of HTTP signatures by feeding two sets of HTTP traffic
traces to exec nfaobdd:
(1) Rutgers traces. We recorded HTTP traffic at the Web server of the Rutgers Com-
puter Science Department for a one week period in August 2009. This traffic was
collected using tcpdump, and includes whole packets of port 80 traffic from the Web
server. The traffic observed during this period consisted largely of Web traffic typi-
cally observed at an academic department’s main Web server; most of the traffic was
to view and query Web pages hosted by the department. Overall, this week-long trace
contained connections from 18,618 distinct source IP addresses. It contained a total
of 1.24GB worth of data, with the payloads in the network packets ranging in size
from 1 byte to 1,460 bytes, with an average of 126 bytes (standard deviation of 271).
Figure 3.3 presents statistics that characterize various other aspects of the trace. The
total number of matches triggered shown in Figure 3.3 is not indicative of the number
of alerts produced by Snort because our signature sets only contain patterns from the
pcre and uricontent fields of the Snort rules. The number of matches is large because
signatures contain patterns that are common in HTTP packets.
(2) DARPA traces. We used publicly available traces from the 1999 DARPA intrusion
detection evaluation data sets [48]. Privacy concerns preclude us from releasing the
network traces collected in our department. We therefore report experimental results
with the DARPA traces to ensure that our experiments can be repeated independently
by other researchers.
34
Command CWD LIST MDTM MKD PASS PORT PWDNumber of Instances 62,561 3,098 613 89 14,701 232 453
Command QUIT RETR SIZE STOR TYPE USER -Number of Instances 12,244 7,676 1,110 1,401 12,201 14,834 -
Figure 3.4: Statistics showing the number of commands observed in the FTP tracesused in our experiments.
We acknowledge that the DARPA traces are no longer in popular use for intrusion
detection research. Indeed, researchers have even argued that they are inadequate for
the purpose that they were originally developed (to test the effectiveness of intrusion
detection systems at detecting attacks; e.g., see [81, 51]). Nevertheless, they suffice as an
independent data point for our experiments as our goal is to measure the performance
of regular expression matching, and not to test their effectiveness at detecting real
attacks.
We used traces from weeks two, four and five of the DARPA data set (only the
traffic from these weeks contain actual instances of attacks). These traces contain
about 11.7GB worth of data, and contain connections from 8,331 distinct source IP
addresses. The payloads in the network packets ranged in size from 2 bytes to 1,460
bytes, with an average size of 351 bytes (standard deviation of 576).
FTP traffic
We evaluated the FTP signatures using two traces of live FTP traffic (from the com-
mand channel), obtained over a two week period in March 2010 from our department’s
FTP server; these FTP traces contained 19.4MB and 24.7MB worth of data. The traffic
consisted of FTP requests to fetch and update technical reports hosted by our depart-
ment. We observed traffic from 528 distinct source IP addresses during this period.
Statistics on various FTP commands observed during this period appear in Figure 3.4
(commands that were not observed are not reported). This traffic triggered 9,656 and
15,976 matches in the FTP/98 signature set, corresponding to matches on 6 and 5 dis-
tinct signatures, respectively. The payload sizes of packets ranged from 2 to 402 bytes
with an average of 40 bytes (standard deviation of 44).
35
Since our primary goal is to study the performance of NFA-OBDDs, we assume that
the HTTP and FTP traces have been processed using standard NIDS operations, such
as defragmentation and normalization. We fed these traces, which were in tcpdump
format, to exec nfaobdd.
3.3.2 Experimental Setup
All our experiments were performed on a Intel Core2 Duo E7500 Linux-2.6.27 machine,
running at 2.93GHz with 2GB of memory (however, our programs are single-threaded,
and only used one of the available cores). We used the Linux /proc file system to
measure the memory consumption of nfa2obdd and the Cudd ReadMemoryInUse util-
ity to obtain the memory consumption of exec nfaobdd. We instrumented both these
programs to report their execution time using processor performance counters. We
report the performance of exec nfaobdd as the number of CPU cycles to process each
byte of network traffic (cycles/byte), i.e., fewer processing cycles/byte imply greater
time-efficiency. All our implementations were in C++; we used the GNU g++ com-
piler suite (v4.3.2) with the O6 optimization level to produce the executables used for
experimentation.
3.4 Experimental Evaluation
This section reports the performance of NFA-OBDDs, and compares them against the
performance of NFAs, the PCRE package, which is a popular library for regular ex-
pression matching, and variants of DFAs. Our experiments show that NFA-OBDDs:
1. outperform traditional NFAs by up to three orders of magnitude while retaining
their space-efficiency (Section 3.4.2);
2. outperform or are competitive in performance with the PCRE package (Sec-
tion 3.4.3);
3. are competitive in performance with variants of DFAs while being drastically less
memory-intensive (Section 3.4.4).
36
Size of the input NFA |OBDD(T )| Construction
Signature Set #Reg. Exps. #States #Transitions #Nodes Time/MemoryHTTP (March 2007) 1503 159,734 3,986,769 659,981 305sec/176MB
HTTP (October 2009) 2612 239,890 5,833,911 989,236 453sec/176MBFTP (October 2009) 98 26,536 5,927,465 69,619 246sec/134MB
Figure 3.5: NFA-OBDD construction results.
We also present a detailed performance breakdown of NFA-OBDDs in terms of
OBDD operations (Section 3.4.5) and the impact of OBDD variable ordering on NFA-
OBDD performance (Section 3.4.6).
3.4.1 NFA-OBDDs: Construction and Performance
We used nfa2obdd to construct NFA-OBDDs from ε-free NFAs of the regular expression
sets. Figure 3.5 presents statistics on the sizes of the input NFAs, the size of the largest
of the four OBDDs in the NFA-OBDD (OBDD(T )), and the time taken and memory
consumed by nfa2obdd. For the NFA-OBDDs corresponding to the HTTP signature
sets, the vectors ~x and ~y had 18 Boolean variables each, while the vector ~i had 8
Boolean variables to denote the 256 possible ASCII characters. For the NFA-OBDD
corresponding to the FTP signature set, the vectors ~x and ~y had 15 Boolean variables
each. We also tried to determinize these NFAs to produce DFAs, but the determinizer
ran out of memory in all three cases.
Figure 3.6 depicts the performance of NFA-OBDDs. Figure 3.6(a) and Figure 3.6(b)
show the performance for the Rutgers and DARPA HTTP traces, while Figure 3.6(c)
shows the performance for both FTP traces. Figure 3.7 presents the raw throughput and
memory consumption of NFA-OBDDs observed for each signature set. The throughput
and memory consumption of NFA-OBDDs varies slightly across different traces for each
signature set. This difference was most pronounced for the HTTP/2612 signature set,
where the Rutgers trace was processed almost 1.8× faster than the DARPA trace. The
variance in performance can be attributed to the size and shapes of OBDD(F) (the
OBDD of the NFA’s frontier) observed during execution.
37
103
104
105
106
107
108
0
50
100
150
200
250
Processing time (cycles/byte)
Mem
ory
usag
e (M
B)
NFANFA−OBDDMDFAPCRE
104
105
106
107
108
0
100
200
300
400
500
Processing time (cycles/byte)
Mem
ory
usag
e (M
B)
NFANFA−OBDDMDFA−2604−sigsPCRE
(a) HTTP/1503 signature set (b) HTTP/2612 signature set
103
104
105
0
20
40
60
80
100
Processing time (cycles/byte)
Mem
ory
usag
e (M
B)
NFANFA−OBDDMDFA−95−sigsPCRE
(c) FTP/98 signature set
Figure 3.6: Comparing memory versus processing time of NFA-OBDDs, traditionalNFAs, the PCRE package, and different MDFAs for the Snort HTTP and FTP signa-ture sets. The x-axis is in log-scale. Note that Figure 3.6(b) and (c) only report theperformance of MDFAs with 2604 and 95 regular expressions, respectively.
3.4.2 Comparison with NFAs
We compared the performance of NFA-OBDDs with an implementation of NFAs that
uses Thompson’s algorithm. This algorithm maintains a frontier F , and operates as
follows: For each state s in the frontier F , fetch the set of targets Ts of the transitions
labeled σ (the input symbol), and compute the new frontier as F ′ =⋃s ∈ F Ts. The
performance and memory consumption of our NFA implementation (as also the PCRE
package and DFA variants in Section 3.4.3 and Section 3.4.4) was relatively stable across
all the traces for each signature set. Figure 3.6 therefore reports only the averages across
these traces.
As Figure 3.6 shows, NFA-OBDDs outperform NFAs for all three sets of signatures
38
Signature Set Processing time Memory
NFA-OBDDsHTTP/1503 6,844–7,582 cycles/byte 58MBHTTP/2612 22,968–41,588 cycles/byte 61MB
FTP/98 5,095 cycles/byte 8MB
NFAsHTTP/1503 1.3× 107 cycles/byte 53MBHTTP/2612 2.1× 107 cycles/byte 73MB
FTP/98 5.6× 105 cycles/byte 29MB
PCREHTTP/1503 2.1× 105–6.2× 105 cycles/byte 3.6MBHTTP/2612 1.3× 107–2.8× 107 cycles/byte 3.9MB
FTP/98 2,210–6,185 cycles/byte 5.9–6.2MB
MDFA (partial signature sets in Figure 3.6(b) and (c))HTTP/1503 1,000–15,951 cycles/byte 71–232MBHTTP/2604 15,891–49,296 cycles/byte 335–426MB
FTP/95 1,160–1,386 cycles/byte 54–82MB
Figure 3.7: Raw performance numbers for the charts shown in Figure 3.6.
by approximately three orders of magnitude for the HTTP signatures, and two orders
of magnitude for the FTP signatures. In Figure 3.6(a), for example, NFA-OBDDs
are approximately 1600×–1800× faster than NFAs while consuming almost the same
amount of memory. The difference in the performance gap between NFA-OBDDs and
NFAs for the HTTP and FTP signatures can be attributed to the number and structure
of these signatures. As discussed in Section 3.2.2, the benefits of NFA-OBDDs are more
pronounced if larger frontiers are to be processed. Since there are a larger number of
HTTP signatures, the frontier for the corresponding NFAs are larger. As a result,
NFA-OBDDs are much faster than the corresponding NFAs for HTTP signatures than
for FTP signatures. Nevertheless, these results clearly demonstrate that OBDDs can
improve the time-efficiency of NFAs without compromising their space-efficiency.
3.4.3 Comparison with the PCRE Package
We compared the performance of NFA-OBDDs with that of the PCRE package, which
is a popular library for regular expression matching implemented by recursive back-
tracking. Figure 3.6 reports three numbers for the performance of the PCRE package,
corresponding to different values of configuration parameters of the package (these
parameters determine whether PCRE must process input in the ASCII or Unicode
formats, and whether the matching algorithm must terminate after finding the first
39
matching substring or all matching substrings). In both Figure 3.6(a) and (b), NFA-
OBDDs outperform the PCRE package. The throughput of NFA-OBDDs is about
an order of magnitude better than the fastest configuration of the PCRE package for
the set HTTP/1503. The difference in performance is more pronounced for the set
HTTP/2612, where NFA-OBDDs outperform the most time-efficient PCRE configura-
tion by approximately 300×–500×. The poorer throughput of the PCRE package for
the second set of signatures is likely because the backtracking algorithm that it em-
ploys degrades in performance as number of paths to be explored in the NFA increases.
However, in both cases, the PCRE package is more space-efficient than NFA-OBDDs,
and consumes about 4MB memory.
For the FTP signatures (Figure 3.6(c)), NFA-OBDDs are about 2.5× slower than the
fastest PCRE configuration. However, unlike NFA-OBDDs which report all substrings
of an input packet that match signatures, this PCRE configuration only reports the
first matching substring. The performance of the PCRE configurations that report all
matching substrings is comparable to that of NFA-OBDDs.
Note that in all cases, the PCRE package outperforms our NFA implementation,
which use Thompson’s algorithm [84] to parse input strings. Despite this gap in perfor-
mance, Cox [28] shows that Thompson’s algorithm performs more consistently than the
backtracking approach employed by PCRE. For example, the backtracking approach
is vulnerable to algorithmic complexity attacks, where a maliciously-crafted input can
trigger the worst-case performance of the algorithm [75].
3.4.4 Comparison with DFA Variants
Multiple DFAs
We compared the performance of NFA-OBDDs with a variant of DFAs, called multiple
DFAs (MDFAs), produced by set-splitting [102]. We were unable to compare the per-
formance of NFA-OBDDs against DFAs because DFA construction ran out of memory.
However, prior work [77] estimates that DFAs may offer throughputs of about 50 cy-
cles/byte. An MDFA is a collection of DFAs representing a set of regular expressions.
40
Each DFA represents a disjoint subset of the regular expressions. To match an input
string against an MDFA, each constituent DFA is executed against the input string to
determine whether there is a match. MDFAs are more compact than DFAs because
they result in a less than multiplicative increase in the number of states. However, MD-
FAs are also slower than DFAs due to the reason that all the constituent DFAs must be
matched against the input string. An MDFA that has a larger number of constituent
DFAs will be more compact, but will also have lower time-efficiency than an MDFA
with fewer DFAs.
Using Yu et al.’s algorithms [102], we produced several MDFAs by combining the
Snort signatures in several ways, each with different space/time utilization. Each point
in Figure 3.6 denotes the performance of one MDFA (again, averaged over all the input
traces), which in turn consists of a collection of DFAs, as described above.
Producing MDFAs for the HTTP/2612 and FTP/98 signature sets was more chal-
lenging, primarily because these sets contained several structurally-complex regular
expressions that were difficult to determinize efficiently. For example, they contained
several signatures with large counters (i.e., sequences of repeating patterns) often used
in combination with the choice (i.e., re1|re2) operator. Our determinizer frequently ran
out of memory when attempting to construct MDFAs for such regular expressions. As
an example, consider the following regular expression in HTTP/2612:
/.*\x2FCSuserCGI\x2Eexe\x3FLogout\x2B[^\s]{96}/i
Our determinizer consumed 1.6GB of memory for this regular expression alone, before
aborting. Producing a DFA for such regular expressions may require more sophisticated
techniques, such as on-the-fly determinization [80] that are not currently implemented
in our prototype. We therefore decided to exclude problematic regular expressions,
and constructed MDFAs with the remaining ones (2604 for HTTP/2612 and 95 for
FTP/98). Note that the MDFAs for these smaller sets of regular expressions may be
more time-efficient and much more space-efficient than corresponding MDFAs for the
entire set of regular expressions.
41
Figure 3.6 shows that in many cases NFA-OBDDs can provide throughputs com-
parable to those offered by MDFAs while utilizing much less memory. For example,
the fastest MDFA in Figure 3.6(b) (constructed for a subset of 2604 signatures) offered
about 50% more throughput than NFA-OBDDs, but consumed 7× more memory. The
remaining MDFAs for this signature set had throughputs comparable to those of NFA-
OBDDs, but consumed 270MB more memory than NFA-OBDDs. The performance
gap between NFA-OBDDs and MDFAs was largest for FTP signature set, where the
MDFAs (for a subset of 95 signatures) were about 4× faster than the NFA-OBDD;
however, the MDFAs consumed 46MB-74MB more memory.
These results are significant for two reasons. First, conventional wisdom has long
held that traditional NFAs operate much slower than their deterministic counterparts.
This is also supported by our experiments, which show that the time-efficiency of NFAs
is three to four orders of magnitude slower than that of MDFAs. However, our results
show that OBDDs can drastically improve the performance of NFAs and even make
them competitive with MDFAs, which are a determinstic variant of finite automata. We
believe that further enhancements to improve the time-efficiency of NFA-OBDDs can
make them operate even faster than MDFAs (e.g., by relaxing the OBDD data struc-
ture, and thereby eliminating several graph operations in the Apply and Restrict
operations).
Second, NFA-OBDDs were produced automatically from regular expressions. In
contrast, processing the set of regular expressions to produce compact yet performant
MDFAs is a non-trivial exercise, often requiring time-consuming partitioning heuristics
to be applied [102]. Some of the partitioning heuristics described by Yu et al. also re-
quire modifications to the set of regular expressions, thereby changing their semantics.
Our own experience in attempting to construct MDFAs for HTTP/2612 and FTP/98
shows that this process is often challenging, especially if the regular expressions con-
tain complex structural patterns. In contrast, NFA-OBDDs can be constructed in a
straightforward manner from regular expressions, including those with counters and
other complex structural patterns, and are yet competitive in performance and more
compact than MDFAs.
42
Operation Fraction
AndAbstract 50%And 39%Map 4%
Acceptance check 7%
Figure 3.8: Fraction of time spent performing OBDD operations.
Hybrid Finite Automata
Finally, we also attempted to compare the performance of NFA-OBDDs with a variant
of DFAs, called hybrid finite automata (HFA) [12]. HFAs are a hybrid of NFAs and
DFAs, and are constructed by interrupting the determinization algorithm when it en-
counters structurally-complex patterns (e.g., large counters and .* patterns) that are
known to cause memory blowups when determinized. We used Becchi and Crowley’s
implementation [12] in our experiments, but found that it ran out of memory when
trying to construct HFAs from our signature sets. For example, the HFA construc-
tion process exhausted the available memory on our machine after processing just 106
regular expressions in the HTTP/1503 set. It may be possible to construct a collec-
tion of HFAs in a manner akin to MDFAs, but we did not consider this design in our
experiments.
3.4.5 Deconstructing NFA-OBDD Performance
We further analyzed the performance of NFA-OBDDs to understand the time consump-
tion of each OBDD operation. The results of this analysis can motivate techniques to
optimize OBDD packages to further improve the time-efficiency of NFA-OBDDs. The
results reported in this section are based upon the HTTP/1503 signature set; the results
with the other signature set were similar.
Figure 3.8 shows the fraction of time that exec nfaobdd spends performing various
OBDD operations as it processes a single input symbol. These include the operations
needed to compute a new frontier and those needed to check if the frontier contains an
accepting configuration.
As discussed earlier, exec nfaobdd uses the Cudd package to manipulate OBDDs.
43
Although Cudd implements the OBDD operations described in Chapter 2, it also imple-
ments composite operations that combine multiple Boolean operations; the composite
operations are often more efficient than performing the individual operations separately.
AndAbstract is one such operation, which allows two OBDDs to be combined using
an And operation followed by an existential quantification. AndAbstract takes a list
of Boolean variables to be quantified, and performs the OBDD transformations needed
to eliminate all these variables. The Map operation allows variables in an OBDD to be
renamed, e.g., it can be used to rename the ~y variables in G(~y) to ~x variables instead.
We implemented the Boolean operations required to obtain a new frontier (described
in Section 3.2.1) using one set of And, AndAbstract and Map operations. Each
AndAbstract step existentially quantifies 26 Boolean variables (the ~x and~i variables).
To check whether a frontier should be accepted, we used another And operation to
combine OBDD(F) and OBDD(A); the cost of an acceptance check appears in the last
row of Figure 3.8.
Figure 3.8 shows that the cost of processing an input symbol is dominated by the
cost of the AndAbstract and And operations to compute a new frontier. This is
because the sizes of the OBDDs to be combined for frontier computation are bigger
than the OBDDs that must be combined to check acceptance. Moreover, computing
new frontiers involves several applications of Apply and Restrict, as opposed to an
acceptance check, which requires only one Apply, thereby causing frontier computation
to dominate the cost of processing an input symbol.
These results suggest that an OBDD implementation that optimizes the AndAb-
stract and And operations (or a relaxed variant of OBDDs that allows for more
efficient AndAbstract and And operations) can further improve the performance of
NFA-OBDDs.
3.4.6 Impact of Variable Ordering on NFA-OBDD Performance
As mentioned in Chapter 2, the size of an OBDD is sensitive to the total order imposed
on its variables. Bryant [20] showed that it is NP-hard to determine whether a particular
variable ordering minimizes the size of an OBDD for a Boolean function. Variable order
44
103
104
105
106
107
108
109
35
40
45
50
55
60
Execution speed (cycles/byte)
Me
mo
ry u
sa
ge
(M
B)
i<x<y
x<i<y
y<x<i
i<(Interleave)xy
i<(Inerleave)yx
(Interleave)xy<i
(Interleave)yx<i
104
105
106
107
108
109
40
45
50
55
60
Execution speed (cycles/byte)
Mem
ory
usag
e (M
B)
i<x<yx<i<yy<x<ii<Interleave(xy)i<Interleave(yx)Interleave(xy)<iInterleave(yx)<i
(a) HTTP/1503 signature set (b) HTTP/2612 signature set
103
104
105
106
107
108
6
8
10
12
14
16
18
Execution speed (cycles/byte)
Mem
ory
usage (
MB
)
i<x<y
x<i<y
y<x<i
i<(Interleave)xy
i<(Inerleave)yx
(Interleave)xy<i
(Interleave)yx<i
(c) FTP/98 signature set
Figure 3.9: Impact of OBDD variable ordering on the performance of NFA-OBDDs.
also impacts the structure of OBDDs, and in our experience, the order of the variables
in the vectors ~i, ~x and ~y influences the performance of NFA-OBDDs.
We experimented with various total orders to empirically determine their impact
on the size and throughput of NFA-OBDDs before settling on one of the total orders
that yielded the best performance (~i < ~x < ~y) for the numbers reported earlier in this
section.
Figure 3.9 compares the performance of NFA-OBDDs constructed using seven total
orders (all four constituent OBDDs of each NFA-OBDD use the same total order):
1. ~i < ~x < ~y: the variables within each vector are arranged in increasing order from
most significant bit (MSB) to least significant bit (LSB);
2. ~x < ~i < ~y: in-vector order is similar to the case above;
3. ~y < ~x < ~i: in-vector order is similar to the case above;
45
4. ~i < Interleave[~x < ~y]: variables in the vector~i appear before the variables in ~x
and ~y. The variables in ~x and ~y appear interleaved, with a variable in ~x appearing
before the corresponding variable in ~y. The variables in each vector increase from
MSB to LSB;
5. ~i < Interleave[~y < ~x]: similar to the case above, except that variables in ~y
appear before their counterparts in ~x;
6. Interleave[~x < ~y] <~i: as above, except that the interleaved variables of ~x and
~y appear in the total order before the variables of ~i;
7. Interleave[~y < ~x] < ~i: as above, with the variables in ~y preceding their coun-
terparts in ~x.
The above variable orders are only a tiny fraction of the set of possible total orders,
which is exponential in the number of variables. However, they provide insight into
which orders empirically provide high-performance NFA-OBDDs. We considered NFA-
OBDDs for HTTP/1503, HTTP/2612 and FTP/98 to determine whether the perfor-
mance NFA-OBDDs for different signature sets is sensitive to the order imposed on the
variables. As before, we used exec nfaobdd to feed network traces to these NFA-OBDDs.
For these experiments, we only used the network traces collected at Rutgers.
Figure 3.9 presents the results of these experiments, showing both the throughput
and overall memory consumption of exec nfaodd. It shows that the total order ~i <
~x < ~y performs consistently well across all three signature sets, but consumes more
memory than the most compact implementation. The total order ~i < Interleave[~x
< ~y] also provides competitive performance for the NFA-OBDDs of all three signature
sets. However, the performance gap between the best and the worst total orders (~y < ~x
<~i) is almost four orders of magnitude. We used the order~i < ~x < ~y for the experiments
reported earlier in this section, though we could have used ~i < Interleave[~x < ~y] as
well, with a slightly smaller memory consumption for the FTP/98 NFA-OBDD.
This figure also shows that there is no direct correlation between the size and per-
formance of exec nfaobdd for different variable orders. Although the time complexity of
46
algorithms such as Apply and Restrict asymptotically depends on the size of their
input OBDDs, factors such as the structure of the OBDDs also affect the number of
graph operations, and therefore the performance of the corresponding NFA-OBDDs.
These experiments lead us to conclude that the performance of NFA-OBDDs is
indeed sensitive to the total order imposed on its variables. The vast search space of
total orders diminishes hopes of a tractable algorithm to identify the total order that
would yield the best-performing NFA-OBDD for a given signature set. Nevertheless,
in practice, experiments with a few total orders (such as the ones in Figure 3.9) can
help empirically determine high-performance NFA-OBDDs. Future work could develop
heuristics that leverage the structure of the regular expressions in the input signature
set to determine “good” total orders.
3.5 Matching Multiple Input Symbols
The preceding sections assumed that only one input alphabet is processed in each
step. However, there is growing interest to develop techniques for multi-byte matching,
i.e., matching multiple input symbols in one step. Prior work has shown that multi-byte
matching can improve the throughput of NFAs [18, 14]. In this section, we present one
such technique, k-stride NFAs [18], and show that OBDDs can further improve the
performance of k-stride NFAs.
A k-stride NFA matches k symbols of the input in a single step. Given a traditional
(i.e., 1-stride) ε-free NFA (Q, Σ, ∆, q0, F ), a k-stride NFA is a 5-tuple (Q, Σk, Γ, q0, F ),
whose input symbols are k-grams, i.e., elements of Σk. The set of states and accepting
states of the k-stride NFA are the same as those for the 1-stride NFA. Intuitively, the
transition function Γ of the k-stride NFA is computed as a k-step closure of ∆, i.e., (s,
σ1σ2 . . . σk, t) ∈ Γ if and only if the state t is reachable from state s in the original
NFA via transitions labeled σ1, σ2, . . ., σk. The algorithm to compute Γ from ∆ must
also consider cases where the length of the input string is not a multiple of k. This is
achieved by padding the input string with a new “do-not-care” symbol, and introducing
this symbol in the labels of selected transitions. We refer the interested reader to prior
47
Figure 3.10: 2-stride NFA for Figure 3.1.
work [18, 14] for a detailed description of the construction.
Figure 3.10 presents an example of a 2-stride NFA corresponding to the NFA in
Figure 3.1. The do-not-care symbol is denoted by a “•”. Thus, for instance, an input
string 101 would be padded with • to become 101•. The 2-stride NFA processes digrams
in each step. Thus, the first step would result in a transition from state A to itself A
(because of the transition labeled 10), followed by a transition from A to B when it
reads the second digram 1•, thereby accepting the input string.
A k-stride NFA (Q, Σk, Γ, q0, F ) can readily be converted into a k-stride NFA-
OBDD using the same approach described in Section 3.2. The main difference is that
the input alphabet is Σk (plus a new symbol “•”); the vector ~i would therefore contain
k times as many Boolean variables. However, two additional details must be addressed
when applying k-stride NFAs (and the corresponding NFA-OBDDs) to the problem of
matching traffic patterns in a NIDS, namely, (i) adapting k-stride NFAs to work in the
streaming model; and (ii) reducing the space consumption of k-stride NFAs. These are
discussed next.
3.5.1 Adapting to the Streaming Model
When operating a 1-stride NFA to process a stream of inputs, the frontier of states must
be checked after processing each input symbol to determine whether the input triggered
a match. However, this technique does not suffice for k-stride NFAs in the streaming
model. To see why, consider how the NFAs in Figure 3.1 and Figure 3.10 would process
the input 10. The 1-stride NFA would trigger an alert after the first symbol has been
processed (because the frontier F = {A, B} contains an accepting state). In contrast,
48
Figure 3.11: The NFA in Figure 3.10 adapted for streaming.
the 2-stride NFA would process the entire input 10 in one step, resulting in the frontier
F = {A}, which does not contain the accepting state. Therefore, for k-stride NFAs, it
does not suffice to simply check F to determine acceptance.
To address this problem, the algorithm to convert a 1-stride NFA into a k-stride
NFA must “remember” that an accepting state of the 1-stride NFA was encountered
when computing the k-step closure of the transition relation ∆ of the 1-stride NFA. One
way to compute k-stride NFAs that achieve this goal is by adding a new accepting state
to the k-stride NFA. This algorithm adds incoming transitions to the new accepting
state suitably from other states in the k-stride NFA to “remember” that a substring
of the input k-gram would have triggered a match in the 1-stride NFA. The resulting
k-stride NFA (Q∗, Σk, Γ∗, q0, F ∗) has the same semantics as the 1-stride NFA in the
streaming model (i.e., they both accept the same set of strings), but adds a state to the
1-stride NFA. We refer the reader to prior work [18, 14] for details on this algorithm.2
This k-stride NFA can be converted into an NFA-OBDD using the same technique
presented in Section 3.2, and operated in the same way.
Figure 3.11 presents an example of this approach applied to the 2-stride NFA in
Figure 3.10. The new transition from state A to accepting state C on input 10 uses the
state C to remember that the substring 1 of 10 triggered a match in the corresponding
1-gram NFA. The state C will therefore be in the frontier when an input 10 is processed
at state A, thereby triggering a match when the OBDD operation to check acceptance
is performed. However, there are no outgoing transitions from C. Thus, this state
is removed from the frontier when the next digram is processed by the 2-stride NFA
2This algorithm can also be adapted easily to identify the regular expression that matched the input.
49
(unless that digram also triggers acceptance).
3.5.2 Reducing Space Consumption using Alphabet Compression
The transition table of a k-stride NFA can have O(|Q| × |Σ|k) entries, each of which
can be of size O(|Q|) (to store the set of “next” states, which can be O(|Q|)), which
can result in a memory utilization of O(|Q|2 × |Σ|k). However, this asymptotic limit is
rarely reached in practice, and transition tables encountered in practice are generally
sparse. In particular, there may be several transitions labeled with the same set of
symbols from the alphabet Σk. That is, if for any state s ∈ Q, and input symbols σ1
and σ2, if Γ(s, σ1) = Γ(s, σ2), then the symbols σ1 and σ2 can potentially be merged
into an equivalence class. This idea is called alphabet compression [40, 8, 18, 13, 45].
The output of an alphabet compression algorithm is a partition of Σk into equiv-
alence classes. Each equivalence class is assigned a symbol, thereby yielding a new
alphabet E with fewer elements than Σk. An alphabet compression algorithms also
outputs an encoding function m : Σk → E that translates elements in Σk to elements in
E . In the above example, m(σ1) = m(σ2). The transitions of an alphabet-compressed
NFA would also be appropriately relabeled to use symbols from E instead. Similarly,
symbols in the input would also have to be appropriately translated using m before
they are passed to the NFA for matching. An alphabet-compressed NFA can also be
converted into an NFA-OBDD using the same techniques described in Section 3.2, and
operated in the same way.
We implemented the alphabet compression algorithm described by Brodie et al. [18]
for 2-stride NFAs and empirically found that an alphabet compression reduces the
memory consumption of 2-stride NFAs. However, this alphabet compression algorithm
itself is quite resource-intensive because it operates on the transition relation of the
entire (2-stride) NFA, thereby causing the algorithm to exhaust the available memory
on our machine. For example, we found that Brodie et al.’s algorithm [18] frequently
ran out of memory when processing 2-stride NFAs obtained by combining more than
200 regular expressions from the HTTP/1503 and HTTP/2612 signature sets.
We developed a scheme (Algorithm 5) that applies Brodie et al.’s algorithm to
50
Algorithm: Combine Compressed Alphabet (X, Y )Input : X = {X1, . . ., Xp} and Y = {Y1, . . ., Yq}, the
compressed alphabet of NFAX and NFAY
Output : Z, the compressed alphabet of NFAZ = NFAX ∪NFAY
Z = X ∪ Y ;1
Z ′ = ∅;2
foreach (A ∈ Z) do3
split = false;4
foreach (B ∈ Z such that B 6= A) do5
if (A ∩B 6= ∅) then6
Z ′ = Z ′ ∪ (A ∩B) ∪ (A−B) ∪ (B −A);7
split = true;8
if (split == false) then Z ′ = Z ′ ∪ A;9
if (Z 6= Z ′) then10
Z = Z ′;11
goto line 2;12
return Z;13
Algorithm 5: Combining the compressed alphabets oftwo NFAs.
smaller NFAs (thereby limiting the algorithm’s memory consumption), and merges the
results to obtain a compressed alphabet for the combination of the smaller NFAs. Our
scheme is based upon the following fact: if two symbols σ1 and σ2 appear in the same
equivalence class of the compressed alphabet of each of two NFAs (say, NFAX and
NFAY ), then they will appear in the same equivalence class of the NFA (say NFAZ)
obtained by merging the set of states and transitions of NFAX and NFAY . Algorithm 5
uses this observation to combine the compressed alphabet X and Y of NFAX and NFAY
and produce the compressed alphabet Z of the NFAZ . It proceeds by combining X and
Y into a set Z, and iteratively refining Z (in line 7) so that if any two symbols σ1 and
σ2 appear in the same equivalence class in the output set Z, then they also appear in
the same equivalence classes in both X and Y .
Our experiments confirm the scalability of Algorithm 5. For example, we were
able to use this algorithm to compress the alphabet of the 2-stride NFA representing
the 2604 signatures from the HTTP/2612 set. We excluded eight signatures from the
HTTP/2612 signature set because alphabet compression ran out of memory for these
51
eight signatures. These eight signatures contained complex structural patterns that
caused Brodie et al.’s compression algorithm [18] to run out of memory for 2-stride
NFAs representing each of these signatures (so Algorithm 5 was never invoked). Further
research is needed to develop alphabet compression algorithms that can handle such
complex signatures. We did so by first splitting these signatures into 61 smaller subsets,
applying Brodie et al.’s alphabet compression to the 2-stride NFAs representing these
subsets, and combining the compressed alphabet pairwise using Algorithm 5. The size
of the compressed alphabet of the 2-stride NFA was 11,119. In contrast, Brodie et al.’s
algorithm ran out of memory when processing the 2-stride NFA representing set of 2604
signatures in its entirety.
3.5.3 Performance of k-stride NFA-OBDDs
To evaluate the performance of k-stride NFAs and k-stride NFA-OBDDs, we used a
toolchain similar to the one discussed in Section 3.3, but additionally applied alphabet
compression. Although our implementation accepts k as an input parameter, we have
only conducted experiments for k = 2 because our alphabet compression algorithm ran
out of memory for larger values of k.
The setup that we used for the experiments reported below is identical to that
described in Section 3.3. However, we only used two sets of Snort signatures in our
measurements: (1) HTTP/2604, a subset of 2604 HTTP signatures from HTTP/2612
and (2) FTP/95, a subset of 95 FTP signatures from FTP/98 (we omitted three signa-
tures for the reason that Brodie et al.’s compression algorithm [18] ran out of memory
for 2-stride NFAs representing each of these signatures.).
Figure 3.12(a) presents the size of the 1-stride and 2-stride NFA-OBDDs, and the
size of the compressed alphabet. In each case, the alphabet compression algorithm
took over a day to complete, and consumed about 1.6GB memory. Figure 3.12(b) and
(c) compare the performance of 1-stride NFA-OBDDs with the performance of 2-stride
NFA-OBDDs (using the traces described in Section 3.3). As expected, these figures
show that matching multiple bytes in the input stream improves the performance of
52
Signature Set #States #Transitions in NFA (1-stride | 2-stride) #Alphabet SymbolsHTTP/2604 237,972 5,567,317 136,212,770 11,119
FTP/95 15,266 3,361,065 5,136,420 848
(a) 2-stride NFA-OBDD construction results
0 2 4 6 8 10
x 104
0
10
20
30
40
50
60
70
80
90
100
Processing time (cycles/byte)
Me
mo
ry u
sa
ge
(M
B)
1−stride−NFA−OBDD
2−stride−NFA−OBDD
103
104
105
106
0
50
100
150
200
250
300
350
400
Processing time (cycles/byte)
Mem
ory
usag
e (M
B)
1−stride−NFA2−stride−NFA1−stride−NFA−OBDD2−stride−NFA−OBDD
(b) HTTP/2604 signature set (c) FTP/95 signature set
Figure 3.12: Memory versus throughput for 1-stride and 2-stride NFA-OBDDs. Fig-ure 3.12(c) also shows the performance of the corresponding 1-stride and 2-stride NFAs.
NFA-OBDDs, roughly doubling the throughput in each case. In fact, the 2-stride NFA-
OBDD of the HTTP/2612 signature set more than doubled (2.26×) the throughput of
the 1-stride NFA-OBDD on the DARPA trace.
These experiments also demonstrate that the use of OBDDs allows 2-stride NFA-
OBDDs to be more space-efficient than NFAs. While we were able to create and operate
a 2-stride NFA-OBDD for the HTTP/2604 signature set, the 2-stride NFA for this
signature set exhausted the memory available on our machine. We were able to create
a 2-stride NFA for the FTP/95 signature set; Figure 3.12(c) depicts the performance
of both the 1-stride and 2-stride NFAs for this signature set. As this figure shows, the
memory utilization of the 2-stride NFA-OBDD is about two orders of magnitude smaller
than that of the 2-stride NFA, and is also about two orders of magnitude faster. These
results lead us to conclude that 2-stride NFA-OBDDs are drastically more efficient in
time and space than 2-stride NFAs.
3.6 Related Work
Early NIDS exclusively employed strings as attack signatures. String-based signatures
are space-efficient, because their size grows linearly with the number of signatures. They
53
are also time-efficient, and have O(1) matching algorithms (e.g., Aho-Corasick [7]).
They are ideally suited for wire-speed intrusion detection, and have been implemented
both in software and hardware [31, 57, 68, 43, 83, 26, 49, 82, 83, 86, 87, 72]. However,
prior work has shown that string-based signatures can easily be evaded by malware
using polymorphism, metamorphism and other mutations [36, 41, 65, 70]. The research
community has therefore been investigating sophisticated signature schemes, such as
session signatures [67, 80, 101] and vulnerability signatures [88, 19], that require the full
power of regular expressions. This in turn, has spurred both the research community
to develop improved algorithms for regular expression matching, as well as NIDS ven-
dors, who are increasingly beginning to deploy products that use regular expressions
(e.g., Tipping Point [2], LSI Corporation [1] and Cisco [24]).
DFAs provide high-speed matching, but DFAs for large signature sets often consume
gigabytes of memory. Researchers have therefore investigated techniques to improve
the space-efficiency of DFAs. These include, for example, techniques to determinize
on-the-fly [80]; MDFAs, which combine signatures into multiple DFAs (as discussed
in Section 3.4) [102]; D2FAs [46], which reduce the memory footprint of DFAs via
edge compression; and XFAs [76, 77], which extend DFAs with scratch memory to
store auxiliary variables, such as bitmaps and counters, and associate transitions with
instructions to manipulate these variables. Some DFA variants (e.g., [46, 77, 18, 52])
also admit efficient hardware implementations.
These techniques use the time-efficiency of DFAs as a starting point, and seek to
reduce their memory footprint. In contrast, our work uses the space-efficiency of NFAs
as a starting point, and seeks to improve their time-efficiency. We believe that both
approaches are orthogonal and may be synergistic. For example, it may be possible to
use OBDDs to also improve the time-efficiency of MDFAs.
Our approach also provides advantages over several prior DFA-based techniques.
First, it produces NFA-OBDDs from regular expressions in a fully automated way.
This is in contrast to XFAs [76], which required a manual step of annotating regular
expressions. Second, our approach does not modify the semantics of regular expressions,
i.e., the NFA-OBDDs produced using the approach described in Section 3.2 accept the
54
same set of strings as the regular expressions that they were constructed from. MDFAs,
in contrast, employ heuristics that relax the semantics of regular expressions to improve
the space-efficiency of the resulting automata [102]. Last, because these techniques
operate with DFAs, they may sometimes encounter regular expressions that are hard
to determinize. For example, Smith et al. [76, Section 6.2] present a regular expression
from the Snort data set for which the XFA construction algorithm runs out of memory.
In contrast, our technique operates with NFAs and therefore does not encounter such
cases.
Research on NFAs for intrusion detection has typically focused on exploiting par-
allelism to improve performance [39, 54, 25, 71]. NFA operation can be parallelized
in many ways. For example, a separate thread could be used to simulate each state
in an NFA’s frontier. Else, a set of regular expressions can be represented as a col-
lection of NFAs, which can then be operated in parallel. FPGAs have been used to
exploit this parallelism to yield high-performance NFA-based intrusion detection sys-
tems [39, 54, 25, 71].
Although not explored in this chapter, OBDDs can potentially improve NFA per-
formance in parallel execution environments as well. For example, consider a NIDS
that performs signature matching by operating a collection of NFAs in parallel. The
performance of this NIDS can be improved by converting it to use a collection of NFA-
OBDDs instead; in this case, OBDDs improve the performance of each NFA, thereby
increasing the throughput of the NIDS as a whole. Finally, NFA-OBDDs may also
admit a hardware implementation. Prior work has developed techniques to implement
OBDDs in CAMs [104] and FPGAs [74]. Such an implementation of NFA-OBDDs can
be used to improve the performance of hardware-based NFAs as well.
3.7 Summary
Many recent algorithms for regular expression matching have focused on improving
the space-efficiency of DFAs. This chapter sought to take an alternative viewpoint,
55
and aimed to improve the time-efficiency of NFAs. To that end, we developed NFA-
OBDDs, a representation of regular expressions in which OBDDs are used to operate
NFAs. Our prototype software-based implementation with Snort signatures showed
that NFA-OBDDs can outperform NFAs by almost three orders of magnitude. We also
showed how OBDDs can enhance the performance of NFAs that match multiple input
symbols in a single step.
In summary, the main contribution of this chapter is in showing that the use of
OBDDs drastically improves NFA performance and brings them within the realm of
feasible use in intrusion detection systems. In the light of this contribution and the
space-efficiency of NFAs, we conclude with a call for further research on the use of
NFAs to represent signatures.
56
Chapter 4
Fast Submatch Extraction using OBDDs
4.1 Introduction
Pattern languages commonly used by NIDS are regular expressions extended with
other features. One of the important features is the capturing group. A captur-
ing group is a syntax used in modern regular expression implementations to spec-
ify a subexpression of a regular expression. Given a string that matches the reg-
ular expression, submatch extraction is the process of extracting the substrings cor-
responding to those subexpressions. In Snort 2012 rule set [78], more than 10% of
pcre fields of the HTTP rules contain capturing groups. When a pattern contain-
ing a capturing group matches an input string, the submatch construct can iden-
tify parts of the input that are of interest to security administrators for analysis.
For a regular expression like “username=(.*),hostname=(.*)” with an input string
“username=Bob,hostname=Foo”, submatch construct can extract the two substrings
“Bob” and “Foo” specified by the two capturing groups (the subexpressions wrapped
by the two pairs of parentheses).
An important network security application that makes extensive use of submatch
extraction is Security Information and Event Management (SIEM) [94], which provides
real-time analysis of security alerts generated by network hardware and applications.
SIEM systems often collect data from a variety of hardware and software sensors, and
must therefore normalize this data into a common format by extracting common fields
from various data sources. SIEM systems use submatch extraction during data nor-
malization and alert reporting. In a typical SIEM system, more than 90% of regular
expressions used for data normalization contain capturing groups.
In both SIEM systems and NIDS, scalability of pattern matching and submatch
57
extraction is key. NIDS are often deployed over high-speed network links, which require
algorithms for pattern matching and submatch extraction be efficient enough to provide
high throughput intrusion detection on large volume of network traffic. Similarly, a
typical SIEM system collects logs from hundred of devices and applications, and must
process terabytes of logs every day in enterprise networks.
There is plenty of prior work on making pattern matching for regular expressions
time-efficient [12, 97, 22, 25, 52, 18, 46, 39, 71, 50] and space-efficient [102, 13, 12, 80, 76,
77, 53, 60]. However, most of these works only considered regular expressions containing
no capturing groups, i.e., they did not support submatch extraction. Existing solutions
for submatch extraction are based on non-deterministic finite automata (NFAs) [47,
29] or recursive backtracking [62]. While NFAs are space-efficient and can extract
submatches with a compact memory footprint, they are not time-efficient because they
maintain a frontier, i.e., a set of states in which a NFA can be at any instant, that
can contain O(n) states where n is the NFA’s number of states. This leads to an
O(n) operation time for the NFA for each input symbol. Google’s RE2 package [29]
uses a combination of DFAs and NFAs to improve the time efficiency of submatch
extraction [29]. RE2 constructs DFAs on demand (determination on the fly) and uses
DFAs to locate a pattern’s overall match location in an input string and then uses a
NFA-based method to extract submatches. The time-efficiency of DFAs, however, often
comes with a cost of state blow-up. RE2 can be very slow when the DFA construction
fills up the limited state cache; it has to empty the state cache and restart the DFA
construction process. Moreover, the actual submatch extraction of RE2 is performed
using a NFA-based method, which is space-efficient, but not time-efficient. Tools such
as PCRE and the regex libraries in Java, Perl, and Python use recursive backtracking
for regular expression matching. The execution time of backtracking, however, can be
exponential for certain types of regular expressions [28]; NIDS that employ backtracking
suffer from algorithmic complexity attacks [75].
In this chapter, we present a novel approach to perform submatch extraction for reg-
ular expression-like pattern languages. Our approach is an extension of the NFA-OBDD
model [97, 99] described in Chapter 3. While both works employ the ordered binary
58
decision diagram (OBDD) data structure, the NFA-OBDD approach did not consider
the submatch construct, making it inapplicable to the 90% of regular expressions in a
typical SIEM system. We extend the NFA-OBDD model in two ways: (1) we propose
an approach to annotate capturing groups in regular expressions, and (2) present a
new approach to perform submatch extraction. To demonstrate the feasibility of our
approach, we evaluated our approach using patterns extracted from the Snort NIDS [5]
and a commercial SIEM product. Our experiments show that our approach achieves
its ideal performance when patterns are combined. In the best case, our approach is
faster than RE2 and PCRE by one to two orders of magnitude. In particular, we make
the following contributions:
• We propose a new approach to tag capturing groups in a regular expression,
and extend Thompson’s NFA construction approach [38] to convert a regular
expression with capturing groups to a tagged-NFA.
• We present a novel and time-efficient technique (henceforth called Submatch-
OBDD) to perform submatch extraction for regular expression-like pattern lan-
guages.
• We evaluated our approach’s time efficiency and space efficiency by matching
the patterns from the Snort system [5] and a commercial SIEM system with
network traces, synthetic traces, and enterprise event logs, and then compared
our performance with two popular regular expression engines: RE2 and PCRE.
The remainder of the chapter is organized as follows. We present our design and
implementation of Submatch-OBDD in Section 4.2, followed by our experimental eval-
uation in Section 4.3. We discuss related work in Section 4.4 and summarize Submatch-
OBDD in Section 4.5.
4.2 Design and Implementation
We first give an overview of our approach before describing the technical details.
59
4.2.1 Solution Overview
A key observation underlying our approach is that adding a capturing group to a
regular expression does not change the language defined by the regular expression. It
is known that every language defined by a regular expression is also defined by a finite
automaton [38]. However, traditional automata do not support capturing groups. We
present an approach to annotate capturing groups in regular expressions and extend
Thompson’s approach to convert a regular expression with capturing groups to a NFA-
like machine where transitions within capturing groups are tagged. We then present a
novel approach to do submatch extraction using the tagged-NFAs. To improve the time
efficiency of submatch extraction, we represent tagged-NFAs with symbolic Boolean
functions, and manipulate the Boolean functions using ordered binary decision diagrams
(OBDDs).
4.2.2 Tagging NFAs for Submatch
The syntax of regular expressions with capturing groups on an alphabet Σ is
E ::= ε ∪ a ∪ EE ∪ E|E ∪ E ∗ ∪ (E) ∪ [E]
where a stands for an element of Σ, and ε denotes for zero occurrence of a symbol. We
use square brackets [, ] to group terms in a regular expression that are not capturing
groups, because the usual parentheses (, ) are reserved for marking capturing groups.
If X and Y are sets of strings we use XY to denote {xy : x ∈ X, y ∈ Y }, and X|Y to
denote X ∪ Y . We use E∗ to denote the closure of E under concatenation.
We use tags to distinguish the capturing groups within a regular expression. Given a
regular expression containing c capturing groups, we assign tags t1, t2, ..., tc to each cap-
turing group in the order of their left parentheses as E is read from left to right. We de-
note the set of tags by T = {t1, t2, ..., tc}. We use tag(E) to refer to the resulting tagged
regular expression. For example, if E = ((a∗)|b)(ab|b) then tag(E) = ((a∗)t2 |b)t1(ab|b)t3 .
The language L(F ) for a tagged regular expression F = tag(E) is a set of tagged
strings, defined by L(ε) = {ε}, L(a) = {a}, L(F1F2) = L(F1) · L(F2), L(F1|F2) =
L(F1)∪L(F2), L(F∗) = L(F )∗, L([F ]) = L(F ), and L((F )t) = {αt : α ∈ L(F )}, where
60
()t denotes a capturing group with tag t and αt denotes the string α tagged with t. A
string α is tagged by t, if and only if each character in α is tagged by t. Substrings of α
may be tagged by other tags. Since capturing groups can be nested, a character can be
tagged by multiple tags. An example of tagged string for a tagged regular expression
is: abt1bt1bt1 ∈ L(a(b∗)t1).
Definition A valid assignment of submatches for a string α that matches regular ex-
pression E is a map sub : {t1, t2, . . . tc} → Σ∗ such that there exists β ∈ L(tag(E))
satisfying the following:
(i) β|Σ = α, where β|Σ represents the projection of characters in β onto their
corresponding values of Σ;
(ii) if ti occurs in β then sub(ti) is the last consecutive sequence of characters that
are assigned with tag ti;
(iii) if ti does not occur in β, then sub(ti) = null;
For example, consider the regular expression [(a|c)(b|d)]∗ with input string abcd. A
valid submatch assignment satisfying the above conditions is sub(t1) = c, sub(t2) = d.
It is well known that a regular expression can be converted to an ε-NFA that defines
the same language using Thompson’s approach [38]. An ε-NFA can be reduced to an ε-
free NFA through an ε-closure mechanism [38]. In this chapter, we extend Thompson’s
algorithm in a way such that it can convert a regular expression containing capturing
groups to a tagged ε-NFA defining the same language. A tagged ε-NFA can be described
by a 7-tuple A = (Q,Σ, T, δ, γ, S, F ), where Q is a finite set of states, Σ is a finite set
of input symbols, T is a finite set of tags that each represents a capturing group, S is
a set of start states, F is a set of accept states, δ is the transition function, and γ is a
tag output function γ : Q × Σ × Q → 2T , which associates each transition with a tag
set (that can be empty).
A tagged NFA can be constructed as follows: starting from the three base cases
shown in Figure 4.1. Figure 4.1(a) is the NFA of expression ε, Figure 4.1(b) handles
61
ɛ
(a) (b) (c)
Figure 4.1: Constructing tagged NFAs for (a) NFA of ε; (b) NFA of an empty regularexpression; (c) NFA of a symbol a wrapped by capturing groups denoted by τ .
R
S
R S
R
(a)
(b)
(c)
Figure 4.2: The (a) union R|S, (b) concatenation RS, and (c) closure constructs R∗ oftagged NFA construction from a regular expression.
the empty regular expression, and Figure 4.1(c) gives the NFA of a single symbol
a with a set of tags τ ∈ 2T corresponding to capturing groups associated with the
illustrated transition. More complex tagged NFAs can be constructed using the union,
concatenation, and closure constructs, by combining smaller tagged NFAs as shown in
Figure 4.2. A tagged NFA constructed using the above approach contains ε transitions.
Such a tagged NFA can be converted to an ε-free NFA in a manner akin to the standard
ε-closure algorithm for standard NFAs. We denote the corresponding ε-free tagged NFA
as A1 = (Q1,Σ, T, δ1, γ1, S1, F1), where the components of A1 are defined in a manner
akin to A (the tagged ε-NFA).
Example Consider an example regular expression “(a*)aa”. Figure 4.3 shows an
the tagged ε-NFA, where the capturing group is tagged by t1. Figure 4.4 shows the
corresponding ε-free tagged-NFA.
62
1 2 3 4 5 6 7 8 a/t1
a a
Figure 4.3: The tagged ε-NFA of “(a*)aa”, where the transition associated with theleftmost character a is tagged by t1 because “a*” is within a capturing group.
1 2
a/t1
a a
3
Figure 4.4: The tagged ε-free NFA of “(a*)aa” after ε-elimination, where state numbers1, 2, and 3 are obtained by renaming and merging states 2, 5, 7, and 8 in the ε-NFAduring ε-closure calculation.
4.2.3 Operations on Tagged NFAs
The transition function δ1 and tag output function γ1 of a tagged ε-free NFA can be
represented by a four-column table denoted by ∆(x, i, y, τ), which is a set of quadruples
(x, i, y, τ) such that there is a transition labeled by input symbol i from state x to state
y with a set of output tags τ . Table 4.1 shows the tagged transition table of the example
NFA in Figure 4.4, where each tagged transition is represented by a row in the table.
∆(x, i, y, τ) allows us to perform two key operations on tagged NFAs — match test and
submatch extraction, where the match test checks whether an input string is accepted
by a tagged NFA; if so, the submatch extraction procedure returns a valid assignment
of submatches of the input string.
x i y τ
1 a 1 {t1}1 a 2 φ2 a 3 φ
Table 4.1: Transition table of the tagged NFA in Figure 4.4.
63
Frontier {1} {1, 2} {1, 2, 3} {1, 2, 3} {1, 2, 3}
Input a a a a
{t1} {t1} {t1} {t1} 1
2
3
Figure 4.5: Example of frontier derivation for the tagged NFA in Figure 4.4 with inputstring “aaaa”. The dark circles of each column stands for the frontier states afterconsuming an input symbol. A light gray circle means that a state is not in a frontierstate set. An arrow between two circles represent a transition. An arrow is labeled bya submatch tag if the denoted transition is within a capturing group.
Match Test
Testing whether an input string matches a regular expression with capturing groups can
be done by operating its tagged NFA. The process is similar to operating a traditional
NFA, except that we need to do bookkeeping to be used for submatch extraction. The
match test of a tagged NFA for a given input string a1a2 . . . al ∈ Σ∗ is performed by
consuming one input symbol at a time, and modifying the frontier of active states
appropriately using the transition function δ1. As we modify the frontier, we also
record the transitions that the tagged-NFA makes by recording quadruples that store
the states traversed by each transition, as well as the tags corresponding to those
transitions. We denote these sets of transitions using ∆1, ∆2, and ∆l, where each ∆i
is a set of quadruples of the form (x, i, y, τ) corresponding to a source state, an input
symbol, a target state, and the corresponding tag.
After the last input symbol al is consumed, we check whether any state in the
frontier set belongs to accept states F1. If so, the input string a1a2 . . . al is accepted by
the tagged NFA A1, i.e., the input string matches the regular expression defined by A1.
Example Consider the example regular expression “(a*)aa” in Figure 4.4, where its
64
tagged transitions ∆(x, i, y, τ) are shown in Table 4.1. For convenience, we denote the
three quadruples in Table 4.1 by row1, row2, and row3. Let us use “aaaa” as an input
string. For the ith input symbol, we use Xi to denote the current frontier set and Yi
to denote the next frontier set after the symbol is consumed. Start from the first input
symbol a and start states S1 = {1}, we have ∆1 = {row1, row2}, Y1 = {1, 2}. Rename
Y1 to X2 and follow the process described in the frontier derivation, we can obtain
∆2 = {row1, row2, row3}, X3 = {1, 2, 3}, ∆3 = {row1, row2, row3}, X4 = {1, 2, 3}, and
∆4 = {row1, row2, row3}, X5 = {1, 2, 3}. Figure 4.5 visualizes how the frontier set
evolves after consuming each input symbol during the match test. An arrow between
two nodes denotes a transition. If an arrow is tagged, it means that a transition is
associated with one or more submatch tags, e.g., the {t1} above the arrow between
states 1 and 1 indicate this transition is within a capturing group.
Submatch Extraction
If an input string is accepted by a regular expression that has capturing groups, the
submatches of the input string need to be extracted. Recall that the NFA match
test process described above actually considers all possible branches (transitions) when
consuming each input symbol. If the input string is accepted by the tagged NFA, then
there exists at least one path from a start state to an accept state, where edges of
the path denote transitions between states and are sequentially associated with the
individual symbols of the input string. An edge may be associated with one or more
submatch tags, or no tag at all. For example, the bold arrows in Figure 4.5 shows a path
from start state 1 to the accept state 3. Such a path allows us to perform submatch
extraction.
In fact, any path from a start state to an accept state during a match test on a tagged
NFA generates a valid assignment of submatches. A review of the match test process
can help us to understand why: Since a path from a start state to an accept state is
a number of sequential and valid operations of a tagged NFA on an input string, the
assignment of submatch tags on each input symbol is also valid. The collections of the
last consecutive sequences of symbols associated with the same tags that satisfy the
65
conditions of the definition in Section 4.2.2 generate a valid assignment of submatches.
Path Finding Assume an input string a1a2 . . . al is accepted by a tagged NFA A1,
and qf is an accept state after consuming the last symbol al. We present a backward
traversal approach to find a path that allows for submatch extraction. Starting from
one of the accept states qf , with the last input symbol al, perform a lookup on ∆l for
quadruples (x, i, y, τ) such that y = qf and i = al. Pick any quadruple (ql, al, qf , τ) that
satisfies this condition, then ql is a previous state that leads the automaton to qf with
the last input symbol al, and τ is the corresponding submatch tags associated with al.
We note that τ can be empty. Using ql, with input symbol al−1, perform a lookup on
∆l−1 for quadruples (x, i, y, τ) such that y = ql and i = al−1. Such quadruples will
allow us to find a previous state of ql with input symbol al−1, along with submatch
tags associated with al−1 if there are any. Continue this process for al−2, . . . , and a1.
Finally, we will reach a start state q1. Then q1, q2, . . . , ql, qf is a valid traversal path for
input string a1a2 . . . al. During the backward path finding, each symbol in a1a2 . . . al is
assigned with zero or a set of submatch tags. Submatches of an accepted input string can
be extracted by scanning the input strings and collecting the last consecutive sequence
of symbols associated with the same submatch tags. Given a regular expression and a
matching string, there might exist multiple paths from a start state to an accept state.
Thus, there might exist multiple ways to assign valid submatches.
Example Figure 4.5 shows a traversal path of input string “aaaa” on the tagged NFA
shown in Figure 4.4. The path is marked by bold arrows. Along this path, we can
see that the first two symbols of “aaaa” are associated with tag t1, and the last two
symbols have no submatch tag. Thus, the submatch of “aaaa” for regular expression
“(a*)aa” is the substring of the first two symbols, i.e., “aa”.
The match test and submatch extraction algorithms described in Section 4.2.3 are
space efficient since the construction is based on NFA. However, they are not time
efficient. During the match test, the number of states in a frontier is O(|Q1|) (the size
of NFA). To derive the next frontier, states in the current frontier need to be processed
one by one. Thus, the number of lookups at the transition table during the match test
66
for an input string of length l is O(|Q1| × l). Similarly, the number of table lookups
performed during submatch extraction can be estimated as O(|Q1| × l). If we can find
an approach that allows us to derive frontiers (in match test) and previous states (in
submatch extraction) more efficiently, then the time efficiency of the algorithms can be
improved. Fortunately, we already have the data structures to do that. Our approach
is to represent tagged NFAs, perform match testing and submatch extraction using
Boolean functions (Section 4.2.4), and manipulate the Boolean functions using ordered
binary decision diagrams (OBDDs) (Section 4.2.5).
4.2.4 Boolean Function Representation
For convenience, we discuss tagged NFAs in which ε transitions have been eliminated.
The Boolean function of a tagged NFA A1 = (Q1,Σ, T, δ1, γ1, S1, F1) uses four vectors
of Boolean variables, x,y, i, and t. Vectors x and y are used to denote states in Q1,
and they contain dlg |Q1|e Boolean variables each. Vector i is used to denote symbols in
Σ and it contains dlg |Σ|e Boolean variables. Vector t is used to denote submatch tags
and it contains d|T |e Boolean variables. We construct the following Boolean functions
for the tagged NFA A1.
• ∆(x, i,y, t) denotes the tagged transition table of A1. It is a disjunction of all
tagged transition relations (x, i, y, t). As an example, the Boolean encoding of
transition relations in Table 4.1 is shown in Table 4.2, where states are encoded
by two bits, input symbol is encoded by one bit (since there is only one symbol
‘a’), and submatch tags are encoded by one bit. Specifically, states 1, 2, and 3
are encoded as 01, 10, and 11; symbol ‘a’ is encoded as 1; and submatch tag t1
is encoded as 1. The fifth column of Table 4.2 lists the function values for each
set of Boolean encodings. The function value of Boolean encodings for tagged
transitions is 1. The Boolean encoding in Table 4.2 can be symbolically translated
67
x i y t ∆(x,i,y,t)
0 1 1 0 1 1 10 1 1 1 0 0 11 0 1 1 1 0 1
Table 4.2: Boolean encoding of transitions in Table 4.1.
to
∆(x, i,y, t) = (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
∨ (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
∨ (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
If we rename variables i, y1, y2, and t1 to x3, x4, x5, and x6
respectively,∆(x, i,y, t) can be represented by the OBDD shown in Figure 4.6 .
• Iσ(i) stands for the Boolean representation of symbols in Σ. As an example,
symbol ‘a’ in Table 4.1 can be symbolically represented by Ia(i) = i.
• F(x) is a Boolean function representing frontier states. In the tagged NFA shown
in Figure 4.4, consider state {1} with input symbol ‘a’, the new frontier has two
states {1, 2}, which can be symbolically represented by F(x) = (x1∧x2)∨(x1∧x2).
• ∆F (x,i,y,t) is used to represent the intermediate transitions for frontier F(x)
during a match test process.
• A(x) is used to define the Boolean representation of accept states of a tagged
NFA. For the tagged NFA shown in Figure 4.4, the accept states is {3}, thus,
A(x) = x1 ∧ x2.
The Boolean functions described above can be automatically computed for any
tagged NFA. We next describe how to perform the match test and submatch extraction
described in Section 4.2.3 using these Boolean functions.
Match Test
The match test process is similar to that described in Chapter 3, except that we do
book-keeping here to be used for submatch extraction. Suppose the frontier of a tagged
68
x1
x2
x2
x3
x3
x4
x4
x5
x5
x5
x6
x6
Figure 4.6: The ordered binary decision diagram of a Boolean functionf(x1, x2, x3, x4, x5, x6) = (x1 ∧ x2 ∧ x3 ∧ x4 ∧ x5 ∧ x6) ∨ (x1 ∧ x2 ∧ x3 ∧ x4 ∧ x5 ∧x6) ∨ (x1 ∧ x2 ∧ x3 ∧ x4 ∧ x5 ∧ x6) with ordering x1 ≺ x2 ≺ x3 ≺ x4 ≺ x5 ≺ x6.
NFA is F(x) at some instant of frontier derivation, and the next input symbol is σ,
then the next frontier states can be computed using the following Boolean operations:
G(y) = ∃ x· ∃ i· ∃ t· [∆F (x, i,y, t)] (4.1)
where
∆F (x, i,y, t) = F(x) ∧ Iσ(i) ∧∆(x, i,y, t) (4.2)
We now explain why Equation (4.1) produces the new frontier states. Recall that
∆(x, i,y, t) is the disjunction of the tagged transitions of a NFA. The conjunctions of
∆(x, i,y, t) with F(x) and Iσ(i) on the right side of Equation (5.1) actually selects rows
in the truth table of ∆(x, i,y, t) that correspond to outgoing transitions from the states
in the current frontier F(x) labeled with symbol σ. These transitions are denoted by
∆F (x, i,y, t), which is a function of x, i,y, and t. The new frontier states are the target
states of the selected transitions and are only associated with y. To extract the new
frontier states, we existentially quantify x, i, and t using the existential quantification
69
operator introduced in Chapter 2. We rename y to x to express the new frontier states
in terms of x.
Consider the tagged NFA in Figure 4.4. Suppose the current frontier is {1} and the
next input symbol is ‘a’. Then
F(x) ∧ Ia(i) ∧∆(x, i,y, t) = (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
∨ (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
Apply existential quantification of x, i, and t on the above conjunctions we obtain
(y1 ∧ y2)∨ (∧y1 ∧ y2), which is the symbolic Boolean representation of the new frontier
states {1, 2}.
To check whether the automaton is in an accept state, simply check the satisfiability
of the conjunction between F(x) and A(x). Rename the above example frontier (y1 ∧
y2) ∨ (∧y1 ∧ y2) to (x1 ∧ x2) ∨ (∧x1 ∧ x2) and do a conjunction with A(x) = x1 ∧ x2.
The result is not satisfiable, thus, the automaton is not in an accept state.
Submatch Extraction
Now, we discuss how to extract submatches using Boolean function operations. The
process starts from the last symbol and one of the states where the input string is
accepted. For convenience, we call the current state of a backward path finding a
reverse frontier, which contains only one state because we are only interested in finding
one path. Suppose at an instant of the path finding the reverse frontier representation
is Fr(y), and the previous input symbol is σ. A previous state that leads the automaton
to Fr(y) can be derived from the following Boolean function:
∆r(x, i,y, t) = Fr(y) ∧ Iσ(i) ∧∆F (x, i,y, t) (4.3)
where ∆F (x, i,y, t) denotes the intermediate tagged transitions corresponding to sym-
bol σ during the match test process. The conjunctions on the right side of Equation (4.3)
selects tagged transitions (labeled by σ) from ∆F (x, i,y, t), where the target state is
Fr(y). The previous states are associated with x in ∆r(x, i,y, t). Since we are only
70
interested in one path, we simply pick one row in the truth table of ∆r(x, i,y, t) to find
one previous state of Fr(y). If we denote the picked row as PickOne(∆r(x, i,y, t)), a
previous state G(x) of Fr(y) can be derived by
G(x) = ∃ y· ∃ i· ∃ t·H(x, i,y, t) (4.4)
H(x, i,y, t) = PickOne(∆r(x, i,y, t)) (4.5)
To obtain submatch tags τ(t) associated with σ, we existentially quantify x, i, and y
on H(x, i,y, t).
τ(t) = ∃ x· ∃ i· ∃ y·H(x, i,y, t) (4.6)
Consider the example in Figure 4.5. After consuming the fourth input symbol of
“aaaa”, the automaton accepts and
∆F (x, i,y, t) = (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
∨ (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
∨ (x1 ∧ x2 ∧ i ∧ y1 ∧ y2 ∧ t1)
Starting from the accept state 3 (Fr(y) = y1∧y2) and the last symbol ‘a’ (Ia(i) = i),
do a conjunction according to Equation (4.3) we get ∆r(x, i,y, t) = (x1∧x2∧i∧y1∧y2∧
t1), which has only one tagged transition. Perform existential quantifications according
to Equation (4.4) and (4.5) we obtain the Boolean representation of a previous state
as x1 ∧ x2, which translates to state 2. Do an existential quantifications according to
Equation (4.6) we get τ(t) = t1, which means that no tag is associated with the fourth
symbol ‘a’. Applying the same approach on the 3rd, 2nd, and 1st symbols, we obtain
a path from state 1 to 3, where the 1st and 2nd symbol ‘a’ are assigned with submatch
tag t1. Thus, the submatch of “aaaa” to “(a*)aa” is sub(t1) =“aa”.
A submatch assignment obtained by our approach is not necessarily the left most,
longest submatch, which is required by POSIX. However, POSIX does not have a notion
of “greedy” and “reluctant” closures, which give some control over the length of the
submatch. Thus, POSIX is incomplete. Standard libraries like Java and PCRE have
behaviors that are not POSIX compliant.
71
4.2.5 Submatch-OBDD
To improve the efficiency of the match test and submatch extraction, we rep-
resent and manipulate the Boolean functions defined in Section 4.2.4 using OB-
DDs. We call our model Submatch-OBDD. A Submatch-OBDD for a tagged NFA
A1 = (Q1,Σ, T, δ1, γ1, S1, F1) is a 5-tuple [OBDD(∆(x, i,y, t)), {OBDD(Iσ|∀σ ∈
Σ))}, {OBDD(Tt|∀t ∈ T )}, OBDD(FS1), OBDD(A)], where ∆(x, i,y, t) is Boolean
representation of tagged transitions, Iσ is the Boolean representation of a symbol σ ∈ Σ,
Tt is Boolean representation of a tag t ∈ T , FS1 is Boolean representation of start states,
and A is the Boolean representation of accept states F1.
To understand why OBDDs can improve the time-efficiency of tagged NFA oper-
ations, consider frontier derivation on a tagged NFA. To derive a new set of frontier
states, the tagged transition table must be retrieved for each state in the current fron-
tier F , leading to O(|F|) operations for each input symbol. On the other hand, the
time-complexity of using OBDDs to derive the next frontier is determined by the two
conjunctions and one existential quantification in Equation (4.1) and (5.1). When
the frontier set F is large, the cost of doing the two conjunctions and one existential
quantification is often smaller than doing |F| lookups on the transition table. Using
the same method, we can calculate that the time complexity of submatch extraction is
the same as the match test process. For a tagged-NFA with n states, the size of frontier
set |F| is O(n). Thus, the cost to process an input string of l bytes by our approach is
between O(l) and O(nl). In other words, the time complexity of Submatch-OBDD is
between a pure DFA and a pure NFA approach.
The space efficiency of Submatch-OBDD is comparable to tagged NFAs. The space
cost of a Submatch-OBDD is dominated by OBDD(∆(x, i,y, t)), which needs a total
of 2 × dlg |Q1|e + dlg |Σ|e + d|T |e Boolean variables. In the worst case, the size of the
OBDD is O(|Q1|2×|Σ|×2|T |), which is comparable to the size of transitions of a tagged
NFA. We note that the OBDDs of intermediate transitions ∆F (x, i,y, t) for all input
symbols also take some space, mainly depending on the size of input string. We will
show that such a cost is not a concern in practice in Section 4.3.
72
4.2.6 Implementation
We implemented Submatch-OBDD as a toolchain in C++. The toolchain has two of-
fline components, Re2Tnfa and Tnfa2Obdd, and one online component, Pattern-
Match. Re2Tnfa accepts patterns as input and outputs tagged-NFAs that defines the
same languages as the input patterns. Tnfa2Obdd generates the tagged-NFAs’ OBDD
representations. PatternMatch performs match test and submatch extraction on an
input stream using the OBDD representations. Our implementation interfaces with the
popular CUDD library [79] for OBDD construction and manipulation.
In comparison, both PCRE and RE2 are implemented in C++. PCRE uses a recursive
backtracking approach. RE2 uses a combination of DFAs and NFAs for submatch
extraction: Given a pattern and an input string, RE2 constructs and uses backward
and forward DFAs to locate the pattern’s overall match in the input string. It then uses
NFA based approaches to find submatches in the overall match. For memory efficiency,
RE2 does not construct entire DFAs. It creates DFA states on demand (determination
on-the-fly) and stores them in a limited sized cache; when the cache gets full, RE2
empties the cache and restarts the DFA construction process.
4.3 Evaluation
We evaluated the performance of our Submatch-OBDD implementation using patterns
used in real systems. We measured Submatch-OBDD’s time efficiency and space effi-
ciency by matching the patterns with network traces, synthetic traces, and enterprise
event logs, and then compared our performance with two popular regular expression
engines: RE2 and PCRE. Our findings suggest that Submatch-OBDD achieves its ideal
performance when patterns are combined. In the best case, Submatch-OBDD is faster
than RE2 and PCRE by one to two orders of magnitude. All the performance num-
bers of Submatch-OBDD reported in this section were obtained based on the variable
ordering of i ≺ x ≺ y ≺ t.
73
4.3.1 Data Sets
We used the following three sets of patterns and trace files to evaluate the performance
of our approach:
Snort-2009
We extracted 115 patterns from a Snort 2009 HTTP rule set of 3078 patterns. All
patterns were extracted from the pcre fields of the rules. Since our focus is submatch
extraction, we excluded patterns containing no capturing groups and patterns contain-
ing back references as patterns with back references cannot be represented by regular
languages. Each extracted pattern contains one to six capturing groups.
We used two network traces and one synthetic trace to evaluate the performance of
our approach on the Snort-2009 pattern set.
• The first web trace was a 1.2GB network traffic collected using tcpdump from our
department’s web server. The average packet size of this trace is 126 bytes with
a standard deviation of 271. The second web trace was a 1.3GB network traffic
collected by crawling URLs that appeared on Twitter using a python script and
recording the full length packets using tcpdump. The average packet size of the
second trace is 1202 bytes with a standard deviation of 472.
• We also created a synthetic trace to observe how different implementations per-
form under the backtracking algorithmic complexity attack [75]. By reviewing
the 115 patterns of the Snort-2009 pattern set, we found that several of them are
vulnerable to the backtracking algorithmic complexity attack if a regular expres-
sion engine is implemented by backtracking, e.g., PCRE. We then crafted a 1MB
trace that can exploit the backtracking behavior of a backtracking-based pattern
matching engine. The average line length of the trace is 311 bytes with a standard
deviation of 5.
74
Snort-2012
We also evaluated our approach with the latest rules from the Snort system. We ex-
tracted 403 patterns (regular expressions with capturing groups) from a snapshot of
the Snort-2012 HTTP rule set containing 3990 rules. All patterns were extracted from
the pcre fields of the rules. Like the patterns of Snort-2009, we excluded patterns
containing back references as they can not be represented by regular languages. Pat-
terns containing no capturing group are also excluded as our focus was on submatch
extraction. Each extracted pattern has one to ten capturing groups.
We used two web traces and one synthetic trace to evaluate the performance of
different approaches on this pattern set. The two web traces are the same as those
used in the Snort-2009 pattern set evaluation. The synthetic trace was created after
reviewing the 403 patterns: We found that several of the 403 patterns are vulnerable
to backtracking algorithmic attacks. We then crafted a 1MB trace that can exploit
the backtracking behavior of a backtracking-based pattern matching engine and evalu-
ated its effects on Submatch-OBDD, RE2, and PCRE. The average line length of this
synthetic trace is 689 bytes with a standard deviation of 41.
Firewall-504
We also obtained a set of 504 patterns used by a commercial SIEM system C to nor-
malize logs generated by a commercial firewall, F . For commercial reasons, we do not
disclose the names of the SIEM system and the firewall. Each pattern in the set has
1-22 capturing groups. We collected 87 MBs of firewalls logs generated by F in an
enterprise setting and measured our performance on the logs. The logs consist of 1.01
million lines of text and the average line size is 87 bytes with standard deviation of 51.
We did not create synthetic trace for this pattern set as firewall logs cannot easily be
controlled by an attacker.
75
4.3.2 Experimental Setup
We conducted our experiments on an Intel Core2 Duo E7500 Linux-2.6.3 machine run-
ning at 2.93 GHz with 2 GB of RAM. We measure the time efficiency of different
approaches in the average number of CPU cycles needed to process one byte of a trace
file. We only measure pattern matching and submatch extraction time, and exclude
pattern compilation time. Similarly, we measure memory efficiency in megabytes (MB)
of RAM used during pattern matching and submatch extraction.
We measure the performance of each approach on a pattern set in two configurations.
In one configuration, Conf.S, we match each pattern with the input stream sequentially.
For example, we match each pattern in the Snort-2009 set with each packet in the
network traces. Combining all patterns of a pattern set into one single pattern, however,
allows us to match each packet with all patterns in one pass. This configuration, Conf.C,
is also useful in the log normalization process of a SIEM system. The system can match
an event log with all rules in one pass and extract all fields of interest instead of matching
the logs with each rule sequentially.
Given a pattern set with n patterns and an input trace of M bytes, we measured
performance of an approach in the following two configurations.
• Conf.S (Sequential): We compile each pattern individually and then match the
compiled patterns with the trace sequentially. If the ith pattern’s execution time
for the M bytes trace is ti cycles, then the time efficiency of an approach to the
pattern set is t1+···+tnM cycles/byte.
• Conf.C (Combination): We combine the n patterns together into one pat-
tern using the Union operation. We compile the combined pattern and match
it with the input trace. If the combined pattern’s execution time for the M
bytes trace is t cycles, then an approach’s time efficiency to the pattern set is
tM cycles/byte. When an input string matches a specific pattern in the com-
bined pattern, Submatch-OBDD emits the submatches, as well as the pattern
that matches the input string.
76
4.3.3 Performance Results
Snort-2009
Table 4.3 shows the execution times (cycles/byte) and memory consumption of RE2,
PCRE, and Submatch-OBDD for the Snort-2009 pattern set on the web traces and
synthetic trace. We have the following observations:
• Submatch-OBDD achieves its ideal performance in Conf.C, i.e., when patterns
are combined together for pattern matching and submatch extraction.
• Submatch-OBDD is the fastest approach among the three. For the web traces,
Submatch-OBDD’s best performance (in Conf.C) is an order of magnitude faster
than the other approaches’ best performance (in Conf.S).
• PCRE suffers from backtracking algorithmic complexity attacks, while Submatch-
OBDD and RE2 do not. With the web traces, the best time efficiency of PCRE
was 3.67 × 104. However, PCRE was slowed down by two orders of magnitude
when the synthetic trace was used, as is shown in Table 4.3(b). The reason is
that the synthetic trace caused PCRE to perform heavily backtracking for some
patterns.
• In Conf.C, the memory consumption of Submatch-OBDD and RE2 are compar-
ative, while PCRE consumes the least memory. We do not report the memory
requirements in Conf.S as the three approaches use very little memory for simple
patterns.
We note that in Conf.S, RE2 is faster than Submatch-OBDD. This is because many
patterns did not fill up the DFA state cache and hence did not trigger the DFA re-
construction process. In the case of simple patterns, the cost of OBDD operations,
e.g., frontier derivation and existential quantification, is higher than the cost of several
lookups on NFA transition table because the frontier size is often very small. Thus,
Submatch-OBDD performs slower than RE2 in such situations. The cost of OBDD
operations will be paid off when the frontier size of a tagged-NFA is large.
77
MethodConf.S Conf.C
Exec-time Exec-time Memory (MB)
RE2 2.31× 104 1.21× 105 7.3PCRE 3.67× 104 1.13× 106 1.2OBDD 8.76× 104 3.63× 103 9.4
(a) Performance numbers with the web traces
MethodConf.S Conf.C
Exec-time Exec-time Memory (MB)
RE2 8.20× 104 2.22× 105 7.6PCRE 1.44× 106 1.40× 106 1.0OBDD 2.12× 105 2.20× 104 7.0
(b) Performance numbers with the synthetic trace
Table 4.3: Execution time (cycles/bytes) and memory consumption for the Snort-2009data set with (a) the web traces and (b) the synthetic trace. In both traces, Submatch-OBDD’s best execution time (Conf.C) is much shorter than RE2’s and PCRE’s bestexecution times (Conf.S).
We recommend that Submatch-OBDD to be used in cases where a group of patterns
are combined together. The performance boost of Submatch-OBDD is due to the
redundancy elimination: The OBDD representation eliminates the redundancy in the
Boolean representation of tagged NFAs.
Snort-2012
Table 4.4 shows the performance of RE2, PCRE, and Submatch-OBDD on the 403
patterns from Snort-2012 rule set.We have the following observations:
• Submatch-OBDD achieves its ideal time efficiency in Conf.C, i.e., when patterns
are combined together for matching test and submatch extraction.
• For the web traces, Submatch-OBDD is faster than RE2, but slower than PCRE.
While for the synthetic trace, Submatch-OBDD is faster than both RE2 and
PCRE.
• Like in the Snort-2009 data set, PCRE suffers from the backtracking algorith-
mic attack performed by the synthetic trace. PCRE’s time efficiency under the
synthetic trace is two to three orders of magnitude than under the web traces.
78
MethodConf.S Conf.C
Exec-time Exec-time Memory (MB)
RE2 4.79× 104 2.09× 106 15.0PCRE 7.70× 104 2.69× 103 1.0OBDD 3.83× 105 1.08× 104 6.3
(a) Performance numbers with the web traces
MethodConf.S Conf.C
Exec-time Exec-time Memory (MB)
RE2 2.92× 105 8.21× 106 15.0PCRE 1.47× 106 7.64× 105 1.0OBDD 4.70× 105 1.10× 105 15.3
(b) Performance numbers with the synthetic trace
Table 4.4: Execution time (cycles/bytes) and memory consumption for the Snort-2012data set with (a) the web traces and (b) the synthetic trace.
• In Conf.C, the memory consumption of Submatch-OBDD and RE2 are compar-
ative.
Although we observed that PCRE performed better for the web traces in Table 4.4,
PCRE is still not recommended to be used as pattern matching engine for a network
intrusion detection system (NIDS). The main reason is that it is easy for attackers
to craft network traffic performing backtracking algorithmic attacks on PCRE, as was
shown by Smith et al. in [75]. Our experimental results in Table 4.3 and Table 4.4
also demonstrated that PCRE is easily to be slowed down by hundreds of times with
carefully crafted synthetic traces.
Firewall-504
Table 4.5 shows the three approaches’ performance on the Firewall-504 data set.
Submatch-OBDD is the fastest approach on this data set. In Conf.C, Submatch-OBDD
is orders of magnitude faster than RE2 and PCRE. Also, Submatch-OBDD’s best per-
formance (in Conf.C) is 62% faster than RE2’s best performance (in Conf.S). In memory
usage, PCRE is most space compact. Submatch-OBDD consumes slightly more mem-
ory than RE2.
79
MethodConf.S Conf.C
Exec-time Exec-time Memory (MB)
RE2 2.04× 105 2.20× 107 21.0PCRE 6.88× 105 1.60× 106 1.1OBDD 6.31× 105 1.25× 105 30.0
Table 4.5: Execution time (cycles/bytes) and memory consumption for the Firewall-504data set.
4.3.4 Discussion
During our evaluation, we found a small number regular expressions from the Snort
2009 and 2012 rule sets that can cause either PCRE or RE2 to perform poorly. For
example, if we use PCRE to match
.*\x2F[^\s]*\.(dat|xml)\?[^\s]*v=[^\s]*t=[^\s]*c=
with input string “/;/;/;.dat?;.dat?;.dat?;v=;v=;v=;t=;t=;t=;c”. Then PCRE
will perform O(3 × 3 × 3 × 3) backtracking evaluations before eventually concluding
that the string does not match the pattern. The evaluation time of PCRE will increase
exponentially if we increase the number of repetitions of the “/;”, “.dat?”, “v=;”, and
“t=;” in the input string. We observed that when these substrings were repeated 20
times, the execution time of PCRE for this regular expression was in the order of 106
cycles/byte. Details on how to create pathological traces to exploit the backtracking
behavior of PCRE can be found in [75].
RE2 can perform poorly under the case when the DFA states of a regular expression
blow up. The blow-up will cause the limited state cache be filled quickly and RE2 has
to empty the cache and restart the DFA construction. In our experiments, we have
observed an individual regular expression from Snort-2009 where the time efficiency of
RE2 is an order of magnitude slower than Submatch-OBDD, which does not suffer from
state blow up as it is a NFA-based approach. We also found eight patterns from the
SIEM system that cause RE2 to blow up in its DFA construction. For these patterns,
the time efficiency of RE2 is an order of magnitude slower than Submatch-OBDD. For
commercial reasons, we do not disclose these patterns in the dissertation.
80
Please note that RE2 and PCRE are mature and popular engines and their code
bases are heavily optimized. We have not devoted significant time to try to optimize
Submatch-OBDD. We believe that Submatch-OBDD’s performance can be further im-
proved with better optimization.
4.4 Related Work
Regular expressions are extensively used to construct attack signatures in NIDS and
to process event logs in SIEM systems. Finite automata are natural representations
for regular expressions, and they demonstrate a time-space tradeoff in pattern match-
ing. Many techniques have been proposed to improve DFAs’ space efficiency: com-
pression [13], determinization on-the-fly [80], building multiple DFAs (MDFA) from
a group of signatures [102], extending DFAs with scratch memory (XFAs) [76, 77],
and constructing DFA variants with hardware implementations [18, 46, 52]. Similarly,
many techniques have been proposed to improve NFAs’ time efficiency: hardware based
parallelism [25, 52, 22, 39, 71] and software based speedup [97, 99]. Hybrid finite au-
tomata [12] combines the benefits of NFAs and DFAs.
Submatch extraction, however, has not received much attention from the research
community. Pike implemented a submatch extraction approach in the sam text ed-
itor [64] using a straightforward modification of Thompson’s NFA simulation [84].
Google’s RE2 tool also uses the modified NFA simulation approach. Laurikari proposed
TNFA, an NFA-based approach for submatch extraction, where an NFA is augmented
with tags to represent capturing groups [47]. Our approach also uses tags, but we asso-
ciate tags with non-ε transitions whereas TNFA associates tags with ε transitions. We
use OBDDs to represent and operate on tagged NFAs to achieve time efficiency.
PCRE and the regular expression libraries in Java, Perl, and Python implement
pattern matching and submatch extraction using recursive backtracking, where an in-
put string may be scanned multiple times before a match is found. The backtracking
approach’s worst case performance is exponential running time [28]. These tools use
backtracking to efficiently handle back reference, a non-regular construct that improves
81
the pattern language’s expressive power. In contrast, our Submatch-OBDD approach
is an NFA-based technique and does not suffer from exponential running time.
Google’s RE2 is an open source automata based pattern matching tool that sup-
ports submatch extraction [29]. RE2 employs a DFA approach to test whether an
input string matches a pattern. If a pattern contains capturing groups, RE2 uses a
DFA approach to find the pattern’s overall match in an input string and then runs an
NFA approach to extract the submatches in the overall match. Similar to RE2, our
Submatch-OBDD is NFA-based. We, however, use OBDDs to perform NFA operations
and hence improve time efficiency. Submatch-OBDD performs better than RE2 when
patterns are combined. Both RE2 and Submatch-OBDD do not support back refer-
ences. We will present an efficient matching algorithm for patterns containing back
references in Chapter 5.
The NFA-OBDD model [97, 99] is the most relevant work to Submatch-OBDD. A
commonality between NFA-OBDD and our Submatch-OBDD is the use of “implicit
state enumeration” by means of OBDDs [27, 85]. NFA-OBDD, however, does not
support submatch extraction.
4.5 Summary
In this chapter, we present Submatch-OBDD, which allows fast submatch extraction in
regular expression-like pattern matching. We propose a new approach to tag capturing
groups in a regular expression, and extend Thompson’s NFA construction approach to
support regular expressions with capturing groups. We present a novel technique to
perform submatch extraction. Our use of OBDDs improves the time efficiency of match
test and submatch extraction. We evaluated our Submatch-OBDD implementation us-
ing patterns used in the Snort NIDS and a commercial SIEM system. Our experiments
on real network traces, synthetic traces, and enterprise event logs show that Submatch-
OBDD achieves its ideal performance when patterns are combined. In the best case,
our approach is faster than RE2 and PCRE by one to two orders of mangintude.
82
Chapter 5
A New Algorithm for Patterns with Back References
5.1 Introduction
Regular language-based patterns have limited expressive power and could not be used to
describe some features appeared in network packet payload. Aside from submatch ex-
traction, back reference is another important feature provided by many pattern match-
ing tools, e.g., PCRE, the regular expression libraries of Java, Perl, and Python, etc.
Back references are used to identify repeated strings in an input string. Patterns con-
taining back references are non-regular languages [29].
To be more specific, patterns with back references are more expressive than regular
languages. For example, suppose we want to match a pair of XML tags and the text
in between. It will be hard to represent this pattern if we are only allowed to use
regular expressions (regular languages) because tags in an XML file may be unknown
beforehand. In this case, using a back reference can easily describe the pattern. For
example, “<([A-Z][A-Z0-9]*)\b[^>]*>.*?</\1>” can be used to match a pair of
XML tags and the text in between, where the first capturing group (subexpression
within the pair of parentheses) is used to capture an XML tag, and the “\1” denotes
that the captured tag will be reused at the end (before the ‘>’ symbol) of the pattern.
A pattern can have multiple back references, where each of them refers to a different
capturing group. Multiple back references can be sequentially named by a ‘\’ followed
by different numbers. For example, three back references can be named as “\1”, “
\2”, and “\3”. One back reference can also appears multiple times in a pattern, e.g.,
“([a-c])x\1x\1”. Back references are also employed by modern NIDS to represent
attack signatures. For example, the HTTP rule set of Snort 2012 has 167 patterns
containing back references [78].
83
Since patterns containing back references are non-regular languages, they cannot by
represented by NFAs or DFAs. Thus, prior approaches on NFAs or DFAs could not be
applied to back references. In fact, very few work was done for patterns containing back
references. As pointed out by Cox in [28], “No one knows how to implement pattern
with back references effciently, though no one can prove that it’s impossible either”.
Specifically, back reference problem is NP-complete [6]. The de facto algorithm for
back references is recursive backtracking. However, recursive backtracking is vulnerable
to algorithmic complexity attacks. For example, the throughput of PCRE quickly
decreases to nearly zero mega-byte/second for patterns in the form of “(a?{n})a{n}\1”
(n = 5, 10, 15, 20, 25, 30) with input strings in the form of an (i.e., a is repeated n times).
In fact, PCRE fails to return correct results for n ≥ 25.
Can we find an approach that can address back references but resist known algo-
rithmic complexity attacks? In this chapter, we explore the answer to this question.
We present a novel approach to implement pattern matching with back references.
The basic idea of our approach is to convert a back reference problem to a conditional
submatch problem, and represent a conditional submatch problem using an NFA-like
machine. We evaluate the performance of our approach using both synthetic patterns
and patterns from real-world NIDS. Our experimental results show that our approach
resists known algorithmic complexity attacks and is faster than PCRE by three orders
of magnitude for certain types of patterns. For general patterns, our approach is one
order of magnitude slower than PCRE.
The remainder of this chapter is organized as follows. Section 5.2 presents the
design of our algorithm for patterns with back references. Section 5.4 presents the
experimental evaluation of our approach, and Section 5.5 discusses the related work.
Section 5.6 summarizes our contribution.
84
5.2 Design of Algorithm
The basic idea of our approach is to convert a back reference problem to a conditional
submatch problem, and represent the conditional submatch problem using an NFA-
like machine. Our approach include two phases: compilation and execution. During the
compilation phase, patterns with back references are compiled to tagged NFAs subjected
to some constraints. During the execution phase, pattern matching is performed by
operating the tagged NFAs generated at the compilation phase with input strings.
5.2.1 Pattern Compilation
We introduce a relax plus constrain approach to tackle back references. Relax refers to
re-writing a regular expression with back references to a regular expression that only
contains capturing groups. During the re-writing, a back reference part is replaced
by the capturing group that is referred by the back reference. By doing this, a back
reference and its referred capturing group becomes a pair of capturing groups in the
re-written regular expression. To make the re-written pattern be equivalent to the
original pattern, we add a constraint to the accept condition such that the submatches
returned by the capturing group pair are equal. For example, pattern “(a*)aa\1” can
be re-written as “(a*)aa(a*)” with the constraint such that “$1=$2”, where “$1” and
“$2” denote the first and second submatches captured by the two capturing groups.
Once a pattern with back reference is converted to a pattern of conditional submatch
extraction, we can construct an NFA-like machine to represent the converted pattern
using a Thompson’s like algorithm. The construction process bears some similarity to
constructing a tagged NFA described in Chapter 4. As we will see soon, the difference
lies on how to operate a tagged NFA. Recall that we add a equal substrings constraint
to a re-written pattern in order to make it equivalent to its original pattern. Thus,
we need a mechanism to maintain substrings matched by the capturing groups in a
re-written pattern. To do so, a tagged NFA needs to have a data structure allowing
for bookkeeping of captured substrings. The data structure we use is to associate
each state with a pair of substrings (multiple pair of substrings are needed if there
85
Figure 5.1: The tagged NFA constructed from “(a*)aa(a*)”.
are multiple back references). For transitions within a capturing group, we add the
corresponding input symbols into captured substrings. For transitions that are not
within any capturing group, we just carry over the captured substrings from state to
state. When a tagged NFA reaches a final state, we check whether there exists a pair
of equal captured strings. If so, an input string is matched by the tagged NFA. To
be more formal, we denote a tagged NFA as a 6-tuple (Q,Σ, T, δ, q0, F in), where Q is
the state set, Σ is an alphabet set, T is a tag set, q0 is the start state, Fin is a set of
final states, and δ is a transition function δ : Q × Σ∗ × Σ∗ × · · · → Q × Σ∗ × Σ∗ × . . .
that maps a current state with the captured substrings to a next state with updated
captured strings.
Let us demonstrate the approach using the example pattern “(a*)aa(a*)”. After
adding tags, the pattern is denoted as “(a∗)t1aa(a∗)t2”, where t1 to t2 are used to label
the two capturing groups. Figure 5.1 shows a tagged NFA constructed from pattern
“(a∗)t1aa(a∗)t2” such that “$1=$2”. It can be observed that transition from state 1 to
itself with input symbol ‘a’ is within the first capturing group, and the transition from
state 3 to itself with input symbol ‘a’ is within the second capturing group. All other
transitions are not in any capturing group.
Similar to traditional NFAs, we can use a transition table to represent the transitions
of a tagged NFA. Instead of having three columns, the transition table of a tagged NFA
has five columns, where the first three columns are same as those in a traditional
transition table, the fourth column denotes tags associated with each transition, and
the fifth column specifies the actions used for maintaining the substrings matched by
capturing groups. In our design, we have three types of actions: new, update, and carry
86
Current state(x) Input symbol(i) Next state(y) Tag(t) Action
1 a 1 t1 new(t1) or update(t1)1 a 2 φ carry over(t1)2 a 3 φ carry over(t1)3 a 3 t2 new(t2) or update(t2)
Table 5.1: Transition table of the tagged NFA in Figure 5.1.
over, where new and update actions are associated with transitions within capturing
groups, and a carry over action is associated with transitions not in any capturing group.
A new action denotes creating a new captured substring using a current input symbol,
and an update action denotes updating a substring by appending an input symbol to
the end of the substring. A carry over action denotes that captured substrings are
copied over from a current state to a next state. Table 5.1 shows the transition table
of the tagged NFA in Figure 5.1.
5.2.2 Execution
Frontier Derivation
To allow for the maintenance of captured substrings, we denote an element in a frontier
set by a tuple (x, substr1, substr2, . . . ), where x is a state number, and substr1 and
substr2 are substrings matched by the capturing groups. In general, if there are k back
references, we need a (2k+1)-tuple to represent a frontier element. During a match test
of an input string, the frontier set is initially a singleton set {(q0, “”, “”, ...)} (where
“” denotes that no substring has been captured yet) but may include multiple elements
during the operation of a tagged NFA. For each symbol in the input string, we must
process all elements in a frontier set and find a new set of elements by applying the
transition functions represented by the transition table. Applying a transition function
to a frontier element (s, substr1, substr2, . . . ) and an input symbol includes two steps.
The first step is a table lookup, i.e., given a state s and symbol I(i), retrieve all states
that are reachable from s with symbol I(i). The second step is to apply one or more
actions on the captured substrings associated with state s. In particular, if a transition
is a start of a capturing group, then a new action is applied; if a transition is within
87
a capturing group then an update action is applied; and if a transition is not within
any capturing group, then a carry over action is applied by just copying around the
captured substrings (if there is any) from a current state to a next state. For a pattern
with one back reference, the above frontier derivation process can be expressed by the
following Boolean formula:
G(y, s) = F0(∃ x· ∃ i· ∃ t· (t = φ ∧∆F (x, s, i,y, t)))
∨ F1(∃ x· ∃ i· ∃ t· (t = t1 ∧∆F (x, s, i,y, t)))
∨ F2(∃ x· ∃ i· ∃ t· (t = t2 ∧∆F (x, s, i,y, t)))
where
∆F (x, s, i,y, t) = F(x, s) ∧ Iσ(i) ∧∆(x, i,y, t) (5.1)
F(x, s) denotes the current frontier set (s denotes captured substrings), Iσ(i) de-
notes an input symbol, and ∆(x, i,y, t) denotes the transition relations of the tagged
NFA. The conjunctions in Equation 5.1 basically selects rows in the transition table
∆(x, i,y, t) that corresponding to outgoing transitions from the states in the current
frontier set F(x, s) labeled with symbol σ. The t = φ∧∆F (x, s, i,y, t) in G(x) selects
transitions that are not in any capturing group, t = t1 ∧∆F (x, s, i,y, t) selects transi-
tions that are labeled by t1 (first capturing group), and t = t2 ∧∆F (x, s, i,y, t) select
transitions labeled by t2 (second capturing group). Function F0 denotes a carry over
action; function F1 denotes applying a new or update action to substrings captured by
the first capturing group; and function F2 denotes applying a new or update action to
substrings captured by the second capturing group. Renaming the y to x in G(y, s)
gives us the new frontier set G(x, s). The frontier derivation formulae for patterns with
multiple back references are similar, except that more tags ti(i = 1, 2, . . . ) are involved.
Example Consider the example tagged NFA in Figure 5.1 with input string “aaaa”.
Initially, the frontier set is a singleton set {(1, “”, “”)}. For the first input symbol ‘a’,
we can get that the next state can be state 1 or 2 according to the transition table in
Table 5.1. The fourth column of the transition table indicates that the transition from
state 1 to 1 is associated with a new(t1) function, which means we need to create a new
88
substring for the first capturing group using the current input symbol ’a’. The transition
from state 1 to 2 is associated with a carry over(t1) action. Since no substring has been
captured in (1, “”, “”), nothing needs to be copied from state 1 to state 2. As a result,
the new frontier set has two elements, i.e., {(1, “a”, “”), (2, “”, “”)}.
Renaming {(1, “a”, “”), (2, “”, “”)} as the current frontier set, with the second
input symbol ‘a’, we can obtain the next frontier set as {(1, “aa”, “”), (2, “a”, “”),
(3, “”, “”)}. Using the same method to process the third and fourth input symbols.
After processing the fourth input symbol ‘a’, the frontier set is {(1, “aaaa”, “”), (2,
“aaa”, “”), (3, “aa”, “”), (3, “a”, “a”), (3, “”, “aa”)}.
Acceptance Checking
The accept condition of a tagged NFA is: at the end of input string, there exist a
tuple (t, substr1, substr2, . . . ) in the frontier set such that t ∈ Fin is a final state, and
substr1 equals substr2 (for patterns with one back reference). If there are k different
back references, we need to have k pairs of captured substrings, where the two substrings
in each pair are equal. For the example tagged NFA with input string “aaaa”, it can be
observed that there is one element, i.e., (3, “a”, “a”), in the frontier set satisfying the
acceptance condition after processing the fourth input symbol ‘a’. Therefore, the input
string “aaaa” is accepted by the tagged NFA, which means the input string matches
patten (a*)aa\1.
Remarks: We note that our approach for back reference can be employed to do sub-
match extraction as well. In that case, nothing needs to be added as constraint to a
tagged NFA. One great benefit is that this approach is capable of performing pattern
matching and submatch extraction by just scanning the input string in a single pass.
5.3 Implementation
We design and built a toolchain to evaluate the performance of our back reference
approach. The implementation is based on C++. The toolchain has two components:
a compilation component and an execution component, as shown in Figure 5.2. The
89
Figure 5.2: The toolchain of our back reference implementation.
compilation component reads patterns with back references and compiles them into
tagged NFAs described in Section 5.2.1. The execution component loads compiled
tagged NFAs and matches them with a stream of input strings. In our implementation,
captured substrings are represented by their starting and ending offsets in the input
strings. In this way, substrings do not have to be copied around from states to states.
Each substring is represented using only two numbers, which saves space and overhead
of string copy operation.
We creates two instances of implementations for evaluation. In the first one, named
as NFA-backref, pattern matching is performed by operating the compiled tagged NFAs.
In the second implementation, dubbed by OBDD-backref, tagged NFAs are represented
by OBDDs and pattern matching is performed by manipulating the OBDDs data struc-
ture of the tagged NFAs.
5.4 Evaluation
5.4.1 Data Sets
We evaluate the performance of different implementations using the following two data
sets:
Patho-01
Patterns in this data set are in the form of (a?{n})a{n}\1, where the ? char is a 0
or 1 quantifier. This pattern will match a string starting with zero or one ‘a’ repeated
90
by n times, followed by n characters of ‘a’, followed by the substring captured by the
first capturing group. We evaluated the pattern for n = 5, 10, 15, 20, 25, 30. For each
pattern, we use input string in the form of an, i.e., ‘a’ repeated by n times, which will
be matched by a pattern with the same value of n.
Snort-46
The second data set includes 46 patterns containing back references from Snort 2012
HTTP signature set. We use two input traces to evaluate this pattern set. The first
trace, which we called benign trace, was generated using a string generator created by
ourselves. Given a set of patterns and a user expected match percentage p, the string
generator generates a trace file where p per cent of strings are matched by at least one
pattern in the pattern set. The size of benign trace we generated in our evaluation is
5MB. The second trace was manually crafted after carefully reviewed the 46 patterns.
We found that at least one of these patterns will suffer from the algorithmic complexity
attack if a patten matching engine is implemented by recursive backtracking. We thus
manually created a 100KB pathological trace using the approach described in Section 1.3
to evaluate how different implementations perform under an algorithmic complexity
attack.
5.4.2 Performance
We measure the time efficiency of different implementation using the number of CPU
cycles in processing each byte of input trace (cycle/byte). We evaluate the performance
of three implementations: NFA-backref, OBDD-backref, and PCRE using the data
sets described in Section 5.4.1. We did no measure the space efficiency of different
implementations since both NFA-based approach and recursive backtracking are space
efficient, as presented in Chapter 3 and Chapter 4.
Figure 5.3 shows the execution time of different implementations for the Patho-01
data set. The x-axis denotes the value of n in pattern (a?{n})a{n}\1, and the y-axis
denotes the execution time in unit of cycle/byte. It can be observed that PCRE is
the slowest implementation as n increases from 5 to 30. NFA-backref is the fastest
91
1
10
100
1000
10000
100000
1000000
10000000
10000000
5 10 15 20 25 30
Exe
c-ti
me
(cy
cle
/byt
e)
PCRE OBDD-backref NFA-backref
Figure 5.3: Performance of different implementations for the Patho-01 data set. It canbe observed that NFA-backref resists the algorithmic complexity attack and it is atleast three orders of magnitude faster than PCRE.
implementation and is faster than PCRE by at least three orders of magnitude. The
performance of OBDD-backref is between NFA-backref and PCRE. This indicates that
the OBDD data structure does not help for patterns with back references. The main
reason is that the cost of representing the new, update, and carry over actions in the
frontier derivation using OBDDs is expensive, while such operations are not needed in
the NFA-OBDD and Submatch-OBDD models.
As is shown in Figure 5.3, PCRE performs extremely slow for this pattern set. This is
mainly because that PCRE performs exhaustive recursive backtracking when matching
an input string an (i.e., ‘a’ repeated n times) against pattern (a?{n})a{n}\1. During a
recursive backtracking, the first matching path that is tried by PCRE is to match the n
characters of ‘a’ with the (a?{n}) part of the pattern. This path will fail because there
is no characters to match the remaining part a{n}\1. Then PCRE will backtrack one
step and use n− 1 characters to match the (a?{n}) part and will fail again. Continue
this way, it needs to traverse O(2n) paths before it finally succeeds by using zero ‘a’
92
1
10
100
1000
10000
100000
1000000
10000000
Exe
c-ti
me
(cy
cle
/byt
e)
PCRE OBDD-backref NFA-backref
(a) Performance of benign trace
1
10
100
1000
10000
100000
1000000
10000000
Exe
c-ti
me
(cy
cle
/byt
e)
PCRE OBDD-backref NFA-backref
(b) Performance of pathological trace
Figure 5.4: Performance of different implementations for the Snort-46 pattern set. NFA-backref outperforms PCRE by three orders of magnitude when the pathological traceis used as input.
to match the (a?{n}) part, n characters of ‘a’ to match a{n} , and zero ‘a’ to match
the back reference part \1. As the value of n gets large, the number of traversal paths
increases exponentially, which will cause PCRE to abort the backtracking process when
the size of stack is too large. In our experimentation, we observed that PCRE failed to
give correct matching results when n ≥ 25, while our implementation always returns
correct results for all input traces. The failure of PCRE for patterns when n ≥ 25 is
mainly due to that PCRE aborts the recursive backtracking when the size of recursive
stack is over a threshold.
Figure 5.5 shows the execution time of different implementations for the Snort-46
data set. Figure 5.4a shows the performance of the benign trace, and Figure 5.4b shows
the performance of the pathological trace. It can be observed that PCRE is about 10
times faster than NFA-backref when the benign trace is used as input strings. How-
ever, NFA-backref is three orders of magnitude faster than PCRE for the pathological
trace. The low performance of PCRE in Figure 5.4b is due to that the pathological
trace triggers PCRE to do exhaustively recursive backtracking during pattern match-
ing. OBDD-backref is the slowest implementation among the three in both cases. This
is mainly due to OBDD’s expensive operation cost associated with the new, update,
and carry over actions in the frontier derivation.
93
Both Figure 5.3 and Figure 5.5 shows that our implementation NFA-backref is im-
mune to the algorithmic complexity attack. Under pathological traces, our NFA-backref
implementation outperforms PCRE by orders of magnitude. Although NFA-backref is
slower than PCRE for benign traces, we argue that NFA-backref is a better implementa-
tion because network security tools, e.g., NIDS, are often exposed to attacking network
traffic, in which an attacker may deliberately craft pathological network contents to per-
form a DoS attack to a recursive backtracking-based pattern matching engine. Thus,
we believe that our approach is better suited to be deployed to process hostile network
traffic.
5.5 Related Work
Pattern matching in practice demonstrates a time/space tradeoff. DFA-based ap-
proaches are time efficient, but suffer from state blow-up. NFA-based approaches are
space efficient, but are slow in operation. Recursive backtracking-based approach is
fast in general, but can be orders of magnitude slower under an algorithmic com-
plexity attack. The time/space tradeoff has spurred a lot of recent research, primar-
ily focused on patterns that can be described by regular languages (regular expres-
sions). Many researchers aimed at reducing the memory foot prints of DFA-based ap-
proaches [80, 102, 46, 76, 77], some researchers worked on improving the time efficiency
of NFA-based approaches with hardware [39, 54, 25, 71] of software solutions [97, 99].
Patterns used in real-world security tools are often regular expressions extended
with some features. One of the important features, submatch extraction, is discussed
in Chapter 4. Another important one, back reference is discussed in this chapter. Up
to now, not much work has been done on submatch extraction and back reference.
Existing approach on submatch extraction include Google’s RE2 [29], Horne et al.’s
DFA-based algorithm [35], Laurikari’s tagged NFA approach [47], and our Submatch-
OBDD model [100] (presented in Chapter 4). However, RE2 does not support back
reference. Recursive backtracking is the de facto approach to implement back refer-
ences and has been adopted by tools such as PCRE and regular expression libraries in
94
some high level languages such as Java, Python, and Perl [28]. As we have shown, a re-
cursive backtracking based implementation suffers from algorithmic complexity attacks.
Becchi and Crowley proposed to model a back reference problem with an automaton-like
machine [15]. Their approach constructs a special state for each back reference instance.
Substrings are recorded in a back reference state and are matched in a consuming way.
Becchi’s approach works in the situation when there is only one back reference instance
for a capturing group but fails when there are multiple back reference instances for a
same capturing group. Also, it is not clear how Becchi’s approach performs because
no execution time was reported in their paper. Namjoshi and Narlikar presented an
automaton-based back reference approach [56] similar to [15]. Our approach differen-
tiate from [15] and [56] in that we do not construct special states or input symbols for
back references. Instead, we treat all the states in an NFA-like machine in the same
manner, and add constraints to the acceptance condition of the constructed tagged
NFA. We also showed that our approach is immune to known algorithmic complexity
attacks.
5.6 Summary
In this chapter, we present a new matching algorithm for patterns with back refer-
ences. Our approach works by converting a back reference problem to a conditional
submatch extraction problem. We then construct NFA-like machines to represent pat-
tern matching with back references. We build a toolchain and evaluate the performance
of our approach using both synthetic data set and data set from real-world NIDS. Our
experimental results have shown that our implementation NFA-backref is immune to
known algorithmic complexity attack. In particular, NFA-backref is at least three or-
ders of magnitude faster than PCRE, a recursive backtracking-based pattern matching
engine. Under benign traffic, NFA-backref is one order of magnitude slower than PCRE.
We believe that our approach is better suited for network security tools because such
tools are often exposed to hostile network traffic that has potential to abuse a recur-
sive backtracking based pattern matching engine. We believe that the performance of
NFA-backref will be further improved with better code optimization.
95
Chapter 6
Conclusion and Future Directions
6.1 Conclusions
Pattern matching algorithms in network security applications demonstrate a time/space
tradeoff. In this dissertation, we present several new pattern matching techniques for
network security applications. We have shown that it is possible to design a pattern
matching engine that is orders of magnitude faster than NFA-based pattern matching
algorithms, immue to known algorithmic complexity attacks, and retaining the space
efficiency of NFAs. To this end, we have developed three techniques: NFA-OBDD,
Submatch-OBDD, and NFA-backref.
6.1.1 NFA-OBDD
Our first contribution is NFA-OBDD, which is designed to improve the time efficiency
of NFA-based regular expression (regular language) matching. Our design employs
symbolic Boolean functions to describe the NFA representation of regular expressions.
We represent and manipulate Boolean functions using OBDDs, which can effectively
remove the redundany of transition relations and set of states. The use of OBDDs
allows us to apply an NFA transition relation to all states in a frontier set in a single
operation in order to produce the new frontier set.
We evaluate the performance of NFA-OBDD using real-world patterns and net-
work traces. Our experimental results have shown that NFA-OBDD outperforms a
traditional NFA implementation by three orders of magnitude, while still retaining the
efficiency of NFAs. NFA-OBDD is competitive with MDFA (a DFA variant) in term
of time efficiency, but consuming much less memory than MDFA. It outperforms or is
96
competitive with PCRE.
6.1.2 Submatch-OBDD
Our second contribution is to extend NFA-OBDD to model submatch extraction, an
important feature in real-world patterns used by network security applications. We
evaluate our submatch extraction approach (Submatch-OBDD) using patterns in an
NIDS and a commerical SIEM system. Our experiments using real network traces,
synthetic traces, and security event logs have shown that Submatch-OBDD outperforms
RE2 and PCRE by one to two orders of magnitude, while retaining the space efficiency
of NFAs.
6.1.3 NFA-backref
The third contribution of this dissertation is a new algorithm for patterns with back
references, which are non-regular languages. We propose to convert a back reference
problem to a conditional submatch extraction problem. We then construct tagged NFAs
to represent patterns with submatch constraints. We evaluate our approach using two
instances of implementations: NFA-backref and OBDD-backref. Our experimental re-
sults have shown that NFA-backref is immune to known algorithmic complexity attacks
and outperforms PCRE by at least three orders of magnitude under pathological traces.
For benign traces, PCRE is an order of magnitude faster than NFA-backref. Never-
theless, we believe that NFA-backref is a better than a recursive backtracking-based
implementation. This is because a NIDS is often exposed to hostile network traffic that
may contain pathological network contents to abuse a recursive backtracking-based
pattern matching engine.
6.2 Future Directions
There are several directions that can be explored in the future.
97
6.2.1 Hardware-based NFA-OBDD and Submatch-OBDD
NIDS vendors are increasingly beginning to deploy hardware-based deep packet in-
spection products. While this dissertation explored the potential of NFA-OBDD
and Submatch-OBDD using a software-based implementation, a hardware-based so-
lution would be required to provide raw matching speeds approaching multiple giga-
bit/second. Although OBDDs [104, 74] and NFAs [71, 25, 32] have each been individu-
ally implemented in hardware (such as FPGAs and CAMs), further research is needed to
investigate the possibility of a hardware-based NFA-OBDD and Submatch-OBDD im-
plementations. The key challenge in implementing NFA-OBDD and Submatch-OBDD
in hardware is to devise techniques that would allow OBDDs to be modified within
hardware, e.g., to allow OBDD(F) to be modified efficiently as each input symbol is
processed. It is also possible to implement our back reference approach NFA-backref
in hardware. The key challenge in the implementation is how to represent the actions
used for maintaining the captured substrings during frontier derivation.
6.2.2 Security in Software Defined Networking
Software Defined Networking (SDN) [95] has emerged as an important technique for the
next generation of data centers and cloud computing. SDN allows the identity and flow
control logic of communication entities to be decoupled from the basic topology based
forwarding, bridging, and routing. The emergence of SDN will add a lot of opportunities
and challenges to the network and software communities. There are several promising
directions in the security of SDN.
Security Applications in SDN
Traditional network security applications, e.g., NIDS and firewalls, work by inspecting
network traffic according to the communication protocols, network interfaces, and com-
munication ports. In SDN, communication entities are no longer bound with network
interfaces as in a traditional network infrastructure. Existing solutions of NIDS and
firewalls do not fit SDN. Some open questions that can be explored are: “How network
98
security applications, e.g., NIDS and firewalls, will be affected under the SDN struc-
ture?”, and “How to design efficient NIDS that are capable of scanning network entities
according to their logical grouping and resistant with VM migration?”.
Digital Forensics in SDN
The emergence of SDN will pose new challenges to digital forensics. For example,
an attacker can set up a malicious VM in a virtualized network on the cloud. Since
SDN techniques decouple the identities of communication entities from the physical
network interfaces they are attached to, it might be more difficult to collect evidence of
malicious behavior and trace back the attackers. One problem that can be studied is
to investigate techniques allowing the linkage of physical resources of attackers to their
logical identities.
Configuration Validation in SDN
Configuration is the “glue” for logically integrating and setting up network components
satisfying end-to-end requirements in terms of security, connectivity, performance, and
reliability. A real infrastructure can have hundreds of components, where each compo-
nent can have a couple thousand of configuration commands. It is known that manual
configuration is error prone. Researchers have developed tools and techniques to val-
idate whether a given configuration satisfies specified requirements. The emerging of
SDN will add new challenges to network configuration since the control plane is sep-
arated from the data plane and is implemented as software. In the future, several
problems can be explored: “How to construct a configuration acquisition system to
extract configuration information from components in SDN?”, “How to create a re-
quirement library that captures the best practices and design patterns of end-to-end
requirements in SDN?”, and “How to develop an evaluation tool that is capable of ef-
ficiently evaluating requirements and suggesting alternative configurations for false or
changed requirements?”.
99
References
[1] Tarari regex content processor. http://www.tarari.com, 2010.
[2] Tipping point. http://www.tippingpoint.com, 2010.
[3] Gawk. http://www.gnu.org/software/gawk/manual/gawk.html, Last re-trieved in March 2013.
[4] Grep. http://www.gnu.org/software/grep/, Last retrieved in March 2013.
[5] Snort. http://www.snort.org/, 2013.
[6] A. V. Aho. Algorithms for finding patterns in strings. In Handbook of TheoreticalComputer Science, Volume A: Algorithms and Complexity (A), pages 255–300.1990.
[7] A. V. Aho and M. J. Corasick. Efficient string matching: An aid to bibliographicsearch. Comm. ACM, 18(6):333–340, 1975.
[8] A. V. Aho and S. C. Johnson. Optimal code generation for expression trees. InACM Symp. on Theory of Computing, New York, NY, USA, 1975. ACM.
[9] B. S. Baker. Parameterized pattern matching by Boyer-Moore type algorithms.In Proc. Sixth Annual ACM-SIAM Symp. on Discrete Algorithms, pages 541–550,Jan 1995.
[10] B. S. Baker. Parameterized pattern matching: Algorithms and applications. J.Comput. Syst. Sci., 52(1):28–42, Feb 1996.
[11] M. Becchi and S. Cadambi. Memory-efficient regular expression search using statemerging. In Proceedings of IEEE Infocom, 2007.
[12] M. Becchi and P. Crowley. A hybrid finite automaton for practical deep packetinspection. In Intl. Conf. on emerging Networking EXperiments and Technologies,2007.
[13] M. Becchi and P. Crowley. An improved algorithm to accelerate regular expressionevaluation. In Proceedings of the 3rd ACM/IEEE Symposium on Architecture fornetworking and communications systems, ANCS ’07, pages 145–154, New York,NY, USA, 2007. ACM.
[14] M. Becchi and P. Crowley. Efficient regular expression evaluation: Theory topractice. In Intl. Conf. on Architectures for Networking and CommunicationSystems, pages 50–59. ACM, 2008.
100
[15] M. Becchi and P. Crowley. Extending finite automata to efficiently match perl-compatible regular expressions. In Proceedings of the 2008 ACM CoNEXT Con-ference, CoNEXT ’08, pages 25:1–25:12, New York, NY, USA, 2008. ACM.
[16] R. S. Boyer and J. S. Moore. A fast string searching algorithm. Communicationsof the ACM, 20(10):62–72, 1977.
[17] Bro. The bro network security monitor. http://bro.org/, Last retrieved inMarch 2013.
[18] B. C. Brodie, D. E. Taylor, and R. K. Cytron. A scalable architecture for high-throughput regular-expression pattern matching. In Intl. Symp. Computer Ar-chitecture, pages 191–202. IEEE Computer Society, 2006.
[19] D. Brumley, J. Newsome, D. Song, H. Wang, and S. Jha. Towards automaticgeneration of vulnerability-based signatures. In IEEE Symposium on Securityand Privacy, May 2006.
[20] R. E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEETransactions on Computers, 35(8):677–691, 1986.
[21] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and J. Hwang. Symbolicmodel checking: 1020 states and beyond. In Symposium on Logic in ComputerScience, pages 401–424. IEEE Computer Society, 1990.
[22] D. Chasaki and T. Wolf. Fast regular expression matching in hardware using nfa-bdd combination. In Proceedings of the 6th ACM/IEEE Symposium on Architec-tures for Networking and Communications Systems, ANCS ’10, pages 12:1–12:2,New York, NY, USA, 2010. ACM.
[23] M. Christiansen and E. Fleury. An MTIDD based firewall. TelecommunicationSystems, 27(2–4), October 2004.
[24] Cisco. IOS terminal services configuration guide. http://tinyurl.com/2eouvq.
[25] C. R. Clark and D. E. Schimmel. Scalable pattern matching for high-speed net-works. In IEEE Symp. on Field-Programmable Custom Computing Machines,pages 249–257. IEEE Computer Society, 2004.
[26] B. Commentz-Walter. A string matching algorithm fast on the average. In Proc.Intl. Cooloquium on Automata, Languages, and Programming, pages 118–132,1979.
[27] O. Coudert, C. Berthet, and J. C. Madre. Verification of synchronous sequentialmachines based on symbolic execution. In Proceedings of the international work-shop on Automatic verification methods for finite state systems, pages 365–373,New York, NY, USA, 1990. Springer-Verlag New York, Inc.
[28] R. Cox. Regular expression matching can be simple and fast (but is slow inJava, Perl, PHP, Python, Ruby, ...), 2007. http://swtch.com/~rsc/regexp/
regexp1.html.
101
[29] R. Cox. Implementing regular expressions. http://swtch.com/~rsc/regexp/,Last retrieved in August 2011.
[30] S. Dharmapurikar, P. Krishnamurthy, T. S. Sproull, and J. W. Lockwood. Deeppacket inspection using parallel bloom filters. IEEE Micro, 24(1):52–61, 2004.
[31] S. Dharmapurikar and J. W. Lockwood. Fast and scalable pattern matchingfor network intrusion detection systems. Jour. on Selected Areas in Comm.,24(10):1781–1792, 2006.
[32] R. W. Floyd and J. D. Ullman. The compilation of regular expressions intointegrated circuits. Journal of the ACM, 29(3), July 1982.
[33] M. G. Gouda and X. Liu. Firewall design: Consistency, completeness and com-pactness. In Intl. Conf. on Distributed Computing Systems, Mar 2004.
[34] J. D. Guttman and A. L. Herzog. Rigorous automated network security manage-ment. Intl. Jour. of Information Security, 4(1–2), 2004.
[35] S. Haber, W. Horne, P. Manadhata, M. Mowbray, and P. Rao. Efficient submatchextraction for practical regular expression. In The 7th International Conferenceon Language and Automata Theory and Applications, Bilbao, Spain, April 2013.
[36] M. Handley, V. Paxson, and C. Kreibich. Network intrusion detection: Evasion,traffic normalization, and end-to-end protocol semantics. In Usenix Security,pages 9–9. USENIX, 2001.
[37] S. Hazelhurst, A. Fatti, and A. Henwood. Binary decision diagram representationsof firewall and router access lists. Technical report, University of Witwatersrand,Johannesburg, South Africa, 1998.
[38] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory,Languages, and Computation, Third Edition. Addison-Wesley, 2007.
[39] B. L. Hutchings, R. Franklin, and D. Carver. Assisting network intrusion de-tection with reconfigurable hardware. In Annual Symp. on Field-ProgrammableCustom Computing Machines, pages 111–120. IEEE Computer Society, 2002.
[40] S. Johnson. Yacc – yet another compiler compiler. Computing Science Tech. Rep.32, AT&T Bell Labs, 1975.
[41] M. Jordan. Dealing with metamorphism. Virus Bulletin Weekly, 2002.
[42] H. Kim and B. Karp. Autograph: Toward automated, distributed worm signaturedetection. In USENIX Security Symposium, pages 271–286, 2004.
[43] S. Kim and Y. Kim. A fast multiple string-pattern matching algorithm. InAoM/IAoM Conf. on Computer Science, August 1999.
[44] D. E. Knuth, J. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAMJournal of Computing, 6(2):323–350, 1977.
102
[45] S. Kong, R. Smith, and C. Estan. Efficient signature matching with multiplealphabet compression tables. In Intl. Conf. on Security and Privacy in Commu-nication Networks, 2008.
[46] S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, and J. Turner. Algorithms toaccelerate multiple regular expressions matching for deep packet inspection. InACM SIGCOMM Conference, pages 339–350. ACM, 2006.
[47] V. Laurikari. NFAs with tagged transitions, their conversion to deterministicautomata and application to regular expressions. In SPIRE’00, September 2000.
[48] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das. The 1999 DARPAoff-line intrusion detection evaluation. Computer Networks, 34(4):579–595, Octo-ber 2000.
[49] R. Liu, N. Huang, C. Chen, and C. Kao. A fast string-matching algorithm for net-work processor-based intrusion detection system. Trans. on Embedded ComputingSys., 3(3):614–633, 2004.
[50] T. Liu, Y. Sun, A. X. Liu, L. Guo, and B. Fang. A prefiltering approach to regularexpression matching for network security systems. In Proceedings of the 10th in-ternational conference on Applied Cryptography and Network Security, ACNS’12,pages 363–380, Berlin, Heidelberg, 2012. Springer-Verlag.
[51] J. McHugh. Testing intrusion detection systems: A critique of the 1998 and1999 DARPA intrusion detection system evaluations as performed by Lincolnlaboratories. ACM Transactions on Information and System Security, 3(4):262–294, November 2000.
[52] C. Meiners, J. Patel, E. Norige, E. Torng, and A. X. Liu. Fast regular expressionmatching using small TCAMs for network intrusion detection and preventionsystems. In 19th USENIX Security Symposium, August 2010.
[53] C. R. Meiners, E. Norige, A. X. Liu, and E. Torng. Flowsifter: A countingautomata approach to layer 7 field extraction for deep flow inspection. In A. G.Greenberg and K. Sohraby, editors, INFOCOM, pages 1746–1754. IEEE, 2012.
[54] A. Mitra, W. Najjar, and L. Bhuyan. Compiling PCRE to FPGA for acceleratingSnort IDS. In Symp. on Arch. for Networking and Comm. Systems, pages 127–136. ACM, 2007.
[55] R. Muth and U. Manber. Approximate multiple string search. In D. S. Hirschbergand E. W. Myers, editors, Annual Symp. on Combinatorial Pattern Matching,number 1075, pages 75–86, Laguna Beach, CA, 1996. Springer-Verlag, Berlin.
[56] K. Namjoshi and G. Narlikar. Robust and fast pattern matching for intrusiondetection. In Proceedings of the 29th conference on Information communications,INFOCOM’10, pages 740–748, Piscataway, NJ, USA, 2010. IEEE Press.
[57] G. Navarro and M. Raffinot. Fast and flexible string matching by combiningbit-parallelism and suffix automata. J. Exp. Algorithmics, 5:4, 2000.
103
[58] J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating sig-natures for polymorphic worms. In IEEE Symposium on Security and Privacy,pages 226–241, Washington, DC, USA, 2005. IEEE Computer Society.
[59] J. Ousterhout. Tcl programming language. http://www.tcl.tk/, Last retrievedin March 2013.
[60] J. Patel, A. Liu, and E. Torng. Bypassing space explosion in regular expressionmatching for network intrusion detection and prevention systems. In Proceedingsof the 19th Annual Network and Distributed System Security Symposium, SanDiego, California, Februray 2012.
[61] V. Paxson. Bro: a system for detecting network intruders in real-time. Comput.Netw., 31(23-24):2435–2463, Dec. 1999.
[62] PCRE. The Perl compatible regular expression library. http://www.pcre.org.
[63] R. Perdisci, D. Ariu, P. Fogla, G. Giacinto, and W. Lee. Mcpad: A multipleclassifier system for accurate payload-based anomaly detection. Comput. Netw.,53(6):864–881, Apr. 2009.
[64] R. Pike. The text editor sam. Softw. Pract. Exper., 17:813–845, November 1987.
[65] T. Ptacek and T. Newsham. Insertion, evasion and denial of service: Eludingnetwork intrusion detection. http://insecure.org/stf/secnet_ids/secnet_
ids.html.
[66] M. Roesch. Snort - lightweight intrusion detection for networks. In USENIXConf. on System Administration, pages 229–238. USENIX, 1999.
[67] S. Rubin, S. Jha, and B. Miller. Language-based generation and evaluation ofNIDS signatures. In Symposium on Security and Privacy, Oakland, California,May 2005.
[68] L. Salmela, J. Tarhio, and J. Kytojoki. Multipattern string matching with q-grams. Jour. of Experimental Algorithmics, 11:1.1, 2006.
[69] N. Schear, D. R. Albrecht, and N. Borisov. High-speed matching of vulnerabilitysignatures. In Lippman, R., Kirda, E., Trachtenberg, A., (eds.) RAID 2008,volume 5230 of Lecture Notes in Computer Science, pages 155–174. Springer,2008.
[70] U. Shankar and V. Paxson. Active mapping: Resisting NIDS evasion withoutaltering traffic. In Symp. on Security and Privacy, pages 44–61. IEEE ComputerSociety, 2003.
[71] R. Sidhu and V. Prasanna. Fast regular expression matching using FPGAs.In Symp. on Field-Programmable Custom Computing Machines, pages 227–238.IEEE Computer Society, 2001.
[72] R. P. S. Sidhu, A. Mei, and V. K. Prasanna. String matching on multicontextfpgas using self-reconfiguration. In Proceedings of the 1999 ACM/SIGDA seventhinternational symposium on Field programmable gate arrays, FPGA ’99, pages217–226, New York, NY, USA, 1999. ACM.
104
[73] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprinting.In USENIX/ACM Symposium on Operating System Design and Implementation,pages 45–60, 2004.
[74] R. Sinnappan and S. Hazelhurst. A reconfigurable approach to packet filtering.In Brebner, G., and Woods, R., (eds.) FPL 2001, volume 2147 of Lecture Notesin Computer Science, pages 638–642. Springer, 2001.
[75] R. Smith, C. Estan, and S. Jha. Backtracking algorithmic complexity attacksagainst a NIDS. In Annual Computer Security Applications Conf., pages 89–98.IEEE Computer Society, 2006.
[76] R. Smith, C. Estan, and S. Jha. XFA: Faster signature matching with extendedautomata. In Symp. on Security and Privacy, pages 187–201. IEEE ComputerSociety, 2008.
[77] R. Smith, C. Estan, S. Jha, and S. Kong. Deflating the Big Bang: Fast andscalable deep packet inspection with extended finite automata. In SIGCOMMConference, pages 207–218. ACM, 2008.
[78] Snort. Download snort rules. http://www.snort.org/snort-rules/, Last re-trieved in March 2013.
[79] F. Somenzi. CUDD: CU decision diagram package, release 2.4.2. Department ofElectrical, Computer, and Energy Engineering, University of Colorado at Boulder.http://vlsi.colorado.edu/~fabio/CUDD.
[80] R. Sommer and V. Paxson. Enhancing byte-level network intrusion detectionsignatures with context. In Conf. on Computer and Comm. Security, pages 262–271. ACM, 2003.
[81] R. Sommer and V. Paxson. Outside the closed world: On using machine learn-ing for network intrusion detection. In Symp. on Security and Privacy. IEEEComputer Society, 2010.
[82] I. Sourdis and D. Pnevmatikatos. Fast, large-scale string match for a 10GbpsFPGA-based network intrusion detection system. In Cheung, P., Constantinides,G., Sousa, J., (eds.) FPL 2003, volume 2778 of Lecture Notes in Computer Sci-ence, pages 880–889. Springer, 2003.
[83] L. Tan and T. Sherwood. A high throughput string matching architecture forintrusion detection and prevention. In Intl. Symp. Computer Architecture, pages112–122. IEEE Computer Society, 2005.
[84] K. Thompson. Programming techniques: Regular expression search algorithm.Commun. ACM, 11:419–422, June 1968.
[85] H. J. Touati, H. Savoj, B. Lin, R. K. Brayton, and A. Sangiovanni-Vincentelli.Implicit state enumeration of finite state machines using bdd’s. In IEEE Interna-tional Conference on Computer-Aided Design, pages 130–133, Santa Clara, CA,1990. IEEE.
105
[86] N. Tuck, T. Sherwood, B. Calder, and G. Varghese. Deterministic memory-efficient string matching algorithms for intrusion detection. In IEEE INFOCOM,pages 333–340. IEEE Computer Society, 2004.
[87] G. Vasiliadis, S. Antonatos, M. Polychronakis, E. P. Markatos, and S. Ioannidis.Gnort: High performance network intrusion detection using graphics processors.In Lippman, R., Kirda, E., Trachtenberg, A., (eds.) RAID 2008, volume 5230 ofLecture Notes in Computer Science, pages 116–134. Springer, 2008.
[88] H. J. Wang, C. Guo, D. R. Simon, and A. Zugenmaier. Shield: Vulnerability-driven network filters for preventing known vulnerability exploits. In ACM SIG-COMM, August 2004.
[89] K. Wang, G. Cretu, and S. J. Stolfo. Anomalous payload-based worm detectionand signature generation. In Proceedings of the 8th international conference onRecent Advances in Intrusion Detection, RAID’05, pages 227–246, Berlin, Heidel-berg, 2006. Springer-Verlag.
[90] B. W. Watson. The performance of single and multiple keyword pattern matchingalgorithms. In Third South American Workshop on String Processing, Recife,Brazil, August 1996.
[91] Wikipedia. Anomaly-based intrusion detection system. http://en.wikipedia.
org/wiki/Anomaly-based_intrusion_detection_system, Last retrieved inMarch 2013.
[92] Wikipedia. Host-based intrusion detection system. http://en.wikipedia.org/
wiki/Host-based_intrusion_detection_system, 2013.
[93] Wikipedia. Intrusion detection system. http://en.wikipedia.org/wiki/
Intrusion_detection_system, 2013.
[94] Wikipedia. Security information and event management. http://en.
wikipedia.org/wiki/Security_information_and_event_management, 2013.
[95] Wikipedia. Software-defined networking. http://en.wikipedia.org/wiki/
Software-defined_networking, 2013.
[96] S. Wu and U. Manber. A fast algorithm for multi-pattern searching. TR 94-17,Department of Computer Science, University of Arizona, 1994.
[97] L. Yang, R. Karim, V. Ganapathy, and R. Smith. Improving nfa-based signaturematching using ordered binary decision diagrams. In RAID’10, volume 6307of Lecture Notes in Computer Science (LNCS), pages 58–78, Ottawa, Canada,September 2010. Springer.
[98] L. Yang, R. Karim, V. Ganapathy, and R. Smith. Signatures referenced inSection 3.4 and Section 3.5, 2010. Available at http://www.cs.rutgers.edu/
~vinodg/papers/raid2010.
[99] L. Yang, R. Karim, V. Ganapathy, and R. Smith. Fast, memory-efficient regu-lar expression matching with nfa-obdds. Computer Networks, 55(15):3376–3393,October 2011.
106
[100] L. Yang, P. Manadhata, W. Horne, P. Rao, and V. Ganapathy. Fast submatchextraction using obdds. In Proceedings of the eighth ACM/IEEE symposium onArchitectures for networking and communications systems, ANCS ’12, pages 163–174, New York, NY, USA, 2012. ACM.
[101] V. Yegneswaran, J. T. Giffin, P. Barford, and S. Jha. An architecture forgenerating semantics-aware signatures. In USENIX Security Symposium, Bal-timore,Maryland, Aug. 2005.
[102] F. Yu, Z. Chen, Y. Diao, T. V. Lakshman, and R. H. Katz. Fast and memory-efficient regular expression matching for deep packet inspection. In ACM/IEEESymp. on Arch. for Networking and Comm. Systems, pages 93–102, 2006.
[103] L. Yuan, J. Mai, Z. Su, H. Chen, C. Chuah, and P. Mohapatra. FIREMAN: Atoolkit for firewall modeling and analysis. In Symp. on Security and Privacy, May2006.
[104] S. Yusuf and W. Luk. Bitwise optimized CAM for network intrusion detectionsystems. In Intl. Conf. on Field Prog. Logic and Applications, pages 444–449.IEEE Press, 2005.