1 string matching of bit parallel suffix automata
TRANSCRIPT
![Page 1: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/1.jpg)
1
String Matching of Bit Parallel Suffix Automata
![Page 2: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/2.jpg)
2
Suffix Automata
Base on a Deterministic Acyclic Word Graph (DAWG) To facilitate comparing equivalence suffix string Nondeterministic suffix automata
Deterministic suffix automataSubset Construction
![Page 3: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/3.jpg)
3
Suffix Automata Search Also called Backward Deterministic automata Matching (BDM) Build the factor x for pattern p
endpos(x) set of all the pattern position where an occurrence of x ends Ex: Pattern = baabbaa, endpos(aa) = {3,7}
Safe shift, if no equivalent suffix in pattern
Text: shift left to right
Fail to matching a factor
Shift window
Windows size = pattern length
![Page 4: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/4.jpg)
4
BDM AlgorithmBuild automata
Reached the final state
![Page 5: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/5.jpg)
5
Suffix Automata Search Example1. Build Reverse Deterministic Suffix Automata
2. endpos(x) to find a factor
3. Fail to find a factor, do a safe shift
![Page 6: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/6.jpg)
6
1. T= [abbaba a ]bbaab a is a factor of pr and a reverse prefix of p. last =6
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 7: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/7.jpg)
7
2. T= [abbab aa ]bbaab aa is a factor of pr and a reverse prefix of p. last =5
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 8: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/8.jpg)
8
3. T= [abba baa ]bbaab
aab is a factor of pr
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 9: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/9.jpg)
9
4. T= [abb abaa ]bbaabWe fail to recognize the next a.So we shift the window to last.We search again in position:T= abbab[aabbaab] . last=7
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 10: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/10.jpg)
10
5. T= abbab[aabbaa b ]b is a factor of pr
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 11: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/11.jpg)
11
6. T= abbab[aabba ab ]
ba is a factor of pr
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 12: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/12.jpg)
12
7. T= abbab[aabb aab ]
baa is a factor of pr and a reverse prefix of p. last =4
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 13: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/13.jpg)
13
8. T= abbab[aab baab ]
baab is a factor of pr
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 14: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/14.jpg)
14
9. T= abbab[aa bbaab ]baabb is a factor of pr
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 15: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/15.jpg)
15
10. T= abbab[a abbaab ]
baabba is a factor of pr
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 16: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/16.jpg)
16
11. T= abbab[ aabbaab ]
We recognize the word aabbaab and report an occurrence.
01234567
145
26 4
5
62367
737
aa
a a a
a
b
b
b
bb
Suffix Automata Search Example
![Page 17: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/17.jpg)
17
BNDM Algorithm
Backward Nondeterministic Dawg Matching (BNDM)
Handle class, multiple pattern, and allow errors Using bit parallelism, Combine Shift-Or and BD
M Faster than BDM 20% ~ 25%, Faster than BM
10% ~ 40% Update Function
![Page 18: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/18.jpg)
18
BNDM Algorithm
![Page 19: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/19.jpg)
19
BNDM Example
![Page 20: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/20.jpg)
20
BNDM Example
![Page 21: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/21.jpg)
21
BNDM Further Improvement
Handle long pattern Partition pattern p into subpatterns pi Build a array of D and B, process each part with basic algorithm If pi is found, than process pi+1 …
Handle Class Modified B table only
Have the ith bit set for all chars belonging to ith position in pattern Multiple Pattern
Two method Interleave patterns, shift r bit for each D update Just concatenate, shift 1 bit, but modifed D = (D<<1) &(1m-10)r
Where r is # of patterns Approximate Matching
Use Wu’s method
![Page 22: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/22.jpg)
22
Performance Comparison
In 1/100 of second per megabyte
![Page 23: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/23.jpg)
23
Reference
Gonzalo Navarro and Mathieu Raffinot. A Bit-parallel approach to Suffix Automata: Fast Extended String Matching. In M. Farach (editor), Proc. CPM'98, LNCS 1448. Pages 14-33, 1998.
Gonzalo Navarro, Mathieu Raffinot, Fast and Flexible String Matching by Combining Bit-parallelism and Suffix Automata (1998)
![Page 24: 1 String Matching of Bit Parallel Suffix Automata](https://reader034.vdocuments.mx/reader034/viewer/2022042507/56649d135503460f949e6e01/html5/thumbnails/24.jpg)
24
Rreverse Pattern ?