kmp pattern search explained simple
TRANSCRIPT
![Page 1: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/1.jpg)
KMP Pattern SearchQUICK OVERVIEW
Created by,Arjun SKarjunsk.com
![Page 2: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/2.jpg)
What is Patter Searching ?
o Suppose you are reading a text document.o You want to search for a word.o You click CTRL + F and search for that word. o The word processor scans the document and shows the position
of occurrence.
What exactly happens is that, word i.e. pattern is searched inside the text document.
![Page 3: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/3.jpg)
Implementation
![Page 4: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/4.jpg)
Naïve ApproachThe naïve approach is to check whether the pattern matches the string at every possible position in the string.
P = Pattern (word) of length mT = Text (document) of length n
Naive string matching algorithm takes time O((n-m+1)m)
![Page 5: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/5.jpg)
Basic Idea of KMPa b c d a b c a a a b c b a b
a b c d a b c d
Text
Pattern
Text
Pattern
We can find the next position for comparison, by looking at the pattern.
a b c d a b c a a a b c b a b
a b c d a b c d
![Page 6: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/6.jpg)
KMP (Knuth-Morris-Prattern String Matching Algorithm)
Why KMP?Best known for linear time for exact pattern matching.
How is it implemented?
o We find patterns within the search pattern.
o When a pattern comparison partially fails, we can skip to next occurrence of prefix pattern.
o In this way, we can skip trivial comparisons.
![Page 7: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/7.jpg)
Pre-processing
Let’s say we’re matching the pattern “abababca” against the text “bacbababaabcbab”.
Here’s our prefix match table : i.e. prefix-table[i]index 0 1 2 3 4 5 6 7char a b a b a b c avalue 0 0 1 2 3 4 0 1
Matching prefix i.e. aMatching prefix i.e. ab
Matching prefix i.e. abaMatching prefix i.e. abab
No matching prefix
![Page 8: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/8.jpg)
Pre-processing - cont.
• partial_match_length = length of the matched pattern in a step.
• prefix-table = pre-processed prefix table
• If prefix-table[ partial_match_length ] > 1we may skip ahead
partial_match_length - prefix-table[ partial_match_length – 1 ] characters.
// Used to skip, already compared prefix match in the pattern.
![Page 9: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/9.jpg)
Searchingb a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
This is a partial match length of 1The value at prefix-table[partial_match_length - 1] (or prefix-table[0]) is 0.so we don’t get to skip ahead any.
![Page 10: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/10.jpg)
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
![Page 11: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/11.jpg)
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
In naïve approach we shift right and compare again:
Step 2
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
Step 1
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
![Page 12: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/12.jpg)
But in KMP approach, we can directly skip Step 1
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern X X
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
This is a partial match length of 5The value at prefix-table[partial_match_length - 1] (or prefix-table[4]) is 3.
That means we get to skip ahead partial_match_length – prefix-table[partial_match_length - 1] (or 5 - table[4] = 5 - 3 = 2) characters:
We skip comparing “b”. The next comparison starts at next “ab” i.e. the prefix match.
![Page 13: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/13.jpg)
In KMP we can directly skip comparing “ab”
This is a partial match length of 3The value at prefix-table[partial_match_length - 1] (or prefix-table[2]) is 1.
That means we get to skip ahead partial_match_length – prefix-table[partial_match_length - 1] (or 3 - table[2] = 3 - 1 = 2) characters:
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern
b a c b a b a b a a b c b a b
a b a b a b c a
Text
Pattern X X
We skip comparing “b”. The next comparison starts at next “a” i.e. the prefix match.
![Page 14: Kmp pattern search explained simple](https://reader036.vdocuments.mx/reader036/viewer/2022062310/58e73f231a28ab8f028b5a73/html5/thumbnails/14.jpg)
Complexity
O(m) - It is to compute the prefix function values. O(n) - It is to compare the pattern to the text. Total of O(n + m) run time.