yesha gupta plagiarism detection. string matching algorithms: kmp lcss rabin-karp fingerprints an...

8
Yesha Gupta Plagiarism detection

Upload: crystal-baker

Post on 14-Jan-2016

221 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

Yesha Gupta

Plagiarism detection

Page 2: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

String Matching Algorithms:

KMPLCSSRabin-Karp fingerprints• an algorithm of choice for multiple pattern search

Page 3: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

Testing text file information:

21 Lines Each line(treated as pattern) is of different length Max line size: 370 Minimum line size: 85

Page 4: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

LCSS is performing very slowRabin Karp performed better than KMP

Why? Efficient use of Hashing techniques

Page 5: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

KMP generated optimum output. Rabin Karp did not.

Why? Because of fixed length patterns in a text

Page 6: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

Testing text file information:

21 Lines Each line(treated as pattern) is of same length

Page 7: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

Result of RabinKarp and KMP is the same

Why?Each pattern has same length

Page 8: Yesha Gupta Plagiarism detection. String Matching Algorithms:  KMP  LCSS  Rabin-Karp fingerprints an algorithm of choice for multiple pattern search

Execution time of RabinKarp is slightly better than KMP