4/14/20151 regular expressions and automata lecture 3: regular expressions and automata husni...

92
06/12/22 1 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

Upload: estevan-sawin

Post on 14-Dec-2015

246 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 1

REGULAR EXPRESSIONS AND AUTOMATA

Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA

Husni Al-Muhtaseb

Page 2: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 2

الرحيم الرحمن الله بسمICS 482: Natural Language

ProcessingLecture 3: REGULAR EXPRESSIONS

AND AUTOMATA

Husni Al-Muhtaseb

Page 3: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

NLP Credits and Acknowledgment

These slides were adapted from presentations of the Authors of the

bookSPEECH and LANGUAGE PROCESSING:

An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

and some modifications from presentations found in the WEB by

several scholars including the following

Page 4: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

NLP Credits and AcknowledgmentIf your name is missing please contact me

muhtasebAt

Kfupm.Edu.sa

Page 5: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

NLP Credits and AcknowledgmentHusni Al-Muhtaseb

James MartinJim MartinDan JurafskySandiway FongSong young inPaula MatuszekMary-Angela PapalaskariDick Crouch Tracy KinL. Venkata

SubramaniamMartin Volk Bruce R. MaximJan HajičSrinath SrinivasaSimeon NtafosPaolo PirjanianRicardo VilaltaTom Lenaerts

Heshaam Feili Björn GambäckChristian Korthals Thomas G.

DietterichDevika

SubramanianDuminda

Wijesekera Lee McCluskey David J. KriegmanKathleen McKeownMichael J. CiaraldiDavid FinkelMin-Yen KanAndreas Geyer-

Schulz Franz J. KurfessTim FininNadjet BouayadKathy McCoyHans Uszkoreit Azadeh Maghsoodi

Khurshid Ahmad

Staffan Larsson

Robert Wilensky

Feiyu Xu

Jakub Piskorski

Rohini Srihari

Mark Sanderson

Andrew Elks

Marc Davis

Ray Larson

Jimmy Lin

Marti Hearst

Andrew McCallum

Nick KushmerickMark CravenChia-Hui ChangDiana MaynardJames Allan

Martha Palmerjulia hirschbergElaine RichChristof Monz Bonnie J. DorrNizar HabashMassimo PoesioDavid Goss-GrubbsThomas K HarrisJohn HutchinsAlexandros PotamianosMike RosnerLatifa Al-Sulaiti Giorgio Satta Jerry R. HobbsChristopher ManningHinrich SchützeAlexander GelbukhGina-Anne Levow Guitao GaoQing MaZeynep Altan

Page 6: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 6

Agenda: REGULAR EXPRESSIONS AND AUTOMATA

• Why to study it?– Talk to ALICE

• Regular expressions

• Finite State Automata

• Assignments

Page 7: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 7

NLP Example: Chat with Alice• http://www.pandorabots.com/pandora/talk

?botid=f5d922d97e345aa1&skin=custom_input

• A.L.I.C.E. (Artificial Linguistic Internet Computer Entity) is an award-winning free natural language artificial intelligence chat robot. The software used to create A.L.I.C.E. is available as free ("open source") Alicebot and AIML software.

• http://www.alicebot.org/about.html

Page 8: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 8

NLP Representations

• State Machines– FSAs: Finite State Automata– FSTs: Finite State Transducers– HMMs: Hidden Markov Model– ATNs: Augmented Transition Network– RTNs: Recursive Transition Network

Page 9: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 9

NLP Representations

• Rule Systems: – CFGs: Context Free Grammar– Unification Grammars– Probabilistic CFGs

• Logic-based Formalisms– 1st Order Predicate Calculus– Temporal and other Higher Order Logics

• Models of Uncertainty– Bayesian Probability Theory

Page 10: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 10

NLP Algorithms

• Most are transducers: accept or reject input, and construct new structure from input– State space search

• To manage the problem of making choices during processing when we lack the information needed to make the right choice

– Dynamic programming• To avoid having to redo work during the course

of a state-space search

Page 11: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 11

State Space Search

• States represent pairings of partially processed inputs with partially constructed answers

• Goals are exhausted inputs paired with complete answers that satisfy some criteria

• The spaces are normally too large to exhaustively explore

Page 12: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 12

Dynamic Programming

• Don’t do the same work over and over

• Avoid this by building and making use of solutions to sub-problems that must be invariant across all parts of the space

Page 13: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 13

Regular Expressions and Text Searching

• Regular expression (RE): A formula (in a special language) for specifying a set of strings

• String: A sequence of alphanumeric characters (letters, numbers, spaces, tabs, and punctuation)

Page 14: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 14

Regular Expression Patterns

• Regular Expression can be considered as a pattern to specify text search strings to search a corpus of texts

• What is Corpus?

• For text search purpose: use Perl syntax

• Show the exact part of the string in a line that first matches a Regular Expression pattern

Page 15: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 15

Regular Expression Patterns

RE String matched

/woodchucks/ “interesting links to woodchucks and lemurs”

/a/ “Sarah Ali stopped by Mona’s”

/Ali says,/ “My gift please,” Ali says,”

/book/ “all our pretty books”

/!/ “Leave him behind!” said Sami

Page 16: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 16

RE Match

/[wW]oodchuck/ Woodchuck or woodchuck

/[abc]/ “a”, “b”, or “c”

/[0123456789]/ Any digit

Page 17: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 17

RE Description

/a*/ Zero or more a’s

/a+/ One or more a’s

/a?/ Zero or one a’s

/cat|dog/ ‘cat’ or ‘dog’

/^cat$/ A line containing only ‘cat’

/\bun\B/ Beginnings of longer strings starts by ‘un’

Page 18: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 18

Example

• Find all instances of the word “the” in a text.– /the/

•What About ‘The’

– /[tT]he/•What about ‘Theater”, ‘Another’

– /\b[tT]he\b/

Page 19: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 19

Sidebar: Errors

• The process we just went through was based on two fixing kinds of errors– Matching strings that we should not have

matched (there, then, other)• False positives

– Not matching things that we should have matched (The)

• False negatives

Page 20: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 20

Sidebar: Errors

• Reducing the error rate for an application often involves two efforts – Increasing accuracy (minimizing false

positives)– Increasing coverage (minimizing false

negatives)

Page 21: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 21

Regular expressions• Basic regular expression patterns• Perl-based syntax (slightly different from

other notations for regular expressions)• Disjunctions [abc]• Ranges [A-Z]• Negations [^Ss]• Optional characters ? and *• Wild cards .• Anchors ^ and $, also \b and \B• Disjunction, grouping, and precedence |

Page 22: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 22

Page 23: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 23

Page 24: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 24

Page 25: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 25

Writing correct expressions

• Exercise: write a Perl regular expression to match the English article “the”:

/the/ missed ‘The’

included ‘the’ in ‘others’/[tT]he/

/\b[tT]he\b/ Missed ‘the25’ ‘the_’

/[^a-zA-Z][tT]he[^a-zA-Z]/Missed ‘The’ at the beginning of a line

/(^|[^a-zA-Z])[tT]he[^a-zA-Z]/

Page 26: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 26

A more complex example

• Exercise: Write a regular expression that will match “any PC with more than 500MHz and 32 Gb of disk space for less than $1000”:

Page 27: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 27

Example• Price

– /$[0-9]+/ # whole dollars – /$[0-9]+\.[0-9][0-9]/ # dollars and cents – /$[0-9]+(\.[0-9][0-9])?/ #cents optional – /\b$[0-9]+(\.[0-9][0-9])?\b/ #word boundaries

• Specifications for processor speed – /\b[0-9]+ *(MHz|[Mm]egahertz|Ghz|[Gg]igahertz)\b/

• Memory size – /\b[0-9]+ *(Mb|[Mm]egabytes?)\b/ – /\b[0-9](\.[0-9]+) *(Gb|[Gg]igabytes?)\b/

• Vendors – /\b(Win95|WIN98|WINNT|WINXP *(NT|95|98|2000|XP)?)\b/

– /\b(Mac|Macintosh|Apple)\b/

Page 28: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 28

Advanced Operators

Underscore: Correct figure 2.6

Page 29: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 29

Page 30: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 30

Page 31: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 31

Assignment: Try regular expressions in MS WORD in both

Arabic & English

Page 32: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 32

baa! baaa! baaaa! baaaaa! ...

Finite State Automata

• FSAs recognize the regular languages represented by regular expressions– SheepTalk: /baa+!/

• Directed graph with labeled nodes and arc transitions

•Five states: q0 the start state, q4 the final state, 5 transitions

q0 q4q1 q2 q3

b a aa !

Page 33: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 33

Formally

• FSA is a 5-tuple consisting of– Q: set of states {q0,q1,q2,q3,q4} : an alphabet of symbols {a,b,!}

– q0: A start state

– F: a set of final states in Q {q4} (q,i): a transition function mapping Q x

to Q

q0 q4q1 q2 q3

b a

a

a !

Page 34: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 34

• FSA recognizes (accepts) strings of a regular language– baa!– baaa!– baaaa!– …

• Tape Input: a rejected input

a b a ! b

q0 q4q1 q2 q3b a

aa !

Page 35: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 35

State Transition Table for SheepTalk

StateInput

b a !

0 1 Ø Ø

1 Ø 2 Ø

2 Ø 3 Ø

3 Ø 3 4

4 Ø Ø Ø

baa! baaa! baaaa! baaaaa! ...

q0 q4q1 q2 q3

b aa

a !

Page 36: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 36

Non-Deterministic FSAs for SheepTalk

q0 q4q1 q2 q3

b a a a !

q0 q4q1 q2 q3

b a a !

Page 37: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 37

• A language is a set of strings

• String: A sequence of letters

– Examples: “cat”, “dog”, “house”, …

– Defined over an alphabet:

Languages

zcba ,,,,

Page 38: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 38

Alphabets and Strings

• We will use small alphabets:

• Strings

abbaw

bbbaaav

abu

ba,

baaabbbaaba

baba

abba

ab

a

Page 39: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 39

Finite AutomatonInput

String

Output

String

FiniteAutomaton

Page 40: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 40

Finite Accepter

• Input

“Accept” or“Reject”

String

FiniteAutomaton

Output

Page 41: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 41

Transition Graph

initialstate

final state“accept”state

transition

abba -Finite Accepter

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

Page 42: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 42

Initial Configuration

1q 2q 3q 4qa b b a

5q

a a bb

ba,

Input Stringa b b a

ba,

0q

Page 43: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 43

Reading the Input

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 44: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 44

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 45: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 45

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 46: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 46

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b b a

ba,

Page 47: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 47

0q 1q 2q 3q 4qa b b a

Output: “accept”

5q

a a bb

ba,

a b b a

ba,

Page 48: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 48

Rejection

1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

0q

Page 49: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 49

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

Page 50: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 50

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

Page 51: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 51

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b a

ba,

Page 52: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 52

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

Output:“reject”

a b a

ba,

Page 53: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 53

Another Example

a

b ba,

ba,

0q 1q 2q

a ba

Page 54: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 54

a

b ba,

ba,

0q 1q 2q

a ba

Page 55: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 55

a

b ba,

ba,

0q 1q 2q

a ba

Page 56: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 56

a

b ba,

ba,

0q 1q 2q

a ba

Page 57: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 57

a

b ba,

ba,

0q 1q 2q

a ba

Output: “accept”

Page 58: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 58

Rejection

a

b ba,

ba,

0q 1q 2q

ab b

Page 59: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 59

a

b ba,

ba,

0q 1q 2q

ab b

Page 60: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 60

a

b ba,

ba,

0q 1q 2q

ab b

Page 61: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 61

a

b ba,

ba,

0q 1q 2q

ab b

Page 62: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 62

a

b ba,

ba,

0q 1q 2q

ab b

Output: “reject”

Page 63: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 63

Formalities

• Deterministic Finite Accepter (DFA) FqQM ,,,, 0

Q

0q

F

: set of states

: input alphabet

: transition function

: initial state

: set of final states

Page 64: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 64

About Alphabets

• Alphabets means we need a finite set of symbols in the input.

• These symbols can and will stand for bigger objects that can have internal structure.

Page 65: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 65

Input Aplhabet

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

ba,

Page 66: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 66

Set of States

Q

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

543210 ,,,,, qqqqqqQ

ba,

Page 67: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 67

Initial State

0q

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Page 68: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 68

Set of Final States

F

0q 1q 2q 3qa b b a

5q

a a bb

ba,

4qF

ba,

4q

Page 69: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 69

Transition Function

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

QQ :

ba,

Page 70: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 70

10 , qaq

2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q 1q

Page 71: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 71

50 , qbq

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Page 72: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 72

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

32 , qbq

Page 73: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 73

Transition Function

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

a b

0q

1q

2q

3q

4q

5q

1q 5q

5q 2q

2q 3q

4q 5q

ba,5q5q5q5q

Page 74: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 74

Extended Transition Function(Reads the entire string)

*

QQ *:*

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

Page 75: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 75

20 ,* qabq

3q 4qa b b a

5q

a a bb

ba,

ba,

0q 1q 2q

Page 76: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 76

40 ,* qabbaq

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

Page 77: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 77

50 ,* qabbbaaq

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Page 78: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 78

50 ,* qabbbaaq

1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

0q

Observation: There is a walk from to with label

0q 5qabbbaa

Page 79: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 79

Example

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

abbaML M

accept

Page 80: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 80

Another Example

0q 1q 2q 3q 4qa b b a

5q

a a bb

ba,

ba,

abbaabML ,, M

acceptacceptaccept

Page 81: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 81

More Examples

a

b ba,

ba,

0q 1q 2q

}0:{ nbaML n

accept trap state

Page 82: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 82

ML = { all substrings with prefix }ab

a b

ba,

0q 1q 2q

accept

ba,3q

ab

Page 83: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 83

ML = { all strings without substring }001

0 00 001

1

0

1

10

0 1,0

Page 84: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 84

Regular Languages

• A language is regular if there is

• a DFA such that

• All regular languages form a language family

LM MLL

Page 85: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 85

Example• The language

• is regular:

*,: bawawaL

a

b

ba,

a

b

ba

0q 2q 3q

4q

Page 86: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 86

Finite State Automata

• Regular expressions can be viewed as a textual way of specifying the structure of finite-state automata.

Page 87: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 87

More Formally

• You can specify an FSA by enumerating the following things.– The set of states: Q– A finite alphabet: Σ– A start state– A set of accept/final states– A transition function that maps QxΣ to Q

Page 88: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 88

Dollars and Cents

Page 89: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 89

Assignment 2 - Part 1

• A windows-based version of Python interpreter is available at the supplementary material section of the course website. Please download the interpreter and practice it. Use the help, tutorials and available documentation to investigate the possibility of using Arabic text. summarize your findings.

Page 90: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 90

Assignment 2 - Part 2

• Practice search in Ms Word using regular expressions (Wildcards) for both Arabic and English. Submit at least 5 nontrivial examples.

Page 91: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 91

Assignment 2 - Part 3

• You have been asked to participate in writing an exam about chapter 2 of the textbook. Write one question to check student understanding of chapter two material. Include the answer in your submission.

Page 92: 4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb

04/18/23 92

Thank you

الله ورحمة عليكم السالم