regular language & expressions. regular language a regular language is one that a finite state...

16
Regular Language & Expressions

Upload: cleopatra-matthews

Post on 24-Dec-2015

231 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Page 2: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular LanguageA regular language is one that a finite state machine (fsm) will accept.

• ‘Alphabet’:{a, b}

• ‘Rules’:{a(a | b)*

Example strings: {“a”, “aa”, “ab”, “aab”, “abb” .. }

Note:

| - OR

* - zero or more instances

Page 3: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Non-regular Language

To construct a language that is non-regular a language must be created that has an infinite number of states.

Page 4: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressions

A basic and important computing task is to try to manipulate different strings.

Example:

• To search for the word ‘cat’ in a large section of text (eg “catching a cold”)

• To search for a specific pattern in a person’s DNA

(Pattern matching)

Page 5: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressions

Sometimes a set of rules need to be checked to verify accuracy.

Example:• Checking an email address

- One or more lowercase letters followed by @ symbol- One or more lowercase letters followed by . Symbol- One or more lowercase letters followed by . Symbol- One or more lowercase letters followed by .co then .uk

Email

Page 6: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressions notationThe notation below represents a regular expression, regex or pattern.

• ‘Alphabet’:{a, b}

• ‘Rules’:{a(a | b)*Example strings: {“a”, “aa”, “ab”, “aab”, “abb” .. }

(This describes an infinite set, without listing all the members of the set)

Page 7: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressions notationExample:

• ‘Alphabet’:{a - z}

• ‘Strings’:{“michel”, “michael”, “michell”)

What rule represents the following strings represented above?

{Mich (e | ae | el) l}

Page 8: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressions notationHere are some regular expressions that are defined byThe formal expression {a, b}

a is a regular expression that matches a string consisting of just a

b is a regular expression that matches a string consisting of just b

ab is a regular expression that matches a string consisting of the symbol a followed by the symbol b

a* is a regular expression that matches a string consisting of zero or more a’s

a+ is a regular expression that matches a string consisting of one or more a’s

Page 9: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressions notationHere are some regular expressions that are defined byThe formal expression {a, b}

abb? is a regular expression that matches the string ab or the string abb The symbol ‘?’ indicates there is a zero or one of the preceding element.

a | b is a regular expression that matches a string consisting of the symbol a or consisting of the symbol b.

Page 10: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expression

Now we will take a look at some examples.

Page 11: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressionExamples of strings:

abc defines the language with one string, “abc”

abc | bac defines the language with two strings, “abc” and “bac”

a+ defines the language with the strings, “a”, “aa”, “aaa”, “aaaa”

ab* defines the language with the strings, “ab”, “abb”, “abbb”, “abbbb”

(ac)* defines the language with the strings, “ ”, “ac”, “acac”, “acacac”, “acacacac”

Page 12: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expressionExamples of strings:

a*ca*ca defines the language withany number of a’s, but exactly two c’s

(a | c)* defines the language that describes any possible combination of a and c, including the empty string

Page 13: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expression

Meta characters

Vertical bar (pipe character)|

Question mark?Asterisk (star)*Plus sign+Both round brackets( )

Both square brackets[ ]The backslash character\

Page 14: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expression

Meta characters

Caretᶺ

Dollar sign$Period or dot.Hyphen-

Page 15: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expression

‘Alternatives’

A vertical bar represents alternatives:

a | b represents a or b.

Searching through a series of words in a paragraph this might bring up either:

• ban• bed

Page 16: Regular Language & Expressions. Regular Language A regular language is one that a finite state machine (fsm) will accept. ‘Alphabet’: {a, b} ‘Rules’:

Regular Language & Expressions

Regular expression

‘Character Class’

An alternative way of expressing alternation uses square brackets [] (eg [ab] means a or b).

The usual expression b [ae] d matches:• bed• bad

b [ae] d acts as the list of alternatives