Regular Language & Expressions
Regular Language & Expressions
Regular LanguageA regular language is one that a finite state machine (fsm) will accept.
• ‘Alphabet’:{a, b}
• ‘Rules’:{a(a | b)*
Example strings: {“a”, “aa”, “ab”, “aab”, “abb” .. }
Note:
| - OR
* - zero or more instances
Regular Language & Expressions
Non-regular Language
To construct a language that is non-regular a language must be created that has an infinite number of states.
Regular Language & Expressions
Regular expressions
A basic and important computing task is to try to manipulate different strings.
Example:
• To search for the word ‘cat’ in a large section of text (eg “catching a cold”)
• To search for a specific pattern in a person’s DNA
(Pattern matching)
Regular Language & Expressions
Regular expressions
Sometimes a set of rules need to be checked to verify accuracy.
Example:• Checking an email address
- One or more lowercase letters followed by @ symbol- One or more lowercase letters followed by . Symbol- One or more lowercase letters followed by . Symbol- One or more lowercase letters followed by .co then .uk
Regular Language & Expressions
Regular expressions notationThe notation below represents a regular expression, regex or pattern.
• ‘Alphabet’:{a, b}
• ‘Rules’:{a(a | b)*Example strings: {“a”, “aa”, “ab”, “aab”, “abb” .. }
(This describes an infinite set, without listing all the members of the set)
Regular Language & Expressions
Regular expressions notationExample:
• ‘Alphabet’:{a - z}
• ‘Strings’:{“michel”, “michael”, “michell”)
What rule represents the following strings represented above?
{Mich (e | ae | el) l}
Regular Language & Expressions
Regular expressions notationHere are some regular expressions that are defined byThe formal expression {a, b}
a is a regular expression that matches a string consisting of just a
b is a regular expression that matches a string consisting of just b
ab is a regular expression that matches a string consisting of the symbol a followed by the symbol b
a* is a regular expression that matches a string consisting of zero or more a’s
a+ is a regular expression that matches a string consisting of one or more a’s
Regular Language & Expressions
Regular expressions notationHere are some regular expressions that are defined byThe formal expression {a, b}
abb? is a regular expression that matches the string ab or the string abb The symbol ‘?’ indicates there is a zero or one of the preceding element.
a | b is a regular expression that matches a string consisting of the symbol a or consisting of the symbol b.
Regular Language & Expressions
Regular expression
Now we will take a look at some examples.
Regular Language & Expressions
Regular expressionExamples of strings:
abc defines the language with one string, “abc”
abc | bac defines the language with two strings, “abc” and “bac”
a+ defines the language with the strings, “a”, “aa”, “aaa”, “aaaa”
ab* defines the language with the strings, “ab”, “abb”, “abbb”, “abbbb”
(ac)* defines the language with the strings, “ ”, “ac”, “acac”, “acacac”, “acacacac”
Regular Language & Expressions
Regular expressionExamples of strings:
a*ca*ca defines the language withany number of a’s, but exactly two c’s
(a | c)* defines the language that describes any possible combination of a and c, including the empty string
Regular Language & Expressions
Regular expression
Meta characters
Vertical bar (pipe character)|
Question mark?Asterisk (star)*Plus sign+Both round brackets( )
Both square brackets[ ]The backslash character\
Regular Language & Expressions
Regular expression
Meta characters
Caretᶺ
Dollar sign$Period or dot.Hyphen-
Regular Language & Expressions
Regular expression
‘Alternatives’
A vertical bar represents alternatives:
a | b represents a or b.
Searching through a series of words in a paragraph this might bring up either:
• ban• bed
Regular Language & Expressions
Regular expression
‘Character Class’
An alternative way of expressing alternation uses square brackets [] (eg [ab] means a or b).
The usual expression b [ae] d matches:• bed• bad
b [ae] d acts as the list of alternatives