introduction to cs theory lecture 3 – regular languages piotr faliszewski [email protected]

Introduction to CS Theory

Lecture 3 – Regular LanguagesPiotr [email protected]

Previous Class

Discrete math Induction Recursive definitions

Today: The quiz 3 questions 10 points for each 15 minutes to complete

the quiz

Discrete Math Quiz

1. Show, using mathematical induction, that for every n N it holds that

1 + ∑ i*(i!) = (n+1)!

2. How many functions f: {0,1}n {0,1} are there? (Hint: There are 2n strings over {0,1}. Consider all possible sequences of inputs. How many ways of selecting the values of the function are there?)

3. Consider a language L defined as follows1. ε L2. If x L then x01 L3. If x L then 10x L4. No string is in L unless it follows from rules 1, 2, 3.

(a) Describe L in English(b) How many strings of length 4, 5, and 6 are there in L?

i=1

n

0! = 1

Regular Languages

Language L over Σ is regular if it is built from: {}, {ε} {a} for a Σ

using the following operations: Union Concatenation Kleene’s star

Examples ({a} {b})*

({a}{b})*{a} ({a} {b})*{a}{a} ({a} {b})*

({a} {b})*{b}{b} ({a} {b})*

This notation is awful!

Regular Languages

Language L over Σ is regular if it is built from: {}, {ε} {a} for a Σ

using the following operations: Union Concatenation Kleene’s star

Examples {a,b}*

{ab}*{a} {a,b}*{aa}{a,b}* {a,b}*{bb} {a,b}*

Better… but could be even simpler

We can perform some of the

simple concatenations

and unions

Notation for Regular Expressions

A more convenient notation: Drop set parentheses Use + for union Use · for concatenation

(or just drop it) Use (…)* for Kleene’s

star

Examples (a+b)*

(ab)*a (a+b)*aa (a+b)* + (a+b)*bb(a+b)*

These notation conventions essentially define so called regular expressions. See the book for a formal definition.

We often use two additional notations helpers: If r is a regular expression and i is an integer the we also use r+ and ri

Regular Expression Fun!

Give regular expressions for the following languages!

Language of strings over {a,b} of even length

The language of strings over {a,b,c} in which all a’s precede all b’s and all b’s precede all c’s

The language of strings over {0,1} of length greater than 3

The language of strings of odd length over {a,b} that contain the substring bb

The language of strings over {0,1} that do not contain the substring 000.

Now… how about L = {aibi | i N}. Is L regular? By what regular expression?

Properties of Regular Languages

For each language class it is natural to ask about its closure properties: Are regular languages

closed under: Union? Intersection? Subtraction? Complementation? Kleene’s star?

It is easy to show that regular languages are closed under union and Kleene’s star, but what about the other cases?

L1 L2

L1L2

If L1 and L2 are both regular then is L1L2

regular?

Machines for Regular Languages?

Regular expressions Give a description of the

language Do not necessarily give

a direct algorithm to recognize a language

… so what kinds of algorithms do we need to recognize regular languages?

Consider the following languages: Strings ending with 0 Strings whose second to

last character is 0 Strings with an even

number of 0s and 1s Strings ending in 1 and

not containing 00

What algorithms work for them? How do we access the

input? How much memory do

we need?

Finite Automata

Finite automata The most restricted

model of computation that we look at

Input read once from left to right

There is no memory, except for one register that contains the current state.

There is a fixed, finite number of state for a given FA.

0 1 0 0 1 0 1 0 0 0 0

State: q

Already read

Remaining part

Can an FA accept the language of strings with an even number of 0s and 1s?

Transition Diagrams

How to describe an FA computation? What’s inside an FA? States One state is designated as an

initial state Some states are designated as

accepting states For every state we know to what

other state to move based on the symbol that we scan

An FA accepts a given string x1x2…xn iff if you start in the initial state and follow the transitions from state to state then you end up in an accepting state after reading in the whole string

Transition Diagrams


initial state Some states are designated

as accepting states For every state we know to what



Transition Diagrams



accepting states For every state we know to

what other state to move based on the symbol that we scan


0

1

1

0

0, 1

Transition Diagrams



accepting states For every state we know to what



0

1

1

0

0, 1

Example: Accepts: 001, 111, 01Rejects: 00, 010

Finite Automata Examples

Give transition diagrams for the following languages L:

L is a lanugage of strings that contain at least 3 a’s

L is a language of strings that contain aaba as a substring

L is a language of strings that do not contain aaba as a substring

L is a language of strings with an even number of a’s and an even number of b’s

Formal Definition of Finite Automata

An FA is a quintuple M = (Q, Σ, q0, A, δ) where Q is a finite set of states Σ is an alphabet of input symbols q0 is the initial state (q0 Q) A Q is a set of accepting states δ is a total function from Q Σ Q : the transition

function (often specified as a table)

Exercise: Express the FA’s from the previous slide in terms of

this formalism

Note: FAs are sometimes called DFAs. (D = deterministic)

Formal Definition of Finite Automata

Let M = (Q, Σ, q0, A, δ) be an FA

Is the definition complete? It’s not enough to say

what an FA is… … we have to define

how it works!

The transition function δ(q, a) = q’ the transition function

says to what state do we move if in state q we see symbol a

We extend δ as follows Let δ*(q, x) = q’ such

that … q’ is the state to which

the FA goes if it starts in state q and reads string x

How can we define δ* formally?

Definition. We say that a finite automaton M = (Q, Σ, q0, A, δ) accepts a string x Σ* iff δ*(q0, x) A.

L(M) = set of strings accepted by M

Kleene’s Theorem

Why study finite automata? We have defined finite

automata and their languages…

But how does that help us in the study of regular languages?

Kleene’s theorem! A language L is regular

if and only if there is a finite automaton that accepts L

Consequences of Kleene’s theorem: We can prove facts

about regular languages via studying either regular expressions or finite automata

In particular Some closure

properties of regular languages are easier to prove based on finite automata!

We can hope to write a grep program that matches text against regular expressions!

Closure Properties Revisited

Using Kleene’s theorem we can now show that regular languages are closed under Union Intersection Subtraction Complement

Proof technique: Given L1 and L2

Their corresponding FAs M1 and M2

run M1 and M2 in parallel and decide string membership based on their states

Focus on intersection

M1 = (Q1, Σ, q1, A1, δ1)

M2 = (Q2, Σ, q2, A2, δ2)

We construct M such that: L(M) = L(M1) L(M2)

M = (Q, Σ, q0, A, δ), where

Q = Q1 Q2

q0 = (q1, q2)

A = {(p,q) | p A1 and q A2}

For all p Q1, q Q2, a Σ, set

δ((p,q),a) = (δ1(p,a), δ2(q,a))

Proving Languages Not Regular

We can use Kleene’s theorem to show that certain languages are not regular So called pumping

lemma … but we will here

directly use properties of FAs to show a language is not regular

Consider

L = {aibi | i N }

We can show that L is not regular via showing that no FA can possibly accept L.

A proof by contradiction.

introduction to cs theory lecture 3 – regular languages piotr faliszewski [email protected]

Documents

english b

regular languageslanguage

regular languagespiotr

regular expressionsa

regular expressionsgive

l1l2 regular

kleenes starexamplesa

n strings