dfa minimization & equivalence to regular...

27
1 Spring 2016 CSCI 565 - Compiler Design Pedro Diniz [email protected] Lexical Analysis DFA Minimization & Equivalence to Regular Expressions Copyright 2016, Pedro C. Diniz, all rights reserved. Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use.

Upload: hoanghanh

Post on 04-Dec-2018

235 views

Category:

Documents


0 download

TRANSCRIPT

1

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

Lexical Analysis

DFA Minimization & Equivalence to Regular Expressions

Copyright 2016, Pedro C. Diniz, all rights reserved.Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use.

2

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA State Minimization• How to Reduce the Number of States of a DFA?

– Find unique minimum-state DFA (up to state names)– Need to recognize the same language

• Normalization– Assume every state has a transition on every symbol– If not, just add missing transitions to a dead state

• Key Idea– Find string w that distinguishes states s and t

• Algorithm– Start with accepting vs. non-accepting states partition of states– Refine state groups on all input sequences, i.e. by tracing all transitions– Until no refinement is possible

3

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA State Minimization• Algorithm

– Start with accepting vs. non-accepting states partition of states– Refine state groups on all input sequences, i.e. by tracing all transitions– Until no refinement is possible

• Does this Terminate?– Refinement will end; in the limit 1 partition is 1 state

• What to do When Refinement Terminates?– Elect representative state for each partition– Merge edges– Remove unneeded states in each partition

4

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

Minimization Example

a

b

c1

1

0

d

e0

1

0

0

1s0

0,1

5

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

a

b

c

d

e

P1 P2

Minimization Example

6

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Label 0 does not split any partition!

a

b

c

d

e

P1 P2

00

0 0

0

Minimization Example

7

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Label 1 splits P1 and P2 partitions!

a

b

c

d

e

P1 P2

1

1 1

1

1

Minimization Example

8

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Label 1 splits P1 and P2 partitions!

a

b

c

d

e

P1 P2P3

1 1

1

1

1

Minimization Example

9

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

a

b

c

d

e

P1 P2P3

Minimization Example

10

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Label 0 does not splits any partition!

a

b

c

d

e

P1 P2P3

00

0 0

0

Minimization Example

11

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Label 1 does not splits any partition!

a

b

c

d

e

P1 P2P3

1

1 1

1

1

Minimization Example

12

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Elect Representative and Merge Edges

a

b

c

d

e

P1 P2P3

1 1

1

1

0,10

0 0

0

Minimization Example

13

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

• Elect Representative and Merge Edges

p3 p1 p20,1 1

0,10

s0

Minimization Example

14

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA State Minimization: Algorithm

P ←{DF, {D - DF}}

while (P is still changing)

T ← Ø

for each set p ∈ P

T ← T ∪ Split(p)

P ← T

Split(S)

for each c ∈ ∑

if c splits S into s1 and s2

then return {s1 , s2}

return S

DFA = {D, ∑, d, s0, DF}

15

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA to RE: Kleene Construction• Path Problem over the DFA

– Starting from state s1 (numbering of states is 1 … N - important)– Label all edges through all states to an accepting state– What to do with cycles in the DFA, as they are infinite paths?

• Kleene Construction– Iterate and merge path expressions for every pair of nodes i and j

not going through any node with label higher then k– Increase k up to N– In the end do the union of all path expressions that start at s1 and

end in a final state.

16

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

for i = 1 to Nfor j = 1 to N

R0ij = {a | δ(si,a) = sj}

if (i = j) thenR0

ij = R0ij | { ε }

for k = 1 to Nfor i = 1 to N

for j = 1 to NRk

ij = Rk-1ik (Rk-1

kk )* Rk-1kj | Rk-1

ij

L = |sj ∈ SF RN1j

i

j

k

Rk-1kk

Rk-1ik

Rk-1kj

Rk-1ij

DFA to RE: Kleene Construction

17

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

for i = 1 to Nfor j = 1 to N

R0ij = {a | δ(si,a) = sj}

if (i = j) thenR0

ij = R0ij | { ε }

for k = 1 to Nfor i = 1 to N

for j = 1 to NRk

ij = Rk-1ik (Rk-1

kk )* Rk-1kj | Rk-1

ij

L = |sj ∈ SF RN1j

i

j

k

Rk-1kk

Rk-1ik

Rk-1kj

Rk-1ij

Direct Path

DFA to RE: Kleene Construction

18

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

Indirect Path

for i = 1 to Nfor j = 1 to N

R0ij = {a | δ(si,a) = sj}

if (i = j) thenR0

ij = R0ij | { ε }

for k = 1 to Nfor i = 1 to N

for j = 1 to NRk

ij = Rk-1ik (Rk-1

kk )* Rk-1kj | Rk-1

ij

L = |sj ∈ SF RN1j

i

j

k

Rk-1kk

Rk-1ik

Rk-1kj

Rk-1ij

Direct Path

DFA to RE: Kleene Construction

19

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA to RE: Example

S1 S2 S3r [0..9]

[0..9]

R012 = r

R023 = [0..9]

R033 = [0..9] | ε

20

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

S1 S2 S3r [0..9]

[0..9]

R012 = r

R023 = [0..9]

R033 = [0..9] | ε

R113 = R0

11 (R011)* R0

13 | R013 = nil

R133 = R0

31 (R011)* R0

13 | R013 = [0..9] | ε

R123 = R0

21 (R011)* R0

13 | R023 = [0..9]

R0kk = nil otherwise

DFA to RE: Example

21

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

S1 S2 S3r [0..9]

[0..9]

R012 = r

R023 = [0..9]

R033 = [0..9] | ε R2

13 = R112 (R1

22)* R123 | R1

13 = r . ε* [0..9]

R233 = R1

32 (R122)* R1

23 | R133 = [0..9] | ε

R233 = R1

32 (R122)* R1

23 | R133 = [0..9] | ε

R113 = R0

11 (R011)* R0

13 | R013 = nil

R133 = R0

31 (R011)* R0

13 | R013 = [0..9] | ε

R123 = R0

21 (R011)* R0

13 | R023 = [0..9]

R0kk = nil otherwise

DFA to RE: Example

22

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

R012 = r

R023 = [0..9]

R033 = [0..9] | ε

S1 S2 S3r [0..9]

[0..9]

R0kk = nil otherwise

R313 = R2

13 (R233)* R2

33 | R213

DFA to RE: Example

23

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

R012 = r

R023 = [0..9]

R033 = [0..9] | ε

S1 S2 S3r [0..9]

[0..9]

R0kk = nil otherwise

R313 = R2

13 (R233)* R2

33 | R213

= (r . ε* [0..9])([0..9]*)([0..9]) | r . ε* [0..9]= (r . [0..9]+) | r . [0..9]= (r . [0..9]+)

DFA to RE: Example

24

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

L(M) = R313 = r . [0..9]+

S1 S2 S3r [0..9]

[0..9]

DFA to RE: Example

25

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA, NFA and REs

RE DFA

NFA SubsetConstruction

Kleene’sConstruction

Thompson’sConstruction

Code fora scanner

DFAMinimization

26

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

DFA, NFA and REs

RE DFA

NFA SubsetConstruction

Kleene’sConstruction

Thompson’sConstruction

DFAMinimization Code for

a scanner

Regular Expressions and FA are Equivalent

27

Spring 2016CSCI 565 - Compiler Design

Pedro [email protected]

Summary• DFA Minimization

– Find sequence w that discriminates states– Iterate until no possible refinement

• DFA to RE– Kleene construction– Combine Path Expression for an increasingly large set of states

• DFA and RE are Equivalent– Given one you can derive an equivalent representation in the other