ambiguity - liacs.leidenuniv.nl

29
Ambiguity From lecture 8: Definition A context-free grammar G is ambiguous, if for at least one x ∈ L(G ), x has more than one derivation tree (or, equivalently, more than one leftmost derivation). Otherwise: unambiguous [M] D 4.18 Automata Theory Context-Free Languages Derivation trees and ambiguity 257 / 285

Upload: others

Post on 04-May-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ambiguity - liacs.leidenuniv.nl

Ambiguity

From lecture 8:

Definition

A context-free grammar G is ambiguous, if for at least one x ∈ L(G ), xhas more than one derivation tree (or, equivalently, more than oneleftmost derivation).

Otherwise: unambiguous [M] D 4.18

Automata Theory Context-Free Languages Derivation trees and ambiguity 257 / 285

Page 2: Ambiguity - liacs.leidenuniv.nl

Ambiguous

Some cf languages are inherently ambiguous

Ambiguity is undecidable[M] Theorem 9.20

Automata Theory Context-Free Languages Derivation trees and ambiguity 258 / 285

Page 3: Ambiguity - liacs.leidenuniv.nl

Normalform

unwanted in CFG:– variables not used in successful derivations S β‡’βˆ— x ∈ Ξ£βˆ—

– A β†’ Ξ› A variable Ξ›-productions– A β†’ B A,B variables unit productions [chain rules]

Automata Theory Context-Free Languages Normalform 259 / 285

Page 4: Ambiguity - liacs.leidenuniv.nl

Normalform

unwanted in CFG:– variables not used in successful derivations S β‡’βˆ— x ∈ Ξ£βˆ—

– A β†’ Ξ› A variable Ξ›-productions– A β†’ B A,B variables unit productions [chain rules]

restricted CFG, with β€˜niceβ€˜ formChomsky normalform A β†’ BC , A β†’ Οƒ

Greibach normalform (⊠) A β†’ ΟƒB1 . . .Bk

Automata Theory Context-Free Languages Normalform 260 / 285

Page 5: Ambiguity - liacs.leidenuniv.nl

Useful etc.

CFG G = (V ,Ξ£, S ,P)

Definition

variable A is live if A β‡’βˆ— x for some x ∈ Ξ£βˆ—.

variable A is reachable if S β‡’βˆ— Ξ±AΞ² for some Ξ±,Ξ² ∈ (Ξ£ βˆͺ V βˆ—).

variable A is useful if there is a derivation of the form S β‡’βˆ— Ξ±AΞ² β‡’βˆ— x forsome string x ∈ Ξ£βˆ—.

useful implies live and reachable.For S β†’ AB | b and A β†’ a, variable A is live and reachable, not useful.[M] Exercise 4.51, 4.52, 4.53

Automata Theory Context-Free Languages Normalform 261 / 285

Page 6: Ambiguity - liacs.leidenuniv.nl

Recursion, and an algorithm

Live variables

Construction

– N0 = βˆ…

– Ni+1 = Ni βˆͺ { A ∈ V | A β†’ Ξ± in P , with Ξ± ∈ (Ni βˆͺ Ξ£)βˆ— }

N1 = { A ∈ V | A β†’ x in P , with x ∈ Ξ£βˆ— }

N0 βŠ† N1 βŠ† N2 βŠ† Β· Β· Β· βŠ† V

there exists a k such that Nk = Nk+1

A is live iff A βˆˆβ‹ƒ

i>0 Ni = Nk

(minimal) depth of derivation tree A β‡’βˆ— x

Automata Theory Context-Free Languages Normalform 262 / 285

Page 7: Ambiguity - liacs.leidenuniv.nl

Recursion, and an algorithm

Live variables

Construction

– N0 = βˆ…

– Ni+1 = Ni βˆͺ { A ∈ V | A β†’ Ξ± in P , with Ξ± ∈ (Ni βˆͺ Ξ£)βˆ— }

Exercise 4.53(c i).S β†’ ABC | BaB A β†’ aA | BaC | aaa

B β†’ bBb | a C β†’ CA | AC

Automata Theory Context-Free Languages Normalform 263 / 285

Page 8: Ambiguity - liacs.leidenuniv.nl

Algorithm, ctd.

Reachable variables

Construction

– N0 = {S}

– Ni+1 = Ni βˆͺ { A ∈ V | B β†’ Ξ±1AΞ±2 in P , with B ∈ Ni }

N0 βŠ† N1 βŠ† N2 βŠ† Β· Β· Β· βŠ† V

there exists a k such that Nk = Nk+1

A is reachable iff A βˆˆβ‹ƒ

i>0 Ni = Nk

(minimal) length of derivation S β‡’βˆ— Ξ±AΞ²

Automata Theory Context-Free Languages Normalform 264 / 285

Page 9: Ambiguity - liacs.leidenuniv.nl

Algorithm, ctd.

Reachable variables

Construction

– N0 = {S}

– Ni+1 = Ni βˆͺ { A ∈ V | B β†’ Ξ±1AΞ±2 in P , with B ∈ Ni }

N0 βŠ† N1 βŠ† N2 βŠ† Β· Β· Β· βŠ† V

there exists a k such that Nk = Nk+1

A is reachable iff A βˆˆβ‹ƒ

i>0 Ni = Nk

(minimal) length of derivation S β‡’βˆ— Ξ±AΞ²

– remove all non-live variables (and productions that contain them)– remove all unreachable variables (and their productions)

then all variables are useful

does not work the other way around . . .

Automata Theory Context-Free Languages Normalform 265 / 285

Page 10: Ambiguity - liacs.leidenuniv.nl

Algorithm, ctd.

Reachable variables

Construction

– N0 = {S}

– Ni+1 = Ni βˆͺ { A ∈ V | B β†’ Ξ±1AΞ±2 in P , with B ∈ Ni }

Exercise 4.53(c i)., ctdS β†’ BaB A β†’ aA | aaa B β†’ bBb | a

Automata Theory Context-Free Languages Normalform 266 / 285

Page 11: Ambiguity - liacs.leidenuniv.nl

Algorithm, ctd.

– remove all non-live variables (and productions that contain them)– remove all unreachable variables (and productions)

then all variables are useful

does not work the other way around . . .

Exercise 4.53(c i)., revisitedS β†’ ABC | BaB A β†’ aA | BaC | aaa

B β†’ bBb | a C β†’ CA | AC

Automata Theory Context-Free Languages Normalform 267 / 285

Page 12: Ambiguity - liacs.leidenuniv.nl

Removing Ξ›-productions

Idea:

Example

A β†’ BCDCB

B β†’ b | Ξ›

C β†’ c | Ξ›

D β†’ d

Automata Theory Context-Free Languages Normalform 268 / 285

Page 13: Ambiguity - liacs.leidenuniv.nl

Definition

variable A is nullable iff A β‡’βˆ— Ξ›

Theorem

– if A β†’ Ξ› then A is nullable

– if A β†’ B1B2 . . .Bk and all Bi are nullable, then A is nullable

[M] Def 4.26 / Exercise 4.48

Construction

– N0 = βˆ…

– Ni+1 = Ni βˆͺ { A ∈ V | A β†’ Ξ± in P , with Ξ± ∈ Nβˆ—

i }

N1 = { A ∈ V | A β†’ Ξ› in P }

N0 βŠ† N1 βŠ† N2 βŠ† Β· Β· Β· βŠ† V

there exists a k such that Nk = Nk+1

A is nullable iff A βˆˆβ‹ƒ

i>0 Ni = Nk

Automata Theory Context-Free Languages Normalform 269 / 285

Page 14: Ambiguity - liacs.leidenuniv.nl

Construction

– identify nullable variables– for every production A β†’ Ξ± add A β†’ Ξ²,

where Ξ² is obtained from Ξ± by removing one or more nullable variables– remove all Ξ›-productions (and all productions A β†’ A)

Grammar for { aibjck | i = j or i = k }

S β†’ TU | V

T β†’ aTb | Ξ›

U β†’ cU | Ξ›

V β†’ aVc | W

W β†’ bW | Ξ›

Automata Theory Context-Free Languages Normalform 270 / 285

Page 15: Ambiguity - liacs.leidenuniv.nl

Example nullable

Grammar for { aibjck | i = j or i = k }

S β†’ TU | V

T β†’ aTb | Ξ›

U β†’ cU | Ξ›

V β†’ aVc | W

W β†’ bW | Ξ›

N1 = {T ,U,W }, variables with Ξ› at right-hand side productions

N2 = {T ,U,W } βˆͺ {S ,V }, variables with {T ,U,W }βˆ— at rhs productions

N3 = N2 = {T ,U,W , S ,V }, all productions found, no new

Automata Theory Context-Free Languages Normalform 271 / 285

Page 16: Ambiguity - liacs.leidenuniv.nl

Example nullable, ctd

add all productions, where (any number of) nullable variables areremoved. . .

S β†’ TU | V

T β†’ aTb | Ξ›

U β†’ cU | Ξ›

V β†’ aVc | W

W β†’ bW | Ξ›

[M] Ex. 4.31

Automata Theory Context-Free Languages Normalform 272 / 285

Page 17: Ambiguity - liacs.leidenuniv.nl

Example nullable, ctd

add all productions, where (any number of) nullable variables are removedS β†’ TU | V S β†’ T | U | Ξ›

T β†’ aTb | Ξ› T β†’ ab

U β†’ cU | Ξ› U β†’ c

V β†’ aVc | W V β†’ ac | Ξ›

W β†’ bW | Ξ› W β†’ b

remove all Ξ›-productions. . .

[M] Ex. 4.31

Automata Theory Context-Free Languages Normalform 273 / 285

Page 18: Ambiguity - liacs.leidenuniv.nl

Example nullable, ctd

add all productions, where (any number of) nullable variables are removedS β†’ TU | V S β†’ T | U | Ξ›

T β†’ aTb | Ξ› T β†’ ab

U β†’ cU | Ξ› U β†’ c

V β†’ aVc | W V β†’ ac | Ξ›

W β†’ bW | Ξ› W β†’ b

remove all Ξ›-productionsS β†’ TU | V | T | U

T β†’ aTb | ab

U β†’ cU | c

V β†’ aVc | W | ac

W β†’ bW | b

[M] Ex. 4.31

Automata Theory Context-Free Languages Normalform 274 / 285

Page 19: Ambiguity - liacs.leidenuniv.nl

Removing Ξ›-productions

Theorem

For every CFG G there is CFG G1 without Ξ›-productions such that

L(G1) = L(G ) βˆ’ {Ξ›}.

Proof. . .[M] Thm 4.27

Automata Theory Context-Free Languages Normalform 275 / 285

Page 20: Ambiguity - liacs.leidenuniv.nl

Removing unit productions

Idea:

Example

A β†’ B | aCb

B β†’ C | Bb | Bc

C β†’ c | ABC

Automata Theory Context-Free Languages Normalform 276 / 285

Page 21: Ambiguity - liacs.leidenuniv.nl

Removing unit productions

Assume Ξ›-productions have been removed

Variable B is A-derivable, if– B 6= A, and– A β‡’βˆ— B (using only unit productions)

Construction

– N1 = { B ∈ V | B 6= A and A β†’ B in P }

– Ni+1 = Ni βˆͺ { C ∈ V | C 6= A and B β†’ C in P , with B ∈ Ni }

N1 βŠ† N2 βŠ† Β· Β· Β· βŠ† V

there exists a k such that Nk = Nk+1

B is A-derivable iff B βˆˆβ‹ƒ

i>0 Ni = Nk

Automata Theory Context-Free Languages Normalform 277 / 285

Page 22: Ambiguity - liacs.leidenuniv.nl

Removing unit productions

Construction

– for each A ∈ V , identify A-derivable variables– for every pair (A,B) where B is A-derivable,

and every production B β†’ Ξ± add A β†’ Ξ±

– remove all unit productions

Grammar for { aibjck | i = j or i = k }

S β†’ TU | V | T | U

T β†’ aTb | ab

U β†’ cU | c

V β†’ aVc | W | ac

W β†’ bW | b

Automata Theory Context-Free Languages Normalform 278 / 285

Page 23: Ambiguity - liacs.leidenuniv.nl

Example unit productions

S β†’ TU | V | T | U

T β†’ aTb | ab

U β†’ cU | c

V β†’ aVc | W | ac

W β†’ bW | b

S-derivable: {V ,T ,U}, {V ,T ,U,W } V -derivable: {W }

New productions:S β†’ aTb | ab S β†’ cU | c S β†’ aVc | W | ac S β†’ bW | b

V β†’ bW | b

Remove unit productions:S β†’ TU | aTb | ab | cU | c | aVc | ac | bW | b

T β†’ aTb | ab

U β†’ cU | c

V β†’ aVc | ac | bW | b

W β†’ bW | b

Automata Theory Context-Free Languages Normalform 279 / 285

Page 24: Ambiguity - liacs.leidenuniv.nl

Definition

CFG in Chomsky normal form

productions are of the form– A β†’ BC variables A,B ,C– A β†’ Οƒ variable A, terminal Οƒ

Theorem

For every CFG G there is CFG G1 in CNF such that L(G1) = L(G ) βˆ’ {Ξ›}.

[M] Def 4.29, Thm 4.30

Automata Theory Context-Free Languages Chomsky normalform 280 / 285

Page 25: Ambiguity - liacs.leidenuniv.nl

Construction ChNF

Construction

1 remove Ξ›-productions

2 remove unit productions

3 introduce variables for terminals Xσ → σ

4 split long productions

A β†’ aBabA

is replaced byXa β†’ a Xb β†’ b A β†’ XaBXaXbA

A β†’ ACBA

is replaced byA β†’ AY1 Y1 β†’ CY2 Y2 β†’ BA

Automata Theory Context-Free Languages Chomsky normalform 281 / 285

Page 26: Ambiguity - liacs.leidenuniv.nl

ChNF, example

Grammar for { aibjck | i = j or i = k }

S β†’ TU | V

T β†’ aTb | Ξ› U β†’ cU | Ξ›

V β†’ aVc | W W β†’ bW | Ξ›

After removing Ξ›-productions and unit productions, we obtain (see before)S β†’ TU | aTb | ab | cU | c | aVc | ac | bW | b

T β†’ aTb | ab U β†’ cU | c

V β†’ aVc | ac | bW | b W β†’ bW | b

Now introduce productions for the terminals. . .

Automata Theory Context-Free Languages Chomsky normalform 282 / 285

Page 27: Ambiguity - liacs.leidenuniv.nl

ChNF, example

Grammar for { aibjck | i = j or i = k }

S β†’ TU | V

T β†’ aTb | Ξ› U β†’ cU | Ξ›

V β†’ aVc | W W β†’ bW | Ξ›

After removing Ξ›-productions and unit productions, we obtain (see before)S β†’ TU | aTb | ab | cU | c | aVc | ac | bW | b

T β†’ aTb | ab U β†’ cU | c

V β†’ aVc | ac | bW | b W β†’ bW | b

Now introduce productions for the terminals:Xa β†’ a Xb β†’ b Xc β†’ c

S β†’ TU | XaTXb | XaXb | XcU | c | XaVXc | XaXc | XbW | b

T β†’ XaTXb | XaXb

U β†’ XcU | c

V β†’ XaVXc | XaXc | XbW | b

W β†’ XbW | b

Automata Theory Context-Free Languages Chomsky normalform 283 / 285

Page 28: Ambiguity - liacs.leidenuniv.nl

ChNF, example ctd.

Only a few productions that are too long:S β†’ XaTXb | XaVXc

T β†’ XaTXb

V β†’ XaVXc

Split these long productions. . .

Automata Theory Context-Free Languages Chomsky normalform 284 / 285

Page 29: Ambiguity - liacs.leidenuniv.nl

ChNF, example ctd.

Only a few productions that are too long:S β†’ XaTXb | XaVXc

T β†’ XaTXb

V β†’ XaVXc

Split these long productions:S β†’ XaY1 | XaY2

Y1 β†’ TXb Y2 β†’ VXc

T β†’ XaY1

V β†’ XaY2

Note that we can reuse Y1,Y2 for two productions

Automata Theory Context-Free Languages Chomsky normalform 285 / 285