transformational grammars

54
Transformational Transformational grammars grammars Anastasia Berdnikova & Denis Miretskiy

Upload: renata

Post on 09-Jan-2016

109 views

Category:

Documents


2 download

DESCRIPTION

Transformational grammars. Anastasia Berdnikova & Denis Miretskiy. Overview. Transformational grammars – definition Regular grammars Context-free grammars Context-sensitive grammars Break Stochastic grammars Stochastic context-free grammars for sequence modelling. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Transformational grammars

Transformational grammarsTransformational grammars

Anastasia Berdnikova

&

Denis Miretskiy

Page 2: Transformational grammars

Transformational grammars 2

OverviewOverview

Transformational grammars – definition Regular grammars Context-free grammars Context-sensitive grammars Break Stochastic grammars Stochastic context-free grammars for sequence

modelling

Page 3: Transformational grammars

Transformational grammars 3

Why transformational grammars?Why transformational grammars?

The 3-dimensional folding of proteins and nucleic acids

Extensive physical interactions between residues

Chomsky hierarchy of transformational grammars [Chomsky 1956; 1959]

Application to molecular biology [Searls 1992; Dong & Searls 1994; Rosenblueth et al. 1996]

Page 4: Transformational grammars

Transformational grammars 4

IntroductionIntroduction

‘Colourless green ideas sleep furiously’.Chomsky constructed finite formal

machines – ‘grammars’.‘Does the language contain this sentence?’

(intractable) ‘Can the grammar create this sentence?’ (can be answered).

TG are sometimes called generative grammars.

Page 5: Transformational grammars

Transformational grammars 5

DefinitionDefinition

TG = ( {symbols}, {rewriting rules α→β - productions} )

{symbols} = {nonterminal} U {terminal}α contains at least one nonterminal, β –

terminals and/or nonterminals.S → aS, S → bS, S → e (S → aS | bS | e)Derivation: S=>aS=>abS=>abbS=>abb.

Page 6: Transformational grammars

Transformational grammars 6

The Chomsky hierarchyThe Chomsky hierarchy

W – nonterminal, a – terminal, α and γ –strings of nonterminals and/or terminals including the null string, β – the same not including the null string.

regular grammars: W → aW or W → acontext-free grammars: W → βcontext-sensitive grammars: α1Wα2 → α1βα2.

AB → BAunrestricted (phase structure) grammars:

α1Wα2 → γ

Page 7: Transformational grammars

Transformational grammars 7

The Chomsky hierarchyThe Chomsky hierarchy

Page 8: Transformational grammars

Transformational grammars 8

AutomataAutomata

Each grammar has a corresponding abstract computational device – automaton.

Grammars: generative models, automata: parsers that accept or reject a given sequence.

- automata are often more easy to describe and understand than their equivalent grammars.

- automata give a more concrete idea of how we might recognise a sequence using a formal grammar.

Page 9: Transformational grammars

Transformational grammars 9

Parser abstractions associated Parser abstractions associated with thewith the hierarchy of grammarshierarchy of grammars

----------------------------------------------------------------------

Grammar Parsing automaton

----------------------------------------------------------------------

regular grammars finite state automaton

context-free grammars push-down automaton

context-sensitive grammars linear bounded automaton

unrestricted grammars Turing machine

----------------------------------------------------------------------

Page 10: Transformational grammars

Transformational grammars 10

Regular grammarsRegular grammars

W → aW or W → asometimes allowed: W → eRG generate sequence from left to right

(or right to left: W → Wa or W → a)RG cannot describe long-range correlations

between the terminal symbols (‘primary sequence’)

Page 11: Transformational grammars

Transformational grammars 11

An odd regular grammarAn odd regular grammar

An example of a regular grammar that generates only strings of as and bs that have an odd number of as:

start from S,

S → aT | bS,

T → aS | bT | e.

Page 12: Transformational grammars

Transformational grammars 12

Finite state automataFinite state automata One symbol at a time from an input string. The symbol may be accepted => the automaton

enters a new state. The symbol may not be accepted => the automaton

halts and reject the string. If the automaton reaches a final ‘accepting’ state, the

input string has been succesfully recognised and parsed by the automaton.

{states, state transitions of FSA}{nonterminals, productions of corresponding grammar}

Page 13: Transformational grammars

Transformational grammars 13

FMR-1 triplet repeat regionFMR-1 triplet repeat region

Human FMR-1 mRNA sequence, fragment

. . . GCG CGG CGG CGG CGG CGG CGG CGG CGG

CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG

CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG

CGG CGG CTG . . .

1 2→ → 3 4 → 5→ → 6 → 7 → 8 → ε→S

ac

g gc c g g c t g

Page 14: Transformational grammars

Transformational grammars 14

Moore vs. Mealy machinesMoore vs. Mealy machines

Finite automata that accept on transitions are called Mealy machines.

Finite automata that accept on states are called Moore machines. (HMM)

The two types of machines are interconvertible:

S → gW1 in the Mealy machine S → gŴ1, Ŵ1

→ gW1 in the Moore machine.

Page 15: Transformational grammars

Transformational grammars 15

Deterministic vs. nondeterministic Deterministic vs. nondeterministic automataautomata

In a deterministic finite automaton, no more than one accepting transition is possible for any state and any input symbol.

An example of nondeterministic finite automaton – FMR-1.

Parsing with deterministic finite state automaton is extremely efficient [BLAST.]

Page 16: Transformational grammars

Transformational grammars 16

PROSITE patternsPROSITE patterns

RU1A_HUMAN S R S L K M R G Q A F V I F K E V S S A TSXLF_DROME K L T G R P R G V A F V R Y N K R E E A QROC_HUMAN V G C S V H K G F A F V Q Y V N E R N A R

ELAV_DROME G N D T Q T K G V G F I R F D K R E E A T RNP-1 motif

[RK]– G – {EDRKHPCG} – [AGSCI] – [FY] – [LIVA] – x – [FYM]. A PROSITE pattern = pattern element - pattern element - ... - pattern

element. In a pattern element, a letter indicates the single-letter code for a

amino-acid, [] – any one of enclosed residues can occur; {} – anything but one can occur, x – any residue can occur at this position.

Page 17: Transformational grammars

Transformational grammars 17

A regular grammar for PROSITE A regular grammar for PROSITE patternspatterns

S → rW1 | kW1

W1 → gW2

W2 → [afilmnqstvwy]W3

W3 → [agsci]W4

W4 → fW5 | yW5

W5 → lW6 | iW6 | vW6 | aW6

W6 → [acdefghiklmnpqrstvwy]W7

W7 → f | y | m

[ac]W means aW | cW

Page 18: Transformational grammars

Transformational grammars 18

What a regular grammar can’t doWhat a regular grammar can’t do

RG cannot describe language L when:L contains all the strings of the form aa, bb,

abba, baab, abaaba, etc. (a palindrome language).

L contains all the strings of the form aa, abab, aabaab (a copy language).

Page 19: Transformational grammars

Transformational grammars 19

Regular language: a b a a a b Palindrome language: a a b b a a

Copy language: a a b a a b

Palindrome and copy languages have correlations between distant positions.

Page 20: Transformational grammars

Transformational grammars 20

Context-free grammarsContext-free grammars

The reason: RNA secondary structure is a kind of palindrome language.

The context-free grammars (CFG) permit additional rules that allow the grammar to create nested, long-distance pairwise correlations between terminal symbols.

S → aSa | bSb | aa | bb

S => aSa => aaSaa => aabSbaa => aabaabaa

Page 21: Transformational grammars

Transformational grammars 21

A context-free grammar for an A context-free grammar for an RNA stem loopRNA stem loop

seq 1 seq 2 seq 3

A A C A C A

G A G A G A C A G G A A A C U G seq 1

G●C U●A U x C G C U G C A A A G C seq 2

A●U C●G C x U G C U G C A A C U G seq 3

C●G G●C G x G x

S → aW1u | cW1g | gW1c | uW1a, W1 → aW2u | cW2g | gW2c | uW2a

W2 → aW3u | cW3g | gW3c | uW3a, W3 → gaaa | gcaa

Page 22: Transformational grammars

Transformational grammars 22

Parse treesParse trees Root – start nonterminal S, leaves – the terminal symbols in the

sequence, internal nodes are nonterminals. The children of an internal node are the productions of it. Any subtree derives a contiguous segment of the sequence.

S 5’ 3’

S S C ● G G ● C

W1 W1 A ● U G ● C

W2 W2 G ● C U ● A

W3 W3 G A G A

c a g g a a a с u g g g u g c a a a c c A A C A

Page 23: Transformational grammars

Transformational grammars 23

Parse tree for a PROSITE Parse tree for a PROSITE patternpattern

Parse tree for the RNP-1 motif RGQAFVIF.

Regular grammars are linear special cases of the context-free grammars. Parse tree for a regular grammar is a standard linear alignment of the grammar nonterminals into sequence terminals.

S

W1

W2

W3

W4

W5

W6

W7

r g q a f v i f

Page 24: Transformational grammars

Transformational grammars 24

Push-down automataPush-down automata

The parsing automaton for CFGs is called a push-down automaton.

A limited number of symbols are kept in a push-down stack.

A push-down automaton parses a sequence from left to right according to the algorithm.

The stack is initialised by pushing the start nonterminal into it.

The steps are iterated until no input symbols remain. If the stack is empty at the end then the sequence has been

successfully parsed.

Page 25: Transformational grammars

Transformational grammars 25

Algorithm: Parsing with a push-Algorithm: Parsing with a push-down automatondown automaton

Pop a symbol off the stack. If the poped symbol is nonterminal:

- Peek ahead in the input from the current position and choose a valid production for the nonterminal. If there is no valid production, terminate and reject the sequence.- Push the right side of the chosen production rule onto the stack, rightmost symbols first.

If the poped symbol is a terminal: - Compare it to the current symbol of the input. If it

matches, move the automaton to the right on the input (the input symbol is accepted). If it does not match, terminate and reject the sequence.

Page 26: Transformational grammars

Transformational grammars 26

Parsing an RNA stem loop with a Parsing an RNA stem loop with a push-down automatonpush-down automaton

Input string Stack Automaton operation on stack and inputGCCGCAAGGC S Pop S. Peek at input; produce S → g1c.GCCGCAAGGC g1c Pop g. Accept g; move right on input.GCCGCAAGGC 1c Pop 1. Peek at input; produce 1 → c2g.GCCGCAAGGC c2gc Pop c. Accept c; move right on input.GCCGCAAGGC 2gc Pop 2. Peek at input; produce 2 → c3g.GCCGCAAGGC c3ggc Pop c. Accept c; move right on input.

(several acceptances)GCCGCAAGGC c Pop c. Accept c; move right on input.GCCGCAAGGC - Stack empty. Input string empty. Accept.

Page 27: Transformational grammars

Transformational grammars 27

Context-sensitive grammarsContext-sensitive grammars

Copy language: cc, acca, agaccaga, etc. initialisation:

S → CW terminal generation:nonterminal generation: CA → aCW → AÂW | GĜW | C CG → gCnonterminal reordering: ÂC → CaÂG → GÂ ĜC → CgÂA → AÂ termination:ĜA → AĜ CC → ccĜG → GĜ

Page 28: Transformational grammars

Transformational grammars 28

Linear bounded automatonLinear bounded automaton A mechanism for working backwards through all possible

derivations:

either the start was reached, or valid derivation was not found.

Finite number of possible derivations to examine. Abstractly: ‘tape’ of linear memory and a read/write head. The number of possible derivations is exponentially large.

Page 29: Transformational grammars

Transformational grammars 29

NP problems and ‘intractability’NP problems and ‘intractability’

Nondeterministic polynomial problems:

there is no known polynomial-time algorithm for finding a solution, but a solution can be checked for correctness in polynomial time. [Context-sensitive grammars parsing.]

A subclass of NP problems - NP-complete problems. A polynomial time algorithm that solves one NP-complete problem will solve all of them. [Context-free grammar parsing.]

Page 30: Transformational grammars

Transformational grammars 30

Unrestricted grammars and Unrestricted grammars and Turing machinesTuring machines

Left and right sides of the production rules can be any combinations of symbols.

The parsing automaton is a Turing machine. There is no general algorithm for determination

whether a string has a valid derivation in less than infinite time.

Page 31: Transformational grammars

Transformational grammars 31

Stochastic grammarsStochastic grammars

Stochastic grammar model generates different

strings x with probability

Non-stochastic grammars either generate a string x

or not

For stochastic regular and context-free grammars

( | )P x

( | ) 1x

P x

Page 32: Transformational grammars

Transformational grammars 32

Example of stochastic grammarExample of stochastic grammar

For production rule

Stochastic regular grammar might assign

probabilities of 0.5 for the productions:

1 1|S rW kW

10,5

S rW 10,5

S kW

Page 33: Transformational grammars

Transformational grammars 33

Another probabilitiesAnother probabilities

Exceptions can be admitted without grossly

degrading of a grammarExceptions should has low, but non-zero

probabilities

10,45

S rW 10,45

S kW 10,1

S nW

Page 34: Transformational grammars

Transformational grammars 34

Stochastic context-sensitive or Stochastic context-sensitive or unrestricted grammarsunrestricted grammars

Context-sensitive grammar

{aa, ab, ba, bb}

In general

1 2 3 4 5

, , , , p p p p p

S aW S bW bW bb W a W b

1 4 1 5 2 4 2 3 2 5{ , , , ( )}p p p p p p p p p p

1 2 1p p

3 4 5 1p p p

Page 35: Transformational grammars

Transformational grammars 35

Stochastic context-sensitive Stochastic context-sensitive grammargrammar

In fact if

and

We have that Sum of probabilities of all possible

productions from any non terminal is 1 if and only if

or

1 21

2p p 3 4 5

13p p p

1 0p 3 0p

5( | ) 6x

P x

Page 36: Transformational grammars

Transformational grammars 36

Proper stochastic grammarProper stochastic grammar

Previous grammar can be changed in this

way

Now

1 2 3

, , ,p p p

S aW S bW bW bb

4 5 6

, and p p p

bW ba aW aa aW ab

1 2 3 4 5 6 1p p p p p p

Page 37: Transformational grammars

Transformational grammars 37

Hidden Markov models and Hidden Markov models and stochastic regular grammarsstochastic regular grammars

Any HMM state which makes N transitions

to new states that each emit one of M

symbols can also be modeled by a set of

NM stochastic regular grammar

productions.

Page 38: Transformational grammars

Transformational grammars 38

Stochastic context-free Stochastic context-free grammars for sequence modelinggrammars for sequence modeling

We can use stochastic context-free grammars

for sequence modeling.

To do it we should solve these problems:

(i) calculate an optimal alignment of a

sequence to a parameterized stochastic

grammar (the alignment problem).

Page 39: Transformational grammars

Transformational grammars 39

Other problemsOther problems

(ii) Calculate the probability of a sequence

given a parameterized stochastic

grammar (the scoring problem).

(iii) Given a set of example sequences,

estimate optimal probability

parameters for an unparameterised

stochastic grammar (the training problem).

Page 40: Transformational grammars

Transformational grammars 40

Normal forms for stochastic Normal forms for stochastic context-free grammarscontext-free grammars

Chomsky normal form; production rules should be like this:

or

For example, production rule

could be expanded to

in Chomsky normal form.

v y zW W W vW a

S aSa

1 2 1 2 1, , S WW W a W sW

Page 41: Transformational grammars

Transformational grammars 41

The inside-outside algorithm for The inside-outside algorithm for SGFCsSGFCs

The inside-outside algorithm for SGFCs in

Chomsky normal form is the natural

counterpart of the forward-backward

algorithm for HMMs

Computational complexity of inside-outside

algorithm is substantially greater

Page 42: Transformational grammars

Transformational grammars 42

The inside algorithmThe inside algorithm

Let we have Chomsky normal form SCFG with M

nonterminals W1,W2,…WM

start from W1

Production rules are: Wv WyWz and Wv a

Probability parameters for this productions are:

tv(y,z) and ev(a) respectively

Page 43: Transformational grammars

Transformational grammars 43

The inside algorithmThe inside algorithm

Algorithm calculates the probability

of a parse subtree rooted at nonterminal Wv

for subsequence xi,…,xj for all i, j and v

The calculations requires an

three-dimensional dynamic programming

matrix

( , , )i j v

L L M

Page 44: Transformational grammars

Transformational grammars 44

Algorithm: InsideAlgorithm: Inside

Initialisation: for i=1 to L, v=1 to M:

Iteration: for i=1 to L-1, j=i+1 to L, v=1 to M:

Termination:

( , , ) ( )v ii j v e x

1

1 1

( , , ) ( , , ) ( 1, , ) ( , )jM M

vy z k i

i j v i k y k j z t y z

( | ) (1, ,1)P x L

Page 45: Transformational grammars

Transformational grammars 45

Iteration step of the inside Iteration step of the inside algorithmalgorithm

Page 46: Transformational grammars

Transformational grammars 46

The outside algorithmThe outside algorithm

Algorithm calculates the probability

of a complete parse tree rooted at the start

nonterminal for the sequence x1,…,xL

excluding all parse subtrees for sequence xi,

…,xj rooted at nonterminal Wv for all i, j and

v

( , , )i j v

Page 47: Transformational grammars

Transformational grammars 47

The outside algorithmThe outside algorithm

The calculations requires an

three-dimensional dynamic programming

matrix (like the inside algorithm)

Calculating requires the results

from a previous inside

calculation

L L M

( , , )i j v( , , )i j v

Page 48: Transformational grammars

Transformational grammars 48

Algorithm: OutsideAlgorithm: Outside Initialisation:

for v=2 to M.

Iteration: for i = 1 to L, j =L to I, v =1 to M:

Termination:

(1, ,1) 1;L

(1, , ) 0L v

1

1 1 1

( , , ) ( , 1, ) ( , , ) ( , )M M i

yy z k

i j v k i z k j y t z v

1 1 1

( 1, , ) ( , , ) ( , ).M M L

yy z k j

j k z i k y t v z

1

( | ) ( , , ) ( )M

v iv

P x i i v e x

Page 49: Transformational grammars

Transformational grammars 49

Iteration step of the outside Iteration step of the outside algorithmalgorithm

Page 50: Transformational grammars

Transformational grammars 50

Parameter re-estimation by Parameter re-estimation by expectation maximisationexpectation maximisation

( 1, , ) ( , )vk j z t y z

1 1

1( ) ( , , ) ( , , )

( | )

L L

i j

c v i j v i j vP x

11

1 1

1( ) ( , , ) ( , , )

( | )

jL L

i j i k i

c v yz i j v i k yP x

Page 51: Transformational grammars

Transformational grammars 51

Parameter re-estimation by Parameter re-estimation by expectation maximisationexpectation maximisation

Re-estimation equation for probabilities of the production rules Wv WyWz is

Production rule Wv a:

( )ˆ ( , )( )v

c v yzt y z

c v

( )ˆ ( )

( )v

c v ae a

c v

1

1

( , , ) ( )

( , , ) ( , , )i

Li vx a

L L

i j i

i i v e a

i j v i j v

Page 52: Transformational grammars

Transformational grammars 52

The CYK alignment algorithmThe CYK alignment algorithm Initialisation: for i=1 to L, v=1 to M:

Iteration: for i=1 to L-1, j=i+1 to L, v=1 to M

Termination:

( , , ) log ( ); ( , , ) (0,0,0)v ii i v e x i i v

, 1( , , ) max max { ( , , ) ( 1, , ) log ( , )}v

y z i k ji j v i k y k j z t y z

( , , 1)( , , ) arg max { ( , , ) ( 1, , ) log ( , )}v

y z i k ji j v i k y k j z t y z

ˆlog ( , | ) (1, ,1)P x L

Page 53: Transformational grammars

Transformational grammars 53

CYK tracebackCYK traceback

Initialisation:

Push (1, L, 1) on the stackIteration:

Pop (i, j, v); (y, z, k) =

If =(0, 0, 0) then attach xi as a child of v

else attach y, z to parse tree as children of v

Push (k+1, j, z). Push(i, k, y)

( , , )i j v( , , )i j v

Page 54: Transformational grammars

Transformational grammars 54

SummarySummaryGoal HMM algorithm SCFG algorithm

optimal alignment

Viterbi CYK

forward inside

EM parameter estimation

forward-backward

Inside-outside

memory complexity

O(LM) O(L2M)

Time complexity

O(LM2) O(L3M3)

( | )P x