ling 438/538 computational linguistics sandiway fong lecture 14: 10/12
Post on 21-Dec-2015
213 views
TRANSCRIPT
3
Last Time
• morphology– words are composed of morphemes – morpheme: semantic unit, e.g. -ee in employee– Inflectional: no change in category, e.g. V -ed V– Derivational: category-changing, e.g. V -able A
• Porter Stemmer– normalization procedure– based on (manually determined) ad hoc rules– “measure” of a stem: C(VC)mV– output: “root” (not necessarily a word)
• words that stem to the same root are considered “variants”
– English orthography
• an illustration of the gap that can occur between computation and linguistic theory
5
Today’s Topic
• Finite State Transducers (FST) for morphological processing
– ... also Prolog implementation
6
Recall Finite State Automata (FSA)
• from lecture 8– (Q,s,f,Σ,)1. set of states (Q): {s,x,y} must be a finite set2. start state (s): s3. end state(s) (f): y
4. alphabet (Σ): {a, b}5. transition function :
signature: character × state → state1. (a,s)=x2. (a,x)=x3. (b,x)=y4. (b,y)=y
s x
y
aa
b
b
7
Modeling English Adjectives using FSA
– from section 3.2 of textbook
• examples– big, bigger, biggest, *unbig– cool, cooler, coolest, coolly– red, redder, reddest, *redly– clear, clearer, clearest, clearly, unclear, unclearly– happy, happier, happiest, happily– unhappy, unhappier, unhappiest, unhappily– real, *realer, *realest, unreal, really
• fsa (3.4)
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Initial machineis overly simple
need more classesto make finer grain distinctions
e.g. *unbig
8
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Modeling English Adjectives using FSA
• divide adjectives into classes• examples
– adj-root2: big, bigger, biggest, *unbig– adj-root2: cool, cooler, coolest, coolly– adj-root2: red, redder, reddest, *redly– adj-root1: clear, clearer, clearest, clearly, unclear, unclearly– adj-root1: happy, happier, happiest, happily– adj-root1: unhappy, unhappier, unhappiest, unhappily– adj-root1: real, *realer, *realest, unreal, really
• fsa (3.5)
However...Examplesuncooler •Smoking uncool and getting uncooler.•google: 22,800 (2006), 10,900 (2005) *realer •google: 3,500,000 (2006) 494,000 (2005)
*realest •google: 795,000 (2006) 415,000 (2005)
9
Modeling English Adjectives using FSA
e.g. *unbig google: 11,000 hits (2006)
morphology is productivemorphemes carry (compositional) meaningcan be used for dramatic effect unbig vs. small
10
The Mapping Problem
• To map between a surface form and the decomposition of a word into its components– e.g. root + (person/number/gender) and other features
• using spelling rules
• Example: (3.11)
Notes:^ marks a morpheme boundary# is the end-of-word marker
11
Stage 1: Lexical Intermediate Levels
• example:– f o x +N +PL (lexical)– f o x ^s# (intermediate)
• lexical level: – uninflected “dictionary” level
• intermediate level: – replace abstract morphemes by concrete ones
• key– +N: noun
• fox can also be a verb, • but fox +V cannot combine with +PL
– +PL: (abstract) plural morpheme• realized in English as s (basic case)
– boundary markers ^ and # • for use by the spelling rule machine (later)
12
Stage 1: Lexical Intermediate Levels
• example:– f o x +N +PL (lexical)– f o x ^s# (intermediate)
• machine idea – character-by-character correspondences– f f – o o– x x– +N ( = empty string)– +PL ^s#
• use a Finite State Machine with input/output mapping– Finite State Transducer (FST)
13
Stage 1: Lexical Intermediate Levels
• Example:– g o o s e +N +PL (lexical)– g e e s e # (intermediate)
• Example:– g o o s e +N +SG (lexical)– g o o s e # (intermediate)
• Example:– m o u s e +N +PL (lexical)– m i c e # (intermediate)
• Example:– s h e e p +N +PL (lexical)– s h e e p # (intermediate)
15
Extension to Finite State Transducers (FST)
• [Mealy machine extension to FSA]– (Q,s,f,Σ,)1. set of states (Q): {s,x,y} must be a finite set2. start state (s): s3. end state(s) (f): y
4. alphabet (Σ): pairs I:O– I = input alphabet, O = output alphabet
– ε may be included in I and O
– transition function (or matrix) : signature: i/o pair × state → state1. (a:b,s)=x2. (a:b,x)=x3. (b:a,x)=y4. (b:ε,y)=y
s x
y
a:b a:b
b:ε
b:a
16
Finite State Automata (FSA)
• recall: one possible Prolog encoding strategy
– define one predicate for each state• taking one argument (the input string)• consume input character• call next state with remaining input string
– query•?- s(L).
call start state s
17
Finite State Automata (FSA)
– from lecture 9
– define one predicate for each state• take one argument (the input string), and consume input character• call next state with remaining input string
– query• ?- s(L). i.e. call start state s
– state s: (start state)• s([a|L]) :- x(L).
– state x:• x([a|L]) :- x(L).• x([b|L]) :- y(L).
– state y: (end state)• y([]).• y([b|L]) :- y(L).
s x
y
aa
b
b
simple extension to FST: each predicate takes two arguments:input and output
18
Stage 1: Lexical Intermediate Levels
• example– s0([f|L1],[f|L2]) :- s1(L1,L2).– s0([c|L1],[c|L2]) :- s3(L1,L2).
– s1([o|L1],[o|L2]) :- s2(L1,L2).– s2([x|L1],[x|L2]) :- s5(L1,L2).– s3([a|L1],[a|L2]) :- s4(L1,L2).– s4([t|L1],[t|L2]) :- s5(L1,L2).
– s5([‘+N’|L1],L2) :- s6(L1,L2).– s6([‘+PL’|L1],[^,s,#|L2]) :- s7(L1,L2).– s7([],[]). % end state
19
Stage 1: Lexical Intermediate Levels
• FST queries– lexical intermediate
• ?- s0([f,o,x,’+N’,’+PL’],X).– X = [f, o, x, ^, s, #]
– intermediate lexical • ?- s0(X,[f,o,x,^,s,#]).
– X = [f, o, x, '+N', '+PL']
– enumerator• ?- s0(X,Y).
– X = [f, o, x, '+N', '+PL']– Y = [f, o, x, ^, s, #] ;– X = [c, a, t, '+N', '+PL']– Y = [c, a, t, ^, s, #] ;
• No
inversion of a transducer T: T-1
switch input and output labels
in Prolog, simply change the call
21
The Mapping Problem
• Example: (3.11)
• (Context-Sensitive) Spelling Rule: (3.5) e / {x,s,z}^__ s#
rewrites to letter e in left context x^ or s^ or z^ and right context s#
• i.e. insert e after the ^ when you see x^s# or s^s# or z^s#
• in particular, we have x^s# x^es#
22
Stage 2: Intermediate Surface Levels
• also can be implemented using a FSTimportant!machine is designed to pass input not matching the rule through unmodified (rather than fail)
implements context-sensitive ruleq0 to q2 : left contextq3 to q0 : right context
24
Stage 2: Intermediate Surface Levels
• Transition table for FST in 3.14
• Note:– other: (catch-all case) means pass any remaining symbol (other than
specified explicitly in the state) to the other side unchanged– #: # is never included in other
25
Stage 2: Intermediate Surface Levels
• in Prolog (simplified)– with special treatment for “other”– q0([],[]). % final state– q0([^|L1],L2) :- !, q0(L1,L2). – % ^: – q0([z|L1],[z|L2]) :- !, q1(L1,L2). – % repeat for s,x– q0([#|L1],[#|L2]) :- !, q0(L1,L2).– q0([X|L1],[X|L2]) :- q0(L1,L2). – % other
• ! is known as the “cut” predicate– it affects how Prolog searches– it means “cut” the search off– Prolog will not try any other compatible rule on
backtracking– problematic for generation, e.g. ^: case
26
Stage 2: Intermediate Surface Levels
• in Prolog (simplified)– with special treatment for “other”– q0([],[]). % final state– q0([^|L1],L2) :- !, q0(L1,L2). – % ^: – q0([z|L1],[z|L2]) :- !, q1(L1,L2). – % repeat for s,x– q0([#|L1],[#|L2]) :- !, q0(L1,L2).– q0([X|L1],[X|L2]) :- q0(L1,L2). – % other
• ! is known as the “cut” predicate– it affects how Prolog searches– it means “cut” the search off– Prolog will not try any other compatible rule on
backtracking– problematic for generation, e.g. ^: case
1
2
3
backtrack points: other choices
27
Stage 2: Intermediate Surface Levels
• problem for generation– ?- q0(X,[f,o,x,e,s,#]). X = [^|L1]
• ?- q0(L1,[f,o,x,e,s,#]).L1 = [^|L1’]– ?- q0(L1’,[f,o,x,e,s,#]).
– infinite loop– Culprit: ^: case (morpheme boundary deletion)– can keep introducing ^^^^^^^... ad infinitum– requires more than finite state power to correct
q0([],[]). % final stateq0([^|L1],L2) :- !, q0(L1,L2). % ^: q0([z|L1],[z|L2]) :- !, q1(L1,L2). % repeat for s,xq0([#|L1],[#|L2]) :- !, q0(L1,L2).q0([X|L1],[X|L2]) :- q0(L1,L2). % other
q0([],[]). % final stateq0([^|L1],L2) :- !, q0(L1,L2). % ^: q0([z|L1],[z|L2]) :- !, q1(L1,L2). % repeat for s,xq0([#|L1],[#|L2]) :- !, q0(L1,L2).q0([X|L1],[X|L2]) :- q0(L1,L2). % other
28
Stage 2: Intermediate Surface Levels
• Other cases of ^: do not loop. Could eliminate just the loop case.
29
Stage 2: Intermediate Surface Levels
• query (generation)– ?- q0(X,[c,a,t,s,#]).
• X = [c, a, t, s, ^, #] ; q0+ -> q1 -> q2 -> q0• X = [c, a, t, s, #] ; q0+ -> q1 -> q0 • No