12: constraints on strings - johns hopkins universityjason/325/pdfslides/12rational.pdf · 2011. 5....
TRANSCRIPT
5/13/11
1
600.325/425 Declarative Methods - J. Eisner 1
Constraints on Strings
600.325/425 Declarative Methods - J. Eisner 2
What’s a constraint, again?
X=0
1
2
3
4
5
…
X
Y
unary binary
A set of allowed values
A set of allowed
value pairs
Infinite sets? Sure …
Infinite subsets of (pairs of)
integers, reals, … How about soft constraints?
600.325/425 Declarative Methods - J. Eisner 3
What’s a constraint on strings?
Hard constraint:
Does string S match pattern P? (Is it in the set?)
A description of a set of strings
Like a constraint … how?
S is a variable whose domain is set of all strings!
So P can be regarded as a unary constraint: let’s write P(S).
Soft constraint:
How well does string S fit pattern P?
A function mapping each string to a score / weight / cost.
Like a soft constraint …
600.325/425 Declarative Methods - J. Eisner 4
What is a pattern?
What operations would you expect for combining these string constraints?
If P is a pattern, then so is ~P ~P matches exactly the strings that P doesn’t
If P and Q are both patterns, then so is P & Q
If P and Q are both patterns, then so is P | Q
Wow, we can build up boolean formulas! Does this allow us to encode SAT? How?
600.325/425 Declarative Methods - J. Eisner 5
More about the relation to constraints
By building complicated patterns from simple ones, we are building up complicated constraints!
That is also allowed in ECLiPSe:
alldiff3(X,Y,Z) :- X #\= Y, Y \#= Z, X \#= Z.
between(X,Y,Z) :- X #< Y, Y #< Z. % either this
between(X,Y,Z) :- X #> Y, Y #> Z. % ... or this
Now we can use “alldiff3” and “between” as new constraints
Hang on, patterns are only unary constraints. Generalize?
between(X,Y,Z) :- (X #< Y, Y #< Z)
or (X #> Y, Y ># Z).
600.325/425 Declarative Methods - J. Eisner 6
What is a pattern?
Binary constraint (relation): What are all the possible translations of string S?
A description of a set of string pairs (S,T)
Like a binary constraint: let’s write P(S,T) We can also do n-ary constraints more generally, but most current solvers don’t allow them
Fuzzy case: How strongly is string S related to each T? Which one is it most strongly related to?
Ok, so what’s new here? Why does it matter that they’re string variables?
5/13/11
2
600.325/425 Declarative Methods - J. Eisner 7
Some Pattern Operators
~ complementation ~P
& intersection P & Q
| union P | Q
concatenation PQ
* iteration (0 or more) P*
+ iteration (1 or more) P+
- difference P - Q
\ char complement \P (equiv. to ?-P)
Which of these can be treated as syntactic sugar? That is, which of these can we get rid of?
600.325/425 Declarative Methods - J. Eisner 8
More Pattern Operators
.x. crossproduct P .x. Q
.o. composition P .o. Q
.u upper (input) language P.u “domain”
.l. lower (output) language P.l “range”
600.325/425 Declarative Methods - J. Eisner 9
The language of “regular expressions”
A variable S has infinitely many possible values if its type is “string” or “real”
So to specify a constraint on S, not enuf to list possible values
Language for simple constraints on reals: linear equations
Language for simple constraints on strings: regular expressions
Regular expression language
You probably know the standard form of regular expressions
Standard regexp is a unary constraint (“X must match a*b(c|d)*”)
Basic operators: union “|”, concatenation, closure “*”
But the language has been extended in various ways:
soft constraints (specifies costs)
binary constraints (over pairs of string variables)
n-ary constraints (over n string variables)
600.325/425 Declarative Methods - J. Eisner 10
Regular expressions finite-state automata
1. Given a regexp that specifies a constraint, you can build an FSA that efficiently determines whether a given string satisfies the constraint.
2. Given an FSA, you can find an equivalent regexp.
So the “compiled” form of the little language can be converted back to the source form.
Conclusion: Anything you can do with regexps, you can do with FSAs, and vice-versa.
600.325/425 Declarative Methods - J. Eisner 11
Given a regular expression …
1. Make a parse tree for it
2. Build up the FSA from the bottom up
Example: (ab|c)*(bb*a)
a b
c concat
union
closure
b
b
a concat
concat
closure
concat
600.325/425 Declarative Methods - J. Eisner 12
Concatenation (of soft constraints)
example thanks to M. Mohri
5/13/11
3
600.325/425 Declarative Methods - J. Eisner 13
Union
example thanks to M. Mohri 600.325/425 Declarative Methods - J. Eisner 14
Union
example thanks to M. Mohri
eps/0
eps/0.3
eps/0.8
600.325/425 Declarative Methods - J. Eisner 15
Closure (also illustrates binary constraints)
example thanks to M. Mohri 600.325/425 Declarative Methods - J. Eisner 16
Complementation
M represents a constraint on strings
We’d like to represent ~M (i.e., a constraint that says that the string must not be accepted by M)
Just change M’s final states to non-final and vice-versa
Only works if every string takes you to exactly one state in M (final or non-final). So M must be both deterministic and complete. Any M can be put in this form.
example thanks to M. Mohri
600.325/425 Declarative Methods - J. Eisner 17
Intersection fat/0.5
1 0 2/0.8 pig/0.3 eats/0
sleeps/0.6
fat/0.2 1 0 2/0.5
eats/0.6
sleeps/1.3
pig/0.4
0,0 fat/0.7
0,1 1,1 pig/0.7
2,0/0.8
2,2/1.3
eats/0.6
sleeps/1.9
example adapted from M. Mohri 600.325/425 Declarative Methods - J. Eisner 18
Intersection fat/0.5
1 0 2/0.8 pig/0.3 eats/0
sleeps/0.6
0,0 fat/0.7
0,1 1,1 pig/0.7
2,0/0.8
2,2/1.3
eats/0.6
sleeps/1.9
fat/0.2 1 0 2/0.5
eats/0.6
sleeps/1.3
pig/0.4
Paths 0012 and 0110 both accept fat pig eats So must the new machine: along path 0,0 0,1 1,1 2,0
example adapted from M. Mohri
5/13/11
4
600.325/425 Declarative Methods - J. Eisner 19
fat/0.5
fat/0.2
Intersection
1 0 2/0.5
1 0 2/0.8 pig/0.3 eats/0
sleeps/0.6
eats/0.6
sleeps/1.3
pig/0.4
0,0 fat/0.7
0,1
Paths 00 and 01 both accept fat So must the new machine: along path 0,0 0,1
600.325/425 Declarative Methods - J. Eisner 20
pig/0.3
pig/0.4
Intersection fat/0.5
1 0 2/0.8 eats/0
sleeps/0.6
fat/0.2 1 0 2/0.5
eats/0.6
sleeps/1.3
0,0 fat/0.7
0,1 pig/0.7
1,1
Paths 00 and 11 both accept pig So must the new machine: along path 0,1 1,1
600.325/425 Declarative Methods - J. Eisner 21
sleeps/0.6
sleeps/1.3
Intersection fat/0.5
1 0 2/0.8 pig/0.3 eats/0
fat/0.2 1 0
eats/0.6
pig/0.4
0,0 fat/0.7
0,1 1,1 pig/0.7
sleeps/1.9 2,2/1.3
2/0.5
Paths 12 and 12 both accept fat So must the new machine: along path 1,1 2,2
600.325/425 Declarative Methods - J. Eisner 22
eats/0.6
eats/0
sleeps/0.6
sleeps/1.3
Intersection fat/0.5
1 0 2/0.8 pig/0.3
fat/0.2 1 0
pig/0.4
0,0 fat/0.7
0,1 1,1 pig/0.7
sleeps/1.9
2/0.5
2,2/0.8
eats/0.6 2,0/1.3
600.325/425 Declarative Methods - J. Eisner 23
Intersection
Why is intersection guaranteed to terminate?
How big a machine might be produced by
intersection?
600.325/425 Declarative Methods - J. Eisner 24
Given a regular expression …
1. Make a parse tree for it
2. Build up the FSA from the bottom up
Example: (ab|c)*(bb*a)
a b
c concat
union
closure
b
b
a concat
concat
closure
concat
5/13/11
5
600.325/425 Declarative Methods - J. Eisner 25
Given an FSA … Find a regular expression
describing all paths from
initial state 1 to final state 5.
1 2 3
4
Paths from 1 to 5:
e12 ((e23 e33* e35) | e24 e45)
5 >
600.325/425 Declarative Methods - J. Eisner 26
Paths from 1 to 5:
e12 ((e23 e33* e35) | e24 e45)
Given an FSA … Find a regular expression
describing all paths from
initial state 1 to final state 5.
1 2 3
4
Paths from 1 to 5:
e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)
5 >
600.325/425 Declarative Methods - J. Eisner 27
Paths from 1 to 5:
e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)
Given an FSA … Find a regular expression
describing all paths from
initial state 1 to final state 5.
1 2 3
4
Paths from 1 to 5:
e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))
| (e24 (e43 e33* e34 )* (e45 | e43 e35)))
5 >
600.325/425 Declarative Methods - J. Eisner 28
Given an FSA … Find a regular expression
describing all paths from
initial state 1 to final state 5.
1
2
3
4
Paths from 1 to 5:
???
5
>
600.325/425 Declarative Methods - J. Eisner 29
Does there exist any path from
initial state 1 to final state 5?
Let’s do a simpler variant first …
1 2 3
4
5 >
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
slide thanks to R. Tamassia & M. Goodrich (modified)
More generally, transitive closure problem:
For each A, B, does there exist any path
from A to B?
600.325/425 Declarative Methods - J. Eisner 30
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
Does there exist any path from
initial state 1 to final state 5?
Let’s do a simpler variant first …
Hmm … should I look for
a 1 3 path first in hopes of
using it to build a 1 5
path? Or vice-versa?
More generally, transitive closure problem:
For each A, B, does there exist any path
from A to B?
1
2
3
4 5
>
1 2 3 5 >
1 2 5 3 >
5/13/11
6
600.325/425 Declarative Methods - J. Eisner 31
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
Let’s do a simpler variant first …
Hmm … should I look for
a 1 3 path first in hopes of
using it to build a 1 5
path? Or vice-versa?
1 2 3 5 >
1 2 5 3 >
Option #1: Gradually build up longer paths (length-1, length-2, length-3 …)
How do we deal with cycles?
Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.
Both have O(n3) runtime.
But option #2 allows more flexible handling of cycles. We’ll need that when we return to our FSA problem.
600.325/425 Declarative Methods - J. Eisner 32
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
Floyd-Warshall transitive closure algorithm
Hmm … should I look for
a 1 3 path first in hopes of
using it to build a 1 5
path? Or vice-versa?
1 2 5 3 >
Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.
What are the paths of order 0?
What are the paths of order 1?
What are the paths of order 2?
How big can a path’s order be?
What are the paths of order 5?
600.325/425 Declarative Methods - J. Eisner 33
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
Floyd-Warshall transitive closure algorithm
Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.
Definition: pkij = true iff there is an
i j path of order k.
1. Define p0: For each i,j, set p0ij
= true iff there is an i j edge.
2. For k=1, 2, …n, define pk:
1
2
3
4 5
>
600.325/425 Declarative Methods - J. Eisner 34
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
Floyd-Warshall transitive closure algorithm
Option #2 (less obvious): Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.
Definition: pkij = true iff there is an
i j path of order k.
1. Define p0: For each i,j, set p0ij
= true iff there is an i j edge.
2. For k=1, 2, …n, define pk:
For each i,j, set pij
k = pijk-1 v (pik
k-1 ^ pkjk-1)
3. return pn (e.g., what is pn1n ?)
k
j
i
Uses only vertices
numbered 1,…,k-1 Uses only
vertices numbered 1,…,k-1
New: but still uses only vertices
numbered 1,…,k
parts of slide thanks to R. Tamassia & M. Goodrich
600.325/425 Declarative Methods - J. Eisner 35
Floyd-Warshall Example
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified) 600.325/425 Declarative Methods - J. Eisner 36
Floyd-Warshall: k=1 (computes p1 from p0)
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified)
5/13/11
7
600.325/425 Declarative Methods - J. Eisner 37
Floyd-Warshall: k=2 (computes p2 from p1)
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified) 600.325/425 Declarative Methods - J. Eisner 38
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified)
Floyd-Warshall: k=3 (computes p3 from p2)
600.325/425 Declarative Methods - J. Eisner 39
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified)
Floyd-Warshall: k=4 (computes p4 from p3)
600.325/425 Declarative Methods - J. Eisner 40
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified)
Floyd-Warshall: k=5 (computes p5 from p4)
600.325/425 Declarative Methods - J. Eisner 41
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified)
Floyd-Warshall: k=6 (computes p6 from p5)
600.325/425 Declarative Methods - J. Eisner 42
v2
v1
v3
v4
v5
v6
slide thanks to R. Tamassia & M. Goodrich (modified)
Floyd-Warshall: k=7 (computes p7 from p6)
5/13/11
8
600.325/425 Declarative Methods - J. Eisner 43
Paths from 1 to 5:
e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)
Regular expression version (Kleene/Tarjan)
Find a regular expression
describing all paths from
initial state 1 to final state 5.
1 2 3
4
Paths from 1 to 5:
e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))
| (e24 (e43 e33* e34 )* (e45 | e43 e35)))
5 >
600.325/425 Declarative Methods - J. Eisner 44
Regular expression version (Kleene/Tarjan)
Find a regular expression
describing all paths from
initial state 1 to final state 5.
1
2
3
4
Paths from 1 to 5:
???
5
>
600.325/425 Declarative Methods - J. Eisner 45
If there’s a way to get
from 1 to 3 and from
3 to 5, then there's a
way to get from 1 to 5.
Regular expression version (Kleene/Tarjan)
Definition: pkij = regular
expression describing all i j paths that have order k.
1. Define p0: For each i,j, set p0ij
= eij if that edge exists, else .
2. For k=1, 2, …n, define pk:
For each i,j, set pijk =
pijk-1 | (pik
k-1 pkkk-1* pkj
k-1)
(a regexp using all three of union, concat, closure!)
3. return pn (e.g., what is pn1n ?)
k
j
i
Uses only vertices
numbered 1,…,k-1 Uses only
vertices numbered 1,…,k-1
New: but still uses only vertices
numbered 1,…,k
parts of slide thanks to R. Tamassia & M. Goodrich 600.325/425 Declarative Methods - J. Eisner 46
Paths from 1 to 5:
e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)
Regular expression version (Kleene/Tarjan)
What if the arcs have labels?
1 2 3
4
Paths from 1 to 5:
e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))
| (e24 (e43 e33* e34 )* (e45 | e43 e35)))
5 >
a
a b
c
b
aa
600.325/425 Declarative Methods - J. Eisner 47
Paths from 1 to 5:
e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)
Regular expression version (Kleene/Tarjan)
What if the arcs have labels?
Just substitute them in:
1 2 3
4
Paths from 1 to 5:
e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45))
| (e24 (e43 e33* e34 )* (e45 | e43 e35)))
5 > a b
c
b
aa a
a b c a
b
b
a c aa
a
600.325/425 Declarative Methods - J. Eisner 48
Regular languages as points in a high-
dimensional space abc abc
abc:2 2abc (weighted)
ab|ac ab + ac
a(b|c) ab + ac
a(b|(c:2)) ab + 2ac
ab* c ac + abc + abbc + abbbc + …
a(b:2)*c ac + 2abc + 4abbc +8abbbc + …
Instead of dimensions x2, y2, xy, etc.,
every possible string is a dimension
and its coefficient is the coordinate (often 0)
5/13/11
9
600.325/425 Declarative Methods - J. Eisner 49
Suppose P, Q are two regular languages represented as these “formal power series.”
What is the sum P+Q? Union!
We double-count …
What is the product PQ? Concatenation!
What is the Hadamard product P Q? (i.e., the dot product before you sum: x y = (x1y1, x2y2, …)) Intersection!
What is 1/(1-P)? * closure!
Could we use these techniques to classify strings using kernel SVMs?
Regular languages as points in a high-
dimensional space
600.325/425 Declarative Methods
- J. Eisner 50
Function from strings to ...
a:x/.5
c:z/.7
:y/.5
.3
Acceptors (FSAs) Transducers (FSTs)
a:x
c:z
:y
a
c
Unweighted
Weighted a/.5
c/.7
/.5
.3
{false, true} strings
numbers (string, num) pairs
600.325/425 Declarative Methods
- J. Eisner 51
Sample functions
Unweighted
Weighted
{false, true} strings
numbers (string, num) pairs
Grammatical?
How grammatical? Better, how likely?
Markup Correction Translation
Good markups Good corrections Good translations
Acceptors (FSAs) Transducers (FSTs)
600.325/425 Declarative Methods
- J. Eisner 52
Sample data, encoded same way
Unweighted
Weighted
{false, true} strings
numbers (string, num) pairs
Input string Corpus Dictionary
Input lattice Reweighted corpus Weighted dictionary
Bilingual corpus Bilingual lexicon Database (WordNet)
Prob. bilingual lexicon Weighted database
Acceptors (FSAs) Transducers (FSTs) b a n a n a
a i d d
600.325/425 Declarative Methods
- J. Eisner 53
Some Applications
Prediction, classification, generation of text
More generally, “filling in the blanks” (probabilistic reconstruction of hidden data)
Speech recognition
Machine translation, OCR, other noisy-channel models
Sequence alignment / Pdit distance / Computational biology
Text normalization, segmentation, categorization
Information extraction
Stochastic phonology/morphology, including lexicon
Tagging, chunking, finite-state parsing
Syntactic transformations (smoothing PCFG rulesets) 600.325/425 Declarative Methods
- J. Eisner 54
Finite-state “programming”
Object code
compiler
Function
Source code
programmer
Finite-state machine
regexp compiler
Better object code
optimizer
Better object code
determinization, minimization, pruning
Function on strings
Regular expression
programmer
c a
a?c*
Programming Langs Finite-State World
5/13/11
10
600.325/425 Declarative Methods
- J. Eisner 55
Finite-state “programming”
Function composition
FST/WFST composition
Function inversion (available in Prolog)
FST inversion
Higher-order functions
...
Finite-state operators
...
Small modular cooperating functions (structured programming)
Small modular regexps, combined via operators
Programming Langs Finite-State World
600.325/425 Declarative Methods
- J. Eisner 56
Finite-state “programming”
Programming Langs Finite-State World
More features you wish other languages had!
600.325/425 Declarative Methods
- J. Eisner 57
p(x) =
Finite-State Operations
Projection GIVPS YOU marginal distribution
domain( p(x,y) )
p(y) = range( p(x,y) )
a : b / 0.3 a : b / 0.3 600.325/425 Declarative Methods
- J. Eisner 58
0.3 p(x) + 0.7 q(x) =
Finite-State Operations
Probabilistic union GIVPS YOU mixture model
p(x) +0.3 q(x)
p(x)
q(x)
0.3
0.7
600.325/425 Declarative Methods
- J. Eisner 59
p(x) + (1- )q(x) =
Finite-State Operations
Probabilistic union GIVPS YOU mixture model
p(x) + q(x)
p(x)
q(x)
1-
Learn the mixture parameter !
600.325/425 Declarative Methods
- J. Eisner 60
p(x|z) =
Finite-State Operations
Composition GIVPS YOU chain rule
p(x|y) o p(y|z)
p(x,z) = o z p(x|y) o p(y|z)
The most popular statistical FSM operation
Cross-product construction
5/13/11
11
600.325/425 Declarative Methods
- J. Eisner 61
Finite-State Operations
Concatenation, probabilistic closure HANDLP unsegmented text
p(x) q(x)
p(x) p(x) q(x) *0.3
0.3
0.7
p(x)
Just glue together machines for the different segments, and let them figure out how to align with the text
600.325/425 Declarative Methods
- J. Eisner 62
Finite-State Operations
Directed replacement MODPLS noise or postprocessing
p(x, noisy y) = p(x,y) o
Resulting machine compensates for noise or postprocessing
D
noise model defined by dir. replacement
600.325/425 Declarative Methods
- J. Eisner 63
p(x)*q(x) =
Finite-State Operations
Intersection GIVPS YOU product models e.g., exponential / maxent, perceptron, Naïve Bayes, …
p(x) & q(x)
pNB(y | x) & p(y) p(A(x)|y) & p(B(x)|y) &
Cross-product construction (like composition)
Need a normalization op too – computes x f(x) “pathsum” or “partition function”
600.325/425 Declarative Methods
- J. Eisner 64
Finite-State Operations
Conditionalization (new operation)
p(y | x) = condit( p(x,y) )
p(x,y)
Construction: reciprocal(determinize(domain( ))) o p(x,y)
not possible for all weighted FSAs
Resulting machine can be composed with other distributions: p(y | x) * q(x)
600.325/425 Declarative Methods
- J. Eisner 65
Other Useful Finite-State
Constructions
Complete graphs YIPLD n-gram models
Other graphs YIPLD fancy language models (skips, caching, etc.)
Compilation from other formalism FSM:
Wordlist (cf. trie), pronunciation dictionary ...
Speech hypothesis lattice
Decision tree (Sproat & Riley)
Weighted rewrite rules (Mohri & Sproat)
TBL or probabilistic TBL (Roche & Schabes)
PCFG (approximation!) (e.g., Mohri & Nederhof)
Optimality theory grammars (e.g., Pisner)
Logical description of set (Vaillette; Klarlund) 600.325/425 Declarative Methods
- J. Eisner 66
Object code
compiler
Function
Source code
programmer
Finite-state machine
regexp compiler
Better object code
optimizer
Better object code
determinization, minimization, pruning
Function on strings
Regular expression
programmer
c a
a?c*
Programming Langs Finite-State World
Regular Expression Calculus
as a Programming Language
5/13/11
12
600.325/425 Declarative Methods
- J. Eisner 67
Regular Expression Calculus
as a Modelling Language
Oops! Statistical FSMs still done “in assembly language”!
Build machines by manipulating arcs and states
For training, get the weights by some exogenous procedure and patch them onto arcs
you may need extra training data for this
you may need to devise and implement a new variant of PM
Would rather build models declaratively
((a*.7 b) +.5 (ab*.6)) ° repl.9((a:(b +.3 ))*,L,R) 600.325/425 Declarative Methods
- J. Eisner 68
A Simple Example: Segmentation
tapirseatgrass tapirs eat grass? tapir seat grass? tap irse at grass?
...
Strategy: build a finite-state model of p(spaced text, spaceless text)
Then maximize p(???, tapirseatgrass)
Start with a distribution p(English word) a machine D (for dictionary)
Construct p(spaced text) (D space)*0.99 D
Compose with p(spaceless | spaced) ((¬space)+(space: ))*
600.325/425 Declarative Methods
- J. Eisner 69
train on spaced or spaceless text
Strategy: build a finite-state model of p(spaced text, spaceless text)
Then maximize p(???, tapirseatgrass)
Start with a distribution p(Pnglish word) a machine D (for dictionary)
Construct p(spaced text) (D space)*0.99 D
Compose with p(spaceless | spaced) ((¬space)+(space: ))*
A Simple Example: Segmentation
D should include novel words:
D = KnownWord +0.99 (Letter*0.85 Suffix)
Could improve to consider letter n-grams, morphology ...
Noisy channel could do more than just delete spaces: Vowel deletion (Semitic); OCR garbling ( cl d, ri n, rn m ...) 600.325/425 Declarative Methods - J. Eisner 70