1 shake-and-bake mt chris brew, the ohio state university

33
1 Shake-and-Bake MT Chris Brew, The Ohio State University http://www.purl.org/NET/ cbrew . htm

Upload: bernadette-blake

Post on 17-Dec-2015

361 views

Category:

Documents


0 download

TRANSCRIPT

1

Shake-and-Bake MT

Chris Brew, The Ohio State Universityhttp://www.purl.org/NET/cbrew.htm

Shake-and-Bake MT 2795V, Autumn 2005

Ed Hovy’s vision

Automatic summarization/ document understanding could be interesting

System should be able to distill information down to a few salient points.

Users are no longer overwhelmed by floods of irrelevant information.

NLP people would enjoy picking out the salient points and generating language that expresses these points

Shake-and-Bake MT 3795V, Autumn 2005

Ed Hovy’s plaint

To know which are the key points takes world knowledge

Summarization systems have no world knowledge and little ability to represent key points in robust ways.

What they can do is to identify sentences that seem to be important.

They select a series of sentences then smooth them back together

NLP people still seem to like this, but perhaps they should not.

Shake-and-Bake MT 4795V, Autumn 2005

Generating from logical form

John threw a large red ball John threw a red ball that

is large John threw a large ball

that is red

x. throw(j,x) large(x) red(x) ball(x)

x. throw(j,x) red(x) ball(x) large(x)

x. throw(j,x) large(x) ball(x) red(x)

Shake-and-Bake MT 5795V, Autumn 2005

Generating from logical form

Shieber CL 20(1) 1993The three terms are

equivalent according to first-order logic, but we might not want them to be equivalent for purposes of generation

Generation systems can and often are broken down into a strategic component and a tactical component.

We might want the tactical component to be tied to a particular grammar, but the strategic component should know nothing of grammar.

x. throw(j,x) large(x) red(x) ball(x)

x. throw(j,x) red(x) ball(x) large(x)

x. throw(j,x) large(x) ball(x) red(x)

Shake-and-Bake MT 6795V, Autumn 2005

Generating from logical form

We might want the generator to be tied to a particular grammar, but the strategic component should know nothing of grammar.

Thus, we may wish that the strategic component have the freedom to pass to the tactical component any LF that has the right meaning.

x. throw(j,x) large(x) red(x) ball(x)

x. throw(j,x) red(x) ball(x) large(x)

x. throw(j,x) large(x) ball(x) red(x)

Shake-and-Bake MT 7795V, Autumn 2005

Canonical logical forms

The grammar assigns a logical form to each string. In general the one that it assigns is only one of the many

that could express that meaning We call this the canonical logical form. Shieber argues that the generator should be able to

generate from non-canonical logical forms Reason 1: Canonicality is a fact about the grammar, not the

meanings Reason 2: The reasoner would otherwise have to know details

of how the generator wants to receive logical forms

Shake-and-Bake MT 8795V, Autumn 2005

The problem

Regrettably, there are no effective procedures for mapping non-canonical logical forms to canonical ones.

One might, for example, use a weaker logic (e.g. propositional logic) that offers normal forms.

But canonicality (grammar’s view of equivalence) and normalisation (logic’s view of equivalence) do not necessarily coincide.

For example, all those sentences about large red balls…

Shake-and-Bake MT 9795V, Autumn 2005

Machine translation

Machine translation is a hard engineering problem

All-pairs translation between n languages seems to require O(n2) separate systems.

If n = 16 n2 = 256, which is more than the European Community can afford.

KRJP

ENFR

DE

Shake-and-Bake MT 10795V, Autumn 2005

Equivalence of Logical Forms

Shake-and-Bake MT 11795V, Autumn 2005

TLE

Translationally equivalent expressions

Shake-and-Bake MT 12795V, Autumn 2005

Modularity

We need A parser for each source language, delivering

some kind of logical form A transfer mechanism for each language pair,

converting source language logical forms into target language logical forms

A generator for each target language, converting logical forms back into strings of target language words

Shake-and-Bake MT 13795V, Autumn 2005

Modularity ?!

Gotcha! To keep the transfer mechanism simple,

tempting to mess with the parser, but is convenient for one language pair will be inconvenient for the others.

Cleverness in the parser will provide opportunities for complexity in the transfer mechanism.

The generator is likely to offer the same deadly opportunities for extra cleverness as the parser

Shake-and-Bake MT 14795V, Autumn 2005

Interlingua

This sounds like an argument for an interlingua-based approach. Each language is responsible for mapping into and out of a

powerful logical language. Adding a new language is just a matter of adding mapping

and unmapping components for the interlingua. But, this fails because:

It is really hard to ensure that each language maps to identical logical forms for things that are intertranslatable.

Determining equivalence of logical forms without help is undecidable for any logical form language with adequate logical power. The only place you’re going to get help is from heuristics about the mapping between language pairs.

Shake-and-Bake MT 15795V, Autumn 2005

Lexicalism

The Shake-and-Bake solution is to keep excess cleverness in check by adopting strong constraints on the architecture of the system.

Fortunately, HPSG and similar formalisms adopt lexicalism The only meaningful elements of a grammar are its

lexical items Signs are combined by rules that introduce no

independent meaning, but simply equate variables in the logical forms of he combining signs

The derivable logical forms of the grammar are constructed entirely from templates introduced by lexical items.

Shake-and-Bake MT 16795V, Autumn 2005

The Shake-and-Bake idea

The transfer mechanism simply states equivalences between multisets of lexical items{pay,attention,to} {faire,attention,á}{take,a,walk} {faire,une,promenade}{as,as}{aussi,que} (as fast as possible aussi

vite que possible) The representation of a sentence is a bag of

extensions of lexical items, called its base Two sentences are translation equivalent if the

bases are equivalent bags, and they obey the same constraints on relevant logical form variable.

Shake-and-Bake MT 17795V, Autumn 2005

The Shake-and-Bake idea

Second condition is needed to keep “John loves Mary” distinct from “Mary loves John”, even though the two may be correlated. Condition on grammar: you must be able to find the semantics in SourceSign and Skolemize the variables.

Bag equivalence: find a set of equivalence statements that use each extended sign in the source bag once. The resulting target bag is the input to generation.

Both equivalencing and generation are non-deterministic. First try at target language sentence representation may not be one that can be sewn together into a target language sentence.

Shake-and-Bake MT 18795V, Autumn 2005

Advantages of Shake-and-Bake

The ordering of items from the target language bag is entirely a matter for the source language grammar.

No constraints on grammar formalism, providing the semantic forms on each side are mappable. Nothing stops the English grammar writer from using HPSG, the French one TAG, the Japanese one JPSG and the German one YAHCDF.

Head-switching and argument switching just work:Jan zwemt graag John enjoys swimmingJohn likes Mary Marie plait à Jean

See Whitelock 92 for the details

Shake-and-Bake MT 19795V, Autumn 2005

Disadvantages

New algorithm development needed to handle bag-generation

The input to the task consists of the following elements:• A set (B) of lexical signs having cardinality |B|.• A grammar (G) against which to parse this input string.and a solution to the problem consists of•A parse of any sequence (S) such that S contains all the elements of B.

We are interested in understanding how hard this is.

Shake-and-Bake MT 20795V, Autumn 2005

Shift-reduce parsing

shiftreduce([Sign],Sign,[], [])

shiftreduce(P0,Sign, [Next|Bag0], Bag):-

push(Next, P0, P)

shiftreduce(P,Sign,Bag0,Bag).

shiftreduce(P0,Sign,Bag0,Bag) :-

pop(First,P0,P1),

pop(Second,P1,P2),

rule(Mom,First, Second),

push(Mom, P2, P),

shiftreduce([P,Sign,Bag0, Bag).

Shake-and-Bake MT 21795V, Autumn 2005

Shake and Bake generation

shake_and_bake([Sign],Sign,[], [])

shake_and_bake(P0,Sign, [Next|Bag0], Bag):-

push(Next, P0, P)

shake_and_bake(P,Sign,Bag0,Bag).

shake_and_bake(P0,Sign,Bag0,Bag) :-

pop(First,P0, P1),

delete(Second,P1,P2),

unordered_rule(Mom,First, Second),

push(Mom, P2, P),

shake_and_bake([P,Sign,Bag0, Bag).

Shake-and-Bake MT 22795V, Autumn 2005

NP-completeness

It’s intuitively obvious that the change from “pick the second top element of the stack” to “pick any of the elements in the stack” introduces extra indeterminacy.

In fact, it turns out that bag generation is equivalent to the STABLE MÉNAGE Á TROIS problem, and therefore NP-complete and likely intractable (Brew,92)

So the ground shifts to answering: Can we find a sensible algorithm anyway? What properties of linguistic signs shall we

exploit?

Shake-and-Bake MT 23795V, Autumn 2005

English adjective ordering

The fierce brown little cat? The brown fierce little cat ? The brown little fierce cat? The little brown fierce cat

For the sake of argument, lets pretend that the top one is the only grammatical ordering. I’m not committed to this belief.

Shake-and-Bake MT 24795V, Autumn 2005

Grammar

Item Remainder Active Partthe np / n(_)fierce n([]) / n([1|_])little n([1]) / n([1,1|_])brown n([1,1]) / n([1,1,1|_])cat n(_) <none>

Shake-and-Bake MT 25795V, Autumn 2005

Connections

NodeCategory Lexical Item Nesting0 np : <dummy> 11 np : the 02 n(_) : the 13 n([]) : fierce 04 n([1|_]) : fierce 15 n([1]) : little 06 n([1,1|_]) : little 17 n([1,1]) : brown 08 n([1,1,1|_]) : brown 19 n([1,1,1]) : cat 0

Shake-and-Bake MT 26795V, Autumn 2005

The search space

01

2

3

45

6

7

8

9Link together pairs that may stand in functor/argument relationships. We still don’t know which elements do stand in functor/argument relationships

Shake-and-Bake MT 27795V, Autumn 2005

Completing the graph

01

2

3

45

6

7

8

9Add lines linking functor and argument categories. Now the task of finding a parse comes down to finding a Hamiltonian path through the graph.

Shake-and-Bake MT 28795V, Autumn 2005

Applying constraints

01

2

3

45

6

7

8

9we can immediately see that node 3 must be connected to node 2, since there are no other links leading away from node 3. and so on.

Shake-and-Bake MT 29795V, Autumn 2005

Mopping up

Once these links have been established, we can delete alternative links which they preclude. This results in the deletion of the lines from node 9 to nodes 6, 4 and 2, and that of the line from 7 to 2.

The resulting system can once again be simplified by deleting the line from node 7 to node 4, yielding a unique circuit through the graph. This corresponds to the correct analysis of “the fierce little brown cat”.

In this example the constraints encoded in the graph are sufficient to drive the analysis to a unique conclusion, without further search, but this will not always happen. We need a combination of constraint propagation with a facility for making guesses when confronted with a choice of alternatives.

Shake-and-Bake MT 30795V, Autumn 2005

Shake and Bake generation with constraints

shake_and_bake([Sign],Sign,[], [],_) shake_and_bake(P0,Sign,[Next|Bag0],Bag,G):-push(Next, P0, P)shake_and_bake(P,Sign,Bag0,Bag,G).

shake_and_bake(P0,Sign,Bag0,Bag) :-pop(First,P0, P1,G),delete(Second,P1,P2,G),

unordered_rule(Mom, First, Second, Info),update(Info,G),push(Mom, P2, P),shake_and_bake([P,Sign,Bag0, Bag).

Shake-and-Bake MT 31795V, Autumn 2005

Implementation notes

We combine the constraint propagation mechanism with Whitelock's original shift-reduce parser, propagating constraints after every reduction step. The parser has the role of systematically choosing between alternative reductions, while the constraint propagation mechanism fills in the consequences of a particular set of choices.

One of the elements in a reduction is taken from the top of the stack, while the other is taken from anywhere in the tail of the stack. This idea, due to Whitelock and Reape, ensures that the input is treated as a bag rather than a string.

Shake-and-Bake MT 32795V, Autumn 2005

Performance (number of reductions)

Example Length Whitelock Constraint Propagation 1 a fox 2 1 1 2 a yellow fox 3 3 2 3 a tame yellow fox 4 7 3 4 a big tame yellow fox 5 15 4 5 the cat likes a fox 5 6 6 6 the fierce cat likes a fox 6 13 9 7 the fierce cat likes a tame fox 7 27 19 8 the little brown cat likes a yellow fox

8 55 16

9 the fierce little brown cat likes a yellow fox

9 111 20

10 the fierce little brown cat likes a tame yellow fox

10 223 25

11 the fierce little brown cat likes a big tame yellow fox

11 447 30

12 the little brown cat likes a big yellow fox

9 111 20

Shake-and-Bake MT 33795V, Autumn 2005

Conclusions

These preliminary results must obviously be interpreted with some caution, since the examples were specially constructed.

For grammars related to HPSG it seems probable that considerable benefit would be gained from adding a constraint propagation component to an unordered version of a head-corner parsing algorithm, as described by Van Noord [Van Noord, 1991].

Whatever the right basis for declarative MT is, likely to look something like this.

The constraint graph is a good place to put statistics