templates in linguistics - why garbage garbage
TRANSCRIPT
Templates in Linguistics: Why Garbage Garbage?
Presented by:Hussein Ghaly
1- Garbage Disposal
Yaseen Ghaly, 3 Years Old:• Papy laih zebala zebala? (Dad, why
garbage garbage?) > Why are you carrying two garbage bags?
• Papy laih bang bang? (Dad, why bang bang?) > Why are you making this “bang bang/hammering” sound?
2- Making a Template
• Yaseen seems to be using this template:• “Papy, laih X?” (Dad, why X)• Where X can be anything:
– Garbage Garbage– Bang Bang– Sleeping– …– 天空是绿色的 (foreign word/code switching)
3- To build a language
• How can the linguistic expression from simple sentences into language such as ours?
• Answer: Recursion
4- Recursively
• An example of recursion found by Salma Ghaly, 6 Years old.
Main Claim
• Language is built using simple (idiomatic) templates. The complexity comes from recursion.
Outline
• Starting Assumptions• Learning templates (Language Acquisition)• Cross Linguistic Template Linearity• Selecting A template (Semantic-Pragmatic
Prompt)• Extending A template (Template Malleability)• Applications of Templates (Information
Extraction and Machine Translation)
Starting Assumptions - Syntax
• In the syntax literature, language is a lexicon of words, and a computational system to put these words where they should form a grammatical sentence.
Lexicon Computation System
Starting Assumptions – Templates Framework
• The “lexicon”, which is stored in the memory, is extended with a list of templates, also stored in memory.
• The computational system only manages what to fill the placeholders within templates.
Word Lexicon Computation System
Template Lexicon
Learning Templates
• The “garbage gabage” example indicates:– A child can intuitively form a template for
plurals (that is applicable in some human languages such as Bhasa Malaysia (e.g. kanak kanak=children)
– A child can put anything in the placeholder X within the sentence template “Dad, Why X?”
• But these hypotheses would need further evidence from First Language Acquistion
Template Linearity
• English– I love you.– I miss you.– I need you.
• French– Je t’aime.– Tu me manques.– J’ai besoin de toi.
Clearly, the linear order is very different between Constructions in different languages.
This should entice us to think about how these constructions are generated.
Semantic-Pragmatic Prompt• An area of overlap between the reason, context,
and information content of some sentence.• Start with list of arguments (X1: I, X2: You)• I Want to express [+feeling] [+positive]
[+distance], therefore: – in English, we invoke the template I miss X2.– In French, we invoke the template X2 me manques
(with some adjustments depending on pronouns, etc)• So I can utter the sentence after filling the
template:– I miss Randa.– Randa me manque.
Template Variability• Almost everything can be said in an alternative
way:– Godzilla destroyed the City, which is unfortunate.– It is unfortunate that Godzilla destroyed the city.– The destruction of the city by Godzilla is unfortunate.
• So, there are different templates to express the relation between these four entities (being unfortunate, the destruction, Godzilla, the City). This again feeds into the argument of non-linearity of templates, this time within the same language.
Template Malleability• Meaning how easy the template can be re-
shaped. This includes the following:– Tense malleability:
• John was eating fish.• John has been eating fish.
– Synonym malleability:• Sarah cannot tolerate this any more.• Sarah cannot put up with this anymore.
• The idea of malleability enables us to avoid accounting for hundreds of millions of combinations of basic templates.
Using Templates
• For information Extraction (e.g. Banko and Etzioni 2008), where templates where used to extract (is-a) relationships between entities.
Using Templates in Machine Translation
• Was first suggested by (Nagao, 1984) under the name of Example-Based Machine Translation. He also indicted this approach is relevant to Second Language Acquisition.
Using Templates in Machine Translation
• Current state of the art Phrase-Based Statistical Machine Translation techniques uses contiguous chunks.
(Koehn, 2010)
Using Templates in Machine Translation
• But using contiguous chunks misses many phrases where there is a difference in word order between the two languages.
- needs a lot of training data • To compensate for this, a statistical
reordering model is used - can make the output unintelligible
Using Templates in Machine Translation
Chunk: Michael assumes that he will stay in the house ->Michael geht davon aus, dass er im haus bleibt
Subchunks:Michael -> Michaelin the house -> im haus
So by removing (stenciling) subcunks from the chunk we get a translation template
X1 assumes that he will stay X2 ->X1 geht davon aus, dass er X2 bleibt
- preserves word order - can apply to many sentences not seen before - requires less training data - can set restrictions on the type of placeholders (X1: NP , X2: PP)
•Thank X1!
(X1 = You )