Download - Parsing with Context-Free Grammars for ASR
![Page 1: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/1.jpg)
Parsing withContext-Free Grammars for
ASRJulia Hirschberg
CS 4706
Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin
![Page 2: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/2.jpg)
What is Syntax?
• Structure of language• How words are arranged together and related to one
another• Goal of syntactic analysis: relate surface form (what
someone says or writes) to underlying structure, to support semantic analysis (what the utterance or text means)
• Syntactic representation: typically a tree structure
![Page 3: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/3.jpg)
Structure in Strings
• A set of words, or, a lexicon: the a small nice big very boy girl sees likes
• Some `good’ (grammatical) sentences:– the boy likes a girl – the small girl likes the big girl– a very small nice boy sees a very nice boy
• Some bad (ungrammatical) sentences:– *the boy the girl– *small boy likes nice girl
• Can we find a way of distinguishing between the two kinds of sequences?
• Can we identify similarities among grammatical subsequences?
![Page 4: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/4.jpg)
One Version of Constituent Structure
• Lexicon: the a small nice big very boy girl sees likes• Grammatical sentences:
– (the) boy (likes a girl) – (the small) girl (likes the big girl)– (a very small nice) boy (sees a very nice boy)
• Ungrammatical sentences:– *(the) boy (the girl)– *(small) boy (likes the nice girl)
![Page 5: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/5.jpg)
Another Constituency Hypothesis
• Lexicon: the a small nice big very boy girl sees likes• Grammatical sentences:
– (the boy) likes (a girl) – (the small girl) likes (the big girl)– (a very small nice boy) sees (a very nice boy)
• Ungrammatical sentences:– *(the boy) (the girl)– *(small boy) likes (the nice girl)
• Better: fewer types of constituents (blue and red are of same type)
![Page 6: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/6.jpg)
Even More Structures
• Lexicon: the a small nice big very boy girl sees likes• Grammatical sentences:
– ((the) boy) likes ((a) girl) – ((the) (small) girl) likes ((the) (big) girl)– ((a) ((very) small) (nice) boy) sees ((a) ((very) nice)
girl)• Ungrammatical sentences:
– *((the) boy) ((the) girl)– *((small) boy) likes ((the) (nice) girl)
![Page 7: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/7.jpg)
From Substrings to Trees
• (((the) boy) likes ((a) girl))
boythe
likesgirl
a
![Page 8: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/8.jpg)
How do we Label the Nodes?
• ( ((the) boy) likes ((a) girl) )• Choose constituents so each one has one non-bracketed
word: the head• Group words by distribution of constituents they head
(POS)– Noun (N), verb (V), adjective (Adj), adverb (Adv),
determiner (Det)• Category of constituent: XP, where X is POS
– NP, S, AdjP, AdvP, DetP
![Page 9: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/9.jpg)
Types of Nodes
• (((the/Det) boy/N) likes/V ((a/Det) girl/N))
boy
the
likes
girl
a
DetP
NP NP
DetP
S
Phrase-structuretree
nonterminalsymbols= constituents
terminal symbols = words
![Page 10: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/10.jpg)
Determining Part-of-Speech
A blue seat/a child seat: noun or adjective?– Syntax:
• a blue seat a child seat• a very blue seat *a very child seat • this seat is blue *this seat is child
– Morphology:• bluer *childer
– blue and child are not the same POS – blue is Adj, child is Noun
![Page 11: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/11.jpg)
Determining Part-of-Speech
– Preposition or particle?• A he threw out the garbage• B he threw the garbage out the door• A he threw the garbage out • B *he threw the garbage the door out
– The two out are not same POS• A is particle, B is Preposition
![Page 12: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/12.jpg)
Constituency
• Some Noun phrases (NPs)• A red dog on a blue tree• A blue dog on a red tree• Some big dogs and some little dogs• A dog• I• Big dogs, little dogs, red dogs, blue dogs,
yellow dogs, green dogs, black dogs, and white dogs
• How do we know these form a constituent?
![Page 13: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/13.jpg)
NP Constituency
• NPs can all appear before a verb:– Some big dogs and some little dogs are going
around in cars…– Big dogs, little dogs, red dogs, blue dogs,
yellow dogs, green dogs, black dogs, and white dogs are all at a dog party!
– I do not• But individual words can’t always appear before
verbs:– *little are going…– *blue are…– *and are
• Must be able to state generalizations like:– Noun phrases occur before verbs
![Page 14: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/14.jpg)
PP Constituency
• Preposing and postposing:– Under a tree is a yellow dog. – A yellow dog is under a tree.
• But not:– *Under, is a yellow dog a tree. – *Under a is a yellow dog tree.
• Prepositional phrases notable for ambiguity in attachment– I saw a man on a hill with a telescope.
![Page 15: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/15.jpg)
Context-Free Grammars
• Defined in formal language theory– Terminals: e.g. cat – Non-terminal symbols: e.g. NP, VP– Start symbol: e.g. S– Rewriting rules: e.g. S NP VP
• Start with start symbol, rewrite using rules, done when only terminals left
![Page 16: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/16.jpg)
A Fragment of English
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
Input: the cat is on the mat
![Page 17: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/17.jpg)
Derivations in a CFG
SS
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
![Page 18: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/18.jpg)
Derivations in a CFG
NP VP
NP
S
VP
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
![Page 19: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/19.jpg)
Derivations in a CFG
DetP N VP
DetP
NP
S
VP
N
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
![Page 20: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/20.jpg)
Derivations in a CFG
the cat VP
catthe
DetP
NP
S
VP
N
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
![Page 21: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/21.jpg)
Derivations in a CFG
the cat V PP
catthe
DetP
NP
PP
S
VP
N V
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
![Page 22: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/22.jpg)
Derivations in a CFG
the cat is Prep NP
catthe is
DetP
NP
PP
Prep
S
VP
N
NP
V
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
![Page 23: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/23.jpg)
Derivations in a CFG
the cat is on Det N
catthe is
DetP
NP
DetP
PP
Prep
S
VP
N
NP
V
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
on N
![Page 24: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/24.jpg)
Derivations in a CFG
the cat is on the mat
catthe is
DetP
NP
DetP
PP
Prep
S
VP
N
NP
V
S NP VPVP V PPNP DetP NN cat | matV isPP Prep NPPrep onDetP the
on N
the mat
![Page 25: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/25.jpg)
A More Complicated Fragment of English
• S NP VP• S VP• VP V PP• VP V NP• VP V• NP DetP NP• NP N NP• NP N• PP Prep NP
• N cat | mat | food | bowl | Mary• V is | likes | sits• Prep on | in | under• DetP the | a
Mary likes the cat bowl.The cat ate the tasty food.Hello. Nice talking to you.
![Page 26: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/26.jpg)
Pocket Sphinx Grammar Format
• Variables go in angle brackets, e.g. <city>• Terminals must appear in your pronunciation
dictionary (case sensitive)• X Y is concatenation -- e.g., I WANT• (X | Y) means X or Y -- e.g., (WANT|NEED)• Square brackets mean optional -- e.g., [ON]
FRIDAY• * means that the expansion may be spoken zero or
more times -- e.g. <digit>*• + means one or more times -- e.g. <digit>+
![Page 27: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/27.jpg)
Example
• <city> = BOSTON | NEWYORK | WASHINGTON | BALTIMORE;
• <time> = MORNING | EVENING; • <day> = FRIDAY | MONDAY; • public <query> = (((WHAT TRAINS LEAVE) |
(WHAT TIME CAN I TRAVEL) | (IS THERE A TRAIN)) (FROM|TO) <city> [(FROM|TO) <city>] ON <day> [<time>]);
Hello. No. I want to go on Tuesday. When does the train leave?
![Page 28: Parsing with Context-Free Grammars for ASR](https://reader035.vdocuments.mx/reader035/viewer/2022062301/56816280550346895dd2ebf3/html5/thumbnails/28.jpg)
Next Class
• Language modeling for large vocabulary applications: Ngrams