gina-anne levow and douglas w.oard institute for advanced computer studies
DESCRIPTION
Topic Tracking at Maryland: Lessons from the Johns Hopkins Mandarin-English Information (MEI) Project. Gina-Anne Levow and Douglas W.Oard Institute for Advanced Computer Studies University of Maryland, College Park. Roadmap. MEI Overview (6 weeks in 5 minutes) MEI Results - PowerPoint PPT PresentationTRANSCRIPT
November 17, 2000 TDT-2000 Workshop
Topic Tracking at Maryland:Lessons from the Johns Hopkins
Mandarin-English Information (MEI) Project
Gina-Anne Levow and Douglas W.OardInstitute for Advanced Computer Studies
University of Maryland, College Park
Roadmap
• MEI Overview (6 weeks in 5 minutes)
• MEI Results
• Adapting MEI to TDT
• TDT Results
• Conclusions
The MEI Team• Senior Members
• Students
Helen Meng Chinese University of Hong KongErika Grams Advanced Analytic ToolsSanjeev Khudanpur Johns Hopkins UniversityGina-Anne Levow University of MarylandDouglas Oard University of MarylandPatrick Schone US Department of DefenseHsin-Min Wang Academia Sinica, Taiwan
Berlin Chen National Taiwan UniversityWai-Kit Lo Chinese University of Hong KongKaren Tang Princeton UniversityJianqiang Wang University of Maryland
MEI: The Challenges
• Speech Recognition– Tokenization– Lexicon coverage– Selection among alternatives
• Translation– Tokenization– Lexicon coverage– Selection among alternatives
Dif
fere
nt P
robl
ems
Term Granularity Options
MandarinWords
MandarinSyllables
MandarinCharacters
EnglishWords
EnglishPhrases
MEI Evaluation Collections
2265manually
segmentedstories
3371manually segmented
stories
DevelopmentCollection: TDT-2
EvaluationCollection: TDT-3
Mar 98
Oct 98 Dec 98
17 topics,variable number
of exemplars
Jun 98Jan 98
English texttopic exemplars:Associated PressNew York Times
Mandarin audiobroadcast news:Voice of America
56 topics,variable number
of exemplars
Jun 98
Mandarin Audio
Term Translation
President Bill Clinton and…
English Exemplar
Term Selection
BilingualTermList
Query Construction
MandarinIR System
StoryBoundaries
Evaluation
Named Entity
Tagging
DocumentConstruction
SpeechRecognition
Relevance Judgments
RankedList
BBN
U Mass
LDC
Cornell
DragonLDC
LDC
LDC 000100010000010100
MeanUninterpolated
AveragePrecision
LDCCETA
Query Translation
• Dictionary inversion for phrase translation– “Wall Street” “best interests” “human rights”
• Lemmatize remaining words if necessary– e.g. “televised” translates as “television
• filtering for query term selection– Compared to an English background model
2
0.0
0.5
1.0
0.0 0.2 0.4 0.6 0.8 1.0
Recall
Inte
rpol
ated
Pre
cisi
onEvaluation Measure
Able to characterize variation across exemplars!
Balanced Translation Works Well
• Pirkola’s structured queries– Treat translation alternatives
as synonyms
– Inquery #syn() operator
• Balanced translation– Distribute probability mass
over translation alternatives
– Inquery #sum() operator 0
0.1
0.2
0.3
0.4
0.5
0.6
Me
an
Av
era
ge
Pre
cis
ion
StructuredQueries
BalancedTranslation
StrategyTDT-2, phrase-based translation, word-based retrieval
Phrase Translation Beats Words
• Phrases beat words
• Three sources– Translation lexicon
– Named entities
– Numeric expressions
0
0.1
0.2
0.3
0.4
0.5
0.6
Me
an
Av
era
ge
Pre
cis
ion
Words Phrases Phrases +NE/NUMEX
StrategyCondition: TDT-2, 12 exemplars, word-based retrieval
Character Bigram Indexing Wins
• Character bigrams are best
• Syllable bigrams do poorly
0
0.1
0.2
0.3
0.4
0.5
0.6
Mea
n A
ver
age
Pre
cisi
on
Words Char Syllable
TDT-2, single NYT exemplar, manual translation
Untranslatable Terms
Term Occurrencessuharto 97netanyahu 88starr 62arafat 50bjp 45vajpayee 44estrada 44….hsu 19zemin 7
# (by token)87,0043,028
# (by type)12,4021,122
TermstotalOOV
Cross-Language Phonetic Matching
• Small improvement– Not statistically significant
• Character bigrams are best– Form a unified index
• Character and syllable bigrams
– Translate words if possible• Then form character bigrams
– Otherwise translate syllables• Then form syllable bigrams
0
0.1
0.2
0.3
0.4
0.5
0.6
Me
an
Av
era
ge
Pre
cis
ion
Wo
rd
Ch
ar
Sy
llab
le
Indexing Terms
no CLPM CLPM
TDT-2, phrase-based translation
MEI: Comparing Collections
0.4
0.45
0.5
0.55
0.6
Words Character Bigrams Character Bigrams +CLPM
Mea
n A
vera
ge
Pre
cisi
on
TDT2 TDT3
MEI Conclusions
• ASR Words
• Translation Phrases, Words, Lemmas, Syllables
• Indexing Character Bigrams
TDT-2000: What’s New Since ’99?
• Key ideas from MEI:– Dictionary inversion for phrase translation– Balanced translation– Post-translation resegmentation
• Adaptation to TDT:– Exploit negative exemplars– Improved Mandarin topic normalization– Round-robin balanced translation
Mandarin Audio
Term Translation
President Bill Clinton and…
English Exemplars
Term Selection
BilingualTermList
Query Construction
PRISE
StoryBoundaries
ScoreNormalization
DocumentConstruction
SpeechRecognition
RankedList
NIST
DragonLDC
LDC
LDC
Scores
LDC/CETA
TDT-2000
IDFComputation
Training Epoch
Topic Tracking Improvements
• Improved filtering for query term selection– First compare to background model– Augment by comparison to negative exemplars
• Mandarin topic normalization (unofficial)– Language-specific strategy
• Mandarin: Best single training epoch score
• English: Average of exemplar scores
– Recomputed Mandarin source normalization
2
Effect of Negative Exemplars
Text Only DET Plots1st 60 topics (self-scored)
Mandarin TextNn=0 & Nn = 2
English TextNn=0 & Nn=2
Indexing Character Bigrams
Mandarin Speech Only1st 60 topics
(unofficial renormalization)
Character Bigrams
Words
Round Robin 8-Best Translation
TDT-1999 2-best translation
Mandarin Text1st 60 Topics(self-scored)
TDT-2000Round-robin 8 best
Conclusions
• Top-8 round robin translation to Mandarin wins– Slightly outperforms top-2 translation to English
• Query translation is more efficient– Better suited to a stream of stories
• Match term extent to purpose– ASR, translation, indexing
Closing Thoughts
• Thanks to Jon and LDC !
• Normalization limits our insight– Need some way to see past it
• Availability of TDT-3 ground truth?