10417/10617 intermediate deep learning: fall2019rsalakhu/10417/lectures/... · reflect the nature...
TRANSCRIPT
![Page 1: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/1.jpg)
10417/10617IntermediateDeepLearning:
Fall2019RussSalakhutdinov
Machine Learning [email protected]
https://deeplearning-cmu-10417.github.io/
Representation Learning for Reading Comprehension
![Page 2: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/2.jpg)
• Disclaimer:SomeofthematerialandslidesforthislecturewereborrowedfromBhuwanDhingra’stalk.
2
ReadingComprehension
![Page 3: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/3.jpg)
ReadingComprehension
Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.
Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”
Answer:RodBlagojevich
Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)
![Page 4: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/4.jpg)
Ø isadocumentØ isaquestionoverthecontentsofthatdocumentØ istheanswertothisquery
• Theanswercomesfromafixedvocabulary.
TASK:Givenadocumentquerypairfindwhichanswers.
ReadingComprehension
Ø mightconsistofalltokens/spansoftokensinthedocument(ExtractiveQuestionAnswering)
• QuestionAnswering/InformationExtraction• Testfortextrepresentationmodels
– Abetterrepresentationcanhelpanswermorequestions
![Page 5: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/5.jpg)
Approach--SupervisedLearning
ModelPr(c|d, q) = f✓(d, q, c) 8 c 2 A
L(✓) =X
(d,q,a)2D
� logPr(a|d, q) Loss
✓̂ = argmin✓
L(✓) Training
Whatarchitecturalbiasescanwebuildintothemodel?
(Aneuralnetwork)
D = {(d, q, a)}Ni=1 Dataset
![Page 6: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/6.jpg)
ArchitecturalBias
• CNNs,RNNshavearchitecturalbiasestowardsimages/sequences
• Forreadingcomprehensionwhatbiasescanwebuildtoreflectlinguisticphenomena?– Alignment,Paraphrasing,Aggregation– Coreference,SyntacticandSemanticDependencies
DesigningtheconnectivitypatternofaNeuralNetworktoreflectthenatureoftheproblembeingsolved.
(Part1)
(Part2)
![Page 7: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/7.jpg)
Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.
Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”
TextPhenomena
Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)
Alignment
![Page 8: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/8.jpg)
Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.
Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”
TextPhenomena
Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)
Alignment Paraphrasing
![Page 9: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/9.jpg)
Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.
Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”
TextPhenomena
Who-Did-What Dataset (Onishi, Wang, Bansal, Gimpel, McAllester, EMNLP, 2016)
Alignment AggregationParaphrasing
![Page 10: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/10.jpg)
Alignment
WordVectors+RNNstorepresentDocumentandQuery
Multiplepassesoverthedocument
+PointerSumAttention
AggregationParaphrasing
MultiplicativeAttention
Biases
![Page 11: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/11.jpg)
• Ascompositionsofwordvectors
arrested Illinois governor Rod Blagojevich
Thingswhichhelp:Ø PretrainedGloveembeddingsØ RandomvectorsforOOVtokensattesttime.
• Betterthantrained“UNK”embedding.Ø Characterembeddings
*Dhingra,Liu,Salakhutdinov,Cohen(preprint,2017)**Yang,Dhingra,Yuan,Hu,Cohen,Salakhutdinov(ICLR,2017)
RepresentingDocument/Query
![Page 12: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/12.jpg)
• BidirectionalGatedRecurrentUnitsprocessthetokensfromlefttorightandrighttoleft
arrested Illinois governor Rod Blagojevich
RepresentingDocument/Query
dt = [�!ht ; �ht ]
![Page 13: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/13.jpg)
• Bothdocumentandqueryarerepresentedasmatrices
RepresentingDocument/Query
arrested Illinois governor Rod Blagojevich corruption by X who was
Q 2 R2h⇥|Q|D 2 R2h⇥|D|
h–StatesizeofeachGRU
![Page 14: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/14.jpg)
• ForeachtokeninD,weformatokenspecificrepresentationofthequery
Blagojevich corruption by X who was
di
GatedAttentionMechanism
Dhingra et al (ACL 2017)
↵ij = softmax(qTj di)q̃i =
|Q|X
j=1
↵ijqj
document query
qj
![Page 15: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/15.jpg)
GatedAttentionMechanism• Useelement-wisemultiplicationtogatethe
interactionbetweendocumenttokensandquery
corruption by X who wasBlagojevich
di
d0i = di ⇤ q̃i
Dhingra et al (ACL 2017)
q̃i =
|Q|X
j=1
↵ijqj
1. Findfeaturesinthequerywhichmatchthecontextualrepresentationofthedocument
2. Gatethedocumentrepresentationbymultiplyingwiththesefeatures
![Page 16: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/16.jpg)
MultiHopArchitecture• Performseveralpassesoverthedocument
– Allowmodeltocombineevidencefrommultiplesentences
Dhingra et al (ACL 2017)corruption by X who wasBlagojevicharrested
RepeatforKlayers
GatedAttention
GatedAttention
![Page 17: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/17.jpg)
Multi-hopArchitecture• Performseveralpassesoverthedocument
– Allowmodeltocombineevidencefrommultiplesentences
Dhingra et al (ACL 2017)
![Page 18: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/18.jpg)
OutputModel• Probabilitythataparticulartokeninthedocumentanswersthequery:
Ø Takeaninnerproductbetweenthequeryembeddingandtheoutputofthelastlayer:
• Theprobabilityofaparticularcandidateisthenaggregatedoveralldocumenttokenswhichappearinc:
setofpositionswhereatokenincappearsinthedocumentd.
PointerSumAttentionofKadlecetal.,2016
si =exp (hq(K), d(K)
i i)P
i0 exp (hq(K), d(K)i0 i)
, i = 1, . . . , |D|
![Page 19: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/19.jpg)
• Theprobabilityofaparticularcandidateisthenaggregatedoveralldocumenttokenswhichappearinc:
• Thecandidatewithmaximumprobabilityisselectedasthepredictedanswer:
• Usecross-entropylossbetweenthepredictedprobabilitiesandthetrueanswers.
OutputModel
![Page 20: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/20.jpg)
Results
Datasets
Westudied5datasets:
![Page 21: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/21.jpg)
Results
Datasets
Westudied5datasets:Stateoftheart
Dhingra et al (ACL 2017)
![Page 22: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/22.jpg)
![Page 23: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/23.jpg)
AnalysisofAttention
Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.
Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”
Blagojevich corruption by X who was
di
↵ij = softmax(qTj di)q̃i =
|Q|X
j=1
↵ijqj
![Page 24: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/24.jpg)
AnalysisofAttention
Query:“President-electBarackObamasaidTuesdayhewasnotawareofallegedcorruptionbyXwhowasarrestedonchargesoftryingtosellObama’ssenateseat.”FindX.
Document:“…arrestedIllinoisgovernorRodBlagojevichandhischiefofstaffJohnHarrisoncorruptioncharges…includedBlogojevichallegedlyconspiringtosellortradethesenateseatleftvacantbyPresident-electBarackObama…”
Layer1 Layer2
Dhingra et al (ACL 2017)
![Page 25: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/25.jpg)
Summarysofar
• Multiplicativeattentionfordocumentandqueryalignment
• Multiplelayersallowmodeltofocusondifferentsalientaspectsofthequery
Code+Data:https://github.com/bdhingra/ga-reader
![Page 26: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/26.jpg)
Wordsvs.Characters• Word-levelrepresentationsaregoodatlearningthesemanticsofthetokens
• Character-levelrepresentationsaremoresuitableformodelingsub-wordmorphologies(“cat”vs.“cats”)
Ø Word-levelrepresentationsareobtainedfromalearnedlookuptable
Ø Character-levelrepresentationsareusuallyobtainedbyapplyingRNNorCNN
• Hybridword-charactermodelshavebeenshowntobesuccessfulinvariousNLPtasks(Yangetal.,2016a,Miyamoto&Cho(2016),Lingetal.,2015)
• Commonlyusedmethodistoconcatenatethesetworepresentations
![Page 27: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/27.jpg)
Fine-GrainedGating• Fine-grainedgatingmechanism:
Word-levelrepresentation
Character-levelrepresentation
Gating
Additionalfeatures:namedentitytags,part-of-speechtags,documentfrequencyvectors,wordlook-uprepresentations
Yang et al, ICLR 2017
![Page 28: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/28.jpg)
Children’sBookTest(CBC)Dataset
![Page 29: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/29.jpg)
Wordsvs.Characters• Highgatevalues:character-levelrepresentations• Lowgatevalues:word-levelrepresentations.
![Page 30: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/30.jpg)
TalkRoadmap
• MultiplicativeandFine-grainedAttention• LinguisticKnowledgeasExplicitMemoryforRNNs
![Page 31: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/31.jpg)
Broad-ContextLanguageModelingHer plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.'' She gave me a quick nod and turned back to X
Task:FindX.
LAMBADA dataset, Paperno et al., ACL 2016
(Xcanbeanywordinthevocabulary)
Passagesarefilteredsuchthathumanscancorrectlyanswerwhengiventhewholecontextbutnotwhengivenonlythelastsentence.
![Page 32: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/32.jpg)
Broad-ContextLanguageModeling(recastasReadingComprehension)
Query:She gave me a quick nod and turned back to XFindX.(NowXisassumedtobeinthedocument)
Document:Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.''
Chu et al, EACL 2017
• ReadingComprehensionapproachesperformbestonthistask
![Page 33: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/33.jpg)
TextPhenomenaDocument:Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.''
Query:
She gave me a quick nod and turned back to XFindX.
conjnsubj nmod
A0:turner A1:destination
SyntacticDependency
SemanticDependency
![Page 34: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/34.jpg)
TextPhenomenaDocument:Her plain face broke into a huge smile when she saw Terry. “Terry!” she called out. She rushed to meet him and they embraced. “Hon, I want you to meet an old friend, Owen McKenna. Owen, please meet Emily.''
Query:
She gave me a quick nod and turned back to XFindX.
SyntacticDependency
SemanticDependency
X=Terry
Coreference
![Page 35: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/35.jpg)
ArchitecturalBias
1. Dependent/Coreferentmentionsmaybefarapartinthedocument– RNNs(includingGRU/LSTM)tendto“forget”long-terminteractions
Usememory-augmentedarchitecture.2. Off-the-shelfNLPtoolsavailablewhichextract
thesedependencies– Example:StanfordCoreNLP
Uselinguisticannotationstoguidememorypropagationinthenetwork.
![Page 36: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/36.jpg)
LinguisticKnowledge
there ball the left and kitchen the to went she football the got mary
CoreNLP / SENNA
there ball the left and kitchen the to went she football the got mary
1. Coreference2. SyntacticDependencies3. SemanticRoles+Inverserelations
![Page 37: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/37.jpg)
• Bothdocumentandqueryarerepresentedasmatrices
RepresentingDocument/Query
arrested Illinois governor Rod Blagojevich corruption by X who was
Q 2 R2h⇥|Q|D 2 R2h⇥|D|
h–StatesizeofeachGRU
![Page 38: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/38.jpg)
• Bothdocumentandqueryarerepresentedasmatrices
RepresentingDocument/QueryGraphs
Q 2 R2h⇥|Q|D 2 R2h⇥|D|
there ball the left and kitchen the to went she football the got mary there ball the left and kitchen the to went she football the got mary
GraphRepresentation GraphRepresentation
![Page 39: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/39.jpg)
Forward/BackwardDAGs• Topologicalordergivenbythesequenceitself
there ball the left and kitchen the to went she football the got mary
there ball the left and kitchen the to went she football the got mary
ForwardDAG
BackwardDAG
*Peng et al (TACL, 2017) use similar decomposition for Relation Extraction
![Page 40: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/40.jpg)
MemoryasAcyclicGraphEncoding(MAGE)RNN
there ball the left and kitchen the to went she football the got mary
went
she
she
left
to
kitchenh3t
h2th1t
m1t
m3t
![Page 41: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/41.jpg)
MemoryasAcyclicGraphEncoding(MAGE)RNN
there ball the left and kitchen the to went she football the got mary
went
Updateequations:
mt = m1tkm2
tk . . . km|Ef |t ,
rt = �(Wrxst + Urmt + br),
zt = �(Wzxst + Uzmt + bz),
h̃t = tanh(Whxst + rt � Uhmt + bh),
ht = (1� zt)� ht�1 + zt � h̃t,
Memory
GRUupdate
![Page 42: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/42.jpg)
Performance• LAMBADAdataset
– Onlyonquestionswhereanswerisincontext
![Page 43: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/43.jpg)
Performance• LAMBADAdataset
– OnlyonquestionswhereanswerisincontextStateoftheart
![Page 44: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/44.jpg)
Document:marytravelledtothekitchen.danielgotthemilk.hewenttothekitchen.johnpickeduptheapple.marywenttothehallway.johnmovedtothebedroom.sandrajourneyedtothekitchen.danielwentbacktothebedroom.johnwenttotheoffice.danielputdownthemilk.hepickedupthemilk.sandrawenttothebedroom.johndroppedtheapple.marywentbacktothebathroom.
Performance• FacebookbAbIdataset
– Task3(1000trainingexamples)
Query:wherewastheapplebeforetheoffice?
Answer:bedroom
Onlycoreferenceannotationsextractedforthisdataset
![Page 45: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/45.jpg)
Performance• FacebookbAbIdataset
– Task3
![Page 46: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/46.jpg)
Performance• FacebookbAbIdataset
– Task3Onehotindicatorfeatureforentitymentionappendedtoinput
![Page 47: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/47.jpg)
Performance• FacebookbAbIdataset
– Task3
![Page 48: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/48.jpg)
Summarysofar
• LinguisticannotationscanhelpguidememorypropagationinRNNs– Bettermodelingoflongtermdependencies
![Page 49: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/49.jpg)
AnnotatorNoise
• Performancedropssharplywithaddednoiseinlinguisticannotations
• Canwejointlytrainannotationandreadingcomprehensionmodels?– Canwelearntask-specificknowledge?
![Page 50: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/50.jpg)
“Common”Sense?
Query:“Whowasthefemalememberofthe1980’spopmusicduoEurythmics?”
Document:“EurythmicswereaBritishmusicduoconsistingofmembersAnnieLennoxandDavidA.Stewart”
Answer:AnnieLennox
TriviaQA Dataset (Joshi et al, ACL 2017)
• Wherecanwefindsuchknowledge?• Needmoredatasetsfortestingthisaspecttoo.
![Page 51: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/51.jpg)
HerplainfacebrokeintoahugesmilewhenshesawTerry.“Terry!”shecalledout.Sherushedtomeethimandtheyembraced.“Hon,Iwantyoutomeetanoldfriend,OwenMcKenna.Owen,pleasemeetEmily.'’ShegavemeaquicknodandturnedbacktoX
CoreferenceDependencyParses
Entityrelations
Wordrelations
CoreNLP
Freebase
WordNet
RecurrentNeuralNetwork
TextRepresentation
IncorporatingPriorKnowledge
![Page 52: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/52.jpg)
Search-and-ReadforopendomainQA• RCassumespassagecontaininganswerisalreadyknown
• ForrealQAweneedtosearchforthepassageaswell
7-ElevenstoresweretemporarilyconvertedintoKwikE-martstopromotethereleaseofwhatmovie?
QUASAR Dataset (Dhingra, Mazaitis, Cohen, in submission)
InJuly2007,7-ElevenredesignedsomestorestolooklikeKwik-E-MartsinselectcitiestopromoteTheSimpsonsMovie.
Corpus
Tie-inpromotionsweremadewithseveralcompanies,including7-Eleven,whichtransformedselectedstoresintoKwik-E-Marts.
Retrieval
7-ElevenBecomesKwik-E-MartforSimpsonsMoviePromotion
Excerpts
![Page 53: 10417/10617 Intermediate Deep Learning: Fall2019rsalakhu/10417/Lectures/... · reflect the nature of the problem being solved. (Part 1) (Part 2) ... • The candidate with maximum](https://reader033.vdocuments.mx/reader033/viewer/2022052100/603a84f39309593af23e14e4/html5/thumbnails/53.jpg)
Thankyou