neural module networks for reasoning over textmausam/courses/col873/spring... · neural modules •...

54
Neural Module Networks for Reasoning Over Text Nitish Gupta , Kevin Lin , Dan Roth , Sameer Singh & Matt Gardner Presented by: Jigyasa Gupta

Upload: others

Post on 29-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

NeuralModuleNetworksforReasoningOverText

Nitish Gupta,KevinLin,DanRoth,SameerSingh&MattGardner

Presentedby:Jigyasa Gupta

Page 2: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

NeuralModules• Introducedinthepaper“DeepCompositionalQuestionAnsweringwithNeuralModuleNetworks”byJacobAndreas,MarcusRohrbach,Trevor Darrell,DanKleinforVisualQAtask

SlidesofNeuralModulestakenfromBerthy Feng,astudentatPrincetonUniversity

Page 3: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Motivation :CompositionalNatureofVQA

SlidesofNeuralModulestakenfromBerthy Feng,astudentatPrincetonUniversity

Page 4: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Motivation :CompositionalNatureofVQA

Page 5: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Motivation:CombineBothApproaches

Page 6: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 7: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 8: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Modules

Attention(Find)Re-Attention(Transform)CombinationClassification(Describe)Measurement

Page 9: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 10: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 11: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 12: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 13: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 14: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 15: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 16: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 17: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 18: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

DROP:AReadingComprehensionBenchmarkRequiringDiscreteReasoningOverParagraphs

Dheeru Dua,Yizhong Wang,PradeepDasigi,GabrielStanovsky,SameerSingh,andMattGardner

Page 19: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 20: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

• UseNeuralModuleNetworks(NMNs)toanswercompositionalquestionsagainstaparagraphoftext.

• Requiremultiplestepsofreasoning:discrete,symbolicoperations(asshowninDROPdataset)

• NMNsare• Interpretable• Modular• Compositional

NEURALMODULENETWORKSFORREASONINGOVERTEXT

Page 21: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Example

Page 22: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

NMNcomponents

• Modules:differentiablemodulesthatperformreasoningovertextandsymbolsinaprobabilisticmanner• Contextualtokenrepresentations:• nandmarenumberoftokensinquesandpara,d=sizeofembedding(bidirectional- GRUorpretrainedBERT)

• QuestionParser:encoderdecodermodelwithattentiontomapquestionintoexecutableprogram• Learning:• likelihoodoftheprogramunderthequestion-parsermodelp(z|q)• foranygivenprogramz,likelihoodofthegold-answerp(y∗|z)

Page 23: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Questionembedding

Paragraphembedding Answer(y*)

Encoder Decoder Decoder Decoder Decoder

Module1 Module2 Module3 Module4

Programexecutor(z)

QuestionParser

JointLearning

NMNcomponents

Page 24: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

LearningChallenges

• QuestionParser:• Freeformrealworldquestions:diversegrammarandlexicalvariability

• ProgramExecutor• Nointermediatefeedbackavailableformodules.Errorsgetspropagated

• JointLearning:• supervisiononlyatgoldlevel,difficulttolearnquestionparserandprogramexecutorjointly

Page 25: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Modules

Page 26: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find(Q)→PForquestionspansintheinput,findsimilarspansinthepassage

• Similaritymatrixbetweenquestionandparatokensembedding

• NormalizeStogetattentionmatrix• Computeexpectedparagraphattention

Inputquestionattentionmap

Outputparaattentionmap

Page 27: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find(Q)→P:Example

Questionattentionmapisavailablefromtheencoder–decoderofparser

Page 28: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

filter(Q,P)→PBasedonthequestion,selectasubsetofspansfromtheinput

• Weightedsumofquestion-tokenembedding

• Computealocally-normalizedparagraph-tokenmask

• Outputisanormalizedmaskedinputparagraphattention

Page 29: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

filter(Q,P)→P :Example

Page 30: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

relocate(Q,P)→PFindtheargumentaskedforinthequestionforinputparagraphspans

• Weightedsumofquestion-tokenembeddingwithattentionmap

• Computeaparagraph-to-paragraphattentionmatrix

• OutputattentionisaweightedsumoftherowsRweightedbytheinputparagraphattention

Page 31: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find-num(P)→Nandfind-date(P)→DFindthenumber(s)/date(s)associatedtotheinputparagraphspans

• Extractnumbersanddatesasapre-processingstep,eg [2,2,3,4]• Computeatoken-to-number similarity matrix

• Computean expected distribution overthe number tokens

• Aggregate the probabilities fornumber-tokens ,• Example :{2,3,4}with N=[0.5,0.3,0.2]

Page 32: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find-num(P)→N:xample

Page 33: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 34: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

count(P)→CCountthenumberofinputpassagespans

• Count([0,0,0.3,0.3,0,0.4])=2• Modulefirstscalestheattentionusingthevalues[1,2,5,10]toconvertitintoamatrixPscaled∈ Rm×4

Pretraining thismodulebygeneratingsyntheticdataofattentionandcountvalueshelps

Normalized-passage-attentionwherepassagelengthsaretypically400-500tokens.Hencescalingtheattentionusingvalues>1helpsthemodelindifferentiatingamongstsmallvalues.

Page 35: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 36: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

compare-num-lt(P1,P2)→POutputthespanassociatedwiththesmallernumber

• N1=find_num(P1),N2=find_num(P2)• Computestwosoftboolean values,p(N1<N2)andp(N2<N1)

• Outputsaweightedsumoftheinputparagraphattentions

Page 37: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 38: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

time-diff(P1,P2)→TDDifferencebetweenthedatesassociatedwiththeparagraphspans

• Moduleinternallycallsthefind-datemoduletogetadatedistributionforthetwoparagraphattentions,D1andD2

Page 39: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

find-max-num(P)→P,find-min-num(P)→PSelectthespanthatisassociatedwiththelargestnumber

• ComputeanexpectednumbertokendistributionTusingfind-num• Computetheexpectedprobabilitythateachnumbertokenistheonewiththemaximumvalue,Tmax∈ Rntokens

• Reweight thecontributionfromthei-th paragraphtokentothej-thnumbertoken

Page 40: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module
Page 41: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

span(P)→SIdentifyacontiguousspanfromtheattendedtokens

• Onlyappearsastheoutermostmoduleinaprogram.• Outputstwoprobabilitydistributions,Ps andPe∈ Rm,denotingstartandendofaspan• Thismoduleisimplementedsimilartothecountmodule

Page 42: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Auxiliarysupervision

• unsupervisedauxiliarylosstoprovideaninductivebiastotheexecutionoffind-num,find-date,andrelocatemodules• provideheuristically-obtainedsupervisionforquestionprogramandintermediatemoduleoutputforasubsetofquestions(5–10%).

Page 43: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

UnsupervisedauxiliarylossforIE

• find-num,find-date,andrelocatemodulesperforminformationextraction• ObjectiveincreasesthesumoftheattentionprobabilitiesforoutputtokensthatappearwithinawindowW=10

Page 44: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

QuestionParseSupervision

• Heuristicpatternstogetprogramandcorrespondingquestionattentionsupervisionforasubsetofthetrainingdata(10%)

Page 45: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

IntermediateModuleOutputSupervision

• Usedforfind-num andfind-datemodules• Forasubsetofthequestions(5%)• Eg :“howmanyyardswasthelongest/shortesttouchdown?”• Identifyallinstancesofthetoken“touchdown”• Assumetheclosestnumbertoitshouldbeanoutputofthefind-nummodule.• Supervisethisasamulti-hotvectorN∗ anduseanauxiliaryloss

Page 46: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Dataset

20,000questionsfortraining/validation,and1800questionsfortesting(25%ofDROP)Automaticallyextractedquestionsinthescopeofmodelbasedontheirfirstn-gram.

Page 47: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

RESULTS

Page 48: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

RESULTS– QuestionsType

Page 49: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

EffectofAuxiliarySupervision

Page 50: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

IncorrectProgramPredictions.

• HowmanytouchdownpassesdidTomBradythrowintheseason?-count(find)• Correctanswerrequiresasimplelookupfromtheparagraph.

• Whichhappenedlast,failedassassinationattemptonLenin,ortheRedTerror?date-compare-gt(find,find))• Correctanswerrequiresnaturallanguageinferenceabouttheorderofeventsandnotsymboliccomparisonbetweendates.

• Whocaughtthemosttouchdownpasses?- relocate(find-max-num(find))).• Requirenestedcountingwhichisoutofscope

Page 51: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

FutureWork

• Designadditionalmodules• Howmanylanguageseachhadlessthan115,000speakersinthepopulation?• Whichquarterbackthrewthemosttouchdownpasses?• Howmanypointsdidthepackersfallbehindduringthegame?

• UsecompletedatasetofDROP:Incurrentsystem,trainingmodelonthequestionsforwhichmodulescan’texpressthecorrectreasoningharmstheirabilitytoexecutetheirintendedoperations

• Opensupavenuesfortransferlearningwheremodulescanbeindependentlytrainedusingindirectordistantsupervisionfromdifferenttasks

• Combiningblack-boxoperationswiththeinterpretablemodulessothatcancapturemoreexpressivity

Page 52: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

ReviewComments- Pros

• Interestingidea[Atishya,Rajas,Keshav,Siddhant,Lovish]• Interpretableandmodular[Atishya,Rajas,Siddhant,Lovish,Vipul]• BetterthanBERTforsymbolicreasoning[Keshav]• Auxiliarylossformulationseemsaverynovelidea[Vipul]• Questionparserhasnewrole:parsetoreturncompositionofmodules.[Pawan]

Page 53: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Reviewcomments- Cons

• Difficulttounderstandmoduledescription[Atishya,Siddhant]• Auxillary lossnotgeneralizable[Atishya,Rajas]• Contributionofeachmodulenotstudied[Atishya,Rajas,Siddhant,Lovish,Pawan]• Only22%ofDROPdatasetused[Rajas,Keshav,Lovish]• Compositionalreasoningquerieslike“WhoisthemotherofPMofIndia?”arenothandled.[Keshav]• Endlessamountofmodulesrequiredtoachievefullreasoningcapability[Vipul]

Page 54: Neural Module Networks for Reasoning Over Textmausam/courses/col873/spring... · Neural Modules • Introduced in the paper “Deep Compositional Question Answering with Neural Module

Reviewcomments- Extensions

• Studyonthecontributionofeachmodule[Atishya]• Pre-trainallthemodulesbycollectingdatausingspecificheuristics[Atishya,Rajas]• RLframeworktopredictwhetheragivenquestioncanbesufficientlyreasoned [Rajas]• Moduletopredictopen-predicatesofthetypePM(India,x)&Mother(x,y).[Keshav,Vipul]• Trainmultipurposemodules(topredict citizenof and presidentof relationships)[Vipul]• Combineend-to-endneuralsystemandNMN[Keshav]• Learnnewmodulesfromdatasetautomatically;learnnewSPARQLtemplatefromdata )[Siddhant,Pawan]• Curriculumlearning[Siddhant]• Metalearning toautomaticallydeterminethemodules[Lovish]