peak: pyramid evaluation via automated knowledge extraction · scoring–pyramid method • scorea...
Post on 30-May-2020
3 Views
Preview:
TRANSCRIPT
PEAK:PyramidEvaluationviaAutomatedKnowledgeExtraction
QianYang,RebeccaJ.Passonneau,GerarddeMelo
PhDCandidate,TsinghuaUniversityVisitingStudent,ColumbiaUniversity
http://www.larayang.com/
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
EvaluatingSummary Content
• Human assessors– Judgeeachsummaryindividually– Verytime-consuming anddoesnotscale well
• ROUGE (Lin2004)– Automaticallycomparesn-gramswithmodelsummaries– Notreliable enoughforindividualsummaries(Gillick 2011)
• Pyramid Method (Nenkova andPassonneau, 2004)– Semanticcomparison,reliableforindividualsummaries– Hasrequiredmanual annotation
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Our Contribution
• Noneed formanually createdpyramids• Alsogood resultsonautomaticassessmentgivenapyramid
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
SemanticContentAnalysis
Source: http://www1.ccls.columbia.edu/~beck/pubs/2458_PassonneauEtAl.pdf
Figure 1: Sample SCU from Pyramid Annotation Guide: DUC 2006.
SemanticContentAnalysis
Weight: 4
SemanticContentAnalysis
• “Thelawofconservationofenergyisthenotionthatenergycanbetransferredbetweenobjects butcannotbecreatedordestroyed.”• Openinformationextraction(OpenIE)methodssplitthemandextract
<subject,predicate,object>triples
• “Thesecharacteristicsdetermine thepropertiesofmatter”
yieldsthetriple⟨Thesecharacteristics,determine,thepropertiesofmatter⟩• WeuseClausIE (DelCorro andGemulla 2013)
SemanticContentAnalysis
Figure 2: Hypergraph to capture similarites between elements of triples, with salient nodes circled in red
Similarity Score: Align,DisambiguateandWalk(ADW) (Pilehvar, Jurgens,andNavigli 2013),
SemanticContentAnalysis
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
PyramidInduction
PyramidInduction
PyramidInduction
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Scoring – Pyramid Method
• Score atargetsummaryagainstapyramid–AnnotatorsmarkspansoftextinthetargetsummarythatexpressanSCU
–TheSCUweightsincrementtherawscoreforthetargetsummary.
• An Example– SCULabel: PlaidCymru wantsfullindependence–Target Summary:PlaidCymru demandsanindependentWales
AutomatedScoring – PEAK
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Dataset
• Student summarydatasetfromPerin etal.(2013)with20 targetsummarieswrittenbystudents• Passonneau etal.(2013)hadproduced5referencemodelsummaries,and2manuallycreatedpyramids
Results
Results
Result
• Machine-GeneratedSummaries–Dataset: the2006DocumentUnderstandingConference(DUC)administeredbyNIST(“DUC06”)
–ThePearson’scorrelationscorebetweenPEAK’sscoresandthemanualonesis0.7094.
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Content
• EvaluatingSummary Content• Our Contribution• How does PEAK work?
– SemanticContentAnalysis– PyramidInduction– AutomatedScoring
• Our Results• Conclusion
Conclusion
• Thefirstfullyautomaticversionofthepyramidmethod• Notonlyevaluatestargetsummariesbutalsogeneratesthepyramidsautomatically• Experimentsshowthat–OurSCUsaresimilartothosecreatedbyhumans–The methodforassessingtargetsummariesautomaticallyhasahighcorrelationwithhumanassessors
• Overall, our research shows great promise forautomated scoring and assessment of manual orautomated summaries, opening up the possibilityof wide-spread use in the education domain and ininformation management.
Thisdataandcodesareavailableathttp://www.larayang.com/peak/.
Thankyou!
top related