peak: pyramid evaluation via automated knowledge extraction · scoring–pyramid method • scorea...

Post on 30-May-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

PEAK:PyramidEvaluationviaAutomatedKnowledgeExtraction

QianYang,RebeccaJ.Passonneau,GerarddeMelo

PhDCandidate,TsinghuaUniversityVisitingStudent,ColumbiaUniversity

http://www.larayang.com/

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

EvaluatingSummary Content

• Human assessors– Judgeeachsummaryindividually– Verytime-consuming anddoesnotscale well

• ROUGE (Lin2004)– Automaticallycomparesn-gramswithmodelsummaries– Notreliable enoughforindividualsummaries(Gillick 2011)

• Pyramid Method (Nenkova andPassonneau, 2004)– Semanticcomparison,reliableforindividualsummaries– Hasrequiredmanual annotation

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Our Contribution

• Noneed formanually createdpyramids• Alsogood resultsonautomaticassessmentgivenapyramid

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

SemanticContentAnalysis

Source: http://www1.ccls.columbia.edu/~beck/pubs/2458_PassonneauEtAl.pdf

Figure 1: Sample SCU from Pyramid Annotation Guide: DUC 2006.

SemanticContentAnalysis

Weight: 4

SemanticContentAnalysis

• “Thelawofconservationofenergyisthenotionthatenergycanbetransferredbetweenobjects butcannotbecreatedordestroyed.”• Openinformationextraction(OpenIE)methodssplitthemandextract

<subject,predicate,object>triples

• “Thesecharacteristicsdetermine thepropertiesofmatter”

yieldsthetriple⟨Thesecharacteristics,determine,thepropertiesofmatter⟩• WeuseClausIE (DelCorro andGemulla 2013)

SemanticContentAnalysis

Figure 2: Hypergraph to capture similarites between elements of triples, with salient nodes circled in red

Similarity Score: Align,DisambiguateandWalk(ADW) (Pilehvar, Jurgens,andNavigli 2013),

SemanticContentAnalysis

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

PyramidInduction

PyramidInduction

PyramidInduction

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Scoring – Pyramid Method

• Score atargetsummaryagainstapyramid–AnnotatorsmarkspansoftextinthetargetsummarythatexpressanSCU

–TheSCUweightsincrementtherawscoreforthetargetsummary.

• An Example– SCULabel: PlaidCymru wantsfullindependence–Target Summary:PlaidCymru demandsanindependentWales

AutomatedScoring – PEAK

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Dataset

• Student summarydatasetfromPerin etal.(2013)with20 targetsummarieswrittenbystudents• Passonneau etal.(2013)hadproduced5referencemodelsummaries,and2manuallycreatedpyramids

Results

Results

Result

• Machine-GeneratedSummaries–Dataset: the2006DocumentUnderstandingConference(DUC)administeredbyNIST(“DUC06”)

–ThePearson’scorrelationscorebetweenPEAK’sscoresandthemanualonesis0.7094.

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Content

• EvaluatingSummary Content• Our Contribution• How does PEAK work?

– SemanticContentAnalysis– PyramidInduction– AutomatedScoring

• Our Results• Conclusion

Conclusion

• Thefirstfullyautomaticversionofthepyramidmethod• Notonlyevaluatestargetsummariesbutalsogeneratesthepyramidsautomatically• Experimentsshowthat–OurSCUsaresimilartothosecreatedbyhumans–The methodforassessingtargetsummariesautomaticallyhasahighcorrelationwithhumanassessors

• Overall, our research shows great promise forautomated scoring and assessment of manual orautomated summaries, opening up the possibilityof wide-spread use in the education domain and ininformation management.

Thisdataandcodesareavailableathttp://www.larayang.com/peak/.

Thankyou!

top related