phd research proposal - qualifying exam

27
Extracting Temporal and Causal Relations between Events Paramita Under the supervision of Sara Tonelli 10 December 2013

Upload: paramita-mirza

Post on 10-May-2015

1.427 views

Category:

Education


3 download

DESCRIPTION

Understanding which events are mentioned in unstructured natural language texts, and which relations connect them is a fundamental task for many applications in natural language processing (NLP), such as personalized news systems, question answering and summarization. A notably challenging problem related to event processing is recognizing the relations that hold between events, in particular temporal and causal relations. Having knowledge about such relations is necessary to build event timelines from text and could be useful for future event prediction, risk analysis and decision making support. While there has been some research on temporal relations, the aspect of causality between events from an NLP perspective has hardly been touched, even though it has a long-standing tradition in psychology and formal linguistic fields. We propose an annotation scheme to cover different types of causality between events, techniques for extracting such relations and an investigation into the connection between temporal and causal relations. The latter will be the focus of this thesis work because causality clearly has a temporal constraint. We claim that injecting this precondition may be beneficial for the recognition of both temporal and causal relations.

TRANSCRIPT

Page 1: PhD Research Proposal - Qualifying Exam

Extracting Temporal and Causal Relations between Events

Paramita

Under the supervision of Sara Tonelli

10 December 2013

Page 2: PhD Research Proposal - Qualifying Exam

Overview

• Introduction to Event Extraction• Event Relation Extraction– Problem Statements– State-of-the-Art

• Research Goals and Plan– Preliminary Result

Page 3: PhD Research Proposal - Qualifying Exam

Information Extraction

Typhoon Haiyan, one of the most powerful typhoons ever recorded slammed into the Philippines on Friday, setting off landslides, knocking out power in one entire province and cutting communications in the country's central region of island provinces.

What?Where?When?

Typhoon HaiyanThe PhilippinesFriday

Natural Language Text: unstructured

Knowledge Base: structured

Page 4: PhD Research Proposal - Qualifying Exam

“A thing that happens or takes place, especially one of importance”─ Oxford dictionary

A Philippine volcano, dormant for six centuries, explodedlast Monday. During the eruption, lava, rocks and red-hotash are spewed onto surrounding villages. The explosionclaimed at least 30 lives.

Event ExtractionWhat is an event?

A Philippine volcano, dormant for six centuries, explodedlast Monday. During the eruption, lava, rocks and red-hotash are spewed onto surrounding villages. The explosionclaimed at least 30 lives.

Annotation frameworks for events: • TimeML• ACE

event: “something that happens/occurs or a state that holds true”

Events and temporal expressions:• TimeML

• ACE

A Philippine volcano, dormant for six centuries, explodedlast Monday. During the eruption, lava, rocks and red-hotash are spewed onto surrounding villages. The explosionclaimed at least 30 lives.

dormant• arg-time: six centuries

exploded• arg-time: last Monday

dormant for six centuries, exploded last Mondaytemporal link temporal link

temporal link

Page 5: PhD Research Proposal - Qualifying Exam

TempEval-3 (2013)

• Shared task on temporal and event processing• Automatic identification of temporal expressions, events, and

temporal relations within a text annotated with TimeML Task F1 Precision Recall

Task A –Temporal Expression 90.30% 93.09% 87.68%Task B – Event Extraction 81.05% 81.44% 80.67%Task ABC – Temporal Awareness 30.98% 34.08% 28.40%Task C1 – Temporal Relations (identification + classification)

36.26% 37.32% 35.25%

Task C2 – Temporal Relations (only classification)

56.45% 55.58% 57.35%

Low performances on temporal relation extraction!

Page 6: PhD Research Proposal - Qualifying Exam

Overview

• Introduction to Automatic Event Extraction• Event Relation Extraction– Problem Statements– State-of-the-Art

• Research Goals and Plan– Preliminary Result

Page 7: PhD Research Proposal - Qualifying Exam

CAUSE

BEFORE

The Relationship between Events

IS_INCLUDED

Temporal Relations

Causal Relations

Typhoon Haiyan struck the eastern Philippines on Friday,

killing thousands of people.

Temporal Constraint of Causalitycause BEFORE effect

creating event timelines, multi-document summarization

predicting future events, risk analysis,decision making support

Page 8: PhD Research Proposal - Qualifying Exam

Research Questions

“Given a text annotated with events and time expressions, how to automatically extract temporal relations and causal relations between them?”

“Given the temporal constraint of causality, how to utilize the interaction between temporal relations and causal relations for building an integrated extraction system for both types of relations?”

Page 9: PhD Research Proposal - Qualifying Exam

Temporal Relation Types: TimeML

• Based on Allen’s interval algebra (James F. Allen, 1983): a calculus for temporal reasoning, capturing 13 relations between two intervalsAllen’s Relation Illustration TimeML Relation

X < Y , Y > X X BEFORE Y , Y AFTER X

X m Y , Y mi X X IBEFORE Y , Y IAFTER X

X o Y , Y oi X X overlaps with Y

X s Y , Y si X X BEGINS Y , Y BEGUN_BY X

X d Y , Y di X X DURING Y , Y DURING_INV X

X INCLUDES Y , Y IS_INCLUDED X

X f Y , Y fi X X ENDS Y , Y ENDED_BY X

X = Y X SIMULTANEOUS Y

X IDENTITY Y

XY

XY

Y

Y

Y

Y

Y

X

X

X

X

X

XY

YX

Page 10: PhD Research Proposal - Qualifying Exam

Expressing Temporal Order

• Temporal anchoring– John drove back home for 20 minutes.

• Explicit temporal connectives– John went shopping before he drove back home.

• Implicit (and ambiguous) temporal connectives– John arrived at home. He parked the car and saw

his son waiting at the front door.

Page 11: PhD Research Proposal - Qualifying Exam

Temporal Relation Extraction

• Common approach dividing the task:– Identifying the pairs of entities having a temporal link

• Often simplified, rule-based approach:– Main events of consecutive sentences– Pairs of events in the same sentence– An event and a time expression in the same sentence– An event and the document creation time

– Determining the relation types• Often regarded as a classification problem, supervised learning

approach: – Given an ordered pair of entities (e1, e2), the classifier has to

assign a certain label (temporal relation type)

Page 12: PhD Research Proposal - Qualifying Exam

TempEval-3 (2013)

• Shared task on temporal and event processing• Automatic identification of temporal expressions, events, and

temporal relations within a text annotated with TimeML Task F1 Precision Recall

Task A –Temporal Expression 90.30% 93.09% 87.68%Task B – Event Extraction 81.05% 81.44% 80.67%Task ABC – Temporal Awareness 30.98% 34.08% 28.40%Task C1 – Temporal Relations (identification + classification)

36.26% 37.32% 35.25%

Task C2 – Temporal Relations (only classification)

56.45% 55.58% 57.35%

Low performances on temporal relation extraction!

Page 13: PhD Research Proposal - Qualifying Exam

Modelling Causality

• Counterfactual Model (Lewis, 1973)– “C is the cause of E iff it holds true that

if C had not occurred, E would not have occurred”

• Probabilistic Contrast Model (Cheng & Novick, 1991)– “C is the cause of E if covariation

is positive” probability of E in the presence of C: probability of E in the absence of C

• Dynamics Model (Wolff & Song, 2003)Patient tendency for result

Affector-patient concordance

Occurrence of result

CAUSEENABLEPREVENT

NYY

NYN

YYN

Page 14: PhD Research Proposal - Qualifying Exam

Causal Relations: Language Resources

• Penn Discourse Treebank (PDTB) 2.0– Focuses on encoding discourse relations– “It was approved when a test showed some positive results,

officials said.” CONTINGENCY:Cause:reason• PropBank

– Annotates verbal propositions and their arguments– “Five countries remained on that so-called priority watch list

because of an interim reviewARGM-CAU.”

• SemEval 2007 Task 4 “Classification of Semantic Relations between Nominals”– Contains nominal causal relations as a subset– The period of tumor shrinkagee1 after radiation therapye2 is

often long and varied. Cause-Effect(e2,e1) = "true"

Page 15: PhD Research Proposal - Qualifying Exam

Causal Relations: Language Resources (2)between Events

• Bethard et al. (2008) – 1000 conjoined event pairs (with conjunctive and) are annotated

manually with BEFORE, AFTER, CAUSE, or NO-REL relations– Build classification model using SVM (697 train pairs)– Causal relation extraction evaluation: F-score 37.4%

• Do et al. (2011)– Detection of causality between verb-verb, verb-noun, and noun-noun

triggered event pairs, using PMI (based on probabilistic contrast model)

– Causal relation extraction evaluation: F-score 46.9%

• Riaz & Girju (2013)– Identification of causal relations between verbal events (with

conjunctives because and but, for causal and non-causal resp.)– Resulting in knowledge base containing 3 classes of causal association:

strongly causal, ambiguous, strongly non-causal

Page 16: PhD Research Proposal - Qualifying Exam

Causal Relation Extraction

• No standard benchmarking corpus for evaluating event causality extraction

• Causal relations in TimeML?– “The rainse1 causede2 the floodinge3.“– IDENTITY (e1,e2), BEFORE (e1,e3)

Page 17: PhD Research Proposal - Qualifying Exam

Temporal and Causal: the Interaction

• Temporal constraint of causal relations:The cause happened BEFORE the effect

• Bethard et al. (2008) on corpus analysis:– 32% of CAUSAL relations in the corpus did not have an

underlying BEFORE relation– “The walls were shaking because of the earthquake."

• Rink et al. (2010) makes use of temporal relations as a feature for classification model of causal relations– Causal relation extraction evaluation: F-score 57.9%

Page 18: PhD Research Proposal - Qualifying Exam

Overview

• Introduction to Automatic Event Extraction• Event Relation Extraction– Problem Statements– State-of-the-Art

• Research Goals and Plan– Preliminary Result

Page 19: PhD Research Proposal - Qualifying Exam

Research Objectives & Time Plan

1. Temporal Relation Extraction– Finding ways to improve the current state-of-the-art

performance on temporal relation extraction: 1st year

2. Causal Relation Extraction– Creating a standard benchmarking corpus for evaluating

causal relation extraction: 2nd year, 4 months– Building an automatic extraction system for event

causality: 2nd year, 8 months

3. Integrated Event Relation Extraction– Utilizing the interaction between temporal and causal to

build an integrated system for temporal and causal relations: 3rd year, 8 months

Page 20: PhD Research Proposal - Qualifying Exam

Temporal Relation Extraction Preliminary Result

• Temporal Relation Classification“Given a pair of entities (event-event, event-timex or timex-timex*), the classifier has to assign a certain label (temporal relation type).”

*) timex-timex pairs are so few in the dataset, so they are not included

– Supervised classification approach– Support Vector Machines (SVMs) algorithm– Feature engineering: event attributes, temporal signals, event

duration, temporal connectives (disambiguation), etc.– Bootstrapping the training data: inverse relations and closure

• TempEval-3 task evaluation setup

*) Paper submitted to EACL 2014

System F-Score Precision Recall

TRelPro* 58.48% 58.80% 58.17%

UTTime 56.45% 55.58% 57.35%

NavyTime 46.83% 46.59% 47.07%

JU-CSE 34.77% 35.07% 34.48%

Page 21: PhD Research Proposal - Qualifying Exam

Temporal Relation Extraction (2)Preliminary Result

• TempEval-3 test data annotated by TRelPro

Can be improved by including causality as a feature?

Page 22: PhD Research Proposal - Qualifying Exam

Causal Relation Extraction

• Create an annotation format for causal relations based on TimeML, in order to have a unified annotation scheme for both temporal and causal relations– Take the same definitions of events and time expressions– Introduce CLINK tags, in addition to TimeML TLINK tags for temporal

relations

• Map existing resources (e.g. PDTB, PropBank, SemEval-2007 Task 4 nominal causal corpus) to the newly created annotation scheme

• Build a causal relation extraction system– Consider the similar approach (and features) for the temporal relation

extraction system– New features relevant for causality extraction: causal

signals/connectives, lexical information (WordNet, VerbOcean)

Page 23: PhD Research Proposal - Qualifying Exam

• Affect verbs (affect, influence, determine, change)– Age influences cancer spread in mice.

• Link verbs (linked to, led to, depends on)– The earthquake was linked to a tsunami in Japan.

• Causal conjunctives– She fell because she sat on a broken chair.– John drank a lot of coffee. Consequently, he stayed awake all night. (conjunctive adverb)– I will go around the world if I win the lottery. (conditional)– She stopped the car when she saw the runaway goose. (temporal)– Ralph broke the car and his father went ballistic. (coordinating)

• Causal prepositions– He likely died because of a heart attack.– She was tired from running around all day.

• Periphrastic causative verbs– The earthquake prompts people to stay out of buildings. (CAUSE)– The pole restrains the tent from collapsing. (PREVENT)– The oxygen lets the fire gets bigger. (ENABLE)

• Affect verbs (affect, influence, determine, change)– Age influences cancer spread in mice.

• Link verbs (linked to, led to, depends on)– The earthquake was linked to a tsunami in Japan.

• Causal conjunctives– She fell because she sat on a broken chair.– John drank a lot of coffee. Consequently, he stayed awake all night. (conjunctive adverb)– I will go around the world if I win the lottery. (conditional)– She stopped the car when she saw the runaway goose. (temporal)– Ralph broke the car and his father went ballistic. (coordinating)

• Causal prepositions– He likely died because of a heart attack.– She was tired from running around all day.

• Periphrastic causative verbs– The earthquake prompts people to stay out of buildings. (CAUSE)– The pole restrains the tent from collapsing. (PREVENT)– The oxygen lets the fire gets bigger. (ENABLE)

Expressing Causality

Ambiguous!

Page 24: PhD Research Proposal - Qualifying Exam

Integrated Temporal & Causal Relation System

Temporal Expressions

Event Extraction

Temporal Relation

Classification

Temporal & Causal Relation

Classification

Explicit Causal Relation

Classification

Page 25: PhD Research Proposal - Qualifying Exam

Thank you!

CAUSE

BEGINS

Paramita closes the presentation and the question-answering session starts.

Page 26: PhD Research Proposal - Qualifying Exam
Page 27: PhD Research Proposal - Qualifying Exam

Expressing Causality: Implicit

• Lexical causatives– John broke the clock.

• Resultatives– John hammered the metal flat.

• Implicit– Max switched off the light. The room became pitch dark.