1 temporal information extraction inderjeet mani [email protected]

100
1 Temporal Information Extraction Inderjeet Mani [email protected]

Upload: amanda-brockwell

Post on 01-Apr-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

1

Temporal Information Extraction

Inderjeet [email protected]

Page 2: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

2

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 3: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

3

Motivation: Question-Answering

• When is Ramadan this year?• What was the largest U.S. military operation since

Vietnam? • Tell me the best time of the year to go cherry-

picking.• How often do you feed a pet gerbil?• Is Gates currently CEO of Microsoft? • Did the Enron merger with Dynegy take place?• How long did the hostage situation in Beirut last?• What is the current unemployment rate?• How many Iraqi civilian casualties were there in the

first week of the U.S. invasion of Iraq?• Who was Secretary of Defense during the Gulf War?

Page 4: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

4

Motivation: Coherent and Faithful Summaries

• Single-document sentence extraction summarizers are plagued by dangling references– especially temporal ones

• Multi-Document summarizers can be misled by the weakness of vocabulary overlap methods– leads to inappropriate

merging of distinct events

..worked in recent summers..

..was the source of the virus last week..…where Morris was a computer science

undergraduate until June..…..whose virus program three years ago

disrupted…

Page 5: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

5

An Example StoryFeb. 18, 2004

Yesterday Holly was running a marathon when she twisted her ankle. David had pushed her.

02172004 02182004

run

twist ankle

du

rin

gfi

nis

hes

before

push befo

re

du

rin

g

1. When did the running occur?Yesterday.2. When did the twisting occur?Yesterday, during the running.3. Did the pushing occur before the twisting?Yes.4. Did Holly keep running after twisting her ankle?5. Probably not.

Page 6: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

6

Temporal Information Extraction Problem

• Input: A natural language discourse• Output: representation of events and

their temporal relations

Feb. 18, 2004Yesterday Holly was running a marathon when

she twisted her ankle. David had pushed her.

Page 7: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

7

IE Methodology

RawCorpus

AnnotatedCorpus

InitialTagger

AnnotationEditor

AnnotationGuidelines

MachineLearningProgram

RawCorpus

Learn-edRules

AnnotatedCorpus

RuleApply

Idea for Temporal IE: Make progress by focusing on a particular top-down slice (i.e., time), using its rich structure

Page 8: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

8

Theories

Events Time

Language

AI& logic

FormalLinguistics

Page 9: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

9

Linguistic Theories

• Events– Event Structure (event subclasses and parts)– Tense (indicates location of event in time, via

verb inflections, modals, auxiliaries, etc.)– Grammatical Aspect (indicates whether event is

ongoing, finished, completed)

• Time Adverbials• Relations between events and/or times

– temporal relations– we will also need discourse relations

Page 10: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

10

Tense• All languages that have tense (in the semantic sense of

locating events in time) can express location in time • Location can be expressed relative to a deictic center

that is the current ‘moment’ of speech, or ‘speech time’, or ‘speech point’– e.g., tomorrow, yesterday, etc.

• Languages can also express temporal locations relative to a coordinate system– a calendar, e.g., 1991 (A.D.), – a cyclically occurring event, e.g., morning, spring, – an arbitrary event, e.g., the day after he married her.

• A language may have tense in the above semantic sense, without expressing it using tense morphemes– Instead, aspectual morphemes and/or modals and

auxiliaries may be used.

Page 11: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

11

Mandarin Chinese

• Has semantic tense• Lacks tense morphemes• Instead, it uses ‘aspect’ markers to

indicate whether an event is ongoing (-zhai, -le), completed (-wan), terminated (-le, -guo), or in a result state (-zhe)– But aspect markers are often absent

我 看 电视 wo kan dianshi* I watch / will watch / watched TV

*Example from Congmin Min, MS Thesis, Georgetown, 2005.

Page 12: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

12

Burmese*

• No semantic tense, but all languages that lack semantic tense all have a realis/irrealis distinction.

• Events that are ongoing or that were observed in the past are expressed by sentence-final realis particles –te, -tha, -ta, and –hta.

• For unreal or hypothetical events (including future and present and hypothetical past events), the sentence-final irrealis particles –me, -ma, and –hma are used.

*Comrie, B. Tense. Cambridge, 1985.

Page 13: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

13

Tense as Anaphor: Reichenbach

• A formal method for representing tense, based on which one can locate events in time

• Tensed utterances introduce references to 3 ‘time points’ – Speech Time: S– Event Time: E– Reference Time: R

SI had [mailed the letter]E [when John came & told me the news]R

E < R < S

• Three temporal relations are defined on these time points– at, before, after

• 13 different relations are possible

N.B. the concept of ‘time point’ is an abstraction –- it can map to an interval

E R S

time

Page 14: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

14

Reichenbachian Tense Analysis

• Tense is determined by relation between R and S

– R=S, R<S, R>S • Aspect is determined by

relation between E and R– E=R, E < R, E> R

• Relation of E relative to S not crucial

– Represent R<S=E as E>R<S

• Only 7 out of 13 relations are realized in English

– 6 different forms, simple future being ambiguous

– Progressive no different from simple tenses• But I was eating a peach

> I ate a peach

Relation Reichenbach’s Tense Name

English Tense Name

Example

E<R<S Anterior past Past perfect I had slept E=R<S Simple past Simple past I slept R<E<S R<S=E Posterior past I would

sleep R<S<E E<S= R Anterior present Present perfect I have slept S= R= E Simple present Simple present I sleep S= R<E Posterior present Simple future I will sleep

Je vais dormir

S<E<R S=E<R Anterior future Future perfect I will have

slept E<S<R S<R=E Simple future Simple future I will sleep

Je dormirai S<R<E Posterior future I shall be

going to sleep

E<R>S

E>R<S

Page 15: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

15

• G: It is always going to be the case that .H: It always has been the case that .F: It will be at some point in the future be the case that . P: It was at some point in the past the case that .

F = ¬G¬P = ¬H¬

• System Kt:

(a) H F : What is, has always been going to be;

(b G P : What is, will always have been;

(c) H( ) (H H): Whatever always follows from what always has been, always has been;

(d) G( ) (G G): Whatever always follows from what always will be, always will be.

Priorean Tense Logic

Page 16: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

16

Tense as Operator: Prior

Relation Reichenbach’s Tense Name

PRIOR English Tense Name

Example

E<R<S Anterior past PP Past perfect I had slept E=R<S Simple past P Simple past I slept R<E<S R<S=E Posterior past PF I would

sleep R<S<E E<S= R Anterior present P Present perfect I have slept S= R= E Simple present Simple present I sleep S= R<E Posterior present F Simple future I will sleep

Je vais dormir

S<E<R S=E<R Anterior future FP Future perfect I will have

slept E<S<R S<R=E Simple future F Simple future I will sleep

Je dormirai S<R<E Posterior future FF I shall be

going to sleep

• Free iteration captures many more tenses, – I would have

slept PFP• But also

expresses many non-NL tenses – PPPP [It was

the case]4 John had slept

Page 17: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

17

Event Classes (Lexical Aspect)• STATIVES know, sit, be clever, be

happy, killing, accident– can refer to state itself (ingressive) John

knows , or to entry into a state (inceptive) John realizes

– *John is knowing Bill, *Know the answer, *What John did was know the answer

• ACTIVITIES walk, run, talk, march, paint– if it occurs in period t, a part of it (the

same activity) must occur for most sub-periods of t

– X is Ving entails that X has Ved– John ran for an hour,*John ran in an hour

Telic Dynamic Durative E.g. Stative - - + know,

have Activity - + + walk,

paint Accomplishment

+ + + destroy, build

Achievement

+ + - notice, win

• ACCOMPLISHMENTS build, cook, destroy

– culminate (telic)– X is Ving does not entail that X

has Ved. – John booked a flight in an hour,

John stopped building a house • ACHIEVEMENTS notice, win, blink,

find, reach – instantaneous accomplishments– *John dies for an hour, *John wins

for an hour, *John stopped reaching New York

Page 18: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

18

Aspectual Composition• Expressions of one class can be transformed into one of

another class by combining with another expression. – e.g., an activity can be changed into an accomplishment

by adding an adverbial phrase expressing temporal or spatial extent

– I walked (activity) – I walked to the station / a mile / home (accomplishment)

– I built my house (accomplishment). – I built my house for an hour (activity).

• Moens & Steedman (1988) – implement aspectual composition in a transition network

Page 19: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

19

Example: Classifying Question Verbs

• Androutsopoulos’s (2002) NLITDB system allows users to pose temporal questions in English to an airport database that uses a temporal extension of SQL

• Verbs in single-clause questions with non-future meanings are treated as states– Does any tank contain oil?

• Some verbs may be ambiguous between a (habitual) state and an accomplishment– Which flight lands on runway 2?– Does flight BA737 land on runway 2 this afternoon

• Activities are distinguished using the imperfective paradox: – Were any flights taxiing? implies that they taxied– Were any flights taxiing to gate 2? does not imply that they taxied.

• So, taxi will be given– an activity verb sense, one that doesn’t expect a destination

argument, and – an accomplishment verb sense, one that expects a destination

argument.

Page 20: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

20

Grammatical Aspect

• Perfective – focus on situation as a whole– John built a house

• Imperfective – focus on internal phases of situation– John was building a house

built.a.h

was building.a.h

English Verbal tense and aspect morphemes, e.g., for present and past perfect

French Tense (passé composé)

Mandarin morphemes –le and –guo

English progressive verbal inflection -ing

French Tense (imparfait)

Mandarin progressive morpheme –zai and resultative morpheme –zhe.

Page 21: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

21

Inferring Temporal Relations

1. Yesterday Holly was running a marathon when she twisted her ankle. FINISHES David had pushed her. BEFORE

2. I had mailed the letter when John came & told me the news AFTER

3. Simpson made the call at 3. Later, he was spotted driving towards Westwood. AFTER

4. Max entered the room. Mary stood up/was seated on the desk. AFTER/OVERLAP

5. Max stood up. John greeted him. AFTER

6. Max fell. John pushed him. BEFORE

7. Boutros-Ghali Sunday opened a meeting in Nairobi of ....He arrived in Nairobi from South Africa BEFORE

8. John bought Mary some flowers. He picked out three red roses. DURING

Page 22: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

22

Linguistic Information Needed for Temporal IE

• Events• Tense• Aspect• Time adverbials• Explicit temporal signals (before, since, at, etc.)• Discourse Modeling

– For disambiguation of time expressions based on context

– For tracking sequences of events (tense/aspect shifts)– For computing Discourse Relations

• Commonsense Knowledge– For inferring Discourse Relations– For inferring event durations

Page 23: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

23

Narrative Ordering• Temporal Discourse Interpretation Principle (Dowty 1979)

– Reference time for the current sentence is a time consistent with its time adverbials if any, or else it immediately follows reference time of the previous sentence.

– The overlap of statives is a pragmatic inference,(hinting at a theory of defaults)

• A man entered the White Hart. He was wearing a black jacket. Bill served him a beer.

• Discourse Representation Theory (Kamp and Reyle 1993)– In successive past tense sentences which lack temporal

adverbials, events advance the narrative forward, while states do not.

– Overlapping statives come out of semantic inference rules• Neither theory explicitly represents discourse relations,

though they are needed (e.g., 6-8 above)

Page 24: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

24

Discourse Representation Theory (example)

A man entered the White Hart. He was wearing a black jacket. Bill served him a beer.

Rpt {}e1, t1, x, y enter(e1, x, y), man(x), y= theWhiteHartt1 < n, e1 t1 Rpt e1----------------------------------------------------------e2, t2, x1, y1PROG(wear(e2, x1, y1)), black-jacket(y1), x1=xt2 < n, e2 ο t2, e1 e2----------------------------------------------------------e3, t3, x2, y2, zserve(e3, x2, y2, z), beer(z), x2=Bill, y2=xt3 < n, e3 t3 Rpt e3e1 < e3

Page 25: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

25

Overriding Defaults

• Lascarides and Asher (1993)*: temporal ordering is derived entirely from discourse relations (that link together DRS’s, based on SDRT formalism).

• Example– Max switched off the light. The room was pitch dark. – Default inference: OVERLAP– Use an inference rule that if the room is dark and the

light was just switched off, the switching off caused the room to become dark.

– Inference: AFTER

• Problem: requires large doses of world knowledge

*L&P 1993

Page 26: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

26

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 27: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

27

Time and Events in Logic

Events

Time

Time

Events

Time

Events

Instants

Intervals

Intervals

Instants

Intervals

Instants

Page 28: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

28

Instant Ontology

• Consider the event of John’s reading the book • Decompose into an infinite set of infinitesimal

instants• Let T be a set of temporal instants. • Let < (BEFORE) be a temporal ordering

relation between instants• Properties: irreflexive, antisymmetric,

transitive, and complete– Antisymmetric => time has only one direction of

movement– Irreflexive and Transitive => time is non-cyclical– Complete => < is a total ordering

Page 29: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

29

Instants -- Problem Where Truth Values Change

• P = The race is on• T-R = the time of running the race• T-AR = the time after running the race• R and AR have to meet somewhere• If we choose instants, there is some instant x where T-R

and AR meet• Either we have P and not P both true at x, or there is a

truth value gap at x• This is called the Divided Instant Problem (D.I.P.)

T-R T-AR

P not P?

x

Page 30: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

30

Ordering Relations on Intervals

• Unlike instants, where we have only <, we can have at least 3 ordering relations on intervals– Precedence <: I1 < I2 iff t1 I1, t2 I2,

t1 < t2 (where < is defined over instants)– Temporal Overlap O: I1 O I2 iff I1 I2

– Temporal Inclusion : I1 I2 iff I1 I2

Page 31: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

31

Instants versus Intervals

• Instants– We understand the idea of truth at an instant– In cases of continuous change, e.g., a tossed ball, we

need a notion of a durationless event in order to explain the trajectory of the ball just before it falls

• Intervals– We often conceive of time as broken up in terms of

events which have a certain duration, rather than as a (infinite) sequence of durationless instants.

– Many verbs do not describe instantaneous events., e.g., has read, ripened

– Duration expressions like yesterday afternoon aren’t construed as instants

Page 32: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

32

Allen’s Interval-Based Ontology*

• Instants are banished– So, avoids the divided instant problem

• Short duration intervals will be instant-like• Uses 13 relations

– Relations are mutually exclusive

• All 13 relations can be expressed using meet: XY [Before (X, Y) Z [meet(X, Z) & meet(Z, Y)]]

*James F. Allen, ‘Towards a General Theory of Action and Time’, Artificial Intelligence 23 (1984): 123–54.

Page 33: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

33

Allen’s 13 Temporal RelationsA

B

A

B

A

B

A

B

A

B

A

B

A

B

A FINISHES B

B is FINISHED by A

A is BEFORE B

B is AFTER A

A MEETS B

B is MET by A

A OVERLAPS B

B is OVERLAPPED by A

A STARTS B

B is STARTED by A

A is EQUAL to B

A DURING B

B CONTAINS Ad, di

m, mi

s, si

o, oi

<, >

f, fi

Page 34: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

34

< > d di o oi m mi s si f fi

< < ? <omds

< < <omds

< <omds

< < <omds

<

> ? > > oi mi d f

> > oi mi d f

> > oi mi d f

> > oi mi d f

> > >

d < > d ? <omds

> oi mi d f

< > d > oi mi d f

d <omds

di

o

oi

m

mi

s

si

f

fi

Page 35: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

35

Temporal Closure: Sputlink* in TANGO

*Verhagen (2005)

Page 36: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

36

AI Reasoning about Events

Situation Calculus

Holds(Have(John, book), t1)Holds(Have(Mary, book), t2)Holds(Have(Z, Y),

Result(give(X, Y, Z), t))– t-i are states

• Concurrent actions cannot be represented

• No duration of actions or delayed effects

Event Calculus

HoldsAt(Have(J, B), t1)HoldsAt(Have(M, B), t2)Terminates(e1, Have(J, B))Initiates(e1, Have(M, B))Happens(e, t)[t is a time point]

• Involves non-monotonic reasoning

• Handles frame problem using circumscription

• John gave a book to Mary

Page 37: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

37Temporal Question-Answering using IE + Event

Calculus• Mueller (2004)*: Takes instantiated MUC terrorist event

templates and represents information in EC• Adds commonsense knowledge about terrorist domain

– e.g., if a bomb explodes, it’s no longer activated

• Commonsense knowledge includes frame axioms– e.g., if an object starts falling, then its height will be

released from the commonsense law of inertia

• Example temporal questions– Was the car dealership damaged before the high-power

bombs exploded? Ans: No.• Requires reasoning that the damage did not occur at all times

t prior to the explosion

• Problem: requires large doses of world knowledge

*Mueller, Erik T. (2004). Understanding script-based

stories using commonsense reasoning. Cognitive Systems Research, 5(4), 307-340.

Page 38: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

38Temporal Question Answering using IE + Temporal Databases

• In NLITDB, semantic relation between a question event and the adverbial it combines with is inferred by a variety of inference rules.

• State + ‘point’ adverbial– Which flight was queueing for runway 2 at 5:00 pm?:

• state coerced to an achievement, viewed as holding at the time specified by the adverbial.

• Activity + point adverbial– can mean that the activity holds at that time, or that the

activity starts at that time, e.g., Which flight queued for runway 2 at 5:00 pm?

• An accomplishment may indicate inception or termination– Which flight taxied to gate 4 at 5:00 pm? can mean the

taxiing starts or ends at 5 pm.

Page 39: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

39

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 40: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

40

IE Methodology

RawCorpus

AnnotatedCorpus

InitialTagger

AnnotationEditor

AnnotationGuidelines

MachineLearningProgram

RawCorpus

Learn-edRules

AnnotatedCorpus

RuleApply

Page 41: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

41

Events in NLP

• Topic: well-defined subject for searching– document- or collection-level

• Template: structure with slots for participant named entities– document-level

• Mention: linguistic expression that expresses an underlying event– phrase-level (verb/noun)

Page 42: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

42

Event Characteristics

• Can have temporal a/o spatial locations• Can have types

– assassinations, bombings, joint ventures, etc.

• Can have members• Can have parts• Can have people a/o other objects as

participants• Can be hypothetical• Can have not happened

Page 43: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

43

MUC Event TemplatesWall Street Journal, 06/15/88 MAXICARE HEALTH PLANS INC and UNIVERSAL HEALTH SERVICES INC have dissolved a joint venture which provided health services.

Page 44: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

44

ACE Event Templates

• Four additional attributes for each event mention– Polarity (it did or did not occur)– Tense (past, present, future)– Modality (real vs. hypothetical)– Genericity (specific vs. generic)

• Argument slots (4 -7) specific to each event– E.g., Trial-Hearing event has slots for the Defendant, Prosecutor, Adjudicator, Crime,

Time, and Place.

Type Subtype

Life Be-Born, Marry, Divorce, Injure, Die

Movement Transport

Transaction Transfer-Ownership, Transfer-Money

Business Start-Org, Merge-Org, Declare-Bankruptcy, End-Org

Conflict Attack, Demonstrate

Contact Meet, Phone-Write

Personnel Start-Position, End-Position, Nominate, Elect

Justice Arrest-Jail, Release-Parole, Trial-Hearing, Charge-Indict, Sue, Convict, Sentence, Fine, Execute, Extradite, Acquit, Appeal, Pardon

From Lisa Ferro @MITRE

Page 45: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

45

Mention-Level Events

• Event expressions: – tensed verbs; has left, was captured, will resign;– stative adjectives; sunken, stalled, on board;– event nominals; merger, Military Operation, war;

• Dependencies between events and times:– Anchoring; John left on Monday.– Orderings; The party happened after midnight.– Embedding; John said Mary left.

Page 46: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

46

TIMEX2 (TIDES/ACE) Annotation Scheme

Time Points <TIMEX2 VAL="2000-W42">the third week of October</TIMEX2>

Durations <TIMEX2 VAL=“PT30M”>half an hour long</TIMEX2>

Indexicality <TIMEX2 VAL=“2000-10-04”>tomorrow</TIMEX2> He wrapped up a <TIMEX2 VAL="PT3H" ANCHOR_DIR="WITHIN"

ANCHOR_VAL="1999-07-15">three-hour</TIMEX2> meeting with the Iraqi president in Baghdad <TIMEX2 VAL="1999-07-15">today</TIMEX2>.

Sets <TIMEX2 VAL=”XXXX-WXX-2" SET="YES” PERIODICITY="F1W" GRANULARITY=“G1D”>every Tuesday</TIMEX2>

Fuzziness <TIMEX2 VAL=“1990-SU”>Summer of 1990 </TIMEX2><TIMEX2 VAL=“1999-07-15TMO”>This morning</TIMEX2><TIMEX2 VAL=“2000-10-31TNI” MOD=“START”>early last

night</TIMEX2>

Page 47: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

47

TIMEX2 Inter-annotator Agreement

• Georgetown/MITRE (2001)– 193 English docs, .79 F Extent, .86 F VAL– 5 annotators– Annotators deviate from guidelines, and

produce systematic errors (fatigue?)• several years ago: PXY instead of PAST_REF • all day: P1D instead of YYYY-MM-DD

• LDC (2004)– 49 English docs, .85 F Extent, .80F VAL– 19 Chinese docs, .83 Extent– 2 annotators

Page 48: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

48

Example of Annotator Difficulties (TERN 2004*)

*Time Expression Recognition and Normalization Competition (timex2.mitre.org)

Page 49: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

49

TIMEX2 – A Mature Standard

• Extensively debugged• Detailed guidelines for English and

Chinese • Evaluated for English, Arabic, Chinese,

Korean, Spanish, French, Swedish, and Hindi

• Applied to news, scheduling dialogues, other types of data

• Corpora available through ACE, MITRE

Page 50: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

50

Temporal Relations in ACE

• Restricted to verbal events (verbs of scheduling, occurrence, aspect etc.)

• The event and the timex must be in the same sentence• Eight temporal relations

– WithinThe bombing occurred [during] the night.

– Holds They were meeting [all] night.

– Starting, EndingThe talks [ended (on)] Monday.

– Before, After The initial briefs have to be filed [by] 4 p.m. Tuesday”

– At-Beginning, At-EndSharon met with Bill [at the start] of the three-day conference

From Lisa Ferro @MITRE

Page 51: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

51

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 52: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

52

TimeML Annotation Scheme

• A Proposed Metadata Standard for Markup of events, their temporal anchoring, and how they are related to each other

• Marks up mention-level events, time expressions, and links between events (and events and times)

• Developer: James Pustejovsky (& co.)

Page 53: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

53

An Example StoryFeb. 18, 2004

Yesterday Holly was running a marathon when she twisted her ankle. David had pushed her.

02172004 02182004

run

twist ankle

du

rin

gfi

nis

hes

before

push befo

re

du

rin

g

1. When did the running occur?Yesterday.2. When did the twisting occur?Yesterday, during the running.3. Did the pushing occur before the twisting?Yes.4. Did Holly keep running after twisting her ankle?5. Probably not.

Page 54: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

54

An Attested Story

AP-NR-08-15-90 1337EDT

Iraq's Saddam Hussein, facing U.S. and Arab troops at the Saudiborder, today sought peace on another front by promising towithdraw from Iranian territory and release soldiers capturedduring the Iran-Iraq war. Also today, King Hussein of Jordan arrived in Washington seeking to mediate the Persian Gulf crisis. President Bush onTuesday said the United States may extend its naval quarantine to

Jordan's Red Sea port of Aqaba to shut off Iraq's last unhindered trade route. Past < Tuesday < Today < Indef Future___________________________________________________________war said sought withdrawcaptured release

arrived extendquarantine

Page 55: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

55

TimeML EventsAP-NR-08-15-90 1337EDT

Iraq's Saddam Hussein, facing U.S. and Arab troops at the Saudiborder, today sought peace on another front by promising to withdraw

from Iranian territory and release soldiers capturedduring the Iran-Iraq war. Also today, King Hussein of Jordan arrived in Washington seeking to mediate the Persian Gulf crisis. President Bush onTuesday said the United States may extend its naval quarantine to

Jordan's Red Sea port of Aqaba to shut off Iraq's last unhindered trade route. In another mediation effort, the Soviet Union said today it hadsent an envoy to the Middle East on a series of stops to includeBaghdad. Soviet officials also said Soviet women, children andinvalids would be allowed to leave Iraq.

Page 56: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

56

TimeML Event Classes• Occurrence:

– die, crash, build, merge, sell, take advantage of, ..

• State:– Be on board, kidnapped, recovering, love, ..

• Reporting:– Say, report, announce,

• I-Action:– Attempt, try,promise, offer

• I-State:– Believe, intend, want, …

• Aspectual:– begin, start, finish, stop, continue.

• Perception:– See, hear, watch, feel.

Page 57: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

57

Temporal Anchoring Links

AP-NR-08-15-90 1337EDT

Iraq's Saddam Hussein, facing U.S. and Arab troops at the Saudiborder, today sought peace on another front by promising towithdraw from Iranian territory and release soldiers capturedduring the Iran-Iraq war. Also today, King Hussein of Jordan arrived in Washington seeking to mediate the Persian Gulf crisis. President Bush onTuesday said the United States may extend its naval quarantine to

Jordan's Red Sea port of Aqaba to shut off Iraq's last unhindered trade route. In another mediation effort, the Soviet Union said today it hadsent an envoy to the Middle East on a series of stops to includeBaghdad. Soviet officials also said Soviet women, children andinvalids would be allowed to leave Iraq.

Page 58: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

58

TLINK TypesSimultaneous (happening at the same time)Identical: (referring to the same event)

John drove to Boston. During his drive he ate a donut.

Before the other: In six of the cases suspects have already been arrested.

Immediately before the other: All passengers died when the plane crashed into

the mountain.

Including the other: John arrived in Boston last Thursday.

Exhaustively during the duration of the other: John taught for 20 minutes.

Beginning of the other: John was in the gym between 6:00 p.m. and 7:00

p.m.

Ending of the other: John was in the gym between 6:00 p.m. and 7:00

p.m.

Page 59: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

59

TLINK ExampleJohn taught 20 minutes every Monday.

John <EVENT eid="e1" class="OCCURRENCE"> taught </EVENT>

<MAKEINSTANCE eiid="ei1" eventID="e1" pos="VERB" tense="PAST" aspect="NONE" polarity="POS"/>

<TIMEX3 tid="t1" type="DURATION" value="P20TM"> 20 minutes </TIMEX3>

<TIMEX3 tid="t2" type="SET" value="xxxx-wxx-1" quant="EVERY"> every Monday </TIMEX3>

<TLINK timeID="t1" relatedToTime="t2" relType="IS_INCLUDED"/>

<TLINK eventInstanceID="ei1" relatedToTime="t1" relType="DURING"/>

Page 60: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

60

Subordinated LinksAP-NR-08-15-90 1337EDT

Iraq's Saddam Hussein, facing U.S. and Arab troops at the Saudiborder, today sought peace on another front by promising to withdraw

from Iranian territory and release soldiers capturedduring the Iran-Iraq war. Also today, King Hussein of Jordan arrived in Washington seeking to mediate the Persian Gulf crisis. President Bush onTuesday said the United States may extend its naval quarantine to

Jordan's Red Sea port of Aqaba to shut off Iraq's last unhindered trade route. In another mediation effort, the Soviet Union said today it hadsent an envoy to the Middle East on a series of stops to includeBaghdad. Soviet officials also said Soviet women, children andinvalids would be allowed to leave Iraq.

Page 61: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

61

SLINK TypesSLINK or Subordination Link is used for contexts introducing relations between two events, or an event and a signal, of the following sort: Modal: Relation introduced mostly by modal verbs (should, could, would, etc.) and events that introduce a reference to a possible world --mainly I_STATEs:

John should have bought some wine. Mary wanted John to buy some wine.

Factive: Certain verbs introduce an entailment (or presupposition) of the argument's veracity. They include forget in the tensed complement, regret, manage:

John forgot that he was in Boston last year. Mary regrets that she didn't marry John.

Counterfactive: The event introduces a presupposition about the non-veracity of its argument: forget (to), unable to (in past tense), prevent, cancel, avoid, decline, etc.

John forgot to buy some wine. John prevented the divorce.

Evidential: Evidential relations are introduced by REPORTING or PERCEPTION: John said he bought some wine. Mary saw John carrying only beer.

Negative evidential: Introduced by REPORTING (and PERCEPTION?) events conveying negative polarity:

John denied he bought only beer. Negative: Introduced only by negative particles (not, nor, neither, etc.), which will be marked as SIGNALs, with respect to the events they are modifying:

John didn't forgot to buy some wine. John did not wanted to marry Mary.

Page 62: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

62

Aspectual Links Th' U.S. military buildup in Saudi Arabia

corntinued at fevah pace, wif Syrian troops now part of a multinashunal fo'ce camped out in th' desert t'guard the Saudi kin'dom fum enny noo threst by Iraq.

In a letter to President Hashemi Rafsanjani of Iran, read by a broadcaster over Baghdad radio, Saddam said he will begin withdrawing troops from Iranian territory a week from tomorrow and release Iranian prisoners of war.

Page 63: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

63

Towards TIMEX3

• Decompose more– Smaller tag extents compared to TIMEX2

• <TIMEX2 ID="t28" VAL="2000-10-02">just days after another court dismissed other corruption charges against his father</TIMEX2>.

– N. B. extent marking a source of inter-annotator disagreements in ACE TERN 2004 evaluation

– Avoid tag Embedding<TIMEX2 VAL="1999-08-03">two weeks from <TIMEX2

VAL="1999-07-20">next Tuesday</TIMEX2></TIMEX2> • Include temporal functions for delayed evaluation

– Allow non-consuming tags

• Put relationships in Links

Page 64: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

64

TIMEX3 AnnotationTime Points

<TIMEX3 tid=“t1” type=“TIME” value=“T24:00”>midnight</TIMEX3>

<TIMEX3 tid=“t2” type=“DATE” value=“2005-02-15” temporalFunction=“TRUE” anchorTimeID=“t0”>tomorrow</TIMEX3>

Durations <TIMEX3 tid="t6" type="DURATION" value="P2W"

beginPoint="t61" endPoint="t62">two weeks</TIMEX3> from <TIMEX3 tid="t61" type="DATE" value="2003-06-07">June 7, 2003</TIMEX3>

<TIMEX3 tid="t62" type="DATE" value="2003-06-21" temporalFunction="true" anchorTimeID="t6"/>

Sets <TIMEX3 tid=”t1” type=”SET” value=”P1M”

quant=”EVERY” freq=”P3D”> three days every month</TIMEX3><TIMEX3 tid=”t1” type=”SET” value=”P1M” freq=”P2X”> twice a month</TIMEX3>

Page 65: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

65

TimeML and DAML-Time Ontology*

• We shipe1 2 dayst1 after the purchasee2

• TimeML<TLINK eventInstanceID=e1 relatedToTime=t1

relType=BEGINS/><TLINK eventInstanceID=e1 relatedToEventInstance=e2

relType=AFTER/>

• DAML-OWLatTime(e1, t1) & atTime(e2, t2) & after(t1, t2) &

timeBetween(T, t1, t2) & duration(T, *Days*)=2

*Hobbs & Pustejovsky, in I. Mani et al., eds., The Language of Time

Page 66: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

66

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 67: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

67

IE Methodology

RawCorpus

AnnotatedCorpus

InitialTagger

AnnotationEditor

AnnotationGuidelines

MachineLearningProgram

RawCorpus

Learn-edRules

AnnotatedCorpus

RuleApply

Page 68: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

68

Callisto Annotation Tool

Page 69: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

69

Tabular Annotation of Links

Page 70: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

70

TANGO Graphical Annotator

Page 71: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

71

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 72: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

72

IE Methodology

RawCorpus

AnnotatedCorpus

InitialTagger

AnnotationEditor

AnnotationGuidelines

MachineLearningProgram

RawCorpus

Learn-edRules

AnnotatedCorpus

RuleApply

Page 73: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

73

Timex2/3 Extraction

• Accuracy– Best systems: TIMEX2: 95 F Extent, .8XF

VAL (TERN* 2004 English)– GUTime: .85F Extent, .82F VAL (TERN 2004

training data English)– KTX: .87F Extent, .86F VAL (100 Korean

documents)

• Machine Learning– Tagging Extent: easily trained– Normalizing Values: harder to train

Page 74: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

74

TimeML Event Extraction

• Easier than MUC template events (those were .6F)• Part-of-speech tagging to find verbs• Lexical patterns to detect tense and lexical and

grammatical aspect• Syntactic rules to determine subordination

relations• Recognition and Disambiguation of event

nominals, e.g., war, building, construction, etc.• Evita (Brandeis):

– 0.8F on verbal events (overgenerates generic events which weren’t marked in TimeBank)

– 0.64F on event nominals (WordNet-derived, disambiguated via SemCor training)

Page 75: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

75

TempEx in Qanda

Page 76: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

76Extracting Temporal Relations based on Tense

Sequences• Song & Cohen 1991: Adopt a Reichenbachian tense

representation• Use rules for permissible tense sequences

– When the tense moves from simple present to simple past, the event time moves backward, and from simple present to simple future, it moves forward.

– When the tense of two successive sentences is the same, they argue that the event time moves forward, except for statives and unbounded processes, which keep the same time.

• Won’t work in cases of discourse moves

– When the tense moves from present perfect to simple past, or present prospective (John is going to run) to simple future, the event time of the second sentence is less than or equal to the event time of the first sentence.

• However, incorrectly rules out, among others, present tense to past perfect

transitions.

Song & Cohen: AAAI’91

Page 77: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

77Extracting Temporal Relations by Heuristic Rule

Weighting• Approach assigns weights to

different ordering possibilities based on the knowledge sources involved.

• Temporal adverbials and discourse cues are first tried; if neither are present, then default rules based on tense and aspect are used.

– Given a sentence describing past tense activity followed by one describing a past tense accomplishment or achievement, the second event can only occur just after the activity; it can’t precede, overlap, or be identical to it.

• If the ordering is still ambiguous at the end of this, semantic rules are used based on modeling the discourse in terms of threads.

– Assumes there is one ‘thread’ that the discourse is currently following.

• a. John went into the florist shop.

• b. He had promised Mary some flowers.

• c. She said she wouldn’t forgive him if he forgot.

• d. So he picked out three red roses.

• Each utterance is associated with exactly one of two threads: – (i) going into the florist’s

shop and – (ii) interacting with Mary.

• Prefer an utterance to continue a current thread which has the same tense or is semantically related to it– (i) would be continued by d.

based on tense*Janet Hitzeman, Marc Moens, and Claire Grover, ‘Algorithms for Analysing the Temporal structure of Discourse’, EACL’1995, 253–60.

Page 78: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

78

Heuristic Rules (Georgetown GTag)

• Uses 187 hand-coded rules – LHS: tests based on

TimeML-related features and pos-tags

– RHS: TimeML TLINK classes (~ 13 Allen)

• Ordered into Classes– R1&2: event anchored w/o

signal to time in same clause

– R3 (28): main clause event in 2 successive sentences

– R4: reporting verb and document time

– R5 (54): reporting verb and event in same sentence

– R6 (87): events in same sentence

– R7: timex linked to document time

• Rules can have confidence

ruleNum=6-6If sameSentence=YES &&

sentenceType=ANY &&conjBetweenEvents=YES &&arg1.class=EVENT &&arg2.class=EVENT &&arg1.tense=PAST &&arg2.tense=PAST &&arg1.aspect=NONE &&arg2.aspect=NONE &&arg1.pos=VB &&arg2.pos=VB &&arg1.firstVbEvent=ANY &&arg2.firstVbEvent=ANY

then infer relation=BEFOREConfidence = 1.0Comment = “they traveled far and

slept the night in a rustic inn”

Page 79: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

79

Using Web-Mined Rules

• Lexical relations (capturing causal and other relations, etc.)– kill => die (always)– push => fall (sometimes: Max fell. John pushed him.)

• Idea: leverage the distributions found in large corpora• VerbOcean: database from ISI that contains lexical relations mined from Google searches

– E.g., X happens before Y, where X and Y are WordNet verbs highly associated in a corpus

• Converted to GUTenLink Format • Yields 4199 rules!

ruleNum=8-3991If arg1.class=EVENT && arg2.class=EVENT && arg1.word=learn && # uses morph normalization arg2.word=forget && then infer relation=BEFORE

Page 80: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

80

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 81: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

81

IE Methodology

RawCorpus

AnnotatedCorpus

InitialTagger

AnnotationEditor

AnnotationGuidelines

MachineLearningProgram

RawCorpus

Learn-edRules

AnnotatedCorpus

RuleApply

Page 82: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

82

Related Machine Learning Work

• (Li et al. ACL’2004) obtained 78-88% accuracy on ordering within-sentence temporal relations in Chinese texts.

• (Mani et al., HLT’2003 short) obtained 80.2 F-measure training a decision tree on 2069 clauses in anchoring events to reference times that were inferred for each clause.

• (Lapata and Lascarides NAACL’2004) used found data to successfully learn which (possibly ambiguous) temporal markers connect a main and subordinate clause, without inferring underlying temporal relations.

Page 83: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

83

Car Sim: Text to Accident Simulation System*

• Carries out TimeML annotation of Swedish accident reports

• Builds an event ordering graph using machine learning, with separate decision trees for local and global TLINKS

• Generates, based on domain knowledge, a simulation of the accident

Anders Berglund. Extracting Temporal Information and Ordering Events for Swedish. MS Thesis. Lund University. 2004.

Page 84: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

84

Prior Machine Learning from TimeBank

• Mani (p.c., 2004): – TLINKs converted into feature vectors from

TimeBank 1.0 tags– TLINK relType converted to feature vector class

label, after collapsing– Accuracy of C5.0.1 decision rules: .55 F

• majority class

• Boguraev & Ando (IJCAI’2005):– Uses features based on local syntactic context

(chunks and clause-structure)– trained a classifier for within-sentence TLINKS on

Timebank 1.1: .53F• Bottom Line: TimeBank corpus doesn’t provide

enough data for training learners?

Page 85: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

85

Insight: TLINK Annotation (Humans)

• Inter-annotator reliability is ~.55F– But agreement on LINK labels: 77%

• So, the problem is largely which events to link– Within sentence, adjacent sentences, across

document?– Guidelines aren’t that helpful

• Conclusion: global TLINKing is too fatiguing– 0.84 TLINKS/event in corpus

Page 86: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

86

Temporal Reasoning to the Rescue

• Earlier experiments with SputLINK in TANGO (interactive, text-segmented closure) indicated that without closure, annotators cover 4% of all possible links.

• With closure, an annotator could cover about 65% of all possible links in a document.

• Of those links, 84% were derived by the algorithm

Initial Links 36 4%

User Prompts 109 12%

Derived Links 775 84%

Page 87: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

87

IE Methodology

RawCorpus

AnnotatedCorpus

InitialTagger

AnnotationEditor

AnnotationGuidelines

MachineLearningProgram

RawCorpus

Learn-edRules

AnnotatedCorpus

RuleApply

Axioms

Page 88: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

88

Temporal Closure as an Oversampling Method

• Closing the Corpus (with 745 axioms)

– Number of TLINKs goes up > 11 times!

– BEFORE links go up from 3170 Event-Event and 1229 Event-Time TLINKs to 68,585 Event-Event and 186,65 Event-Time TLINKs

– Before closure: 0.84 TLINKs/event

– After closure: 9.49 TLINKs/event

12750 Events, 2114 Times

Relation Event-Event

Event-Time

IBEFORE 131 15

BEGINS 160 112

ENDS 208 159

SIMULTANEOUS 1528

77

INCLUDES950

3001 (65.3%)

BEFORE 3170 (51.6%) 1229

TOTAL 6147 4593

Corpus: 186 TimeBank 1.2.1 + 73 Opinion Corpus

Page 89: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

89

ML Results• Features – each TLINK

is a feature vector:– For each event in the

pair:• event-class

occurrence, state, reporting, i-action, i-state, aspectual, perception

• aspect progressive, perfective, progressive_perfective

• modality nominal• negation nominal• string string

– tense present, past, future

– signal string– shiftAspect boolean– shiftTense boolean– class

{SIMULTANEOUS, IBEFORE, BEFORE, BEGINS, ENDS, INCLUDES}

Link Labeling Accuracy

72 .4664 .8

74 .02

93 .1 88 .25

76 .13

62 .5 63 .43

E xE E xT

M L GTag GTag+VerbO cean M L + CL O SURE

Page 90: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

90

TLINK Extraction: Conclusion

• Annotated TimeML Corpus provides insufficient examples for training machine learners

• Significant Result: – number of examples expanded 11 times by Closure

– Training learners on the expanded corpus yields excellent performance

– Performance exceeds human intuitions, even when augmented with lexical rules

• Next steps– Integrate GUTenLink+VerbOcean rules into machine learning

framework– Integrate with s2tlink and a2tlink– Feature engineering

Page 91: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

91

Challenges: Temporal Reasoning

• Temporal reasoning for IE has used qualitative temporal relations

• Trivial metric relations (distances in time) can be extracted from anchored durations and sorted time expressions

• But commonsense metric constraints are missing– Time(Haircut) << Time(fly Boston2Sydney)

• First steps: – Hobbs et al. at ACL’06– Mani & Wellner at ARTE’06 workshop

Page 92: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

92

Challenges: Integrating Reasoning and Learning

ReasonReason

Expand training data, globally consistent

But classifier decisions may not be globally consistent

Expansion substantially improves classification!

LearnerLearnerA < BB < C

A < BB < CA < C

ReasonReason

Need to integrate classification and reasoning!

Page 93: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

93

Difficulties in Annotation

In an interview with Barbara Walters to be shown on ABC’s “Friday nights”, Shapiro said he tried on the gloves and realized they would never fit Simpson’s larger hands.

• BEFORE or MEET?• More coarse-grained annotation may suffice

Page 94: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

94

Discourse Relations

• Lexical Rules from VerbOcean are still very sparse, even though they are less brittle

• But need to match arguments when applying lexical rules (e.g., subj/obj of push/fall)

• A discourse model should in fact be used

Page 95: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

95

Temporal Relations as Surrogates for Rhetorical Relations

• When E1 is left-sibling of E2 and E1 < E2, then typically, Narration(E1, E2)

• When E1 is right-sibling of E2 and E1 < E2, then typically Explanation(E2, E1)

• When E2 is a child node of E1, then typically Elaboration(E1, E2)

constraints: {Eb < Ec, Ec < Ea, Ea < Ed}

a. John went into the florist shop. b. He had promised Mary some flowers. c. She said she wouldn’t forgive him if he forgot. d. So he picked out three red roses.

Narr

Elab

Expl

Page 96: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

96

TLINKS as a measure of fluency in Second Language Learning*

• Analyzed English oral and written proficiency samples elicited from 16 speakers of English:– 8 native speakers and 8 students in ‘Advanced’

courses in an Intensive English Program. – Corpus includes 5888 words elicited from subjects via

a written narrative retelling task• Chaplin’s Hard Times

• On average, native speakers (NSs) use significantly fewer wds to create TLinks (8.2/TLink vs. 10.1 for NNSs).

• Number of closed TLINKS for NS far exceeds the number for NNS (12,330 vs. 4924). – This means NS have, on the average, longer

chains of TLINKS

* Joint work with Jeff Connor-Linton at AAAL’05.

Page 97: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

97

Outline

• Introduction• Linguistic Theories • AI Theories• Annotation Schemes• Rule-based and machine-learning

methods. • Challenges• Links

Page 98: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

98

Corpora

• News (newswire and broadcast)– TimeML: TimeBank, AQUAINT Corpus (all English)– TIMEX2: TIDES and TERN English Corpora, Korean Corpus (200 docs),

TERN Chinese and Arabic news data (extents only)

• Weblogs– TIMEX2 TERN corpus (English, Chinese, Arabic – the latter with extents

only)

• Dialogues

– TIMEX2- 95 Spanish Enthusiast dialogs, and their translations

• Meetings

– TIMEX2 Spanish portions of UN Parallel corpus (23,000 words)

• Children’s Stories

– Reading Comprehension Exams from MITRE, Remedia: 120 stories, 20K words, CBC: 259 stories, 1/3 tagged, ~50K

Page 99: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

99

Links

• TimeBank: (April 17, 2006)– http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?

catalogId=LDC2006T08• TimeML:

– www.timeml.org• TIMEX2/TERN ACE data (English, Chinese,

Arabic):– timex2.mitre.org

• TIMEX2/3 Tagger:– http://complingone.georgetown.edu/~linguist/

GU_TIME_DOWNLOAD.HTML• Korean and Spanish data .: [email protected]• Callisto: callisto.mitre.org

Page 100: 1 Temporal Information Extraction Inderjeet Mani imani@mitre.org

100

References1. Mani, I., Pustejovsky, J., and Gaizauskas, R. (eds.). (2005) The

Language of Time: A Reader. Oxford University Press. 2. Mani, I., and Schiffman, B. (2004). Temporally Anchoring and Ordering

Events in News. In Pustejovsky, J. and Gaizauskas, R. (eds), Time and Event Recognition in Natural Language. John Benjamins, to appear.

3. Mani, I. (2004). Recent Developments in Temporal Information Extraction. In Nicolov, N., and Mitkov, R. Proceedings of RANLP'03, John Benjamins, to appear.

4. Jang, S., Baldwin, J., and Mani, I. (2004). Automatic TIMEX2 Tagging of Korean News. In Mani, I., Pustejovsky, J., and Sundheim, B. (eds.), ACM Transactions on Asian Language Processing: Special issue on Temporal Information Processing.

5. Mani, I., Schiffman, B., and Zhang, J. (2003). Inferring Temporal Ordering of Events in News. Short Paper. In Proceedings of the Human Language Technology Conference (HLT-NAACL'03).

6. Ferro, L., Mani, I., Sundheim, B. and Wilson G. (2001). TIDES Temporal Annotation Guidelines Draft - Version 1.02. MITRE Technical Report MTR MTR 01W000004. McLean, Virginia: The MITRE Corporation.

7. Mani, I. and Wilson, G. (2000). Robust Temporal Processing of News. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL'2000), 69-76.