event ordering using terseo system research group on language processing and information systems g...

39
Event Ordering using TERSEO Event Ordering using TERSEO system system Research Group on Research Group on Language Processing and Information Systems Language Processing and Information Systems g g PLSI PLSI Estela Saquete Boró, Rafael Muñoz, Patricio Martinez-Barco Departamento de Lenguajes y Sistemas Informáticos

Post on 20-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Event Ordering using TERSEO Event Ordering using TERSEO systemsystem

Research Group on Research Group on Language Processing and Information SystemsLanguage Processing and Information Systems g g

PLSIPLSI

Estela Saquete Boró, Rafael Muñoz, Patricio Martinez-Barco

Departamento de Lenguajes y Sistemas Informáticos

NLDB 2004 2

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

1. Introduction

2. Previous work

3. Description of the Event Ordering System

4. Application of Event Ordering in NLP tasks

5. System evaluation

6. Conclusions

Index

NLDB 2004 3

Introduction

•Automatic processes to extract relevant

information

•Event ordering using dates and time

–Identification of temporal expressions

–Resolution of temporal expression

–Chronological order

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 4

Introduction

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

•Example:“Today is July the 3rd (2003).

Tomorrow is my birthday”

–Anaphoric expression: “Tomorrow”

–Antecedent: July the 3rd (2003)

–Referent: 07/04/2003

NLDB 2004 5

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

1. Introduction

2. Previous work

3. Description of the Event Ordering System

4. Application of Event Ordering in NLP tasks

5. System evaluation

6. Conclusions

Index

NLDB 2004 6

Previous work•Types of systems:

–Based on Machine Learning: A supervised

annotated corpus needed to automatically generate

the system rules (percentage of appearance).–High precision results with concrete domains

–Not very flexible, large annotated corpus

–Based on knowledge: Previous knowledge base

with rules to solve temporal expressions.–Greater flexibility

•Our system based on Spanish knowledge,

but this knowledge is automatically extended

using automatic acquisition of rules for new

languages

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 7

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

1. Introduction

2. Previous work

3. Description of the Event Ordering System

4. Application of Event Ordering in NLP tasks

5. System evaluation

6. Conclusions

Index

NLDB 2004 8

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

Graphic representation

TEMPORAL INFORMATION

DETECTION

EVENT ORDERING

DATEESTIMATION

Dictionary

TEMPORALEXPRESSION

COREFERENCERESOLUTION

T.E. TAGS

ORDEREDTEXT

Document

Temporal Expression Detection

Temporal Signal Detection

ORDERING KEY

OBTAINING

ORDERING KEYS

TEMPORAL EXPRESSIONS

TEMPORAL SIGNALS

NLDB 2004 9

•Detection of temporal information:–Temporal Expression Detection Unit

–Temporal Signal Detection Unit

•Temporal expressions are resolved by the

Temporal Expression Coreference Resolution

unit that generates the XML tags.

•Ordering key is obtained by the Ordering

Key unit

•With all this information, the Event

Ordering Unit orders the text.

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I Description of the Event Ordering system

NLDB 2004 10

•Detection of temporal information:–Temporal Expression Detection Unit

–Temporal Signal Detection Unit

•Both share a common pre-processing of

texts. Text are tagged with lexical and

morphological information by a PosTagger and

this information is the input of a temporal parser.

•The temporal parser is implemented using

and ascending technique and it is based on a

temporal grammar.

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I Description of the Event Ordering system

NLDB 2004 11

•One of the main tasks involved in trying to

recognize and resolve temporal expressions is to

classify them. A taxonomy with two different

classification of the temporal expressions has

been established:–Classification of the expression based on the kind

of reference

–Classification by the representation of the

temporal value of the expression

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

Temporal Expression Detection

NLDB 2004 12

•Classification of the expression based on the

kind of reference:–Explicit Temporal Expressions:

–Complete dates with or without time

exp:01/01/2003

–Dates of events: Christmas

–Implicit Temporal Expressions:

–Exp. that refer to the Document Date: yesterday

–Exp. that refer to another Date: a month later

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

Taxonomy of TE´s

NLDB 2004 13

Taxonomy of TE´s

•Classification by the representation of the

temporary value of the expression:–Concrete. Give back a concrete day or/and time

–Period. Give back a time interval.

–Fuzzy. Give back approximate time interval.

–Fuzzy concrete: a day of the last week

–Fuzzy period: some months before

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 14

Temporal Signal Detection

•Temporal signals:–Relate the different events in texts

–Establish a chronological order between these

events.

•Some examples of Temporal signals:–After

–Before

–During

–When

–Previously

–While

–At the time of

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 15

•Temporal Expression Coreference Resolution:–Anaphoric relation resolution based on a temporal

model

–Tagging of Temporal Expressions

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I Description of the Event Ordering system

NLDB 2004 16

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

• Looking for antecedents. – Two main candidates:

• Newspaper´s date (DateP),

• Date named before in the text (DateAnt).

– Proccess:

• By default, the newspaper´s date is used as a

base referent if it exists.

• If a non-anaphoric TE is found, this is stored as

DateAnt.

Anaphoric Relation Resolution

NLDB 2004 17

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

Anaphoric Relation Resolution

REFERENCE DICTIONARY ENTRY

‘ayer’ (yesterday)‘mañana’ (tomorrow)

DateP – 1DateP – 1

DateP +1DateP +1

‘durante el mes siguiente’ (during the following month)

[DayI/Month(DateAnt) +1/Year(DateAnt) -- DayF/Month(DateAnt)+1/Year(DateAnt)]

‘un día antes’ (a day before)

DateAnt-1

‘días después’ (some days later)

>>>>>DateAnt

NLDB 2004 18

Tagging of TEs

• Set of XML tags (eXtensible Markup Language).

Targets:

– Showing the results of our system

– Standarise the date-time formats of Internet texts.

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 19

Tagging of TEs

•Set of XML tags (eXtensible Markup Language).

•Explicit Dates< DATE_TIME ID =”value”

TYPE=”value”

VALDATE1=”value”

VALTIME1=”value”

VALDATE2=”value”

VALTIME2=”value” >

Expression

</DATE_TIME>

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 20

Tagging of TEs

•Implicit dates< DATE_TIME_REF ID =”value”

TYPE=”value”

VALDATE1=”value”

VALTIME1=”value”

VALDATE2=”value”

VALTIME2=”value” >

Expression

</DATE_TIME>

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 21

Ordering Keys Obtaining

• The study of the corpus revealed a set of

temporal signals.

• Each temporal signal denotes a relationship

between the dates of the events that it is

relating.

• Example: in EV1 S EV2, the signal S denotes a

relationship between EV1 and EV2. Assuming

that F1 is the date of EV1 and F2 the date of

EV2, S establish an order between EV1 and EV2.Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 22

Ordering Keys Obtaining

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

SIGNAL ORDERING KEY

After F1 > F2

When F1 = F2

Before F1 < F2

During F2i <= F1 <= F2f

Previously F1 > F2

On/in F1 = F2

While F2i <= F1 <= F2f

For F2i <= F1 <= F2f

NLDB 2004 23

Event ordering method• Building of a table with the complete information

from the XML tags

– This table includes the columns ID, VALDATE1,

VALTIME1, VALDATE2, VALTIME2 and VALORDER.

• Ordering rules:

– EV1 is previous to EV2, if the range associated with

TE1 is prior to and not overlapping the range

associated with TE2 or the ordering key is EV1<EV2

– EV1 is concurrent to EV2, if the range associated with

TE1 overlaps the range associated with TE2 or the

ordering key is EV1=EV2Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 24

System example

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

In December 1, the French bathyscaphe Nautilus arrives at the

Galician coast, previously there were some cracks.

Text

TEMPORAL INFORMATION

DETECTION

TEMPORALEXPRESSION

COREFERENCERESOLUTION

T.E. TAG: <DATE_TIME_REF

VALDATE1=“12/01/2002”>in December 1

</DATE_TIME_REF>

ORDERING KEY

OBTAINING

ORDERING KEY: event 1 > event 2

TEMPORAL EXPRESSION: In December 1

TEMPORAL SIGNAL: previously

EVENT ORDERING

Order Event Date

1 There were some cracks <<< 12/01/2002

2 The French bathyscaphe Nautilus arrives at the Galician Coast

12/01/2002

NLDB 2004 25

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

1. Introduction

2. Previous work

3. Description of the Event Ordering System

4. Application of Event Ordering in NLP tasks

5. System evaluation

6. Conclusions

Index

NLDB 2004 26

Application of Event Ordering in NLP tasks

• Applied in different tasks:

• Summarization

• Question Answering

• Etc.

• Temporal Question Answering can help current

QA system to answer complex questions.

Complex questions consist of two or more

events related with a temporal signal, which

establish the order between them.Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 27

Application in Question Answering

• Possible questions:

• When did Iraq invade Kuwait?

• When is the next New Hampshire Democratic

primary?

• Which US ship was attacked by Israeli forces during

the Six Day war in the sixties?

• Where did Bill Clinton study before going to Oxford

University?

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

NLDB 2004 28

Application in Question Answering

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

GENERAL PURPOSE QUESTION ANSWERING SYSTEMGENERAL PURPOSE QUESTION ANSWERING SYSTEM

TEMPORAL TEMPORAL Q. A. Q. A.

PROCESSINGPROCESSING

SCRIPT SCRIPT Q. A. Q. A.

PROCESSINGPROCESSING

TEMPLATE TEMPLATE Q. A. Q. A.

PROCESSINGPROCESSING . . . .

Complex Question

Simple Questions Simple Answers

Complex Answer

INTERFACEINTERFACE

Multilayered Question Answering Architecture

NLDB 2004 29

Example of Application in Question Answering

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

•Question: “Where did Bill Clinton study before

going to Oxford University?

•First of all, the unit recognizes the temporal

signal, which in this case is “before”

•Secondly, the complex question is divided:

• Q1: Where did Bill Clinton study?

• Q2: When did Bill Clinton go to Oxford University?

NLDB 2004 30

Example of Application in Question Answering

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

•Answers Q1:

• Georgetown University (1964-1968)

•Oxford University (1968-1970)

•Yale Law School (1970-1973)

•Answers Q2:

•1968

•Only Georgetown University fulfill the temporal

constrainst, so that is the answer to the complex

question.

NLDB 2004 31

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

1. Introduction

2. Previous work

3. Description of the Event Ordering System

4. Application of Event Ordering in NLP tasks

5. System evaluation

6. Conclusions

Index

NLDB 2004 32

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

• Corpus Spanish: Training (50 articles) and Test (50

articles)

• Kappa factor: measures the affinity in agreement

between a set of annotators when they make

categories judgments k=0.953

• Two measures– Precision: Num Successes / Num Treated Ref

– Recall: Num Successes / Num Real Ref

System evaluation

NLDB 2004 33

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

• The establishment of a correct order between the

events implies that the resolution is correct and

the events are placed on a timeline. For this

reason, an evaluation of the resolution of Temporal

Expressions has been made.

System evaluation

-EVENT 1: Jan. 1, 1967

-EVENT 2: a year later

-EVENT 3: two months before

EVENTS AND ITS TEMPORAL EXPRESSIONSEVENTS AND ITS TEMPORAL EXPRESSIONS

EV1EV1 EV3EV3 EV2EV2

01/01/196701/01/1967 01/01/196801/01/196810/01/196710/01/1967

NLDB 2004 34

System evaluation

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

SPANISH

TRAINING TEST

Num. Art. 50 50

Real Ref. 238 199

Treated Ref. 201 156

Successes 170 138

Precision 84.58% 88.46%

Recall 71.43% 69.35%

NLDB 2004 35

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

• Expressions like “el sábado hubo cinco accidentes”

(Saturday there were five accidents) need context

information of the sentence where the reference is,

in this case, the time of the sentence´s verb. Our

system does not use this information.

• There is not a world knowledge database, for

instance: “two days before the Iraqi war”. We don

´t have this information nowadays.

System evaluation

NLDB 2004 36

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

1. Introduction

2. Previous work

3. Description of the Event Ordering System

4. Application of Event Ordering in NLP tasks

5. System evaluation

6. Conclusions

Index

NLDB 2004 37

• Obtaining facts related to an event from a

Documental Database Chronology.

• System:1. Title of the news linked to the date of the

documents

2. Recognition of temporal expressions. Events

sentences with TE

3. Module for treating TE is applied

4. The ordering module tags the order of the events

in the text

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

Conclusions

NLDB 2004 38

• Application in Temporal Question Answering:

Decomposition of complex temporal

questions in simple ones.

• Future work:

• Cope with context information and world

knowledge

• Multilingual evaluation of the system

Rese

arc

h G

roup o

n L

anguage P

roce

ssin

g a

nd Info

rmati

on

Syst

em

sg

g P

LS

IP

LS

I

Conclusions

Event Ordering using TERSEO Event Ordering using TERSEO systemsystem

Research Group on Research Group on Language Processing and Information SystemsLanguage Processing and Information Systems g g

PLSIPLSI

Estela Saquete Boró, Rafael Muñoz, Patricio Martinez-Barco

Departamento de Lenguajes y Sistemas Informáticos