outsourcing framenet to the crowd

26
OUTSOURCING FRAMENET TO THE CROWD Marco Fossati, Claudio Giuliano, and Sara Tonelli [email protected] 1

Upload: marco-fossati

Post on 04-Aug-2015

79 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Outsourcing FrameNet to the Crowd

OUTSOURCING FRAMENET TO THE CROWD

Marco Fossati, Claudio Giuliano, and Sara [email protected]

1

Page 2: Outsourcing FrameNet to the Crowd

Filling

“A trolley was heaped with beer cans”

ThemeGoal

Frame annotation

2

Lexical unit

Frame Frame elements

to heap Filling [Goal, Theme]

Page 3: Outsourcing FrameNet to the Crowd

Frame annotation

3

Lexical unit

to heap

Frame 1

Filling

Frame 2

Placing

Goal CauseTheme Agent

Page 4: Outsourcing FrameNet to the Crowd

CHALLENGEFull frame annotation by the man in the street

4

Page 5: Outsourcing FrameNet to the Crowd

2-step methodology

5

1Frame

discrimination

2FE recognition

Word sense disambiguationWhich is the sense of heaped?

Filling

Semantic role assignmentThe Theme is...

Which is the Theme?with beer cans

“A trolley was heaped with beer cans”

Page 6: Outsourcing FrameNet to the Crowd

Critical issuesin crowdsourcing

6

“The element Theme is generally an NP object”

???Definition by experts for

experts

Page 7: Outsourcing FrameNet to the Crowd

Critical issuesin crowdsourcing

7

1Frame

discrimination

2FE recognition

!!!Error propagation

Page 8: Outsourcing FrameNet to the Crowd

Frame emersion

Alternative methodology

8

1-step workflow (bottom-up)

1FE recognition

Page 9: Outsourcing FrameNet to the Crowd

Implementation

9

“A trolley was heaped with beer cans”

to heap

A trolley

with beer cans None None

PlacingFilling

Theme?

Goal?

Agent?

Cause?

Page 10: Outsourcing FrameNet to the Crowd

Manual simplification

a. Replace the FE name with the semantic type

b. Simplify complex syntax

c. Avoid variability

d. Reformulate technical concepts using common words

10

Page 11: Outsourcing FrameNet to the Crowd

Simplification impact

11

LU Frame FE Gain

to throwCause motion

Theme + 44%to throw

Cause motionGoal + 19%to throw

Body movement Body part + 31%

to guide Influence of event on cognizer

Cognizer + 25%

Average gainAverage gainAverage gain + 30%

Page 12: Outsourcing FrameNet to the Crowd

Simplification examples

The element Theme is generally an NP object

The Theme is the element that undergoes the motion

12

ThemeCause motion

Page 13: Outsourcing FrameNet to the Crowd

Simplification examples

13

With some verbs in this frame, the Body part involved in the action is specified by the meaning of the verb and cannot be

expressed separately

This element describes the Body part that is involved in the action

Body part

Body Movement

Page 14: Outsourcing FrameNet to the Crowd

EXPERIMENTSwith the CrowdFlower platform

14

Page 15: Outsourcing FrameNet to the Crowd

SettingsLexical unit Frames

to disappearCeasing to be

Departing

to guideCotheme

Influence of event on cognizer

to heapFillingPlacing

to throwBody movementCause motion

JudgmentsCost per sentence

51.83 $ cents

15

Page 16: Outsourcing FrameNet to the Crowd

2-STEPTop-down standard annotation workflow

1. Frame discrimination

2. FE recognition

16

“A trolley was heaped with beer cans”Which is the correct sense?

Filling Placing

“A trolley was heaped with beer cans”Theme: The Theme is the object which changes location

A trolley with beer cans

Goal: The Goal is...

Choose the right sense of a word

Find the participants in the event

Simplified FE

definitions

Page 17: Outsourcing FrameNet to the Crowd

Theme: The Theme is the object which changes location A trolley with beer cans None

Goal: The Goal is...

1-STEPBottom-up workflow

17

“A trolley was heaped with beer cans”

Filling

correct frame

Page 18: Outsourcing FrameNet to the Crowd

Agent: The Agent is the person that cause the theme to move A trolley with beer cans None

Cause: The Cause is...

1-STEPBottom-up workflow

“A trolley was heaped with beer cans”

Placing

wrong frame

18

Page 19: Outsourcing FrameNet to the Crowd

Results

2-step 1-step

Majority vote accuracy

Execution time (h)

Cost per sentence ($ cents)

.687 .792

171 130

4.57 8.41

19

986 judgments collected so far

Page 20: Outsourcing FrameNet to the Crowd

Lessons learnt

Difficult gold = low agreement

Automatic task takedown = time increase

Contested gold is useful

Signal for tricky FE definitions

Negation and modality are problematic

20

Page 21: Outsourcing FrameNet to the Crowd

Negation

21

“On their way to the station she would not throw her coin into the Trevi Fountain”

the Goal is the place where the element ends up at the end of the motionGoal

A worker said

“If she WOULD NOT THROW her coin, it did NOT end up in the fountain. Therefore this answer is wrong. She still

has the coin.”

Page 22: Outsourcing FrameNet to the Crowd

Conclusion

22

We can crowdsource frame annotation

FE definitions simplification

Bottom-up approach

Page 23: Outsourcing FrameNet to the Crowd

Research directions

Larger scale experiments

Get rid of FE definitions

Entity linking techniques

Semantic type information from structured knowledge bases

23

Page 25: Outsourcing FrameNet to the Crowd

CROWDCRAFTING.ORGFree crowdsourcing platform

Our task

25

Page 26: Outsourcing FrameNet to the Crowd

FE recognition pilots

26

Original Automatic Simplified

Majority Accuracy

Untrusted judgments

.777 .666 .750

99 222 36