semeval-2013 task 13 - stanford university · 2015-08-20 · semeval-2013 task 13: word sense...

77
SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses [email protected] [email protected] David Jurgens Dipartimento di Informatica Sapienza Universita di Roma Ioannis Klapaftis Search Technology Center Europe Microsoft

Upload: others

Post on 11-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

SemEval-2013 Task 13: Word Sense Induction for

Graded and Non-Graded Senses

[email protected]

[email protected]

David JurgensDipartimento di InformaticaSapienza Universita di Roma

Ioannis KlapaftisSearch Technology Center Europe

Microsoft

Page 2: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• Introduction

• Task Overview

• Data

• Evaluation

• Results

Page 3: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Which meaning of the word is being used?

John sat on the chair.1. a seat for one person, with a support for the back2. the position of professor3. the officer who presides at the meetings of an organization

Page 4: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Which meaning of the word is being used?

John sat on the chair.1. a seat for one person, with a support for the back2. the position of professor3. the officer who presides at the meetings of an organization

This is the problem of Word Sense Disambiguation (WSD)

Page 5: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

What are the meanings of a word?

It was dark outside

Her dress was a dark green

We didn’t ask what dark purpose the knife was for

It was too dark to see

I light candles when it gets dark

These are some dark glasses

The dark blue clashed with the yellow

The project was made with dark designs

Page 6: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

What are the meanings of a word?

It was dark outside

Her dress was a dark green

We didn’t ask what dark purpose the knife was for

It was too dark to see

I light candles when it gets dark

These are some dark glasses

The dark blue clashed with the yellow

The project was made with dark designs

This is the problem of Word Sense Induction (WSI)

Page 7: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• Introduction

• Task Overview

• Data

• Evaluation

• Results

Page 8: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Task 13 Overview

Induce senses

WSDsystem

Use WordNet

or

Annotate the sametext and measure the

similarity of annotations

Lexicographers

Page 9: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Why another WSD/WSI task?

Page 10: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Why another WSD/WSI task?

Application-based(Task 11)

Annotation-focused(this task)

Page 11: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluation is tied to Inter-Annotator Agreement (IAA)

Lexicographers

If lexicographers can’t agree on which meaning is present, WSD

systems will do no better.

Page 12: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Why might humans not agree?

Page 13: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

He struck them with full force.

Page 14: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

He struck them with full force.

strike#v#1“deliver a sharp blow”

He’s probably fighting so

Page 15: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

He struck them with full force.

strike#v#10 “produce by manipulating keys”

He’s clearly playing a piano!

Page 16: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

He struck them with full force.

strike#v#19 “form by stamping”

I thought he was minting coins the old fashioned way

Page 17: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

He struck them with full force.

• strike#v#1 “deliver a sharp blow”

• strike#v#10 “produce by manipulating keys”

• strike#v#19 “form by stamping”

Only one sense is correct, but contextual ambiguity makes it impossible to determine

which one.

Page 18: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

She handed the paper to her professor

Page 19: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• paper#n#1 - a material made of cellulose

• paper#n#2 - an essay or assignment

She handed the paper to her professor

Multiple, mutually-compatible meanings

Page 20: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• paper#n#1 - a material made of cellulose

• paper#n#2 - an essay or assignment

She handed the paper to her professor

a physical property

Multiple, mutually-compatible meanings

Page 21: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Multiple, mutually-compatible meanings

• paper#n#1 - a material made of cellulose

• paper#n#2 - an essay or assignment

She handed the paper to her professor

a physical property

a functional property

Page 22: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Parallel literal and metaphoric interpretations

• dark#a#1 – devoid of or deficient in light or brightness; shadowed or black

• dark#a#5 – secret

We commemorate our births from out of the dark centers of women

Page 23: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Annotators will use multiple senses if you let them

• Véronis (1998)

• Murray and Green (2004)

• Erk et al. (2009, 2012)

• Jurgens (2012)

• Passoneau et al. (2012)

• Navigli et al. (2013) - Task 12

• Korkontzelos et al. (2013) - Task 5

Page 24: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

New in Task 13: More Ambiguity!

Induce senses

WSDsystem

Use WordNet

or

Annotate the sametext and measure the

similarity of annotations

Lexicographers

Page 25: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Task 13 models explicitly annotating instances with...

• Ambiguity

• Non-exclusive property-based senses in the sense inventory

• Concurrent literal and metaphoric interpretations

Page 26: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Task 13 annotation has lexicographers and WSD systems

use multiple senses with weights

The student handed her paper to the professor

Page 27: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• paper%1:10:01:: – an essay

• paper%1:27:00:: – a material made of cellulose pulp

The student handed her paper to the professor

Definitely! 100%

Task 13 annotation has lexicographers and WSD systems

use multiple senses with weights

Page 28: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• paper%1:10:01:: – an essay

• paper%1:27:00:: – a material made of cellulose pulp

The student handed her paper to the professor

Definitely! 100%

Sort of? 30%

Task 13 annotation has lexicographers and WSD systems

use multiple senses with weights

Page 29: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Potential Applications

• Identifying “less bad” translations in ambiguous contexts

• Potentially preserve ambiguity across translations

• Detecting poetic or figurative usages

• Provide more accurate evaluations when WSD systems detect multiple senses

Page 30: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• Introduction

• Task Overview

• Data

• Evaluation

• Results

Page 31: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Task 13 Data

• Drawn from the Open ANC

• Both written and spoken

• 50 target lemmas

• 20 noun, 20 verb, 10 adjective

• 4,664 Instances total

Page 32: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Annotation Process

1 Use methods from Jurgens (2013) to get MTurk annotations

Page 33: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Annotation Process

Achieve high (> 0. 8) agreement

12

Use methods from Jurgens (2013) to get MTurk annotations

Page 34: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Annotation Process

Achieve high (> 0. 8) agreement

12

Use methods from Jurgens (2013) to get MTurk annotations

3 Analyze annotations and discover Turkersare agreeing but are also wrong

Page 35: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Annotation Process

Achieve high (> 0. 8) agreement

12

Use methods from Jurgens (2013) to get MTurk annotations

3 Analyze annotations and discover Turkersare agreeing but are also wrong

Annotate the data ourselves4

Page 36: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Annotation Setup

• Rate the applicability of each sense on a scale from one to five

• One indicates doesn’t apply

• Five is exactly applies

Page 37: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Multiple sense annotation rates

1 1.1 1.2 1.3 1.4

Senses Per Instance

Face-to-face

Telephone

Fiction

Journal

Letter

Non-fiction

Technical

Travel Guides

Spoken Written

Page 38: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• Introduction

• Task Overview

• Data

• Evaluation

• Results

Page 39: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Evaluating WSI and WSD Systems

Lexicographer Evaluation WSD Evaluation

Page 40: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Evaluations

It was dark outside

Her dress was a dark green

We didn’t ask what dark purpose the knife was for

Page 41: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Evaluations

It was dark outside

Her dress was a dark green

We didn’t ask what dark purpose the knife was for

It was too dark to seeI light candles when it gets dark

Dark nights and short days

Make it dark red

These are some dark glassesThe dark blue clashed with the yellow

The project was made with dark designs

He had that dark look in his eyes

Page 42: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Evaluations

It was dark outside

Her dress was a dark green

We didn’t ask what dark purpose the knife was for

It was too dark to seeI light candles when it gets dark

Dark nights and short days

Make it dark red

These are some dark glassesThe dark blue clashed with the yellow

The project was made with dark designs

He had that dark look in his eyes

Page 43: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Evaluations

The project was make with dark designs

Lexicographer

Page 44: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Evaluations

The project was make with dark designs

Lexicographer WSI System

Page 45: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Evaluations

The project was make with dark designs

Lexicographer WSI System

How similarare the

clusters ofusages?

Page 46: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

The complication of fuzzy clusters

Lexicographer WSI System

Page 47: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

The complication of fuzzy clusters

Lexicographer WSI System

Overlapping

Partialmembership

Page 48: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Evaluation 1: Fuzzy B-Cubed

Lexicographer WSI System

How similar are the clusters of this item

in both solutions?

Page 49: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Evaluation 1: Fuzzy Normalized Mutual Information

Lexicographer WSI System

How much information does this

cluster give us about the cluster(s) of its items in the other solution?

Page 50: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Why two measures?B-Cubed: performance

with the same sense distribution

NMI: performance independent of sense

distribution

Page 51: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

Page 52: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

Induce senses

WSDsystem

Use WordNet

or

Page 53: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

Induce senses

WSDsystem

Use WordNet

or

Learn a mapping function that converts an induced labeling to

a WordNet labeling

• 80% use to learn mapping

• 20% used for testing

• Used Jurgens (2012) method for mapping

Page 54: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

Which senses apply?

Which senses apply more?

How much does each sense apply?

123

Page 55: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

Which senses apply?1

Gold = {wn1, wn2 } Jaccard Index|Gold ∩ Test||Gold ∪ Test|Test = {wn1}

Page 56: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

Which senses apply more?2Gold = {wn1:0.5, wn2:1.0, wn3:0.9}

Test = {wn1:0.6, wn2:1.0,}

wn2 > wn3: > wn1

wn2 > wn1: > wn3

Kendall’s Tau Similaritywith positional weighting

Page 57: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

How much does each sense apply?3

Weighted NormalizedDiscounted Cumulative Gain

Page 58: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

• All measures are bounded in [0,1]

1

0.9

0.8

1

.8

.8

.7

Avg: 0.9 Avg: 0.825

Page 59: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Evaluations

• All measures are bounded in [0,1]

• Extend Recall to be average across all answers

1

0.9

0.8

1

.8

.8

.7

Avg: 0.9 Avg: 0.825Recall: 0.675 Recall: 0.825

Page 60: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

TeamsAI-KU (WSI)

Lexical Substitution+ Clustering

Page 61: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

TeamsAI-KU (WSI) Unimelb (WSI)

Lexical Substitution+ Clustering

Topic Modeling

Page 62: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

TeamsAI-KU (WSI)

UoS (WSI)

Unimelb (WSI)Lexical Substitution

+ ClusteringTopic Modeling

Graph Clustering

Page 63: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

TeamsAI-KU (WSI)

UoS (WSI)

Unimelb (WSI)

La Sapienza (WSD)

Lexical Substitution+ Clustering

Topic Modeling

Graph Clustering PageRank over WordNet graph

Page 64: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI BaselinesOne cluster per instance

(1c1inst)One cluster

Page 65: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Baselines

•MFS - All instances labeled with MFS from SemCor

•Ranked Senses - All instances labeled with all senses, proportionally weighted by their frequency in SemCor

Page 66: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

• Introduction

• Task Overview

• Data

• Evaluation

• Results

Page 67: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSI Results

0

0.175

0.35

0.525

0.7

0 0.02 0.04 0.06 0.08

Fuzz

y B-

Cub

ed

Fuzzy NMI

One Cluster 1c1inst AI-KUAI-KU (add 1000) AI-KU (add 1000, remove 5) Unimelb (5p)Unimelb (50k) UoS (WN) UoS (Top)

Page 68: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Results

Detection

Ranking

Weighting

0 0.175 0.35 0.525 0.7

AI-KU (add+rem) Unimelb (5000k) UoS (Top) La Sapienza #2One cluster (WSI) 1c1inst (WSI) Semcor MFS Semcor Ranked

Unimelb (5000k)UoS (Top)

One Cluster

SemCor MFS

La Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1inst

Unimelb (5000k)UoS (Top)

One ClusterLa Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1instSemCor MFS

Unimelb (5000k)UoS (Top)

One Cluster

SemCor MFS

La Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1inst

Page 69: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Results

Detection

Ranking

Weighting

0 0.175 0.35 0.525 0.7

AI-KU (add+rem) Unimelb (5000k) UoS (Top) La Sapienza #2One cluster (WSI) 1c1inst (WSI) Semcor MFS Semcor Ranked

Unimelb (5000k)UoS (Top)

One Cluster

SemCor MFS

La Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1inst

Unimelb (5000k)UoS (Top)

One ClusterLa Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1instSemCor MFS

Unimelb (5000k)UoS (Top)

One Cluster

SemCor MFS

La Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1inst

Page 70: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Results

Detection

Ranking

Weighting

0 0.175 0.35 0.525 0.7

AI-KU (add+rem) Unimelb (5000k) UoS (Top) La Sapienza #2One cluster (WSI) 1c1inst (WSI) Semcor MFS Semcor Ranked

Unimelb (5000k)UoS (Top)

One Cluster

SemCor MFS

La Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1inst

Unimelb (5000k)UoS (Top)

One ClusterLa Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1instSemCor MFS

Unimelb (5000k)UoS (Top)

One Cluster

SemCor MFS

La Sapienza #2

SemCor Ranked

AI-KU (add+rem)

1c1inst

Page 71: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Issues with Evaluation

TestTrial100% 11%

Multi-sense Annotation Rate

Task 13 evaluation measures specificallydesigned for multiple senses

Page 72: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Evaluation #2

• Modify the WSI mapping procedure to only produce a single sense

• Modify WSD systems to retain only highest-weighted sense

Page 73: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

WSD Results for single-sense instances

0 0.175 0.35 0.525 0.7

0.477

0

0.569

0.217

0.6

0.605

0.641

F-1 Score

AI-KU

Unimelb (5000k)

UoS (Top)

La Sapienza (#2)

One Cluster

SemCor MFS

One Cluster Per Instance

Page 74: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Conclusions

• Multiple sense annotations offers a way to improve annotation by making ambiguity explicit

• WSI offer some hope for creating highly accurate semi-supervised systems

Page 75: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Future Work

• Embed this application in a task

• Task 11 extension with multiple labels?

• Have systems annotate why an instance needs multiple senses

• Build WSI sense mapping on an external tuning corpus

Page 76: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Summary

• All resources released on the Task 13 website: http://www.cs.york.ac.uk/semeval-2013/task13/

• All evaluation scoring and IAA code is released on Google code https://code.google.com/p/cluster-comparison-tools/

• Annotations (hopefully) being folded into MASC

Page 77: SemEval-2013 Task 13 - Stanford University · 2015-08-20 · SemEval-2013 Task 13: Word Sense Induction for Graded and Non-Graded Senses jurgens@di.uniroma1.it ioannisk@microsoft.com

Any questions?InqueryThe subject matter at issueA sentence of inqueryDoubtfulnessFormal proposals for actionMarriage proposals

SemEval-2013 Task 13: Word Sense Induction for

Graded and Non-Graded Senses

[email protected] [email protected]

David Jurgens Ioannis Klapaftisand