distributional semantic models for affective text …potam/talks/google_semsim_v7.pdf ·  ·...

81
Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects Distributional Semantic Models for Affective Text Analysis and Grammar Induction Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Upload: vumien

Post on 06-Mar-2018

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Distributional Semantic Models for AffectiveText Analysis and Grammar Induction

Alexandros Potamianos

School of ECE, National Technical Univ. of Athens, Greece

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 2: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

AcknowledgementsElias Iosif, Kelly Zervanou, Georgia Athanasopoulou: semantic similaritycomputation, semantic networks, semantic spacesNikos Malandrakis, Shri Narayanan (USC): affective models for text, dialogueand multimediaGiannis Klassinas, Georgia Athanasopoulou, Elias Iosif, Spyros Georgiladakis,Elissavet Palogiannidi: grammar induction for spoken dialogue systems

References[1] E. Iosif and A. Potamianos. 2010. “Unsupervised semantic similarity computation between terms using webdocuments”. IEEE Transactions on Knowledge and Data Engineering.[2] E. Iosif and A. Potamianos. 2013. “Similarity computation using semantic networks created from web-harvesteddata”. Natural Language Engineering.[3] N. Malandrakis, A. Potamianos, E. Iosif and S. Narayanan. 2013. “Distributional Semantic Models for AffectiveText Analysis”. IEEE Transactions on Audio, Speech and Language Processing.[4] K. Zervanou, E. Iosif and A. Potamianos. 2014. “Word Semantic Similarity for Morphologically Rich Languages”.In Proc. LREC.[5] N. Malandrakis et al. 2014. “Affective language model adaptation via corpus selection”. In Proc. ICASSP.[6] G. Athanasopoulou, E. Iosif and A. Potamianos. 2014. “Low-Dimensional Manifold Distributional SemanticModels”. Submitted to COLING.[7] S. Georgiladakis et al. 2014. “Fusion of knowledge-based and data-driven approaches to grammar induction".Submitted to Interspeech.

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 3: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Talk Outline

Motivation: Cognitive Semantic ModelsSemantic similarity estimation

Web data harvestingNetwork-based Distributional Semantic Models (DSMs)Hierarchical manifold DSMs

Semantic-affective models of textAffective lexica and semantic-affective mapsCompositional semantics and affectAffective model adaptation

PortDial and SpeDial project overviewGrammar inductionWeb data harvestingSemEval 2014 task on grammar induction

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 4: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

List of Open Questions

1 How are concepts, features/properties, categories, actionsrepresented?

2 How are concepts, properties, categories, actionscombined (compositionally)?

3 How are judgements (classification/recognition decisions)achieved?

4 How is learning and inference (especially induction)achieved?

Answers should fit evidence by psychology and neurocognition!

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 5: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Three Solutions

Symboliccognition is a Turing machinecomputation is symbol manipulationrule-based, deterministic (typically)

Associationism, especially, connectionism (ANNs)brain is a neural networkcomputation is activation/weight propagationexample-based, statistical, unstructured (typically)

Conceptualintermediate between symbolic and connectionistconcepts are represented as well-behaved (sub-)spacescomputation tools: similarity, operators, transformationshierarchical, semi-structured

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 6: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Properties of the Three Approaches

SymbolicGood for high-level cognitive computations (math)Poor generalization powerToo expensive and slow for most cognitive purposes

ConceptualExcellent generalization power (intuition, physics)Good for induction and learning; geometric properties(hierarchy, low dim., convex) guarantee quick convergenceProperties and actions defined as operators/translationsStill too slow for some survival-dependent decisions

Connectionist (machine learning)General-purpose, extremely fast and decently accurateComputational sort-cuts create cognitive biasesPoor generalizability power due to high dimensionality andlack of crisp semantic representation

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 7: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Representation Learning

Properties of a classifier with good generalizationproperties [Bengio et al 2013]:

Low-dimensionality/SparsenessDistributed representations/hierarchy

Depth and abstractionShared factors across tasks

Examples: auto-encoders, manifolds, deep neural nets ...How to induce these properties in your classifiers:

Include as regularization term in training classifier criterionInclude properties directly in classifier designGo deep and pray (dirty neural net tricks)

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 8: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Our Vision

Cognitively-motivated semantic modelsEmphasis on induction not classificationAssociations not probabilities/distanceMappings between layersHierarchical manifold models not metric spacesMultimodal not unimodalOther cognitive considerations ...

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 9: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Problem Definition

Semantic Similarity ComputationGiven a pair of words or terms (w

i

,wj

)Compute semantic similarity between them S(i , j)

Related tasksPhrase or sentence level semantic similarityStrength of associative relation between wordsAffective score (valence) of words and sentences

MotivationOrganizing principle of human cognitionBuilding block of machine learning in NLP/semantic webEntry point for the semantics of language

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 10: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

System 1 vs System 2

Using Kahneman’s (and others) formalism:System 1 (intuition): generates

– impressions, feelings, and inclinationsSystem 2 (reason): turns System 1 input into

– beliefs, attitudes, and intentions

Associative relations reside in System 1But where do semantic relations reside?

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 11: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Example

Example from vision: system 1 vs system 2

L" L"L"

L" L"

L" L"L"

T"

T"

T"

T"L"L"L"

L"L" L"

L"L"

L"

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 12: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Main approaches of lexical semantics

Word are associated with feature vectorscrisp, parsimonious representation of semantics

Distributional semantic models (DSMs)Semantic information extracted from word frequenciesEstimate co-occurence counts of word pairs or tripletsEstimate statistics of word context vectors

Semantic networksdiscovery of new relations via systematic co-variationrobust estimates – smoothing corpus statistics over networkrapid language acquisition

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 13: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Example of Semantic Network

Linked nodes: lexicalized senses and attributesInformative for semantic similarity computation

Computation of structural properties, e.g., cliques

t oma to

juicevege tab le

fruit

suga r

lemon

daiquiri

apple

z e s t

o r ange

t r e e

species

p lan t

flowering

milk

c r eam

flour

sal t

bu t t e rcorn

honey

na t ive sh rub

ga rden flower

seed

w a t e rsoilanimal

drinkgenu s

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 14: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Proposed semantic similarity two-tier system

Unifies the three approachesFuzzy vs explicit semantic relationsWord senses vs words vs conceptsA two tier system

An associative network backboneSemantic relations defined as operations on networkneighborhoods (cliques)

Consistent with system 1 vs system 2 viewFurthermore we believe that the

underlying network consists of word senses, andis a low dimensional semi-metric space

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 15: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Semantic Similarity Estimation by Machines

Resource-based, e.g., WordNetRequire expert knowledgeNot available for all languages

Corpus-basedDistributional semantic models (DSMs)Unstructured (unsupervised): no use of linguistic structureStructured: use of linguistic structurePattern-based, e.g., Hearst patterns

Mixed

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 16: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Semantic Sim. Computation: Sense Similarity

Maximum sense similarity assumption [Resnik, ’95]:Similarity of words equal to similarity of their closest sensesIf words are considered as sets of word senses, this is the“common sense” set distance

Given words w1, w2 with senses s1i

, s2j

S(w1,w2) = maxij

S(s1i

, s2j

)

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 17: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Semantic Sim. Computation: Attributional Similarity

Attributional similarity assumptionAttributes (features) reflect semantics

Item-Relation-Attribute, e.g., canary-color-yellowMain representation schemes

Hierarchical/CategoricalMainly taxonomic relations, e.g., IsA, PartOf

Distributed (networks)Open set of relations, e.g., Cause-Effect, etc

Similarity between wordsFunction of attribute similarityDefined wrt representation

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 18: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Types of Similarity Metrics

Co-occurrence-basedAssumption: co-occurrence implies relatednessCo-occurrence counts: web hits, corpus-basedExamples: Dice coef., point-wise mutual information, ...

Context-basedAssumption: context similarity implies relatedness(distributional hypothesis of meaning)Contextual features extracted from corpusExamples: Kullback-Leibler divergence, cosine similarity, ...

Network-based (proposed)Build lexical net using co-occurrence and/or context sim.Notion of semantic neighborhoodsAssumptions: neighborhoods capture word semantics

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 19: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Queries to Web Search Engines

Number of hitsDocument URLs (download)Document snippets

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 20: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Corpus Creation using Web Queries

Two types of web queriesAND, e.g., “money + bank”“... leading bank in India offering online money transfer ...”IND, e.g., “bank”“... downstream parallel to the banks of the river ...”

AND queriesPros: Similarity computation highly correlated (0.88) withhuman ratings [Iosif & Potamianos, ’10]

Cons: Quadratic query complexity wrt lexicon L

IND queriesPros: Linear query complexity wrt lexicon L

Cons: Sense ambiguity: moderate correlation (0.55)

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 21: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Semantic Similarity Estimation

Co-occurence based metricsFrom web: hits of IND, AND queriesFrom (web) corpus: co-occurence counts at the snippet orsentence levelMetrics: Dice, Jacard, Mutual Information, Google

Context-based metricsDownload a corpus of documents of snippets using INDqueriesConstruct lexical context vector for each word (window ±1)Cosine similarity using binary or log-weighted counts

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 22: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Enter semantic networks

Why do IND queries fail to achieve good performance?1 Word senses are often semantically diverse

co-occurrence acts as a semantic filter2 Word senses have poor coverage in IND queries

rare word senses of words not well-represented

Solution: use semantic networks1 Create a corpus for all words in lexicon (not just semantic

similarity pair)2 Use semantic neighborhoods for semantic cohesion

improved robustness3 Inverse frequency word-sense discovery

discover rare senses via co-occurrence with infrequentwords

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 23: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Corpus and Network Creation

GoalsLinear web query complexity for corpus creationNew similarity metrics with high performance

Proposed methodIND queries to aggregate data for large L ( ⇡ 9K )Create network and semantic neighborhoodsNeighborhood-based similarity metrics

AdvantagesNetwork: parsimonious representation of corpus statisticsSmooth distributionsRare words: well-representedEnable discovery of less frequent senses

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 24: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Lexical Network - Semantic Neighborhoods

Lexical NetworkUndirected graph G = (N,E)

Vertices N: words in lexicon L

Edges E : word similarities

Semantic NeighborhoodsFor word i create subgraph G

i

Select neighbors of i

Compute S(i , j), 8 j 2 L, i 6= j

Sort j according to S(i , j)Select |N

i

| top-ranked j

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 25: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Semantic Neighborhoods: Examples

Word Neighborsautomobile auto, truck, vehicle,

car, engine, bus, . . .car truck, vehicle, travel,

service, price, industry, . . .slave slavery, beggar, nationalism,

society, democracy, aristocracy, . . .journey trip, holiday, culture,

travel, discovery, quest, . . .

SynonymyTaxonomic: IsA, MeronymyAssociativeBroader semantics/pragmatics. . .

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 26: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Neighborhood-based Similarity Metrics: M

n

M

n

metric: maximum similarity of neighborhoods

Motivated by maximum sense similarity assumptionNeighbors are semantic features denoting sensesSimilarity of two closest senses

Select max. similarity: M

n

(“forest”, “fruit”) = 0.30

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 27: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Neighborhood-based Similarity Metrics: R

n

R

n

metric: correlation of neighborhood similarities

Motivated by attributional similarity assumptionNeighborhoods encode word attributes (or features)Similar words have co-varying sim. wrt their neighbors

Compute correlation r of neighborhood similaritiesr1((0.16...0.09), (0.10...0.01)), r2((0.002...0), (0.63...0.13))

Select max. correlation: R

n

(“forest”, “fruit”) = �0.04Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 28: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Neighborhood-based Similarity Metrics: metric E

✓=2n

E

✓=2n

metric : sum of squared neighborhood similarities

Motivation: middle road between M

n

and R

n

Accumulation of word–to–neighbor similaritiesNon-linear weighting of similarities via ✓ = 2

E

✓=2n

(“forest”, “fruit”)=p(0.102 + · · ·+ 0.012) + (0.0022 + · · ·+ 02) = 0.22

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 29: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Performance of net-based similarity metrics

Task: similarity judgment on noun pairsDataset: MC [Miller and Charles, 1998]

Evaluation metric: Pearson’s correlation wrt to humanratings

Dataset Neighbor Similarity Metricsselection computation M

n=100 R

n=100 E

✓=2n=100

MC co-occur. co-occur. 0.90 0.72 0.90MC co-occur. context 0.91 0.28 0.46MC context co-occur. 0.52 0.78 0.56MC context context 0.51 0.77 0.29

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 30: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Main findings

Network constructionCo-occurence metrics achieve high-recall for word sensesContext-based metrics achieve high-recall for attributes

Semantic similarity performanceCo-occurence a more robust feature that contextMax sense similarity assumption is valid and gives bestperformanceAttributional similarity assumption valid for certaincases/languages

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 31: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Performance of web-based similarity metrics

For MC datasetFeature Description Correlationcontext AND queries 0.88context IND queries 0.55context IND queries: network 0.90

Comparable to structured DSMs, WordNet-basedapproaches

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 32: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Extensions

Sentence level semantic similarity (SemEval 2012, 2013)Abstract vs concrete semantic networks (IWSC 2013)Grammar induction in PortDial/SpeDial projectsMorphologically rich languages (LREC 2014)

Network-based DSMs perform consistently well acrosslanguages

Network DSMs and language acquisition (BabyAffectproject)

Recognition vs generalization power (induction)

Manifold DSMsMultimodal (text and image) conceptual spaces

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 33: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Acquisition of lexical semantics

Assume a recently acquired word w

Num. of w ’s examples needed for “learning” w ’s similaritiesRelated to acquisition of lexical semantics

CompareSimple co-occurrence-based similarity metricNetwork-based similarity metric

Experiment28 noun pairs (Miller-Charles dataset)Remove one word from each pair from the networkCompute pair similaritiesEvaluation: correlation coef. wrt human similarity ratings

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 34: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Acquisition of lexical semantics

100

101

102

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Num. of examples for out!of!net words

Co

rrela

tio

n

Co!occurrence!based metric

Network!based metric

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 35: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Manifold DSMs

Cognitive semantic space is fragmented in domainsSparse encoding of relations in each domain (manifold)Low-dimensional subspaces with good geometricproperties

vs non-metric global semantic space

Semantic similarity operation is performed locally in eachsubspaceDecision fusion to reach semantic similarity score

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 36: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Manifold DSMs

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 37: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Sparse similarity matrices

Correlation performance on the WS363 task

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 38: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Effect of dimensionality

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 39: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Contributions

Proposed a language agnostic, unsupervised and scalablealgorithm for semantic similarity computation

No linguistic knowledge required, works from text corpusor using a web query engineShown to perform at least as well as resource-basedsemantic similarity computation algorithms, e.g.,WordNet-based methods

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 40: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Compositional Semantic-Affective Models of Text

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 41: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Motivation

Affective text labeling at the core of many multimediaapplications, e.g.,

Sentiment analysisSpoken dialogue systemsEmotion tracking of multimedia content

Affective lexicon is the main resource used to bootstrapaffective text labeling

Lexica are currently of limited scope and quality

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 42: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Goals and Contributions

Our goal: assigning continuous high-quality polarity ratings toany lexical unit

We present a method of expanding an affective lexicon,using web-based semantic similarityAssumption: semantic similarity implies affective similarity.The expanded lexica are accurate and broad in scope,e.g., they can contain proper nouns, multi-word terms

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 43: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Our lexicon expansion method

Extension of [Turney and Littman, ’02].Assumption: the valence of a word can be expressed as alinear combination of the valence ratings of seed wordsweighted by semantic similarity and trainable weights a

i

:

v̂(t) = a0 +NX

i=1

a

i

v(wi

) d(wi

, t), (1)

t : a word or n-gram (token) not in the affective lexiconw1...wN

: seed wordsv(.) : valence rating of a word or n-grama

i

: weight assigned to seed w

i

d(wi

, t) : semantic similarity between word w

i

and token t

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 44: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Semantic-Affective Mapping

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 45: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Givenan initial lexicon of K wordsa set of N < K seed words

we can use (1) to create a system of K linear equations withN + 1 unknown variables:

2

64

1 d(w1,w1)v(w1) · · · d(w1,wN

)v(wN

)...

......

...1 d(w

K

,w1)v(w1) · · · d(wK

,wN

)v(wN

)

3

75 ·

2

6664

a0a1...

a

N

3

7775=

2

6664

1v(w1)

...v(w

K

)

3

7775(2)

Solving with Least Mean Squares estimation provides theweights a

i

.

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 46: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Example, N = 10 seeds

Order w

i

v(wi

) a

i

v(wi

)⇥ a

i

1 mutilate -0.8 0.75 -0.602 intimate 0.65 3.74 2.433 poison -0.76 5.15 -3.914 bankrupt -0.75 5.94 -4.465 passion 0.76 4.77 3.636 misery -0.77 8.05 -6.207 joyful 0.81 6.4 5.188 optimism 0.49 7.14 3.509 loneliness -0.85 3.08 -2.62

10 orgasm 0.83 2.16 1.79- w0 (offset) 1 0.28 0.28

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 47: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Sentence Tagging

Simple combinations of word ratings:linear (average)

v1(s) =1N

NX

i=1

v(wi

)

weighted average

v2(s) =1

NPi=1

|v(wi

)|

NX

i=1

v(wi

)2 · sign(v(wi

))

max

v3(s) = maxi

(|v(wi

)|) · sign(v(wz

)), z = arg maxi

(|v(wi

)|)

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 48: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

N-gram Affective Models

Generalize method to n-grams

v

i

(s) = a0 + a1v

i

(unigram) + a2v

i

(bigram)

Starting from all 1-grams and 2-grams, select terms:1 Backoff: use overlapping bigrams as default, revert to

unigrams based on mutual information-based criterion2 Weighted interpolation: use all unigrams and bigrams as

default, reject bigrams based on criterion

In both cases unigrams and bigrams are given linearweights, trained using LMS

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 49: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Evaluation

ANEW Word Polarity Detection TaskAffective norms for English words (ANEW) corpus1.034 English words, continuous valence ratings

General Inquirer Word Polarity DetectionGeneral Inquirer words corpus3.607 English words, binary valence ratings

BAWLR Word Polarity Detection TaskBerlin affective word list reloaded (BAWLR) corpus2.902 German words, continuous valence ratings

SemEval 2007 Sentence Polarity DetectionSemEval 2007 News Headlines corpus1.000 English sentences, continuous valence ratingsANEW used for lexicon training250 sentence development set used for word fusion training

SemEval 2013, 2014: Twitter Sentiment AnalysisAlexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 50: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Experimental Procedure

Corpus selectionWeb corpus (web)Lexically balanced web corpus (14m, 116m)

Semantic DistanceCo-occurence based (G = google)Context-based using web snippets (S)

All experiments: training on ANEW seed words(cross-validation)

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 51: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Word Polarity Detection (ANEW)

2-class word classification accuracy (positive vs negative)

0 200 400 600 8000.6

0.65

0.7

0.75

0.8

0.85

0.9

Acc

ura

cy

number of seeds

baseline

S1/116m

S1/14mG/116mG/web

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 52: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Word Polarity Detection (GINQ)

2-class word classification accuracy (positive vs negative)

0 200 400 600 8000.6

0.65

0.7

0.75

0.8

0.85

0.9

Acc

ura

cy

number of seeds

baseline

S1/116m

S1/14mG/116mG/web

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 53: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Word Polarity Detection (BAWLR)

2-class word classification accuracy (positive vs negative)

0 200 400 600 8000.6

0.65

0.7

0.75

0.8

0.85

0.9

Acc

ura

cy

number of seeds

S1/170m

G/170m

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 54: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Sentence Polarity Detection (SemEval 2007)

2-class sentence classification accuracy (positive vs negative),using weighted interpolation

0 200 400 600 8000.5

0.55

0.6

0.65

0.7

0.75

0.8

number of seeds

Acc

ura

cy

plain averageweighted averagemin max

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 55: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Sentence Polarity Detection (SemEval 2007)

2-class sentence classification accuracy (positive vs negative),vs bigram rejection threshold

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 56: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Sentence Polarity Detection (SemEval 2007)

Using a compositional DSM model for AN pairs

2-class sentence classification accuracy

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 57: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

ChIMP Sentence Frustration/Politeness Detection

ChIMP Children Utterances corpus15.585 English sentences, Politeness/Frustration/NeutralratingsSoA results, binary accuracy P vs 0 / F vs O:

81% / 62.7% [Yildirim et al, ’05]

10-fold cross-validationANEW used for training/seeds to create word ratingsChiMP words added to ANEW with weight w , to adapt tothe taskSimilarity metric: Google semantic relatednessOnly content words taken into account

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 58: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Politeness: Sentence Fusion schemeClassification Accuracy avg w.avg maxBaseline: P vs O 0.70 0.69 0.54Adapt w = 1: P vs O 0.74 0.70 0.67Adapt w = 2: P vs O 0.77 0.74 0.71Adapt w = 1: P vs O 0.84 0.82 0.75Frustration: Sentence Fusion schemeClassification Accuracy avg w.avg maxBaseline: F vs O 0.53 0.62 0.66Adapt w = 1: F vs O 0.51 0.58 0.57Adapt w = 2: F vs O 0.49 0.53 0.53Adapt w = 1: F vs O 0.52 0.52 0.52

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 59: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Twitter Sentiment Analysis: Main Concept

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 60: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Twitter Sentiment Analysis: Semantic Adaptation

3-class sentence classification accuracy(positive-neutral-negative) [ICASSP 2014]

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 61: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SemEval 2014: Twitter Sentiment Results Analysis

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 62: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Conclusions

Proposed a high-performing, robust, general-purpose andscalable algorithm for affective lexicon creation

Investigated linear and non-linear sentence level fusionschemes, showing good but task-dependent performanceInvestigated domain adaptation: semantic space vssemantic-affective mapping adaptationDemonstrated that distributional approach can generalizeto n-grams

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 63: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Linguistic Resources for Spoken Dialogue Systems:The PortDial and SpeDial projects

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 64: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Outline

1 PortDial project“Language Resources for Portable Multilingual SpokenDialogue Systems”2-year EU-funded project: currently in last quarterwww.portdial.eu

2 SpedDial project“Machine-Aided Methods for Spoken Dialogue SystemEnhancement and Customization for Call-CenterApplications”2-year EU-funded project: currently in first quarterwww.spedial.eu

3 SemEval’14-Task 2“Grammar Induction for Spoken Dialogue Systems”Evaluation period: until March 30http://alt.qcri.org/semeval2014/task2/

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 65: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Outline

GrammarsEssential unit of spoken dialogue systemsExpertise needed, time-consumingNeed for rapid porting

PortDial paradigmMachine-aided processHuman-in-the-loop

ProtDial approachesCorpora creation via web harvestingGrammar induction

Bottom-up: corpus-basedTop-down: ontology-basedFusion of bottom-up and top-down

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 66: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Web-harvested corpora

WWW search query, e.g., “depart from”&(“flight”|“travel”|..)Out-of-domain LM: perplexity ! grammaticality/spellingIn-domain LM: perplexity ! domain relevance

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 67: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Web-harvested corpora

Travel domain grammar83 low-level rules

E.g., <City> = (“New York”, “London”, ... )47 high-level rules

E.g., <ArrivalCity> = (“fly to <City>”, “arrive at <City>”, ...)Use of various corpora for inducing low-level rules

Corpus Precision Recall F-measureQ&A 0.52 0.40 0.45WoZ 0.41 0.33 0.37

Human-Human 0.42 0.32 0.36Human-System 0.41 0.34 0.37

Manually harvested 0.46 0.41 0.43Web-harvested 0.56 0.45 0.50

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 68: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Bottom-up Grammar Induction

Goal: induction of high-level rulesBased on the availability of low-level rules

Minimal set of examples (seeds) are providedAnalogous to the manual process of grammar developmentExamples are automatically augmented

Two sub-problems1 Extraction of fragments from corpus

Retain fragments with appropriate boundariesE.g., “to depart from <City> on” VS “depart from <City>”

2 Similarity between seeds and extracted fragmentsRetain semantically similar fragmentsE.g., “out of <City>” VS “to <City>”

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 69: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Bottom-up Grammar Induction

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 70: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Bottom-up Grammar Induction

1 Extraction of fragmentsBinary classification problem

Valid / non-valid fragmentsSeeds considered as valid fragments

Types of featuresLexical, e.g., frequency in corpus, num. of tokensSyntactic, e.g., fragment perplexity, PoS info.Semantic, e.g., similarity wrt to seeds

2 Similarity computationNon-compositional: fragments as entire chunks

Various well-known lexical metricsE.g., longest common sub-string similarity

Compositional: function of constituents’ semanticsRecent open research problemModels proposed for sentences, but not for phrases

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 71: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Bottom-up Grammar Induction

EvaluationTravel domainInput: n seeds for each rule

n < 5Output: m fragments suggested for each rule

m: user-definedAccuracy

Valid / non-valid fragments classification: 43%Suggestion of semantically similar fragments: 30%

However, in practiceSome non-valid fragments may be useful

Lengthier, e.g., “depart from <City> on”Human-in-the-loop idea

Post correctionsIterative process

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 72: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Bottom-up & Top-down Grammar Induction

Goal: Fuse different approaches for grammar inductionHigh-level rules

Approaches1 Bottom-up: corpus-based2 Top-down: based on ontology lexica

Bottom-up (BU)Relies on given seeds for each grammar ruleExtraction and suggestion of similar textual fragments

Top-down (TD)Ontology lexica: lexicalizations of ontological knowledgeRepresent domain semantics in ontological representationPossible lexicalizations are encoded as grammar rules

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 73: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

PortDial: Bottom-up & Top-down Grammar Induction

Three fusion approaches1 Early fusion

Rules of TD triggered to generate a corpus; input to BU2 Mid fusion

TD grammar rules given as seeds to BU3 Late fusion

Rules of TD and BU are combined (union)Evaluation: Travel domain

Approach Precision Recall F-measureBottom-Up (BU) 0.65 0.44 0.52Top-Down (TD) 0.81 0.18 0.30

Early fusion 0.64 0.44 0.52Mid fusion 0.56 0.54 0.55Late fusion 0.72 0.55 0.63

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 74: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SpeDial Objectives

Devise machine-aided algorithms for spoken dialoguesystem enhancement and customization for call-centerapplicationsCreate a platform that supports cost-effective servicedoctoring for

Service enhancement: the developer starts from an existingapplication and tries to improve performance and usersatisfaction,Service customization: the developer addresses the specialneeds of a user population

Create and support a sustainable pool of developers thatwill be trained to use the platform

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 75: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SpeDial: Multimodal Analytics for IVR

Affective analysis of dialoguesValence, arousal, mood, certain/uncertainAlso: gender, age, nativeness identification

Call-flow, discourse and cross-modal analyticsIdentify problematic and successful parts of the dialoguesIdentify dialogue hot-spots

Multilingual analyticsHow previous sub-tasks can be applied across multiplelanguages

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 76: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SpeDial: Enhancement and Customization

Prompt and grammar enhancementSelect most appropriate prompts from the pool of promptsUse transcriptions to train/update statistical grammarsUpdate FSM grammars via grammar induction

Dialogue flow enhancementAdjustment of system policies for successful interactions

User modeling: prompt selection wrtAge, gender, age & gender

MultilingualitySMT & crowd-sourcing to improve on prompts & grammarsCorpus-based methods for statistical grammar trainingDirect translation of service grammars

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 77: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SemEval’14: Task on Grammar Induction

SemEval workshopsVarious shared evaluations tasks of computationalsemantic analysis systemsSemEval’14: the 8th workshopCo-located with COLING’14, Dublin, Ireland, August 2014

SemEval’14 - Task 2“Grammar induction for spoken dialogue systems”Fosters the application of models of lexical semantics tospoken dialogue systemsOrganized by the consortium of PortDial project

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 78: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SemEval’14: Task on Grammar Induction

Grammar rules distinguished into1 Low-level

Refer to basic concepts; comprised by lexical items onlyE.g., <City> = (“New York”, “London”, ... )

2 High-levelGrouping of semantically related textual fragmentsComposed of both lexical items and low-level rulesE.g., <ArrivalCity> = (“fly to <City>”, “arrive at <City>”, ...)

Parsing example1 “I want to fly to London”2 “I want to fly to <City>”3 “I want to <ArrivalCity>”

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 79: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SemEval’14: Task on Grammar Induction

Sub-problems1 Induction of low-level rules

1 Well-investigated2 Also, use of resources, e.g., gazetteers

2 Induction of high-level rules1 Segmentation problem: identify candidate fragments2 Similarity problem: compute similarity between fragments

SemEval’14-Task 2Focus on high-level rules

Low-level rules are givenSegmentation problem simplified as:

Discriminate between valid / non-valid fragmentsMain focus: computation of similarity between fragments

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 80: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

SemEval’14: Task on Grammar Induction

Train dataList of fragments for each grammar ruleInstances of low-level rules: givenList of non-valid fragments

Test dataList of unknown fragmentsFor each unknown fragment:

1 Is it a valid fragment?2 If so, assign it to the most similar rule

Domain LanguageTravel EnglishTravel Greek

Tourism EnglishFinance English

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction

Page 81: Distributional Semantic Models for Affective Text …potam/talks/google_semsim_v7.pdf ·  · 2014-08-15Compositional semantics and affect ... (distributional hypothesis of meaning)

Intro Sem.Similarity Network DSMs Manifold DSMs Textual Affect Lexicon expansion Evaluation Projects

Thank you

Alexandros Potamianos School of ECE, National Technical Univ. of Athens, Greece

Distributional Semantic Models for Affective Text Analysis and Grammar Induction