extraction of opinions on the web - uni koblenz-landau · advanced topic 1: opinion extraction with...

41
Extraction of Opinions on the Web Richard Johansson Presentation at the LK summer school August 31, 2011 Computer Science and Engineering Department University of Trento Email: [email protected] Funded by EU FP7: LivingKnowledge and EternalS

Upload: others

Post on 16-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Extraction of Opinions on the Web

Richard Johansson

Presentation at the LK summer school August 31, 2011

Computer Science and Engineering Department University of Trento

Email: [email protected]

Funded by EU FP7: LivingKnowledge and EternalS

Page 2: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Personal Background

!   Defended doctoral dissertation in December 2008 at Lund University, Sweden

!   I now work as a postdoctoral researcher at the University of Trento, Italy

!   PhD work focused on NLP tasks such as syntactic parsing and shallow-semantic extraction

! Postdoc work on the applications of these methods in areas such as opinion extraction

Page 3: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Overview

!   Introduction

!   Coarse-grained methods

!   Fine-grained methods

!   Resources

!   Advanced topics: recent research from LK

Page 4: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Introduction

!   Extraction of opinions expressed on the web is a task with many practical applications !   “give me all positive opinions expressed by Sarkozy

last week” !   “what is the overall perception (positive/negative) on

the New Start treaty?”

“Vaclav Klaus expressed his [disapproval] of the treaty while French Prime Minister Sarkozy [supported] it.”

Page 5: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Direct applications

!   Consumer information !   Quickly surveying evaluations from other consumers !   Conversely, companies may survey what customers

think

!   Social and political sciences !   Surveying popular opinion on contentious issues !   Track the development of opinion over time !   Measure the effect of some event on opinions

Page 6: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Indirect applications

!   Retrieval systems !   given a topic, identify documents that express

attitudes toward this topic

!   Question-answering systems !   Obvious: What does X think about Y? !   Also: Filtering out opinionated text before returning

answers

Page 7: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

A note on terminology

!   Opinion extraction/analysis/mining etc

!   Sentiment analysis/extraction

!   Subjectivity analysis/extraction

!   Etc etc etc

Page 8: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Coarse-grained Opinion Extraction

!   Classification of fairly large units of text (e.g. documents)

!   Examples: !   Distinguish editorials from “objective” news text !   Given a review (product, movie, restaurant, …),

predict the number of stars

Page 9: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Lexicon-based Methods

!   Simplest solution: count “positive” and “negative” words listed in some lexicon !   Also weighted

!   Lexicons may be generic or domain-specific

!   Example (with SentiWordNet, first sense): “This movie is awful with really boring actors” !   awful: 0.875 negative !   really: 0.625 positive !   boring; 0.25 negative

Page 10: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Classification using machine learning

!   Coarse-grained opinion extraction is a type of text categorization

!   Categorize the text !   As factual or opinionated !   As positive or negative (or the number of stars)

!   We may then obviously apply classical text categorization methods (Pang and Lee, 2002)

Page 11: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Classification using machine learning

!   Represent a document using a bag of words representation (i.e. a histogram)

!   Optionally, add extra features for words that appear in some lexicon

!   Apply some machine learning method to learn to separate the documents into classes (e.g. SVM, MaxEnt, Naïve Bayes, …)

Page 12: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

But the context…

“The price is high – I saw many cheaper options elsewhere”

!   In practice, expressions of opinion are highly context-sensitive: Unigram (BOW or lexicon) models may run into difficulties

!   Possible solutions: !   Bigrams, trigrams, … !   Syntax-based representations !   Very large feature spaces: feature selection needed

Page 13: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Domain Adaptation

!   Problem: an opinion classifier trained on one collection (e.g. reviews of hotels) may not perform well on a collection from a different domain (e.g. reviews of cars)

!   We may apply domain adaptation methods (Blitzer et al., 2007, inter alia)

!   Similar methods may be applied for lexicon-based opinion classifiers (Jijkoun et al., 2010)

Page 14: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Structural Correspondence Learning (Blitzer et al., 2007) !   Idea:

!   Some pivot features generalize across domains (e.g. “good”, “awful”)

!   Some features are completely domain-specific (“plastic”, “noisy”, “dark”)

!   Find correlations between pivot and domain-specific

!   Example experiment: !   DVD movies -> kitchen appliances !   Baseline 0.74, upper bound 0.88 !   With domain adaptation: 0.81

Page 15: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Fine-grained Opinion Extraction

!   We may want to pose more complex queries: !   “give me all positive opinions expressed by Sarkozy

last week” !   “what is the overall perception (positive/negative) on

the New Start treaty?” !   “what is good and what is bad about the new Canon

camera?”

“Vaclav Klaus expressed his [disapproval] of the treaty while French Prime Minister Sarkozy [supported] it.”

Page 16: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Common subtasks !   Mark up opinion expressions in the text !   Label expressions with polarity values !   Find opinion holders for the opinions !   Find the topics (targets) of the opinions

Page 17: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Opinion Expressions !   An opinion expression is a piece of text that

allows us to conclude that some entity has some opinion – a private state

!   The MPQA corpus (Wiebe et al., 2005) defines two main types of expressions: !   Direct-subjective: typically emotion, communication,

and categorization verbs !   Expressive subjective: typically qualitative adjectives

and “loaded language”

Page 18: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Examples of opinion expressions !   I [love]DSE this [fantastic]ESE conference. !   [However]ESE, it is becoming [rather

fashionable]ESE to [exchange harsh words]DSE

with each other [like kids]ESE. !   The software is [not so easy]ESE to use.

Page 19: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Opinion Holders !   For every opinion expression, there is an

associated opinion holder. !   Also annotated in the MPQA !   Our system finds three types of holders:

!   Explicitly mentioned holders in the same sentence !   The writer of the text !   Implicit holder, such as in passive sentences (“he was

widely condemned”)

Page 20: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Examples of opinion holders !   Explicitly mentioned holder: I [love]DSE this

[fantastic]ESE conference. !   Writer (red) and implicit (green): [However]ESE, it

is becoming [rather fashionable]ESE to [exchange harsh words]DSE with each other [like kids]ESE.

Page 21: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Nested structure of opinion scopes

Sharon [insinuated]ESE+DSE that Arafat [hated]DSE Israel.

!   Writer: negative opinion on Sharon !   Sharon: negative opinion on Arafat !   Arafat: negative opinion on Israel

!   The MPQA corpus annotates the nested structure of opinion/holder scopes

!   Our system does not take the nesting into account

Page 22: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Opinion polarities

!   Every opinion expression has a polarity: positive, negative, or neutral (for non-evaluative opinions)

!   I [love] this [fantastic] conference. !   [However], it is becoming [rather fashionable] to

[exchange harsh words] with each other [like kids].

!   The software is [not so easy] to use.

Page 23: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Tagging Opinion Expressions

!   The obvious approach – which we used as a baseline – would be a standard sequence labeler with Viterbi decoding.

!   Sequence labeler using word, POS tag, and lemma features in a sliding window

!   Can also use prior polarity/intensity features derived from the MPQA subjectivity lexicon.

!   This was the approach by Breck et al. (2007)

Page 24: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Example

Page 25: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Extracting Opinion Holders

!   For opinion holder extraction, we trained a classifier based on techniques common in semantic role labeling

!   Applies to the noun phrases in a sentence

!   A separate classifier detects implicit and writer opinion holders

!   At prediction time, the opinion holder candidate with the maximal score is selected

Page 26: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Syntactic structure and semantic roles !   We used the LTH syntactic/semantic parser to

extract features (Johansson and Nugues, 2008) !   Outputs dependency parse trees and semantic

role structures

Page 27: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Classifying Expression Polarity

!   Given an opinion expression, assign a polarity label (Positive, Neutral, Negative)

!   SVM classifier with BOW representation of the expression and its context, lexicon features

Page 28: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Resources: Collections

!   Pang: Movie reviews (pos/neg)

!   http://www.cs.cornell.edu/people/pabo/movie-review-data

!   Liu: Product features !   http://www.cs.uic.edu/~liub/FBS/CustomerReviewData.zip

! Dredze: Multi-domain product reviews (pos/neg) !   http://www.cs.jhu.edu/~mdredze/datasets/sentiment

!   MPQA: Fine-grained annotation: expressions, holder, polarities, intensities, holder coreference !   http://www.cs.pitt.edu/mpqa/databaserelease

Page 29: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Resources: Lexicons

!   MPQA lexicon

!   http://www.cs.pitt.edu/mpqa/lexiconrelease/collectinfo1.html

! SentiWordNet !   http://sentiwordnet.isti.cnr.it

Page 30: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Advanced topic 1: Opinion extraction with an interaction model !   Previous work used bracketing methods with

local features and Viterbi decoding !   In a sequence labeler using local features only,

the model can’t take into account the interactions between opinion expressions

!   Opinions tend to be structurally close in the sentence, and occur in patterns, for instance !   Verb of categorization dominating evaluation:

He denounced as a human rights violation … !   Discourse connections:

Zürich is beautiful but its restaurants are expensive

Page 31: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Interaction (opinion holders) !   For verbs of evaluation/categorization, opinion

holder extraction is fairly easy (basically SRL) !   They may help us find the holder of other

opinions expressed in the sentence: !   He denounced as a human rights violation … !   This is a human rights violation …

!   Linguistic structure may be useful to determine whether two opinions have the same holder

Page 32: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Interaction (polarity) !   The relation between opinion expressions may

influence polarity: !   He denounced as a human rights violation …

!   Discourse relations are also important: !   Expansion: Zürich is beautiful and its restaurants are good !   Contrast: Zürich is beautiful but its restaurants are expensive

Page 33: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Learning the Interaction model !   We need a new model based on interactions

between opinions !   We use a standard linear model:

!   We decompose the feature representation:

!   But: Exact inference in a model with interactions

is intractable (can be reduced to weighted CSP)

Page 34: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Approximate inference

!   Apply a standard Viterbi-based sequence labeler based on local context features but no structural interaction features.

!   Generate a small candidate set of size k.

!   Generate opinion holders/polarities for every proposed opinion expression.

!   Apply a reranker using interaction features – which can be arbitrarily complex – to pick the top candidate from the candidate set.

Page 35: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Evaluation

!   (Johansson and Moschitti 2010a, 2010b, 2011)

Opinion markup F-measure Baseline 53.8 Reranked 58.5

Holder identification F-measure Baseline 50.8 Extended 54.2

Markup + polarity F-measure Baseline 45.7 Extended 49.7

Page 36: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Advanced topic 2: Extraction of Feature Evaluations !   Extraction of evaluations of product features

(Hu and Liu, 2004)

“This player boasts a decent size and weight, a relatively-intuitive navigational system that categorizes based on id3 tags, and excellent sound”

size +2, weight +2, navigational system +2, sound +2

!   We used only the signs (positive/negative)

Page 37: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Extraction of Feature Evaluations !   We built a system that used features derived

from the MPQA-style opinion expressions !   We compared with two baselines:

!   Simple baseline using local features only !   Stronger baseline using sentiment lexicon

Page 38: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Extraction of Feature Evaluations

Page 39: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

References E. Breck, Y. Choi, C. Cardie. Identifying expressions of opinion in context.

Proc. IJCAI 2007.

J. Blitzer, M. Dredze, F. Pereira. Biographies, Bollywood, Boom-boxes and Blenders: Domain adaptation for sentiment classification. Proc. ACL 2007.

Y. Choi, C. Cardie. Hierarchical sequential learning for extracting opinions and their attributes. Proc. ACL 2010.

M. Hu, B. Liu. Mining opinion features in customer reviews. Proc. AAAI-2004.

V. Jijkoun, M. de Rijke, W. Weerkamp. Generating focused topic-specific sentiment lexicons. Proc. ACL-2010.

R. Johansson, A. Moschitti. Syntactic and semantic structure for opinion expression detection. Proc. CoNLL-2010.

R. Johansson, A. Moschitti. Reranking models in fine-grained opinion analysis. Proc. Coling-2010.

Page 40: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

References R. Johansson, A. Moschitti. Extracting opinion expressions and their polarities

– exploration of pipelines and joint models. Proc. ACL-2011.

R. Johansson, P. Nugues. Dependency-based syntactic–semantic analysis with PropBank and NomBank. Proc. CoNLL-2008.

B. Pang, L. Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. Proc. ACL-2004.

S. Somasundaran, G. Namata, J. Wiebe, L. Getoor. Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. Proc. EMNLP-2009.

J. Wiebe, T. Wilson, C. Cardie. Annotating expressions of opinions and emotions in language. LRE, 39(2-3), 2005.

Page 41: Extraction of Opinions on the Web - Uni Koblenz-Landau · Advanced topic 1: Opinion extraction with an interaction model ! Previous work used bracketing methods with local features

Acknowledgements

!   We have received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under the following grants: !   Grant 231126: LivingKnowledge – Facts, Opinions and

Bias in Time, !   Grant 247758:Trustworthy Eternal Systems via

Evolving Software, Data and Knowledge (EternalS).

!   We would also like to thank Eric Breck and Yejin Choi for explaining their results and experimental setup.