predicting the semantic orientation of adjectives
DESCRIPTION
Predicting the Semantic Orientation of Adjectives. Vasileios Hatzivassiloglou and Kathleen R. McKeown Presenter: Gabriel Nicolae. Introduction. Orientation/polarity = direction of deviation from the norm. Nearly synonymous simple vs. simplistic. Antonyms hot vs. cold. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
Predicting the Semantic Orientation of Adjectives
Vasileios Hatzivassiloglou and Kathleen R. McKeown
Presenter: Gabriel Nicolae
Introduction
Orientation/polarity = direction of deviation from the norm
Nearly synonymous
simple vs. simplistic
Antonyms
hot vs. cold
Introduction
In linguistic constructs such as conjunctions the choice of arguments and connectives are mutually constrained.
The tax proposal was
simple and well-received
simplistic but well-received
simplistic and well-received
by the public.
Exceptions
Goals
Automatically identify antonyms Distinguish near synonyms
How? by retrieving semantic orientation information
using indirect information collected from a large corpus
Why? dictionaries and similar sources (thesauri, WordNet) do
not include explicitly semantic orientation information lack of links between antonyms and synonyms when
they depend on the domain of the discourse
Overview of their approach
Correlation between indicators and semantic orientation
direct indicators: affixes (in-, un-) mostly negatives exceptions: independent, unbiased
indirect indicators: conjunctions conjoined adjectives usually are of the same orientation for
most connectives the situation is reversed for but
fair and legitimate
corrupt and brutal
fair and brutal
corrupt and legitimatevs.
semantically anomalousfrom corpus
General algorithm
1. Extract conjunctions of adjectives and morphological relations
2. Label each two conjoined adjectives as being of the same or different orientation using a log-linear regression model
3. Separate adjectives into two subsets of different orientation using a clustering algorithm
4. The group with the higher average frequency is labeled as positive
Data collection
Corpus: 21 million word 1987 Wall Street Journal
Training data: a set of adjectives with predetermined (hand-annotated) orientation labels (+ or -) 1,336 adjectives (657 +, 679 -)
The training set was validated by four other people 500 adjectives: 89.15% agreement
Test data: 15,048 conjunction tokens 9,296 distinct pairs of conjoined adjectives (type)
Data collection (cont.)
Each conjunction token is classified according to three variables: conjunction used
and, or, but, either-or, neither-nor type of modification
attributive, predicative, appositive, resultative number of the modified noun
singular, plural
Validation of the conjunction hypothesis
Results Their conjunction hypothesis is validated overall and for almost
all individual cases There are small differences in the behavior of conjunctions
between linguistic environments (as represented by the three attributes)
Conjoint antonyms appear far more frequently than expected by chance in conjunctions other than but
Prediction of link type
Baseline 1: always guessing that a link is of the same orientation type => 77.84% accuracy
Baseline 2: Baseline 1 + but exhibits the opposite pattern => 80.82% accuracy
Morphological relationships: Adjectives related in form almost always have different
semantic orientations Highly accurate (97.06%), but applies only to 1,336
labeled adjectives (891,780 possible pairs) E.g. adequate-inadequate, thoughtful-thoughtless
Baseline 1 + Morphology => 78.86% accuracy Baseline 2 + Morphology => 81.75% accuracy
Prediction of link type (cont.)
Log-linear regression model
x: the vector of the observed counts in the various conjunction categories
w: the vector of weights to be learnedy: the response of the system
Using the method of iterative stepwise refinement they selected 9 predictor variables from all 90 possible predictor variables.
Small improvement: 80.97% accuracy (82.05% accuracy using Morphology) but now each prediction is rated between 0 and 1
xwT
e
ey
1
Clustering
Input: a graph of adjectives connected by dissimilarity links Small dissimilarity value => same-orientation link High dissimilarity value => different-orientation link
Method used: apply an iterative optimization procedure on each connected component, based on the exchange method, a non-hierarchical clustering algorithm
Idea: find the partition P such that the objective function Φ is minimized
2
1 ,
),(1
)(i
yxCyxi i
yxdC
P
Labeling the clusters as + or -
In oppositions of gradable adjectives where one member is semantically unmarked, the unmarked member is the most frequent one about 81% of the time
Unmarked => positive orientation almost always
So, label as positive the group that has the highest average frequency of words.
Graph connectivity and performance
They tested how graph connectivity affects the overall performance