intelligent database systems lab presenter : jian-ren chen authors : ahmed abbasi, stephen france,...
TRANSCRIPT
Intelligent Database Systems Lab
Presenter : JIAN-REN CHEN
Authors : Ahmed Abbasi, Stephen France, Zhu Zhang,
and Hsinchun Chen
2011 , IEEE TKDE
Selecting Attributes for Sentiment Classification Using Feature Relation Networks
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
Intelligent Database Systems Lab
MotivationSentiment analysis has emerged as a method for
mining opinions from such text archives.
challenging problem:
1. requires the use of large quantities of linguistic features
2. integrate these heterogeneous n-gram categories into a single
feature set
- noise 、 redundancy and computational limitations
1) polarity 2) intensityI don’t like you 、 I hate you
Intelligent Database Systems Lab
n-gram - (Markov model)天氣:晴天、陰天、雨天
美麗 vs 美痢
“HAPAX” and “DIS” tagsI hate Jimreplaced with “I hate HAPAX”
Intelligent Database Systems Lab
Objectives• Feature Relation Network (FRN) considers semantic information
and also leverages the syntactic relationships between n-gram
features.
- enhanced sentiment classification on extended sets of
heterogeneous n-gram features.
Intelligent Database Systems Lab
Methodology-Extended N-Gram Feature Set
Intelligent Database Systems Lab
Methodology - Subsumption Relations
A subsumes B(A → B) “I love chocolate”
unigram : I, LOVE, CHOCOLATE bigrams : I LOVE, LOVE CHOCOLATE trigrams : I LOVE CHOCOLATE
W hat about the bigrams and trigrams?
It depends on their weight.Their weight exceeds that of their general lower order counterparts by threshold t.
Intelligent Database Systems Lab
Methodology - Parallel RelationsA parallel B (A - B)
POS tag: “ADMIRE_VP” → “ like” semantic class: “SYN-Affection” → “ love”
A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.
Intelligent Database Systems Lab
Methodology - The Complete Network
Intelligent Database Systems Lab
Methodology - Incorporating Semantic Information
Intelligent Database Systems Lab
Experiments - Datasets
Intelligent Database Systems Lab
Experiments – FRN vs Univariate
Intelligent Database Systems Lab
Experiments - FRN vs Univariate (WithinOne)
Intelligent Database Systems Lab
Experiments - FRN vs Multivariate
Intelligent Database Systems Lab
Experiments - FRN vs Multivariate (WithinOne)
Intelligent Database Systems Lab
Experiments - FRN vs Hybrid
Intelligent Database Systems Lab
Experiments - FRN vs Hybrid (WithinOne)
Intelligent Database Systems Lab
Experiments - Ablation
Intelligent Database Systems Lab
Experiments - Parametert (0.0005, 0.005, 0.05, and 0.5)p (0.80, 0.90, and 1.00)
Intelligent Database Systems Lab
Experiments - Average Runtimes
Intelligent Database Systems Lab
Conclusions
• FRN had significantly higher best accuracy and best
percentage within-one across three testbeds.
• The ablation and parameter testing results play an
important role for the subsumption and parallel
relation thresholds.
Intelligent Database Systems Lab
Comments• Advantages
- accuracy 、 computationally efficient• Disadvantage
- ablation and parameter is sensitive• Applications
- sentiment classification- feature selection method