interpretable and effective opinion spam detection via temporal...

Interpretable and E�ective Opinion SpamDetection via Temporal Pa�ern Mining Across

Websites

Yuan Yuan, Sihong Xie, Chun-Ta Lu, Jie Tang and Philip S. Yu

Tsinghua University, Lehigh University and University of Illinois at Chicago

December 7, 2016

Online reviews & spam

Reviews and ratings influence our decisions

Spam reviews are misleading (the review below was filtered by Yelp)

Yuan et al. (BigData 2016) 2

Multiple review sites

One business may have information on multiple sites

What if we combine information on di�erent sites?


Basic idea: Bi-level framework


Main contributions

Proposed a novel spam detection framework using timeseries pa�erns defined over multiple data sources.

Performed in-depth studies to reveal a full picture of the de-fined pa�erns on two levels

Showed quantitative (prediction) and qualitative (casestudies) results demonstrate that the framework can preciselyidentify and explain a�acks that were not previously spo�ed


Single website time series construction

Useful single website Pa�ernsCount of Reviews, Average Rating, Five-star Ratio, Low-ratingRatio, Average Sentiment, Highly Positive Sentiment Ratio,Negative Positive Sentiment Ratio

e.g. Five-star Ratio: FRs(t) =∑

rs :time(rs )∈τt 1[rating(rs)=5]+αFRs

CRs(t)+α


Algorithm: Single site time series pa�ern detection

For each pair of segmentsCompute d = λ

(1/ |k1 |+1/ |k2 |)∆t+λ



d = λ(1/ |k1 |+1/ |k2 |)∆t+λ > θ , and k1 > 0 and k2 < 0

a burst window is detected



d = λ(1/ |k1 |+1/ |k2 |)∆t+λ > θ , and k1 < 0 and k2 > 0

a dive window is detected




a dive window is detected




a dive window is detectedtake the union of detected burst/dive windows



each time window is classified into burst/dive/plateau


Cross-site time series pa�ern design and construction

detect single-site pa�erns in di�erent sites

combine the simultaneous pa�erns

assumption: di�erent cross-site pa�erns have di�erent spamratio (validate on dataset)


Data setup

Raw data

Foursquare: crawled 301,717 venues

Yelp: Yelp challenge dataset1

Matched by names and locations

95 businesses

Foursquare: 15,004 reviews, 12,147 reviewers

Yelp: 68,517 reviews, 31,092 reviewers

1http://www.yelp.com/dataset_challengeYuan et al. (BigData 2016) 15

http://www.yelp.com/dataset_challenge

Basic statistics of cross-site pa�erns

Table: Cross-Site pa�ern statistics

Pa�ern

(Y-F)

Yelp Foursquare

#bus

ines

s

#rev

iew

#rev

iew

er

#rel

ated

revi

ews

filte

red

rati

o

#bus

ines

s

#rev

iew

#rev

iew

er

BB 7 181 179 19133 27.07% 9 89 83BP 27 821 772 127427 26.31% 27 200 186BD 8 295 290 41713 18.98% 9 122 114PB 51 3795 3187 636679 13.68% 52 1154 1089PP 95 59830 23509 9364943 11.99% 95 12152 9491PD 33 3024 2589 548993 15.41% 34 1036 943DB 4 76 76 10321 21.05% 6 79 74DP 10 303 300 23822 48.18% 9 73 71DD 4 192 190 21059 28.13% 6 99 96


Human evaluation

Three human annotators independently label the sampled reviewsusing 3 levels of suspiciousness (1: not suspicious, 2: likely suspiciousand 3: very suspicious.)

Table: Human annotation results

Pa�erns # reviews Avg Scores Prec(> 1) Prec(> 2)B∗ 93 1.9785 0.9677 0.3871BB 18 1.9074 0.8889 0.4444BP 75 1.9956 0.9867 0.3733PB 68 2.0098 0.8971 0.3824PP 55 1.8606 0.9091 0.2909PD 14 1.7857 0.7857 0.2857


Microscopic classification - Behavioral Features

Table: Microscopic behavioral features of reviewers and reviews, and theircorrelations with the ground truths

Feature Corr. Description

DC +0.252 Proportion of days when a reviewer posts reviewson businesses in di�erent cities.

DS +0.230 Proportion of days when a reviewer posts reviewson businesses in di�erent states.

MP +0.183 Proportion of days when a reviewer posts 3 or morereviews.

LRR -0.148 Proportion of reviews with 1 or 2 stars posted by areviewer.

FRR +0.121 Proportion of reviews with 5 stars posted by a re-viewer.

RC +0.086 Sum of reviews posted by a reviewer.


Microscopic classification - Textual Features

Table: Microscopic textual features of reviewers and reviews, and theircorrelations with the ground truths

Feature Corr. Description

LC -0.010 Sum of le�ers in a review.

CWR +0.106 Proportion of ALL-CAPITAL words. (“I" excluded)

CLR +0.065 Proportion of capital le�ers.

1PP -0.034 Proportion of first person pronouns.

2PP +0.094 Proportion of second person pronouns.

EX +0.032 Proportion of exclamation.


Classification - Results

Prior methods [Rayana et al 2015]

0.0 0.2 0.4 0.6 0.8 1.0False Positive Rate

0.0

0.2

0.4

0.6

0.8

1.0

Tru

e P

osi

tive R

ate

B+T ROC (AUC = 0.65)

B ROC (AUC = 0.67)

T ROC (AUC = 0.55)

Random

0.0 0.2 0.4 0.6 0.8 1.0Recall

0.0

0.2

0.4

0.6

0.8

1.0

Pre

cisi

on

B+T Precision-Recall curve

B Precision-Recall curve

T Precision-Recall curve


Classification - Results

Linear regression

0.0 0.2 0.4 0.6 0.8 1.0False Positive Rate

0.0

0.2

0.4

0.6

0.8

1.0

Tru

e P

osi

tive R

ate

B+T ROC (AUC = 0.70)

B ROC (AUC = 0.68)

T ROC (AUC = 0.60)

Random

0.0 0.2 0.4 0.6 0.8 1.0Recall

0.0

0.2

0.4

0.6

0.8

1.0

Pre

cisi

on

B+T Precision-Recall curve

B Precision-Recall curve

T Precision-Recall curve


Case studies

Table: Case study: representative reviews (the codes under the site namesindicate detected pa�erns)

Representative reviews

Yelp

CR: P

AR: B

FR: B

LR: D

(5 stars)... really was awesome to be there. I don’t knowwhy people are complaining, ...

(5 stars) Ignore the negative reviews... that part was funin itself!(5 stars) ... I don’t know why people are complaining, theydon’t even have to have it opened, but they do. Enjoy it!

(5 stars) ... parking is FREE... they have items on displayfrom $100,000 and more to magnets of the cast for $8.00...


Case studies

Table: Case study: representative reviews (the codes under the site namesindicate detected pa�erns)

Representative reviews

Foursquare

CR: B

AS: D

HPSR: P

NSR: B

Waste of a trip!

They are way over priced on everything, including therefrancised items from the show.Extremely overpriced, they got famous on TV and nowscrew everyone with high prices!

An exhilirating experience. I find going to dumps andalmost ge�ing murdered exhilirating.

Waste of time‼!


Conclusion

MotivationCombine information across multiple sites

Proposed a bi-level frameworkMacroscopic to Microscopic

MacroscopicSingle-site pa�erns

Cross-site pa�erns

Human annotation

MicroscopicClassifications (Prior models and Linear Regressions)

Case studies


interpretable and effective opinion spam detection via temporal...

Documents