ieee/acm asonam 2014, beijing, china

29
Mention-anomaly- based Event Detection and Tracking in Twitter Adrien Guille & Cécile Favre ERIC Lab, University of Lyon 2, France IEEE/ACM ASONAM 2014, Beijing, China August 20, 2014

Upload: stan

Post on 06-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Mention- a nomaly - based E vent D etection and T racking in T witter Adrien Guille & Cécile Favre ERIC Lab , University of Lyon 2, France. IEEE/ACM ASONAM 2014, Beijing, China. What is Twitter & why study it ?. Twitter : micro- blogging service 140-character messages - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: IEEE/ACM ASONAM 2014, Beijing, China

Mention-anomaly-based Event

Detection and Tracking in

Twitter

Adrien Guille & Cécile FavreERIC Lab, University of Lyon 2,

FranceIEEE/ACM ASONAM 2014, Beijing,

ChinaAugust 20,

2014

Page 2: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

2

What is Twitter & why study it?

Twitter: micro-blogging service 140-character messages

Ever growing number of Twitter users Pro: Timely source of information Con: Information overload

How can we use Twitter for automated event detection and tracking?

August 20, 2014

Page 3: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

3

Related Work

Idea: spot bursty patterns Term-weighting-based approaches

Peaky Topics [Shamma11], Trending Score [Benhardus13]

Possible ambiguity, lack of context Topic-modeling-based approaches

On-line LDA [Lau12], ET-LDA [Yuheng12] Lack of scalability

Clustering-based approaches EDCoW [Weng11], TwEvent [Li12], ET [Parikh13] Noisy event descriptions

August 20, 2014

Page 4: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

4

Issues & Proposal

August 20, 2014

Shortcomings of existing methods Event duration is a fixed parameter Only the textual content of tweets is considered

We propose a novel approach and method that Dynamically estimate each event duration Exploit the social aspect of tweet streams through mentions

Page 5: IEEE/ACM ASONAM 2014, Beijing, China

5

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Proposed Method

August 20, 2014

Page 6: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

6

Problem Formulation

Input Corpus C containing N

tweets partitioned into n time-slices

Vocabularies V and V@

Output The k most impactful events

August 20, 2014

Event: A bursty topic and a value Mag translating its magnitude of impact

Bursty Topic: A time interval I, a main term t, a set S of weighted related terms

Page 7: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

7

Overview of the proposed method

August 20, 2014

Two-phase flow 1: Analyse the mention

frequency of each word in V@ to detect events (Mag,I,t,Ø)

2: Select related words and generating the final list of the k most impactful events while controling redundancy

MABED, Mention-Anomaly-Based Event Detection

Page 8: IEEE/ACM ASONAM 2014, Beijing, China

8

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

PHASE 1

Proposed Method

August 20, 2014

Page 9: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

9

Detecting Events with Mention Anomaly

August 20, 2014

Computing the anomaly at a point i for word t Requires computing the expected volume

of tweets containing at least one mention and t, at i

Normal distribution: Expectation: Anomaly:

Measuring the magnitude of impact Integrating anomaly:

Page 10: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

10

Detecting Events with Mention Anomaly

August 20, 2014

For each word t in V@

Solve a « Maximum Contiguous Subsequence Sum » type of problem:

Eventually, each event is described by A main word t A period of time I The magnitude of its impact Mag

Page 11: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

11

Detecting Events with Mention Anomaly

August 20, 2014

Example

Page 12: IEEE/ACM ASONAM 2014, Beijing, China

12

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

PHASE 2

Proposed Method

August 20, 2014

Page 13: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

13

Selecting Words Describing Events

August 20, 2014

Identifying candidate words Set of p words that co-occur the most with t

during I Selecting the most

relevant words Measure the

similarity between candidate words and the main word frequency [Erdem12]

Apply a threshold θ

Page 14: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

14

Selecting Words Describing Events

August 20, 2014

Example

Page 15: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

15

Generating the List of Top k Events

August 20, 2014

Event graph & redundancy graph

Detecting duplicated events Connectivity of main terms in the event graph Overlap between intervals, threshold σ

Merging duplicated events Identifying connected components in the

redundancy graph

Page 16: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

16

Generating the List of Top k Events

August 20, 2014

Example

Page 17: IEEE/ACM ASONAM 2014, Beijing, China

17

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Evaluation

August 20, 2014

Page 18: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

18

Experimental Setup

August 20, 2014

Corpora C(en): 1,437,126 tweets published in

November 2009 C(fr): 2,086,136 tweets published in March

2012 Baselines for comparison

Trending Score (TS) [Benhardus13] and ET [Parikh13]

α-MABED Parameter setting

(α-)MABED: 30-min time-slices, p=10, θ=0.7, σ=0.5

Trending Score, ET: 1-day time-slices

Page 19: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

19

Evaluation Metrics

August 20, 2014

Manual annotation Two human annotators judging the significancy

of the top 40 events detected by each method (κ = 0.72)

Precision Significant events / All detected events

Recall Distinct significant events / All detected events

DERate [Li12] Duplicated events / Significant events

Page 20: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

20

Quantitative Evaluation

August 20, 2014

Performance of the five methods on the two corpora

Page 21: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

21

Quantitative Evaluation

August 20, 2014

Impact of σ on MABED

Page 22: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

22

Qualitative Evaluation

August 20, 2014

Improved readability Excerpt of the list of events detected in C(en) by MABED

Page 23: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

23

Qualitative Evaluation

August 20, 2014

Improved temporal precision & reduced redundancy

Importance of dynamically estimating events duration Politics-related events

tend to be discussed longer [Romero11]

Page 24: IEEE/ACM ASONAM 2014, Beijing, China

24

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Included in the open-source social media data mining tool SONDY [Guille13]

http://mediamining.univ-lyon2.fr/people/guille/mabed.php

Implementation

August 20, 2014

Page 25: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

25

Time-oriented Interface

August 20, 2014

Page 26: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

26

Impact-oriented Interface

August 20, 2014

Page 27: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

27

Topic-oriented Interface

August 20, 2014

Page 28: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

28

Conclusion & Future Work

August 20, 2014

Propose a novel approach and method for detecting events in Twitter

Verified hypothesis Considering mentions helps detecting significant

events Experimental results on two different datasets

demonstrate the accuracy and the robustness of the proposed method

Future work More features to model discussions between

users

Page 29: IEEE/ACM ASONAM 2014, Beijing, China

A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

29

References

August 20, 2014

[Shamma11] D. A. Shamma, L. Kennedy, and E. F. Churchill, “Peaks and persistence: modeling the shape of microblog conversations,” in CSCW, 2011

[Benhardus13] J. Benhardus and J. Kalita, “Streaming trend detection in twitter,” IJWBC, vol. 9, no. 1, 2013

[Lau12] J. H. Lau, N. Collier, and T. Baldwin, “On-line trend analysis with topic models: #twitter trends detection topic model online,” in COLING, 2012

[Yuheng12] H.Yuheng, J.Ajita, D.S.Dorée, and W.Fei, “What were the tweets about? topical associations between public events and twitter feeds,” in ICWSM, 2012

[Weng11] J. Weng and B.-S. Lee, “Event detection in twitter,” in ICWSM, 2011

[Li12] C. Li, A. Sun, and A. Datta, “Twevent: Segment-based event detection from tweets,” in CIKM, 2012

[Parikh13] R. Parikh and K. Karlapalem, “Et: events from tweets,” in companion WWW, 2013

[Erdem12] O. Erdem, E. Ceyhan, and Y. Varli, “A new correlation coefficient for bivariate time-series data,” in MAF, 2012

[Guille13] A. Guille, C. Favre, H. Hacid, and D. Zighed, “Sondy: An open source platform for social dynamics mining and analysis,” in SIGMOD, 2013