quantifying and bursting the online filter bubble

56
QUANTIFYING AND BURSTING THE ONLINE FILTER BUBBLE KIRAN GARIMELLA KCL, 13 FEB 2017 1

Upload: kiran-garimella

Post on 21-Feb-2017

49 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Quantifying and Bursting the Online Filter Bubble

1

QUANTIFYING AND BURSTING THE ONLINE FILTER BUBBLEKIRAN GARIMELLAKCL, 13 FEB 2017

Page 2: Quantifying and Bursting the Online Filter Bubble

2HELLO!

2011

2013

2014

BACHELORS &

MASTERS IN

COMPUTER SCIENCE

BARCELONA

DOHA, QATAR

HYDERABAD,INDIA HELSINKI,

FINLAND

RESEARCH ENGINEER

RESEARCH ASSOCIAT

EPHD

ADVISOR: ARISTIDES

GIONIS

EXPECTED: SEPT 2017

Page 3: Quantifying and Bursting the Online Filter Bubble

3

OVERVIEW▸ Motivation▸ Summary of the thesis▸ Shallow dive into one sub-topic

Page 4: Quantifying and Bursting the Online Filter Bubble

4

SOCIAL MEDIA BUBBLE

Page 5: Quantifying and Bursting the Online Filter Bubble

5

FILTER BUBBLE

Page 6: Quantifying and Bursting the Online Filter Bubble

6

ECHO CHAMBERS

Page 7: Quantifying and Bursting the Online Filter Bubble

7

THE POLARIZATION CYCLE

USER HOMOPH

ILY

ALGORITHMIC

PERSONALIZATION

Increased Polarization

Page 8: Quantifying and Bursting the Online Filter Bubble

8

POLARIZATION - TWITTER

Page 9: Quantifying and Bursting the Online Filter Bubble

9

BLOGS

Page 10: Quantifying and Bursting the Online Filter Bubble

10

INSTAGRAM

Page 11: Quantifying and Bursting the Online Filter Bubble

11

US SENATE VOTES

Page 12: Quantifying and Bursting the Online Filter Bubble

12

HOW CAN WE DEAL WITH THE POLARIZATION ON SOCIAL MEDIA?

THIS THESIS

Page 13: Quantifying and Bursting the Online Filter Bubble

13

RESEARCH QUESTIONS1. Identify polarized discussions on social media and

quantify their severity.2. Track evolution of polarized discussions and

understand their properties.3. Design ways to reduce the polarization.

Page 14: Quantifying and Bursting the Online Filter Bubble

14

1. IDENTIFYING AND QUANTIFYING POLARIZED DISCUSSIONS

▸ Using different types of user interactions▸ A. Retweet network▸ B. Reply network

RESEARCH QUESTION I

Page 15: Quantifying and Bursting the Online Filter Bubble

15

1 A. QUANTIFYING CONTROVERSY ON SOCIAL MEDIA [WSDM’16, CSCW’16]

IDENTIFYING AND QUANTIFYING POLARIZED DISCUSSIONS

Page 16: Quantifying and Bursting the Online Filter Bubble

16

QUANTIFYING CONTROVERSY▸ In the wild▸ Not necessarily political controversies▸ Compare across controversies▸ Language independent

Page 17: Quantifying and Bursting the Online Filter Bubble

17

NEED NOT BE POLITICAL …

Page 18: Quantifying and Bursting the Online Filter Bubble

18

NEED NOT BE POLITICAL …

Page 19: Quantifying and Bursting the Online Filter Bubble

19

COMPARING ACROSS CONTROVERSIES

Page 20: Quantifying and Bursting the Online Filter Bubble

20

SOLUTION▸ Graph based formulation▸ Model conversations using a retweet

graph▸ Nodes: users, Edges: retweets

Page 21: Quantifying and Bursting the Online Filter Bubble

21

EXAMPLE

controversial non-controversial

retweet graphs

#beefban #russia march #sxsw #germanwings.

Page 22: Quantifying and Bursting the Online Filter Bubble

22

EXAMPLE

controversial non-controversial

retweet graphs

follow graphs

Page 23: Quantifying and Bursting the Online Filter Bubble

23

PIPELINE

Any Clustering algorithm

• Retweets

• Mentions

• Social network

• Content

• Random walk

• Edge-betweenness

• 2d-embedding

• Sentiment variance

Controversy score

Page 24: Quantifying and Bursting the Online Filter Bubble

24

SENTIMENT VARIANCE▸ Controversy = intensified sentiments▸ Positive and negative sentiments on each side are

higher compared to non-controversial issues▸ Language dependent

Page 25: Quantifying and Bursting the Online Filter Bubble

25IDENTIFYING AND QUANTIFYING POLARIZED DISCUSSIONS

1 B. A MOTIF-BASED APPROACH FOR IDENTIFYING CONTROVERSY [UNDER REVIEW]▸ Use motifs defined on the reply networks

Page 26: Quantifying and Bursting the Online Filter Bubble

26

REPLY NETWORKS

Page 27: Quantifying and Bursting the Online Filter Bubble

27

CONTROVERSIAL NON-CONTROVERSIAL

REPLY NETWORKS

Page 28: Quantifying and Bursting the Online Filter Bubble

28

MOTIFS

Page 29: Quantifying and Bursting the Online Filter Bubble

29RESEARCH QUESTION II

2. POLARIZATION OVER TIME▸ A. How do polarized debates change with interest▸ B. Has polarization on Twitter increased over the

years

Page 30: Quantifying and Bursting the Online Filter Bubble

30POLARIZATION OVER TIME

2 A. HOW DO POLARIZED DEBATES CHANGE WITH INTEREST [UNDER REVIEW]▸ Polarization increases with interest▸ Most retweeting activity occurs within a side▸ Endorsement network becomes more hierarchical

and a large fraction of edges go from periphery to core

▸ Content becomes more similar between the two sides

Page 31: Quantifying and Bursting the Online Filter Bubble

31POLARIZATION OVER TIME

2 B. HAS POLARIZATION INCREASED OVER THE YEARS? [UNDER REVIEW]▸ Are Twitter users less likely to follow/retweet users

from both sides?▸ Are users less likely to use biased content?▸ Large scale study – 700,000 users, 2B tweets, 8

years

Page 32: Quantifying and Bursting the Online Filter Bubble

32RESEARCH QUESTION III

3. REDUCING POLARIZATION▸ A. Reducing Controversy by Connecting

Opposing Views

▸ B. Balancing Information Exposure in Social Networks

Page 33: Quantifying and Bursting the Online Filter Bubble

33REDUCING POLARIZATION

3 A. REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS [WSDM’17]

Page 34: Quantifying and Bursting the Online Filter Bubble

34

POLARIZATION - TWITTER

Page 35: Quantifying and Bursting the Online Filter Bubble

35

HOW CAN WE BRIDGE THE DIVIDE?

THIS PAPER

Page 36: Quantifying and Bursting the Online Filter Bubble

36REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS

▸ Connect the two sides▸ Model interactions as a graph

▸ Retweet graph Nodes: users, Edges: retweets

HOW CAN WE BRIDGE THE DIVIDE?

Page 37: Quantifying and Bursting the Online Filter Bubble

37

▸ Quantify degree of polarization in a network▸ How well does information flow between the two

sides?

MEASURE OF POLARIZATION

Page 38: Quantifying and Bursting the Online Filter Bubble

38

RANDOM WALK CONTROVERSY SCORE▸ Authoritative users exist on both sides of the

controversy▸ How likely a random user on either side is to be

exposed to authoritative content from the opposing side

Page 39: Quantifying and Bursting the Online Filter Bubble

39

RANDOM WALK CONTROVERSY SCORE (RWC)

X Y

Page 40: Quantifying and Bursting the Online Filter Bubble

40

RANDOM WALK CONTROVERSY SCORE (RWC)

X Y

Page 41: Quantifying and Bursting the Online Filter Bubble

41

RANDOM WALK CONTROVERSY SCORE (RWC)

X Y

Page 42: Quantifying and Bursting the Online Filter Bubble

42

RANDOM WALK CONTROVERSY SCORE (RWC)

Page 43: Quantifying and Bursting the Online Filter Bubble

43

RWC SCORE: 0.95RWC SCORE: 0.12

Page 44: Quantifying and Bursting the Online Filter Bubble

44

PROBLEM▸ Given a graph▸ Two sides▸ RWC score

Page 45: Quantifying and Bursting the Online Filter Bubble

45

FIND THE k BEST EDGES TO ADD TO THE GRAPH THAT MAXIMIZE THE REDUCTION IN RWC SCORE

Page 46: Quantifying and Bursting the Online Filter Bubble

46REDUCING POLARIZATION

Side 1 Side 2

REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS

Page 47: Quantifying and Bursting the Online Filter Bubble

47

▸ Greedy▸ Look for all pairs of nodes

▸ Find the k pairs that give the highest reduction in RWC

▸ O(n2), n: number of nodes

ALGORITHMS

Page 48: Quantifying and Bursting the Online Filter Bubble

48REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS

Side 1 Side 2

OUR ALGORITHM

The best edges are between the highest degree nodes

Page 49: Quantifying and Bursting the Online Filter Bubble

49REDUCING CONTROVERSY BY CONNECTING OPPOSING VIEWS

Side 1 Side 2

OUR ALGORITHM

The best edges are between the highest degree nodesO(p2), p << n

Page 50: Quantifying and Bursting the Online Filter Bubble

50

▸ High degree users Highly retweeted users▸ We can not recommend @realDonaldTrump to follow

@BarackObama▸ Not likely to materialize

NOT PRACTICAL

Page 51: Quantifying and Bursting the Online Filter Bubble

51

▸ Take into account the probability of the user liking the recommendation

▸ Not all users are the same▸ Popular users▸ Highly polarized users

▸ Compute polarity scores for users

ACCEPTANCE PROBABILITY

Page 52: Quantifying and Bursting the Online Filter Bubble

52

ACCEPTANCE PROBABILITY

POLARITY SCORE: -0.99

POLARITY SCORE: 0.95

Page 53: Quantifying and Bursting the Online Filter Bubble

53

based on connections

based on retweets

p(u, v) =

ACCEPTANCE PROBABILITY▸ Learn probabilities from data

Page 54: Quantifying and Bursting the Online Filter Bubble

54

DEMO

Page 55: Quantifying and Bursting the Online Filter Bubble

55REDUCING POLARIZATION

Side 1 Side 2

3 B. BALANCING INFORMATION EXPOSURE IN SOCIAL NETWORKS

▸ Find a set of seed nodes that can balance the exposure of information

Page 56: Quantifying and Bursting the Online Filter Bubble

56

THANK YOU!@gvrkiran

[email protected]