temporal and semantic analysis of richly typed social networks from user-generated content sites on...

97
Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web Zide Meng, Supervisor: Fabien Gandon, Catherine Faron Zucker 1

Upload: zide-meng

Post on 13-Apr-2017

152 views

Category:

Presentations & Public Speaking


3 download

TRANSCRIPT

Page 1: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

1

Temporal and semantic analysis of richly typed social networks from

user-generated content sites on the web

Zide Meng, Supervisor: Fabien Gandon, Catherine Faron Zucker

Page 2: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

2

Page 3: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

Question Title

Question Content

Question User

Question Comments

Answer Content

Answer User

Question Tags

Answer Votes

Question Votes

3

Page 4: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

4

Some facts about Q&A sites

• Traffic statistics in Oct. 2016 from Quantcast.com– 49.9M unique devices visit Stackoverflow– 52.9M unique devices visit Answer.com– 3.9M unique devices visit YahooAnswer

• to compare:– 211.8M unique devices visit Youtube.com– 147.6M unique devices visit Facebook.com

1/41/3

Page 5: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

5

Page 6: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

6

Site info of StackOverflow.com

• total question: 12.7M• unanswered: 3.5M• total answers: 20.1M• total user: 6.2M• question/min: 2.93• answer/min: 4.66

https://api.stackexchange.com/docs/info#filter=default&site=stackoverflow&run=true(accessed 2 Nov 2016)

Page 7: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

7

what is Q&A site?

Page 8: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

8

Detect topics and activities of users

how to export jar?

Topic detection

Temporal analysis

Page 9: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

9

Detect topics and activities of groups

temporal dynamics

community detection

Page 10: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

10

Toward the functionalities

• Expert identification• Question routing• Community evolution• Burst Topic detection• Event detection• etc.

Page 11: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

11

Research Question(RQ1)• How can we formalize user-generated

content?

Page 12: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

12

Research Question(RQ2)• How can we identify the common topics

binding users together?

Page 13: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

13

Research Question(RQ3)• How can we detect topics-based overlapping

communities?

Page 14: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

14

Research Question(RQ4)• How can we generate a semantic label for

topics?

Java Development

Database

Page 15: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

15

Research Question(RQ5)• How can we extract topics-based expertise

and temporal dynamics?

Page 16: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

16

Agenda• Backgrounds & Motivation• RQ1: Formalize user-generated content• RQ2: An efficient topic modeling method• RQ3: Overlapping community detection• RQ4: From a BOW to semantic labels• RQ5: Temporal Topic Expertise Activity• Conclusions and Perspectives

Page 17: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

17

Q&A Data Open Data Other Data

Schema Mapping

DataEnrichment

DataInter-Linking

Integrated DataSet

Applications

Data Preparation

Data Integration

Data Analysis

Communitydetection

Topic Extraction

TemporalAnalysis

Application

Overview

Page 18: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

18

Q&A Data Open Data Other Data

Schema Mapping

DataEnrichment

DataInter-Linking

Integrated DataSet

Applications

Data Preparation

Data Integration

Data Analysis

Communitydetection

Topic Extraction

TemporalAnalysis

Application

RQ1: how to formalize UGC?

RQ1 RQ4

RQ3 RQ2 RQ5

Page 19: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

19

massive unstructured Q&A content

Page 20: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

20

Formalize UGC with semantic schema

• From unstructured to structured• Explicit information (questions, answers…) • Implicit Information (interest, expertise…)

OriginalQ/Adata

Q/ATriples

SIOC & FOAF

User Expertise

User InterestQASM

Information Extraction

DataMining

existing work

Page 21: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

21

QASM vocabulary

Page 22: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

22

Formalize distribution

Page 23: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

RQ1 Discussion

we propose the QASM vocabulary to formalize both explicit information and implicit information for user-generated content (Q&A sites)

How to extract implicit information from the original explicit information?

Page 24: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

24

Q&A Data Open Data Other Data

Schema Mapping

DataEnrichment

DataInter-Linking

Integrated DataSet

Applications

Data Preparation

Data Integration

Data Analysis

Communitydetection

Topic Extraction

TemporalAnalysis

Application

RQ2: How to find topics?

RQ1 RQ4

RQ3 RQ2 RQ5

Page 25: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

25

what kind of topics are they talking about?

Page 26: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

26

what is a Topic?• e.g. Given two documents on topic “music” and

“cooking”• “guitar” and “singer” are more likely to appear in

document about “music”• “receipt” and “pizza” are more likely to appear in

document about “cooking”• “the” and “a” will appear equally in both• score (0 ~ 1) -> relevance (weak ~ strong)• Topic “music” :

“guitar” ”singer” ”receipt” ”pizza” ”the” ”a”0.4 0.3 0.1 0.1 0.05 0.05

Page 27: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

27

Latent Dirichlet Allocation (LDA)

Page 28: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

28

Replace Document with User

• Original LDA: Document-Topic-Word• In our problem: User-Topic-Tag

Page 29: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

29

How to get topic assignment?• User, Tag are observed information• Topic is hidden information

P(topic|user)

P(tag|topic)

Page 30: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

30

how to get the distributions?

• Gibbs sampling

Sample a new Zi

Update distributions

User-Topic distribution

Topic-Tag distribution

θuk

θkw

Page 31: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

31

Intuition behind LDA

• How to create a user tag list

Page 32: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

32

Topic-Tag distribution

User-Topic distribution

Loop: choose a topic, choose a tag

csshtml eclipse, , , ……mysql , layout

……

……

Page 33: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

33

Output of LDA (User-Topic-Tag)

Web Development

Java Development

Database

0.3 0.6 0.1

Java mysql tomcat html

0.1 0.1 0.2 0.6

Java mysql tomcat html

0.6 0.1 0.2 0.1

User-Topic Distribution

Topic-Tag Distribution

java mysql tomcat html

0.1 0.6 0.2 0.1

Page 34: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

Short summary

• Goal: We want to find topics, overlapping communities, user expertise, user activities…..

• Method: LDA may solve the problems• But LDA has problems, e.g. Slow & Complex

• Find an Efficient & Simple topic modeling method

34

Page 35: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

35

Some empirical findings

-> Find topics based on tags

High frequent tags are more general

The first tag normally indicate the domain

Each question has 1~5 tags indicating the key points

Page 36: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

36

Solution: Prefix Tree structure

layout

html

css

Q1: html css element

Q2: html layout float

Q3: html layout css-layout

Q4: html forms select input

Q5: html forms autocomplete

forms

element float css-layout select auto complete

input

1, Root tag can be used to represent the children tags2, Tags in a tree belong to the same topic3, The order is maintained in the tree structure

Page 37: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

37

html

HTML prefix tree for StackOverflow

Page 38: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

38

Combine Prefix Trees • why: some trees should be in the same topic• how: compute root tag similarity matrix• output: combine trees to get topics

layout

html

mysql forms tomcat

java

mysql jvm

cssjavascript

jquery

json

0.89

0.35

0.29

similarity html javascript java

html 1 0.89 0.35

javascript 0.89 1 0.29

java 0.35 0.29 1

Page 39: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

39

How to get the topic-tag distribution?

layout:10

html:50

mysql:20 forms:20

Topic1: Web-dev :100

javascript:50

layout:0.1

html: 0.5

mysql: 0.2 forms:0.2

Topic1: Web-dev

javascript:0.5

• MLE (Maximum Likelihood Estimation)

Page 40: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

40

Topic-tag distribution

probability

tags

topics

sql database

highly related

eclipse

not related

Page 41: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

41

Topic Extraction experiment setup

• Dataset– Stackoverflow (2008/08 to 2009/09)– 103K users– 242K questions and 870K answers

• Baseline Algorithms– LDA (latent Dirichlet Algorithm)

Page 42: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

42

Topic extraction evaluation metric

• Metric: Perplexity– how likely a model would generate the test dataset

• Example:

Training Set

html width css

0.9 0.05 0.05

html width css0.4 0.4 0.2

test case1: html width csstest case2: html widthtest case3: html widthtest case4: html width css

less likely

more likely

higher Probabilities to generate test dataset Lower Perplexity Score Better Performance

model 1

model 2

???

Page 43: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

43

Perplexity Score compared with LDALower is better

LDA WE

Page 44: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

44

Scalability compared with LDA

LDA

Page 45: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

RQ2 Discussion• If two tags co-occur many times, they should be in the

same topic (In the same topic tree in our method)• The probability of a tag to a topic is approximated to its

frequency in that topic if the observed data is large enough!

• Question tag list is short (3~5 tags), which is not suitable for LDA to get very good results

• Test on Flickr dataset, it can also generate meaningful topics

• TTD: a simple and efficient topic modeling method while preserving topic quality

Page 46: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

46

Q&A Data Open Data Other Data

Schema Mapping

DataEnrichment

DataInter-Linking

Integrated DataSet

Applications

Data Preparation

Data Integration

Data Analysis

Communitydetection

Topic Extraction

TemporalAnalysis

Application

RQ3: How to detect communities?

RQ1 RQ4

RQ3 RQ2 RQ5

Page 47: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

47which users are interested in the same topic?

Page 48: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

48

Existing community detection approaches

we focus on simple, efficient, topic-based, overlapping community detection method

Page 49: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

49

how to get user-topic distribution?

jspJava tomcat*22, *15, *11, ……html *9,

0.10Web-Dev

Java-Dev

C#-Dev

0.50

0.05

0.20

0.20

0.05

0.30

0.25

0.15

0.30

0.05

0.10

*22+

*22+

*22+

*15+

*15+

*15+

*11+

*11+

*11+

*9 =

*9 =

*9 =

11.2

17.2

4.4

• topic-tag distribution + user tag list

Topic-tag distribution from the last step

Page 50: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

50

User-topic distributionhigh interest

users

topicslow interest

user12960

web-dev

java-dev

Page 51: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

51

How to find overlapping communities• use user-topic distribution• each topic represent each community

0.75

0.32

0.42

0.15

0.78

0.23

Web-Dev

Java-Dev

C#-Dev

0.15

0.78

0.82

For example, threshold : 0.3

Web-Dev C#-DevJava-Dev

Page 52: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

52

Overlapping Community Detection experiment setup

• Dataset– Stackoverflow (2008/08 to 2009/09)– 103K users– 242K questions and 870K answers

• Compared Algorithms– SLPA (Label Propagation Algorithm)– LDA (latent Dirichlet Algorithm)– Ward (Hierarchical clustering algorithm)

Page 53: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

53

Evaluation metrics

• metric: Jaccard Similarity & Cosine Similarity– avg_inner– avg_rand– avg_center– nmi=avg_inner/avg_rand

avg_inner

avg_rand

avg_center

Users in a community are more close

Users in a community are far away from outside

Users in a community are closer to center

the larger the better

Page 54: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

54

Jaccard Similarity

avg_inner avg_rand

avg_center nmi

WE

WE

WE

WE

Page 55: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

55

Cosine Similarity

avg_inner avg_rand

avg_center nmi

WE

WE

LDA win, but LDA has “sum to 1” restriction

LDA

LDA

Page 56: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

56

RQ3 Discussion

• Both Our method and LDA can detect topic based communities, graph based method and clustering method can not.

• Compared with LDA, our method does not have a “sum-to-1” restrictions (a high interest in a community does not necessarily lower the interest in another community)

• Our method is simple and efficient compared with LDA

Page 57: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

57

Q&A Data Open Data Other Data

Schema Mapping

DataEnrichment

DataInter-Linking

Integrated DataSet

Applications

Data Preparation

Data Integration

Data Analysis

Communitydetection

Topic Extraction

TemporalAnalysis

Application

RQ4: How to label topics?

RQ1 RQ4

RQ3 RQ2 RQ5

Page 58: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

58

Bag of words is hard to view and manage

Topic 1

Topic 2

Page 59: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

59

Existing topic labeling approaches

Page 60: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

60

Using External Knowledge: Link to DBpedia

Java

http://dbpedia.org/resource/Java_(programming_language)

For example

http://dbpedia.org/resource/Java

Disambiguation

Page 61: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

61

Disambiguation

Java

Tag Description in StackOverflow

Java

Java (Place) Description

Java (P.L.) Description

0.31

0.58

Content Cosine Similarity

• Method1: Babelfy (Moro et al. 2014)• Method2: DBpedia Lookup service

Page 62: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

62

S

Find missing links and resourceshttp://dbpedia.org/resource/Apache_Tomcat

http://dbpedia.org/resource/Java_p.l.

http://dbpedia.org/resource/eclipse

http://dbpedia.org/resource/j2ee

Sparql queries to DBpedia

Page 63: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

63

Find the center

Java Development

Algorithms to find central node

the central node

Page 64: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

64

Algorithms to find the center

• InDegree (ID)• Betweenness Centrality (BC)• Degree Centrality (DC)• Page Rank (Page 1999) (PR)• Random• Top (sorted topic-tag distribution)• Most (most selected by above algorithms)

Page 65: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

65

Topic Labeling experiment: Survey

Page 66: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

66

Evaluation: NDCG • Normalized Discount Cumulative Gain– used to evaluate two ranked list– Perfect match: NDCG=1.0– Completely wrong: NDCG=0.0

HTML 10Firefox 8Web-Development 7CSS 1Brower 0

Ground Truth: Survey Result

HTML 0.81Web-Development 0.52Firefox 0.31CSS 0.02Brower 0.01

Algorithm output

Page 67: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

67

Experiment:NDCG

NDCG=1 : perfect match

Page 68: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

68

RQ4 Discussion

• many words can not link to DBpedia• 1 label is not enough, 2 or 3 labels is much better• By using external knowledge base DBpedia, we

propose a method to automatically generate semantic label for bag of words.

Page 69: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

69

Q&A Data Open Data Other Data

Schema Mapping

DataEnrichment

DataInter-Linking

Integrated DataSet

Applications

Data Preparation

Data Integration

Data Analysis

Communitydetection

Topic Extraction

TemporalAnalysis

Application

RQ5: How to model temporal?

RQ1 RQ4

RQ3 RQ2 RQ5

Page 70: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

70

who is active now?on which topic is he active?in which topic does he have expertise?

Page 71: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

71

Related Work

Page 72: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

72

LDA -> TTEATemporal

Expertise

Activity

Page 73: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

73

TTEA Model details

How to get topic assignment?

User-Topic distribution

Topic-Word distribution

Topic-Tag distribution

Topic-Time distribution

Topic-Expertise distribution

Page 74: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

74

How to get the distributions?

• Gibbs sampling

Sample a new Zi

Update distributions

Page 75: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

Intuition behind TTEA (Temporal)

Topic-Tag distribution

User-Topic distribution

csshtml eclipse, , , ……mysql , layout

……

……

75

Topic-Time distribution

June 2016

Page 76: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

Intuition behind TTEA (Expertise)

Topic-Tag distribution

User-Topic distribution

csshtml eclipse, , , ……mysql , layout

……

……

76

Topic-Exp distribution

June 2016 52

Page 77: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

77

Experiments and Evaluations

• StackOverflow dataset (07/2008-11/2013)

Page 78: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

78

Topic Extraction experiment setup

• Baseline Algorithms– TEM (Yang 2013b) : topic, expertise– UQA (Guo 2008b) : topic, categories– GrosToT (Hu 2014): topic, temporal– TTEA (our): topic, expertise, activity, temporal

Page 79: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

79

• For each question in test dataset, we recommend 5,10,20,30 users

• MSC: the number of successful prediction

Question Routing Task

TTEA-ACT TEM UQA GROSTOT RANDOM0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

msc@5msc@10msc@20msc@30

WE

Page 80: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

80

Best answer prediction

• when we recommend 100 users (out of 6.2M users) for each testing questions, in around 44% cases we have one user not only answering the question, but also winning the highest vote.

Page 81: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

81

Temporal illustrations

Month

Day

Hour

User in same topic behavior different

Use

r in

diffe

rent

topi

c

global level user level

Page 82: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

82

RQ5 Discussion

• TTEA: an extended LDA model to extract expertise, activity, and temporal dynamics.

• Extracted information could benefit question routing, expert detection tasks.

Page 83: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

83

Agenda• Backgrounds & Motivation• RQ1: Formalize user-generated content• Apply LDA on User generated content• RQ2: An efficient topic modeling method• RQ3: Overlapping community detection• RQ4: From a BOW to semantic labels• RQ5: Temporal Topic Expertise Activity• Conclusions and Perspectives

Page 84: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

84

Overview of contributions

• temporal and semantic analysis of richly typed social networks from user-generated-content sites on the web

• key points:– temporal analysis– semantic analysis– social networks–user-generated content

Community/topic evolution

Topic Extraction

Community Detection

Question Answer site

Page 85: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

85

Detailed answers to questions

• RQ1: QASM: formalize implicit and explicit content

• RQ2: TTD: a simple and fast topic modeling method

• RQ3: a TTD based overlapping community detection method

• RQ4: A DBpedia based topic labeling method• RQ5: TTEA: joint model Topic Temporal Expertise

and Activity

Page 86: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

86

Limitations & Perspectives

• RQ1: How to formalize UGC?– formalize single platform v.s. cross platform

• RQ2: How to detect topics?– automatically generate tag from content

• RQ3: How to find overlapping communities– combine graph-based community detection

• RQ4: How to generate Labels for BOW?– use extra knowledge base or create links

• RQ5: How to extract temporal and expertise?– use all the extracted information to provide more function

Page 87: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

87

Publications• Zide Meng, Fabien L. Gandon, Catherine Faron-Zucker, Ge Song. Detecting topics and

overlapping communities in question and answer sites. Journal of Social Network Analysis and Mining. 2015

• Zide Meng, Fabien L. Gandon, Catherine Faron-Zucker: Overlapping Community Detection and Temporal Analysis on Q&A Sites. Journal of Web Intelligence and Agent Systems 2016. (to appear)

• Zide Meng, Fabien L. Gandon, Catherine Faron-Zucker: Joint model of topics, expertises, activities and trends for question answering web applications. IEEE/WIC/ACM 2016 (to appear)

• Zide Meng, Fabien L. Gandon, Catherine Faron-Zucker: Simplified detection and labeling of overlapping communities of interest in Q&A sites. IEEE/WIC/ACM Web Intelligence 2015

• Zide Meng, Fabine L. Gandon, Catherine Faron-Zucker. QASM: a Q&A Social Media system based social semantics. ISWC 2014.

• Zide Meng, Fabien L. Gandon, Catherine Faron-Zucker, Ge Song: Empirical study on overlapping community detection in question and answer sites. ASONAM 2014: 344-348

• Jean-Michel Dalle, Catherine Faron-Zucker, Fabien L. Gandon, Mathieu Lacage, Zide Meng: Online Knowledge Triage: Searching, Detecting, Labelling and Orienting User Generated Content. WWW (Companion Volume) 2016

Page 88: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

88

Thank you! [email protected]

Page 89: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

89

• check when you use TAG and when you use WORD

• read reports of your reviewers and prepare answers : get ready

• documents of defense

Page 90: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

90

Page 91: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

91

Page 92: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

92

Evaluations (1/4)Topic -> Perplexity

WEWEWEWE

number of topics

perp

lexi

ty sc

ore

Page 93: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

93

Topic

Topic over tag distribution

0.75

0.23

0.02

Html

css

eclipse

very related

not related

webdevelopment

Page 94: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

94

Temporal

Topic over time distribution

0.75

0.23

0.02

May

June

July

very popular

not popular

webdevelopment

Page 95: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

95

Expertise

User’s Expertise over topic distribution

0.75

0.23

0.02

Web-Dev

Java-Dev

C#-Dev

has high expertise

has low expertise

Page 96: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

96

Activity

Topic over user distribution

0.75

0.23

0.02

very active

not active

webdevelopment

Page 97: Temporal and semantic analysis of richly typed social networks from user-generated content sites on the web

97