quantitative study of innovation and knowledge building in questions&answers system with math...

Post on 07-Aug-2015

31 Views

Category:

Science

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Quantitative Study of Innovation and KnowledgeBuilding in Questions&Answers System with

Math Tags

Marija Mitrovic Dankulov, Bosiljka Tadic

Scientific Computing Laboratory, Institute of Physics Belgrade

University of Belgrade, Pregrevica 118, 11080 Belgrade

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Collective Knowledge Building

Socio-cultural process which takes place trough self-organizeddynamics of interactions among individualsConditions that support collective knowledge building:

(i) Problems as an attempt to understand world/field.

(ii) Improving coherence, quality and utility of ideas.

(iii) Interaction - participants negotiate fit between theirown ideas and of others.

(iv) All participants must contribute.

(v) Knowledge-building discourse, more than knowledgesharing;. participants engage in constructing, refining andtransforming knowledge.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Questions & Answers Sites

Rich repositories for studying dynamics of collective knowledgebuilding

On Q&A sites:

Participants ask, answer and vote for questions.

Comment and engage in discussion aboutquestions/answers.

All participants contribute trough different type of actions:posting and voting for questions, answers, comments. Theyconstruct (ask/answer), refine (comment/vote) andtransform knowledge.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Data: Stack Exchange

Stack Exchange: where expert answers to your questions!Network of 130 Q&A sites where participants answers toinformational and factual questions.Mathematics:

Data for four year period: since the beginning (July 2010)until April 2014.

Rich dataset: 77895 Users posted 269819 Questions,400511 Answers and 1265445 Comments.

High temporal resolution.

Tags - list of up to 5 tags is assigned to each question.Overall 1040 tags: calculus, linear algebra, complexanalysis, application, . . .

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Quantitative study of knowledge building:methods

Tools and methods from statistical physics and complexnetwork theory.

Complex networks - topological structure.

Entropy measures of user activity and activity ondifferent tags.

Time series analysis - power spectrum, avalanches,fluctuations.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Network mapping

Weighted bipartite network

Two partitions: Users andQuestions.

Link weight: number ofanswers/comments.

Structural properties ofbipartite network and it’sprojections to Question andUser partitions.

[M. Mitrovic et al., EPJB 73,

293-301, (2010).]

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Topology

Broad distributions of degree for both partitions stable overtime and tags.

100 101 102

s10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

P(s

)

1st year2nd year3rd year4th year

100 101 102 103 104

s10-8

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(s

)

Users1st year2nd year3rd year4th year

100 101

q10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(q

)

homeworkcalculusreal-analysislinear-algebra

100 101 102 103

q10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(q

)

homeworkcalculusreal-analysislinear-algebra

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Community structure

2 week activity network.Community detectionmethod - Louvainmethod. [V. D. Blondel,

JSTAT 2008 (10), P100,

(2008).]

Communities are formedaround few very activeexperts.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Focus and expertise of users

0 1 2 3 4 5 6 7 8

H

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

number of users

Questions

0 1 2 3 4 5 6 7 8 9

H

0.00

0.05

0.10

0.15

0.20

0.25

numberofusers

Answers+Comments

User activity on separatetags - Xi = n1, . . . , nmax;Total activity Σi =

∑l ni

User’s entropy -Hi = −

∑lnlΣi

Lower Hi higher focus.

[Adamic et al., Proceedings

of WWW’08, (2008).]

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Zipf’s and Heap’s law

Heap’s law

100 101 102 103 104 105 106 107

N

100

101

102

103

104

105

D(N

)

TagsCombination of Tags

D(N) ∼ N−β ; β < 1 sublinear

growth (β = 0.27 (Tags) &

β = 0.92 (Combination of Tags)

Zipfs’s law

100 101 102 103 104 105

R

100

101

102

103

104

105

106

f(R)

TagsCombination of Tags

f(R) ∼ R−α; α = 1.47 (Tags) &

α = 1 (Combination of Tags)

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Entropy of events associated to Tag

100 101 102 103 104 105 106

K

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

S/log

(K)

datareshuffle

K - number of occurrence ofTag.

Ψ - sequence of eventsdivide into K equalintervals; fl is the numberof occurrence of Tag ininterval l;

S = −∑K

l=1flK log( flK )

S = 0 all events are in oneinterval; Smax = log(K)events are equallydistributed.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Power spectrum

Power spectrum is of type 1f for small frequencies - long term

correlations.

100 101 102 103 104

s10-1100101102103104105106107108109

P(s

)

p(t)

binned

100 101 102 103 104

s100101102103104105106107108109

10101011

P(s

)

Na(t)

binned

100 101 102 103 104

q10-1100101102103104105106107108109

P(q

)

homeworkbinned

100 101 102 103 104

q10-1100101102103104105106107108

P(q

)

calculusbinned

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

p(t

)

New users

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

Na(t

)

all

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

N(t

)

homework

0 10000 20000 30000 40000 50000 60000

t[10min]

0

5

10

15

20

25

N(t

)

calculus

[M. Mitrovic et al., JSTAT 2011, P02005, (2011).]

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Avalanche distribution

Time series of events N(t) ⇒ time series of avalanches Si.

100 101 102 103

S

10-7

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(S

)

allhomeworkcalculus

100 101 102

T

10-6

10-5

10-4

10-3

10-2

10-1

100

101

P(T

)allhomeworkcalculus

78000 78500 79000 79500 80000

t

2

4

6

8

10

12

14

N(t)

time series of events

Broad distributions of avalanche sizes and duration ⇒self-organized criticality (SOC).

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Avalanche size returns

−20 −15 −10 −5 0 5 10 15 20d/σ

10-6

10-5

10-4

10-3

10-2

10-1

100

P(d

)

homeworkcalculus

Return di=Si+1 − Si+∆

P (d) = P0(1− (1− q)( dσ )2)1

1−q

SOC ⇒ peaked distributionwith fat tail.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

INSTITUTE OF PHYSICS

BELGRADEIntroduction

Network topologyEntropy measuresTemporal patterns

Summary

Summary

Collective knowledge building can be studied by applyingmethods of complex networks and statistical physics:

Complex networks - Q&A sites can be used for studyingof dynamics of collective knowledge building process.

Entropy measures - most of the users focus on fewcategories (expertise); tag specific dynamics is highlycooperative process.

Time series analysis - self-organized criticalitymechanism with long-range correlations is at the origin ofcollective knowledge building.

KnowEscape 2014| M. Mitrovic Dankulov: Quantitative study of knowledge building

top related