structure and growth of online social networks
TRANSCRIPT
![Page 1: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/1.jpg)
Structure and Growth of Online Social Networks
Alan Mislove†‡ Krishna Gummadi† Peter Druschel†
†Max Planck Institute for Software Systems‡Rice University
MSRC Social Networks Workshop07.12.2007
![Page 2: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/2.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Why are social networks interesting?
• Popular way to connect, share content• Photos (Flickr), videos (YouTube), blogs
(LiveJournal), profiles (Orkut)
• Orkut (60 M), LiveJournal (5 M)
• Content organized with user-user links• Akin to Web’s page-page links• Social network structure influences
how content is shared
2
![Page 3: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/3.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Our research agenda
• Observe and understand online social networks• Measure static structural properties• Observe network growth• Characterize information flow
3
• Leverage social networks to build better systems• Trust can be used to solve security problems• Shared interest can improve content location
![Page 4: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/4.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Our research agenda
• Observe and understand online social networks• Measure static structural properties• Observe network growth• Characterize information flow
3
• Leverage social networks to build better systems• Trust can be used to solve security problems• Shared interest can improve content location
![Page 5: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/5.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Computational sociology
• Marrying network measurement with sociology• Able to collect data at massive scale• Bringing measurement techniques to bear on social networks
• Data we have collected:• Structural information on 11.3M users and 328M links• Observed over 2.9M new users join and 24M new links created• Data on over 5M photos and videos
• All data is (or will be) publicly available
4
http://socialnetworks.mpi-sws.org
![Page 6: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/6.jpg)
Alan Mislove07.12.2007 MSRC Social Networks Workshop 5
Analyzing network structure
Part I:
![Page 7: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/7.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove 6
• Sites reluctant to give out data• Cannot enumerate user list• Instead, performed crawls of user graph
• Picked known seed user• Crawled all of his friends• Added new users to list
• Continued until all known users crawled• Effectively performed a BFS of graph
• Challenging to estimate coverage
Measuring online social networks
![Page 8: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/8.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove 6
• Sites reluctant to give out data• Cannot enumerate user list• Instead, performed crawls of user graph
• Picked known seed user• Crawled all of his friends• Added new users to list
• Continued until all known users crawled• Effectively performed a BFS of graph
• Challenging to estimate coverage
Measuring online social networks
![Page 9: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/9.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
High-level data characteristics
• Able to crawl large portion of networks
• Node degrees vary by orders of magnitude• However, networks share many key properties
7
Flickr LiveJournal Orkut YouTube
Number of Users
Avg. Friends per User
![Page 10: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/10.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
High-level data characteristics
• Able to crawl large portion of networks
• Node degrees vary by orders of magnitude• However, networks share many key properties
7
Flickr LiveJournal Orkut YouTube
Number of Users
Avg. Friends per User
1.8 M 5.2 M 3.0 M 1.1 M
![Page 11: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/11.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
High-level data characteristics
• Able to crawl large portion of networks
• Node degrees vary by orders of magnitude• However, networks share many key properties
7
Flickr LiveJournal Orkut YouTube
Number of Users
Avg. Friends per User
1.8 M 5.2 M 3.0 M 1.1 M
12.2 16.9 106.1 4.2
![Page 12: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/12.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Are online social networks power-law?
• Estimated coefficients with maximum likelihood testing• Flickr, LiveJournal, YouTube have good K-S goodness-of-fit
• Similar coefficients imply a similar distribution of in/outdegree• Unlike Web
8
Web [INFOCOMM’99]
Flickr
LiveJournal
Orkut
YouTube
Outdegree γ Indegree γ
2.09 2.67
1.74 1.78
1.59 1.65
1.50 1.50
1.63 1.99
![Page 13: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/13.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
How are the links distributed?
• Distribution of indegree and outdegree is similar• Underlying cause is significant link symmetry
9
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Fracti
on
of
Lin
ks
Fraction of Users
Web outdegree
Web indegree
![Page 14: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/14.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
How are the links distributed?
• Distribution of indegree and outdegree is similar• Underlying cause is significant link symmetry
9
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Fracti
on
of
Lin
ks
Fraction of Users
Web outdegree
Web indegree
Flickr outdegree Flickr indegree
![Page 15: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/15.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Link symmetry
• Social networks show high level of link symmetry• Links in most networks are directed
• High symmetry increases network connectivity• Reduces network diameter
10
Flickr LiveJournal Orkut YouTube
Symmetric Links
![Page 16: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/16.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Link symmetry
• Social networks show high level of link symmetry• Links in most networks are directed
• High symmetry increases network connectivity• Reduces network diameter
10
Flickr LiveJournal Orkut YouTube
Symmetric Links 62% 73% 100% 79%
![Page 17: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/17.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Implications of high symmetry
• High link symmetry implies indegree equals outdegree• Users tend to receive as many links as the give
• Unlike other complex networks, such as the Web• Sites like cnn.com receive many more links than they give
• Implications is that ‘hubs’ become ‘authorities’• May impact search algorithms (PageRank, HITS)
• So far, observed networks are power-law with high symmetry• Take a closer look next
11
![Page 18: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/18.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Complex network structure
• What is the high-level structure of online social networks?• A jellyfish, like the Internet? [JCN’06]
• A bowtie, like the Web? [WWW’00]
• In particular, is there a core of the network?• Core is a (minimal) connected component• Removing core disconnects remaining nodes
• Approximate core detection by removing high-degree nodes
12
![Page 19: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/19.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Complex network structure
• What is the high-level structure of online social networks?• A jellyfish, like the Internet? [JCN’06]
• A bowtie, like the Web? [WWW’00]
• In particular, is there a core of the network?• Core is a (minimal) connected component• Removing core disconnects remaining nodes
• Approximate core detection by removing high-degree nodes
12
![Page 20: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/20.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Does a core exist?
• Yes, networks contain core consisting of 1-10% of nodes• Removing core disconnects other nodes
13
0
0.2
0.4
0.6
0.8
1
10%1%0.1%0.01%
Node
Dis
tribu
tion
inRe
mai
ning
SCC
s
Fraction of Network Removed
1 Node2-7 Nodes
8-63 Nodes
Large SCC
![Page 21: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/21.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Implications of network structure
• Network contains dense core of users• Core necessary for connectivity of 90% of users• Most short paths pass through core• Could be used for quickly disseminating information
• Remaining nodes (fringe) are highly clustered• Users with few friends form mini-cliques• Similar to previously observed offline behavior• Could be leveraged for sharing information of local interest
14
![Page 22: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/22.jpg)
Alan Mislove07.12.2007 MSRC Social Networks Workshop 15
Characterizing network growth
Part II:
![Page 23: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/23.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Observing network growth
• Online social networks growing at rapid pace• Not possible with Web, Internet
• Offers unique opportunity to observe growth• Validate or invalidate existing models• Predict future growth
• Also examined evolution of other complex information networks• Internet topology: CAIDA archives• Wikipedia: Wikimedia archives
16
![Page 24: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/24.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Growth data characteristics
• Crawled social networks repeatedly for months• Observed 1.2M new users and 16.8M new links
• Question: What processes are driving network growth?
17
Observation Period Node Growth Rate Link Growth Rate
Flickr
YouTube
Wikipedia
Internet
![Page 25: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/25.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Growth data characteristics
• Crawled social networks repeatedly for months• Observed 1.2M new users and 16.8M new links
• Question: What processes are driving network growth?
17
Observation Period Node Growth Rate Link Growth Rate
Flickr
YouTube
104 days 242% 455%
36 days 145% 215%
825 days 54% 120%
1,281 days 31% 43%
Wikipedia
Internet
![Page 26: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/26.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Growth data characteristics
• Crawled social networks repeatedly for months• Observed 1.2M new users and 16.8M new links
• Question: What processes are driving network growth?
17
Observation Period Node Growth Rate Link Growth Rate
Flickr
YouTube
104 days 242% 455%
36 days 145% 215%
825 days 54% 120%
1,281 days 31% 43%
Wikipedia
Internet
![Page 27: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/27.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Can reciprocity explain symmetry?
• Yes, over 80% of symmetric links created within 48 hours• Sites often inform users of new incoming links
18
FlickrYouTube
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30
CD
F
Time Between Establishment of Two Halves of Link (days)
![Page 28: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/28.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Who creates and receives new links?
• Links created in proportion to outdegree (preferential creation)
• Links received in proportion to indegree (preferential reception)
• Is this preferential attachment?
19
100
1
0.01
0.0001 1000 100 10 1
Link
s Re
ceiv
ed(n
ew li
nks/
node
/day
)
Indegree
100
1
0.01
0.0001 1000 100 10 1
Link
s Cr
eate
d(n
ew li
nks/
node
/day
)
Outdegree
![Page 29: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/29.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Does proximity matter?
• New friends much closer than preferential attachment predicts• Suggests links created by local rules
20
0 0.2 0.4 0.6 0.8
1
6 5 4 3 2
CDF
Distance (hops)
Predicted by preferential attachment
Observed
![Page 30: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/30.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Does proximity matter?
• New friends much closer than preferential attachment predicts• Suggests links created by local rules
20
0 0.2 0.4 0.6 0.8
1
6 5 4 3 2
CDF
Distance (hops)
0 0.2 0.4 0.6 0.8
1
6 5 4 3 2
CDF
Distance (hops)
YouTube
0 0.2 0.4 0.6 0.8
1
6 5 4 3 2CD
FDistance (hops)
Internet
0 0.2 0.4 0.6 0.8
1
6 5 4 3 2
CDF
Distance (hops)
Wikipedia
Flickr
![Page 31: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/31.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Implications of network growth
• Observed growth of large, complex information networks• 2.9M new users and 24M new links
• Found multiple growth processes at work• Reciprocity leads to high symmetry• Proximity bias leads to high clustering
• Modeling complex network growth• Based on local rules• Can validate or invalidate models with detailed data• Allow verification of systems at arbitrary size
21
![Page 32: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/32.jpg)
Alan Mislove07.12.2007 MSRC Social Networks Workshop 22
Information flow
(ongoing work)
Part III:
![Page 33: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/33.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Information flow
• Examining information flow in social networks• Lead to better information flow
prediction and search algorithms
• Observe content propagating through network• Sample of 500,000 Flickr photos
• Examine different popularity metrics• Obtained history of comments, favorites• Recorded views daily
23
![Page 34: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/34.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
How quickly does content spread?
• 50 % of comments placed within 24 hours
24
0
0.2
0.4
0.6
0.8
1
1 year1 month1 day1 hour1 minute
CD
F
Time after Photo Upload
Comments
Favorites
![Page 35: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/35.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
How quickly does content spread?
• 50 % of comments placed within 24 hours
24
0
0.2
0.4
0.6
0.8
1
1 year1 month1 day1 hour1 minute
CD
F
Time after Photo Upload
Comments
Favorites
![Page 36: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/36.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Rapid information propagation
• Information often propagates along links• Can track propagation
• What enables such rapid information flow?
25
![Page 37: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/37.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Rapid information propagation
• Information often propagates along links• Can track propagation
• What enables such rapid information flow?
25
Apr 06 Jun 06 Aug 06 Oct 06 Dec 06 Feb 07 Apr 07
Po
pu
lari
ty
![Page 38: Structure and Growth of Online Social Networks](https://reader036.vdocuments.mx/reader036/viewer/2022071602/613d73b7736caf36b75d7ef0/html5/thumbnails/38.jpg)
07.12.2007 MSRC Social Networks Workshop Alan Mislove
Summary
• Analyzed network structure and dynamics• Multiple networks have similar, unique characteristics• Consistent growth characteristics
• Many future directions• Examining information flow• Building better systems
• Open to expertise from other areas
26
http://socialnetworks.mpi-sws.org