using behaviour analysis to detect cultural aspects in social web systems
DESCRIPTION
Presented at: -Aston Business School, Birmingham, UK. 2011 -Keynote presentation at Detecting and Exploiting Cultural Diversity on the Social Web Workshop, 20th Annual Conference on Information and Knowledge Management 2011TRANSCRIPT
Using Behaviour Analysis to Detect Cultural Aspects in Social
Web Systems
Dr Matthew RoweKnowledge Media Institute, The Open University,
Milton Keynes, United Kingdom
http://people.kmi.open.ac.uk/rowe | http://www.matthew-rowe.com
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
2
Web 1.0
http://www.flickr.com/photos/complexify/97303317/
• Web of documents• Web presence constrained to HTML ‘experts’• Fixed categories• Static content
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
3
Web 2.0
http://www.flickr.com/photos/9119028@N05/591163479
• Data access through APIs• Collective Intelligence• User generated content• Web presence for all• Tagging
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
4
A Social Web
http://mmt.me.uk/slides/deri20110401/images/walledgardens.jpg
A Social Web System is an online platform that offers a useful service, normally for free, to users, through which they can interact and network
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
5
Example 1
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
6
Example 2
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
7
Δs of Social Web Systems
• Social Web Systems differ in their: – Domain
• Flickr = photos• Facebook = social networking• Twitter = microblogging
– Audience• SAP Community Network = programmers• Slashdot = technology enthusiasts
• How else do they differ? • What are the Δs?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
8
The Utility of Behaviour Analysis
• WeGov– Investigating the role of social networks in eGovernment– Enabling:
• Tracking of political discussions and topics• Injection of policy content to maximise exposure
• ROBUST– Risk and opportunity management in online communities– Enabling
• Assessment of user churn in online communities• Community evolution prediction• Monitoring of community health
• Behaviour analysis is required to understand:– What behaviour drives content creation– How behaviour is associated with community evolution
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
9
Thesis: Microcultures
Social Web Systems contain micro-cultures that differ in terms of
a) user behaviourb) how attention is generatedc) role compositions in such systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
10
Outline
• Analysis 1: Generating Attention– Understanding Attention Factors– Approach– Experiments – Findings
• Analysis 2: Behaviour Role Compositions – Analysing Community Evolution– Approach– Experiments– Findings
• Microcultures: Evidence
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
11
Generating AttentionAnalysis I
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
12
Shared Content
• Social Web Systems are now used to:– Ask questions– Post opinions and ideas– Discuss events and current issues
• Content analysis in online communities is attractive for:– Market analysis– Brand consensus and product opinion
• Social network analytics in the US is predicted to reach $1 billion by 2014 (Forrester 2009)
• Masses of data is now being published in social web systems:– Facebook has more than 60 million status updates per day (Facebook statistics
2010)
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
13
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
14
The Need for Analysis
• Analysts need to know which piece of content will generate the most activity– i.e. the most auspicious or influential– Helps focus the attention of human and computerised
analysts• What to track?
• Need to understand the effect features (community and content) have on attention to content
Which features are key to stimulating activity?How do these features influence activity length?
How do Social Web Systems differ in how attention is generated?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
15
Approach: Attention Prediction
• Two-stage approach to predict attention to content:
1. Identify seed posts• E.g. thread starters on a message board• Will a given post start a discussion?• What are the properties that seed posts exhibit?
– What parameters tend to trigger a discussion?
2. Predict discussion activity levels• From the identified seed posts• What is the level of activity that a seed post will
generate?• What features correlate with heightened activity?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
16
Features
• For each post, model: a) the author, b) the content and c) the topical concentration of the author
• F1: User Features– In-degree, out-degree: social network properties of the author– Post count, age, post rate: participation information of the author
• F2: Content Features– Post length, referral count, time in day: surface features of the
post– Complexity: cumulative entropy of terms in the post– Readability: Gunning Fog index of the post– Informativeness: TF-IDF measure of terms within the post– Polarity: average sentiment of terms in the post
Which features are key to stimulating activity?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
17
Features (2)
• F3: Focus Features– Topic entropy: the concentration of the author across
community forums• Higher entropy indicates a wider spread of forum activity• More random distribution, less concentrated
– Topic Likelihood: the likelihood that a user posts in a specific forum given his post history
• Measures the affinity that a user has with a given forum• Lower likelihood indicates a user posting on an unfamiliar topic
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
18
Social Web Systems: Datasets
• Microblogging Platform: Twitter– Collected a random subset over 24-hour period– Attention measure: length of @reply chain
• Community Message Board: Boards.ie– Analysed all posts and forums in 2006– Attention measure: number of posts in a thread
• Support Forum: SAP Community Network– Attention measure: number of replies to a question
• News-sharing Platform: Digg– Used previous dataset of ‘popular’ stories– Attention measure: number of comments (and replies) to a story
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
19
Experiments
• Experiment 1: Identifying Seed Posts– Will this post yield a reply?– Experiment 1(a): Model Selection
• Which model performs best?– Experiment 1(b): Feature Assessment
• How do features correlate with seed posts?– Datasets: Twitter and Boards.ie
• Experiment 2: Activity Level Prediction– What is the level of activity that seed posts yield?– Experiment 2(a): Model Selection– Experiment 2(b): Feature Assessment
• How do features correlate with heightened attention?– Datasets: Twitter, Boards.ie, SCN and Digg
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
20
Experiments
• Experiment 1: Identifying Seed Posts– Will this post yield a reply– Experiment 1(a): Model Selection
• Which model performs best?– Experiment 1(b): Feature Assessment
• How do features correlate with seed posts?– Datasets: Twitter and Boards.ie
• Experiment 2: Activity Level Prediction– What is the level of activity that seed posts yield?– Experiment 2(a): Model Selection– Experiment 2(b): Feature Assessment
• How do features correlate with heightened attention?– Datasets: Twitter, Boards.ie, SCN and Digg
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
21
Results: 1(a) Model Selection
• Which model performs best?
Twitter Boards.ie
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
22
Results: 1(b) Feature Assessment
• How do features correlate with seed posts?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
23
Results: 1(b) Feature Assessment
Boards.ie
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
24
Experiments
• Experiment 1: Identifying Seed Posts– Will this post yield a reply– Experiment 1(a): Model Selection
• Which model performs best?– Experiment 1(b): Feature Assessment
• How do features correlate with seed posts?– Datasets: Twitter and Boards.ie
• Experiment 2: Activity Level Prediction– What is the level of activity that seed posts yield?– Experiment 2(a): Model Selection– Experiment 2(b): Feature Assessment
• How do features correlate with heightened attention?– Datasets: Twitter, Boards.ie, SCN and Digg
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
25
Activity Distribution
Twitter Boards.ie
SCN Digg
1. Predict a ranking2. Compare ranking against ground truth3. Measure using Normalised Discounted Cumulative Gain @ varying ranks (k)
• k={1,5,10,20,50,100}4. Best model: highest nDCG averaged over k
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
26
Results: 2(a) Model Selection
• Which model performs best?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
27
Results: 2(b) Feature Assessment
• How do features correlate with heightened attention?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
28
Results: 2(b) Feature Assessment
• How do features correlate with heightened attention?
• Heightened Activity on Twitter=• Shorter posts• Denser vocabulary• Fewer hyperlinks• Earlier in the day!
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
29
Results: 2(b) Feature Assessment
• How do features correlate with heightened attention?
• Heightened Activity on Boards.ie=• Concentrated topics• Longer posts• Wider vocabulary• Fewer referrals• Negative sentiment
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
30
Results: 2(b) Feature Assessment
• How do features correlate with heightened attention?
• Heightened Activity on SCN=• Less author participation• Contacted fewer people• User contacted by many people• Longer posts• Wider vocabulary• More hyperlinks
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
31
Results: 2(b) Feature Assessment
• How do features correlate with heightened attention?
• Heightened Activity on Digg=• Concentrated topics• Longer posts• Later in the day• Familiar community terms
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
32
Generating Attention: Findings
How do Social Web Systems differ in how attention is generated?
• Commonalities– Fewer hyperlinks for Microblogging platforms and discussion message
boards – Use familiar language to the community– Negative content yields more activity– Activity distribution
• Idiosyncrasies– More hyperlinks on support forums– Lower topic affinity on news-sharing system– Models differ: a) best performing, b) coefficients:
• Content: Twitter• User: Boards.ie, SCN• Focus: Digg
What drives attention in one system is not the same as another
Anticipating Discussion Activity on Community Forums. M Rowe, S Angeletou and H Alani. The Third IEEE International Conference on Social Computing. Boston, USA. (2011)
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
33
Behaviour Role Compositions
Analysis II
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
34
Online Communities in Social Web Systems
• Social Web Systems support online communities to function and grow, enabling:– Idea generation– Customer support– Problem solving
• Managing and hosting communities can be– Expensive– Time-consuming
• Social Web Systems have large investments, therefore they must:– flourish and remain active– remain… ‘healthy’
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
35
Increased Community Activity
What did the community look like at the point?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
36
Decreased Community Activity
What were the conditions at this point?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
37
The Need to Assess Behaviour
• How can we gauge community health?– Post Count?– Communication/Interaction?– Behaviour?
• Domination of one behaviour could lead to churn– Preece, 2000
• Behaviour in online community is influenced by the roles that users assume– Preece, 2001
• To provide health insights we need to monitor behaviour over time– Combined with basic health metrics (e.g. post count)
• Enabling detection of how behaviour differs between systems
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
38
Modelling, Representing and Tracking Behaviour: How?
• Users exhibit different behaviour in different contexts:– How can we model user behaviour and represent its change over
time?
• According to [Chan et al, 2010] users can be classified by their community role:– What behaviour correlates with community roles?– How can we label users as the system changes?
• Communities evolve and change over time:– Is there a correlation between community composition and health?– Can we predict community changes based on composition data?
How do Social Web Systems differ in terms of behaviour?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
39
Behaviour Ontology
• How can we model user behaviour and represent its change over time?
http://purl.org/net/oubo/0.3
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
40
Behaviour Features
• In-degree Ratio– Proportion of users that reply to user
• Posts Replied Ratio– Proportion of posts by that yield a reply
• Thread Initiation Ratio– Proportion of threads started by
• Bi-directional Threads Ratio– Proportion of threads where is involved in a reciprocal action
• Bi-directional Neighbours Ratio– Proportion of ‘s neighbours with whom a reciprocal action has
taken place• Average Posts per Thread
– Mean number of posts in the threads that has participated in• Standard Deviation of Posts per Thread
– Standard deviation of posts in the threads that has posted in
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
41
Behaviour Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing discussion forums using common user roles. In Proc. Web Science Conf. (WebSci10), Raleigh, NC: US, 2010.
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
42
Behaviour Roles (2)
• What behaviour correlates with community roles?
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
43
Constructing and Applying Rules
• How can we label users as the system changes?
Structural, social network, reciprocity, persistence, participation
Feature levels change with the dynamics of the community
Based on related work, we associate roles with a collection of feature-to-level mappingse.g. in-degree -> high, out-degree -> high
Run rules over each user’s features and derive the community role composition
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
44
Composition vs Activity
• Is there a correlation between community composition and health?
• Community Message board: Boards.ie– All posts used from 2004 – 2006– Selected 3 forums for analysis
• F246: Commuting and Transport• F388: Rugby• F411: Mobile Phones and PDAs
• Support Forum: Tiddlywiki– Software development forum used by BT’s development team
• Measured at 12-week increments:– Forum composition (% of roles)
• E.g. 20% elitists, 10% grunts, etc– Number of posts
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
45
Correlation Results (1): Boards.ie
Forum 246 – Commuting and Transport
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
46
Correlation Results (2): Boards.ie
Forum 246 – Commuting and Transport
Forum 388 – Rugby Forum 411 – Mobile Phones and PDAs
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
47
Correlation Results: Tiddlywiki
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
48
Evolution Results (1): Boards.ie
Forum 246 – Commuting and Transport
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
49
Evolution Results (2): Boards.ie
Forum 246 – Commuting and Transport
Forum 388 – Rugby Forum 411 – Mobile Phones and PDAs
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
50
Evolution Results: Tiddlywiki
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
51
Predicting Community Health
• Can we predict community changes based on composition data?
1. Activity Change Detection: – Predict either an increase or decrease in activity– Features: roles and percentages– Class label: increase/decrease– Performed 10-fold cross validation with J48 decision tree
2. Post Count Prediction: – Predict post count from role composition– Independent variables: roles and percentages– Dependent variable: post count– Induced linear regression model and assessed the model
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
52
Activity Change Detection
Boards.ie
Tiddlywiki
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
53
Post Count Prediction
Boards.ie
Tiddlywiki
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
54
Post Count Prediction
Boards.ie
Tiddlywiki
• Increased Community Activity on Boards.ie =• More initiators• More participants• Less supporters• Fewer ignored
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
55
Post Count Prediction
Boards.ie
Tiddlywiki
• Increased Community Activity on Tiddlywiki =• More conversationalists• More initiators• Fewer supporters• Fewer ignored
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
56
Clustering Communities by Composition
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
57
Behaviour Role Compositions: Findings
• How do Social Web Systems differ in terms of behaviour?
• Commonalities– No grunts in either system– Increase in ignored users and supporters decreases health– Increase in initiators increases activity
• Idiosyncrasies – No elitists found on support-forum– Conversationalists improve activity in certain cases– Optimum behaviour compositions differ
Modelling and Analysis of User Behaviour in Online Communities. S Angeletou, M Rowe and H Alani. International Semantic Web Conference. Bonn, Germany. (2011)
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
58
Thesis: Microcultures
Social Web Systems contain micro-cultures that differ in terms of
a) user behaviourb) how attention is generatedc) role compositions in such systems
Recap
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
59
Microcultures: Evidence
• Social Web Systems contain micro-cultures that differ in terms of – a) User behaviour
• Non-existence of roles in certain communities• Conversation behaviour important in certain communities
– b) How attention is generated• Differences in optimum prediction models• Factors differ in driving activity
– E.g. referrals, topic affinity
– c) Role compositions in such systems• Intra and inter composition differences
Using Behaviour Analysis to Detect Cultural Aspects in Social Web Systems
60
Questions?
Web: http://people.kmi.open.ac.uk/rowe http://www.matthew-rowe.com
Email: [email protected]: @mattroweshow