klout score: measuring influence across multiple social networks
TRANSCRIPT
Klout Score:Measuring Influence Across Multiple Social NetworksOctober 29, 2015Mining Big Data in Social Networks WorkshopIEEE International Conference on Big Data, Santa Clara
*Adithya Rao, Nemanja Spasojevic, Zhisheng Li, Trevor DSouza
Link to paper: pdf arxiv
● Klout is a social influence measurement tool.
● Users register on Klout.com and connect their social network accounts.
● Klout collects authorized/public information from connected networks.
● Klout derives influence scores and topics for users from collected data.
● Klout recommends:○ content to post○ times when to post.
● Klout Website (klout.com)
What is Klout ?
What is Klout ?
Paper Contributions● Scalable Production System:
○ Full production system○ 750 million public and registered user profiles○ 45 billion interactions from 9 different networks
● Feature Generation: ○ How to generate features that signify influence?○ Over 3600 features generated.
● Hierarchical Scoring: ○ How to combine networks into a single score?
● Validation: ○ Experiments and comparisons that validate effectiveness of the Klout score
Scoring Methodology
Problem Statement
Formal Definition:
For each user u in a network G, let G_u be the subset of the network containing the users who may directly or indirectly interact with u, via a set of reactions R ⊆ A. Then an influence score I(u,T) is a measure of the degree and quantity of reactions that u can induce in G_u over a specified time period T.
In simpler words, an influence score may be defined as the ability of a user to drive actions among other users.
System Overview
Networks and Sources
MentionsLikesCommentsSubscribersWall PostsFriends
RetweetsMentionsList MembershipsFollowersReplies
Facebook Twitter
TitleEducationConnectionsRecommendersComments
LinkedInCheck-in’s and TipsFriends and Mayorship
+K received
Klout
Foursquare
InlinksInlinks to OutlinksPage ImportanceCategory counts
Comments+1’sReshares
Google+
Wikipedia Youtube
PostsFollowersLikes and Comments
SubscribersViewsLikes
Scoring
Step 1: Acquire Labeled Ground truth data● 100k labels from human evaluators ● Each network has its own labels
Step 2: Derive Features from interaction graph
● Long Lasting● Dynamic
Step 3: Generate a score per network / community
● Fit a model for the labels with features using Supervised Learning models
● Non negative least squares
Step 4: Hierarchically combine scores● Use heuristics such as graph size to
determine weights
Long Lasting Features
Dynamic Features
● Who: ○ The characteristics of the audience who
reacted to the original post from the user. ● When:
○ The difference between the current time and the time at which the reaction occurred.
● Where: ○ The social network on which the reaction
was performed.● What:
○ The unit of original content or action on which the reaction was performed.
● How: ○ The type of reaction.
HIGHER_(SCORED)-D3-FACEBOOK-POST-COMMENT
All-D3-FACEBOOK-POST-COMMENT
Dynamic Features - Cont.
Hierarchical Combining
Hierarchical Combining - Cont.
● Treat networks as orthogonal vectors since networks are mostly independent.
● Use heuristics such as network size to determine weights.
● Final Klout score is the Euclidean norm of the combined vector.
Key Insights
● Features are log-normalized => Klout scores are on a log scale○ eg. Order of magnitude difference between users scored 50 and 60
● Network models achieve 70-75% F1 scores. ○ Human evaluators do not always agree on influence ordering
● Wikipedia and LinkedIn are important sources for less active, high influence users○ eg. Warren Buffett => low social network activity, high score
● Twitter and Facebook are important sources for long tail users:○ eg. Low scored users with less influential interactions
● Temporal Dependence:○ Combining long lasting and dynamic features allows influence measurement on
different time scales
Validation
Spreading information
● 87k Users targeted with perks, encouraged to post messages● 18k posts created, 394k reactions received● Order of magnitude difference for users with Klout Score 60 vs 30
Comparison - Real world rankings
nDCG = 0.878 nDCG = 0.874
Comparison - Google Trends
Influencers by Topic
Conclusion
● A hierarchical scoring system called the Klout Score and a feature generation framework to capture different dimensions of influential interactions.
● Framework scales to hundreds of millions of users and billions of interactions across 9 social networks.
● Sources like Wikipedia and LinkedIn provide partial signals for real world influence. Temporal dependence is also considered.
● The Klout Score is only a partial representation of the influence of a user.
● However, an extensible system that is able to easily incorporate new sources of information can grow more accurate over time.
Paper Reference
● Klout Score: Measuring Influence Across Multiple Social Networks
Adithya Rao, Nemanja Spasojevic, Zhisheng Li, Trevor DSouza
Arxiv link to paper
Thank you!