tim budden: "unlocking insights from social data"
TRANSCRIPT
Unlocking insightsfrom social data
Tim BuddenVP Data Science at DataSift
Drew Conway’s Data Science Venn Diagram
2
Expanding data universe
Agenda
1
2
3
4
Evolution of social data
Social data analyticsPrivacy by design
5 Examples
Expanding digital universe1
5
Expanding digital universe
Expanding human data
universe
Evolution of social data2
The evolution of social data
From public to non-public spaces:
Public Walled 1 to 1 Image-based
Public
Where brands and consumers most commonly engage directly. This is where customer support and brand perception can be addressed directly by a brand.
Walled garden
Users engage each other in a non-public but large network. This is where users are more candid about their aspirations and attitudes toward brands.
1 to 1
Users engage each other directly on a one-to-one or small group basis. Thus far this space has been considered largely off limits to brands.
Image-based
Public spaces where people showcase their best visual content.
12
Social data analytics3
14
Business applications of social media
15
Volume and velocity
Natural Language
Privacy
2.1B People Globally on Social Networks
Challenges to extracting insights from data
Unlocking Insights from 2.1B People on Social Networks
Example analytics project: Run on the banks?
16
Bank of England experimented with trying to predict a bank run in the days preceding the Scottish independence referendumObserved spike on 15 September of tweets mentioning “RBS” and “run”
Scottishindependence
referendum
17
Run on the banks?“Great run there! Arm tackles don’t bring down good RBs”
Ambiguity in natural language
18
Synonymity in natural language
19
word2vec
20
king - man=
queen - woman
Berlin - Germany=
Paris - France
https://spacy.io/demos/sense2vec?NFL
Privacy by design4
How can information useful to business be extracted from non-public spaces, while wholeheartedly
respecting people’s privacy?
Think in terms of audiences and demographics not individuals
23
Djokovic
Federer
female male
Come on Djokovic! Come on
Roger!
Go for it Novak!
Great shot Federer!
Henman Hill at Wimbledon
Think in terms of topics and attitudes not verbatim
Sumptuous interior!
Beautiful lines!
Lots of storage
PYLON: Anonymised and Aggregated insights
25
Text available to algorithmsbut not output
Aggregated results
Audience sizes are quantised:minimum bucket size and intervals
Anonymised: allPersonallyIdentifiableInformation(PII) is dropped
API
DS
CONTENTGender: MaleAge Range: 35-44Region: California, USA
CONTENTNegativeNeutralPositive
DEMOGRAPHICS
SENTIMENT
Automatic classification of related topics
e.g. Star Wars VII (Film)
TOPIC ANALYSIS
CONTENT
LINKSAnalyze
URLs shared across Facebook
Engagement and Demographics around Likes, Comments and Shares
ENGAGEMENT
Can’t wait to take the kids to watch Star Wars VII
CONTENT
Privacy-safe aggregate analysis of
text
TEXT ANALYSIS
Topic Data is Multi-Dimensional. Build Insights into Content, Engagement, Audiences
Examples5
Analysing and visualising automotive
28
websequencediagrams.com
Writing the script with Facebook topic data
29
30
Volume and velocity
Natural Language
Privacy
2.1B People Globally on Social Networks
Challenges to extracting insights from data
Unlocking Insights from 2.1B People on Social Networks
THANK YOU