bye, bye research. hello data mining!
DESCRIPTION
TRANSCRIPT
Have a question you’d like to ask regarding today’s presentation?
We welcome you to type your questions in the ‘Question & Answer’
window at any time during today’s Webinar. We will
answer as many questions as time allows during the Q & A session following this
presentation.
Got Tweet? #PLData
Bye, Bye Research.Hello Data Mining!
Hosted by Sean Case, SVP, Peanut LabsWednesday, March 10, 2010
Peanut Labs, Inc. · 114 Sansome Street, Suite 920 · San Francisco, CA 94104www.peanutlabs.com
Presentations by:
Jean Davis, Co-founder, Conversition
Catherine van Zuylen, VP, Product Marketing, Attensity
Jim Schwab, VP, Business Development – Social Media, Alterian
Today’s Agenda
Social network mining and analysis
Text analytics
Predictive modeling and analytics
Emerging technologies in data mining
Plus more!
Pecha Kucha Defined
Usually pronounced in three syllables like “pe-chak-cha”
A presentation format in which one presenter shows 20 slides for 20 seconds each, for a total of six minutes and 40 seconds
Devised in Tokyo in February 2003 by Astrid Klein and Mark Dytham of Tokyo’s Klein-Dytham Architecture
Has since turned into a massive celebration, with events happening in hundreds of cities around the world
Jean Marie Davis, Co-founder, Conversition
Co-founded Conversition in February 2009
Formerly the President of Ipsos Online, North America
25+ years of experience in global marketing research
Known for her story telling, Jean authored The Little Church that Could, a fun and inspirational review of the signs posted outside one church for an entire year
Follow Jean on Twitter @JeanMarie50
Not Bye, Bye Research.
It’s welcome Social Media Research.
In the Social Network arena there is the opportunity to add
social media data to the Marketing Research field.
Evolution of Research Science• Marketing research techniques that assure data quality and create valuable data are very similar for each type of research – mail, face-to-face, phone, online.• Process and methods need to be developed to make social media data be another source for Marketing Research.
New Data Set
New Data Collection Methodology
• Instead of asking survey participants to answer questions, we listen to what social media contributors want to talk about
Applying Research Science
Market research using a different data source
• Research means:- Strict data quality processes- Norms and competitive brands- Standardized measures, both box scores and average scores- Key research measures- Category specific measures- Customized client measures- Sampling and weighting
Create Search Crawl CleanClean Score
Content Analysis Sample Weight
Specifywhatclient wantsto measure
Identifyrelevantconversations
Identify conversations that do notmeet basicquality control requirements
Tieredsystem reflectinguniqueneeds ofdifferentdata sources
Content analysis is applied to every conversation
Sampling is used to identifywhich sources are appropriate for a client
Weighting is applied to the sampling matrix to ensure that the included sources are reflected in a consistentproportion over time
Creating the Process
Data Sources
Past 6 Months Past 30 Days
Client Brand A 100,000 30,000
Client Brand B 800 100
Client Brand C 500,000 100,000
Category A 700,000 75,000
Category B 40,000 5,000
Category C 1,000,000 300,000
Competitive A 15,000 2,000
Competitive B 125,000 25,000
Competitive C 800,000 200,000
Sample Sizes
• Social media presence of Client Brand A and C, and Competitive A, B, and C are very good and well suited to social media research.
• Social media presence of Client Brand B is extremely low and may not be suited for quantitative research at this time.
DemographicsDemographic Percentage
Sex Percentage
Female 55.77 %
Male 44.10 %
Total 100 %
Age Percentage
Under 18 11.30 %
18 to 34 26.89 %
35 to 64 55.44 %
65 and older 6.24 %
Total 100 %
Education Percentage
Highschool or less 24.61 %
College or more 75.26 %
Total 100 %
Income Percentage
Up to $24 999 13.89 %
$25 000 to $74 999 55.97 %
$75 000 and over 30.01 %
Total 100 %
Demographic Percentage
Region Percentage
USA 33.20 %
Canada 3.22 %
Europe 25.16 %
Asia 12.10 %
Middle East 7.19 %
Africa 4.13 %
Pacific 7.27 %
South America 7.60 %
Total 100 %
Hispanic Percentage
Hispanic 6.20 %
Not Hispanic 93.67 %
Total 100 %
Race Percentage
White 76.85 %
Black 11.29 %
Asian 3.66 %
Other 8.06 %
Total 100 %
• Social media contributors do not share their demographic information when they contribute online but we do know the demographic make-up of many popular social media websites including twitter, flickr, and blogger.
• People talking about this brand are more likely to
be womenbe aged 35 to 64have a college degreeearn $25k to $75k
Scoring Methods
5 / Top Box
•I absolutely love this!
4 / Top 2
Box
•This is good
•It made me angry before but I quite like it now
3 / Neutral
•Whatever, I don’t care
•I have a love/hate relationship with it
2 / Bottom 2
Box
•This is stupid
•It’s ok but I hate going there cause the people are so rude
1 / Bottom Box
•This is the worst one ever
Content Analysis
• A method of grouping similar Conversations together so that they can be evaluated as a whole.
• Retailers: Parking, check-out lines, categories (electronics, apparel)• CPG: taste, feel, product, price
• Determine which sets of conversations are similar to each other based on tone of voice and content of the conversation.
• Sentiment: positive/negative
• Sources can be sampled and weighted according to the distribution of internet categories
• Can be weighted to redistribute sample so that overrepresented categories are less likely to skew the data
Source Sample Weight
Category• Blog• Video• Photo
YesYesNo
60%40%0%
Specific• Blogspot• Flickr• YouTube
NoYesYes
0%35%65%
Sampling & Weighting
Reporting
• Data can bring results in familiar data reports.
• Brand comparisons• Attribute reporting
• Data can bring results in new data reports.
• Cloud reporting• Psychographics
Multiple Brand Comparison
Brand Positive Sentiment Volume Of VerbatimsBrand A ██████████████████████
Brand B ████████████████████████████
Brand C ████████████████████████
Brand D █████████████████████████
Brand E ███████████████████████
Brand F ████████████████████████████
Brand G ████████████████████████
Brand H █████████████████████████
Brand I ███████████████████████
Brand J █████████████████████████████
Brand K █████████████████████████
Brand L ████████████████████████████
Brand M ██████████████████████
Brand N ██████████████████
Brand O ███████████████████
Brand P ███████████████████████
Brand Q ██████████████
• Sentiment and volume of chatter were tracked beginning from September 1, 2009
• Brands with the most positive sentiment include Brand A, Brand G, Brand H, and Brand N.
• Brands with the most chatter include Brand B, Brand J, and Brand L
Past 30 daysn = 378 to 92,000
Retailer Attribute Comparison
Average Scores5.0 = Positive3.0 = Neutral1.0 = Negative
Norms4.0 = High3.3 = Normal3.0 = Low
• Radar maps allow one to evaluate multiple brands on multiple constructs in one single chart. Brands with the largest web, or circle, are viewed the most positively by consumers. In this case, constructs relevant to retailers have been selected to compare Brand Green retailer with Brand Grey retailer.
• Scores are most positive in relation to crowding, the parking lot, and the hours. On the other hands, scores are much lower for opinions of employees and the website.
• While Brand Green outperforms Brand Grey on nearly every construct. However, Brand Green and Brand Grey generate very similar opinions related to their websites.
Employees
Crowding
Website
Washrooms
Hours
Parking Lot
Clouds• Data clouds indicate the specific words and phrases that people use in their conversations
• Popular words indicate:- The interests of people talking about the brand, and therefore the contents of marketing materials- Co-branding and co-sponsorship opportunities that are relevant to your consumers- Appropriate language to use in marketing materials, whether slang or formal
Use tennis or football metaphors
Show basketball or football in marketing materials
Obtain tennis or football celebrity endorsements
Psychographics
• Despite the fact that Brand A and Brand B generate similar emotion scores, by reviewing the assortment of constructs and identifying those with greater and lesser frequencies, psychographic differentiators of brands can be discovered
• The first three constructs are revealing in that each word relates to the exact same idea. However, the words used among Brand A consumers are more intellectual.
• This trend follows through in the discussions of technology where Brand A consumers use more technical words.
• Income and schooling also reflect a higher socio-economic status for Brand A consumers
• Brand A consumers reflect a higher socio-economic status than Brand B consumers
Construct Brand A Brand BCreativity Innovative OriginalFashionable Stylish PopularIntelligence Clever SmartHome Theatre Blue-ray, Home Theater Home entertainment, Blu-rayTelephones Android phone, PDA Cell phoneSports Celebrities Dale Earnhardt, Peyton Manning Dan HornbuckleReading Materials Textbook MagazineWith Whom Family, Grandmother Wife, KidIncome Financially stable, Rich Poor, brokeSchooling Degree, school Learn
The End
• Say “Hello” to Social Media Research- New data collection methodology- Create a process from data collection to reporting- Apply research techniques to the data to create a valid,
valuable, actionable data set- Create new and familiar reports- Continue to validate and improve processes
March 10, 2010
Thank you to Peanut Labs for inviting Conversition to share in their webinar!
Jean Davis, [email protected] 10, 2010
Any questions for Jean?
We welcome you to type your questions in the ‘Question & Answer’ window at any time during
today’s Webinar. We will answer as many questions as time allows during the Q & A session following this
presentation.
Catherine H van Zuylen, VP, Product Marketing, Attensity
A consultant at The Grommet Group
Formerly Vice President of Marketing at Block Shield
20 years of experience in product management, product marketing and marketing communications
A Silicon Valley native who grew up across from an apricot orchard and won several blue ribbons at the country fair for her fruits and vegetables
Follow Catherine on Twitter @catevz
Leveraging Customer Conversations Through LARACatherine H van ZuylenVP, Product [email protected]
www.attensity.comTwitter: @attensity
A Few Words About Me and Attensity
Attensity: Over 20 years experience understanding customer conversations in text; 6 patents in natural language processingSuite of applications for social media monitoring, Voice of the Customer Analysis, and Self-Service/Agent ServiceOver 500 customers worldwideMe: 15 years in marketing; 10+ years in text analytics and internet media
“Customer Information” is changing and growing exponentially
Twitter hit the 10 billion tweet mark last week :
over 20% are about products and services
Over 247 billion emails are sent every day
Millions of customer interaction records in a typical large
company.
To effectively harness these “customer conversations”, you need a program to comprehensively
Listen across customer conversation channelsAnalyze accurately and efficientlyRelate this information to other informationAct on the information
We call this the LARA methodology
LARA Methodology: Listen, Analyze, Relate, Act
Are you listening where your customers are talking?
Are your “social media” listening efforts isolated from your “CRM” listening efforts and separate from your “survey” listening? Are you monitoring your internal customer communities?
Text Analysis can help bridge these gaps.
Text Analysis is not Search“Search” is for finding relevant or recent documents that contain a term of interest
But it’s hard with search to get the “big picture”
34
What do people think about my company?
What problems are they having?
Who is thinking of switching?
What do they like about me vs. the competition?
What new ideas do they have?
“Search” starts with you feeding a system words to look for. “Text Analysis” starts with the data itself and lets it tell a story
DocumentsDynamic Text Profiling
?Entities, sentiments, events and relationships, intent, etc
XML or other “tags”
Text Analysis starts the same way some search engines do…
Automatic Language and Character Encoding Identification Identify paragraphs and sentences within textWord Segmentation (Tokenization) and De-CompoundingPart-of-Speech Tagging Stemming Noun-Phrase Identification
Then continues with Entity Extraction…
Who: People, Person Position, Social Security NumbersWhat: Companies, Organizations, Financial Indexes, Products (software, weapons, vehicles, etc…)When: Dates, Days, Holidays, Months, Years, Times, Time PeriodsWhere: Addresses, Cities, States, Countries, Facilities (stadiums, plants), Internet Addresses, Phone Numbers How Much: Currencies, MeasuresConcepts (i.e. Global piracy, unstructured data…)
Can be pattern-based – tell the system that a “Prop-Noun followed by Smith” is probably a personOr machine learning – feed it a million proper names and let it deduce names from those examples…
Practical Text Analysis in Action
Let’s say that I am a major retailer, and someone posted a review that starts out
I bought this Gucci scarf for my mom in your Santana Row store last week.
Entities (brands, people, locations, times, products…)
To “connect the dots” in data, you also need to extract noun-verb relationships, sentiment…
I bought this Gucci scarf for my mom in your Santana Row store last week. I really like the pattern, but I don’t like how it itches.
Entities (brands, people, locations, times, products…)Events and relationships: action and purchasing reasonSentiment (extreme positive, positive, negative, extreme negative)
To “connect the dots” in data, you also need to extract suggestions, intent…
I bought this Gucci scarf for my mom in your Santana Row store last week. I really like the pattern, but I don’t like how it itches. I wish this scarf came in cotton. If Gucci made more cotton scarves, I would buy them all.
Entities (brands, people, locations, times, products…)Events and relationships (I : buy : this Gucci scarf | I : buy : for mom)Sentiment (extreme positive, positive, negative, extreme negative)Suggestions (I : wish : this scarf came in cotton)Intent (to purchase, to leave) (If Gucci made more cotton scarves, I would buy them.)
How do you do this? You parse sentences like a human…and extract triples…
…and voices (intent, recurrence, etc)Question [?] voice:How can I get free shipping with future
orders? Condition [if/then] voice:.I would shop more frequently if you
offered free shipping. Intent [intent] voice:I plan to place an order today. Negation [not] negates the meaning of
the verb:You did not have the size I was looking
for in stock
…and voices (intent, recurrence, etc)Question [?] voice:How can I get free shipping with future
orders? Condition [if/then] voice:.I would shop more frequently if you
offered free shipping. Intent [intent] voice:I plan to place an order today. Negation [not] negates the meaning of
the verb:You did not have the size I was looking
for in stock
Augment [more] voice:The staff were incredibly professional Recurrence [again] voice:I had to enter my information several times
for the order to process Indefinite voice representing suggestions or
requests.You should sell wedding dresses, too!
LARA Methodology: Listen, Analyze, Relate, ActOnce you’ve done text analysis, you can relate the text to structured information…
01/24/2010By errodd from San Jose, CAI bought this Gucci scarf for my mom in your Santana Row store last week. I really like the pattern, but I don’t like how it itches. I wish this scarf came in cotton. If Gucci made more cotton scarves, I would buy them all.
Can help you answer questions likeWhat were the top concerns of people who rated this product a “4”?
LARA Methodology: Listen, Analyze, Relate, Act: What Can You Do with Text Analysis?
The output from text analysis can be exported as XML…
It can also be used directly in applications thatSeek out and deliver information to those who need itRoute and respond to communicationsMine and report on information
“Seek Out” information for a self-service knowledgebase
Problem
Solution
Manufacturer: Apple
Product: Macbook, Projector, Monitor
Component: Adapter cord, Mini-DVI, VGA
Action: Do a presentation, connect
Route and respond to all customer communications
Refund policy? Email
intent to leave tweet
Threatening to sue posting
“refund policy” email response auto-generated
Automatically routed as a mobile alert to legal for review
Responses can be reviewed by agent before sending
Read text and extract
knowledge about what the
document is saying
PeoplePlacesEventsTopics
Sentiment …
Routed to Customer Service for Follow-up and Resolution
Mine and report on sentiments, complaints, compliments, and “intentional” behavior across all customer conversations
Better understanding their customers
Better understanding their customers and gain early warning on product issues
Thank You.Leveraging Customer Conversations Through LARACatherine H van ZuylenVP, Product [email protected]
www.attensity.comTwitter: @attensity
Any questions for Catherine?
We welcome you to type your questions in the ‘Question & Answer’ window at any time during
today’s Webinar. We will answer as many questions as time allows during the Q & A session following this
presentation.
Jim Schwab, VP, Business Development – Social Media, Alterian
Formerly SVP of Sales and Marketing at Harris Interactive
Has close to 800 followers on Twitter
A graduate of the State University of New York College at Brockport
When not preoccupied with helping marketing, advertising, PR and customer service professionals to provide visibility and tools to understand what consumers and media are saying online, Jim enjoys keeping up with his 3 kids
Follow Jim on Twitter @JImSchwab
Bye, Bye Research. Hello Data Mining!Tapping into Social Media
Jim SchwabVP Business Development, Social Media
Alterian
March 10, 2010
Agenda
1. Quick Intro And my observations over the last couple years
2. Social Media, why should you care
3. Some caveats & challenges
4. Finding the right tool for mining social media And how to use it!
5. Some examples
6. What should you do? Listen, learn, engage and participate
Leveraging Social Media Content
Alterian SM2 at a Glance
A software technology focused on social media monitoring and analytics
Founded in 2005 commercially launched August 2008
10,000+ users Globally Freemium Professional
Big brands and agencies alike Microsoft, Intuit, McKinsey Consulting Edelman, Carlson Marketing, Epsilon, Experian
Quick introMy observations…..
1. We have to be where the consumers are!
2. Budgets are moving!
3. If I can do it anyone can!
4. The adoption curve is being followed But much more rapidly New solutions are emerging that make social media more main
street focused
Quick introAbout me…..
1. I’m not a tech geek
2. I’m not a data jockey
3. I’m not a trained analyst
4. I’m passionate about understanding how to deliver the right message to the right audience at the right time using the right mix of channels
NOT AN EASY TASK!!
Why should you care?Consumers are overwhelmed
Why should you care?Listen, learn & engage
Twitter“i was just talking about this the other day - how ineffective/lame the new tropicana packaging is…”
YouTube“just got my new toshiba netbook. seems to be working great. will be nice to use this rather then lugging around my big dell….”
Blog“if you really want to stretch your dollars you can use your registered starbucks card to buy an iced coffee and get a free refill….”
Some caveats & challenges Social media content is dynamic and unpredictable
It’s not magic!! Blogger, tweeters and SM authors do not cooperate with
marketers and customer service professionals SM Content is NOT like your regular customer database
SM has no boarders or zip codes SM has little demographics
You won’t capture every SM post out there It’s unstructured
Automated sentiment is a real challenge
How do you get to relevant content?Filter filter filter…..mine mine mine
The Universe of Content
1,000,000,000,000,000
Key words Continuous cleaningExclusions AlertsPlatforms Content structureLanguage RepresentativenessLocation Irrelevant contentTime period Spam
Project goals
Content that is relevant to you10,000 posts about my brand + purchase intent + promo
terms & time period + competitive mentions
What is being said…..
Where is it being said…..
Who’s driving the conversations…..
Compared to my competition…..
Why should you care?Turn unstructured text into actionable insight….
Social Media Monitoring Applications Client survey results, bucketed into 10 categories
1. Listening / Monitoring
2. Reputation & Crisis Management
3. Engagement & outreach
4. Market Research
5. Influencer identification
6. Competitive analysis
7. Customer support
8. SEO and link building
9. Support Loyalty Programs
10. Augment mystery shopper programs
Increase brand recognition and media attention
The project
OLX is the next generation of free online classifieds.
Blogger outreach Online PR
OLX wanted to run a 4 month trial period before proceeding any further. Unknown territory…..Chris Abraham, President and [email protected]+1 202 352 5051
The paybackYear on year increase in the US
The paybackIncrease in volume, across languages
Chris Abraham, President and [email protected]+1 202 352 5051
The payback and key learnings
OLX.com web traffic increased 40% over the 4 month trial
Abraham & Harrison renewed for 12 month contract Twitter accounts in 3 languages, 5 in 6 months
Chris Abraham, President and [email protected]+1 202 352 5051
The project Help client move the brand image among key influencers
Two part project
1. 12 month audit of conversations, in depth analytics Report and recommendations delivered Segmentation and profiles built of key targets
2. Approval on recommended approach to influencers Outreach and PR program
Wendy Scherer, Founder [email protected]+1 202 715 3884
Segmentations & Profile
Their Views:“..recent concerns about excessive dairy consumption and thepossible effects on health.”
Favorite web sitesMost used social media channels
Their Profile:“They heavily reference thewritings of Michael Pollan,who advocates natural foodproduction ……..generally recommendchoosing foods from a variety of food groups.”
The payback and key learnings
Based on initial work the company has built a team (in house and agency) to do SM engagement. Begun to specialize their team
Fantastic time saver in finding influencers Can’t be salesy – this is SOCIAL media
Education materials on diet data, nutrition, gluten free…etc.
Market & thought leader type conversations have increased
Wendy Scherer, Founder [email protected]+1 202 715 3884
Find the right tool for the jobListen, learn, engage and participate…..
Self service vs Professional service/agency Reporting Powerful and flexible functionality
You HAVE to be able to dig into the weeds…….or you risk analysis based on bad data
There are many vendors! High tech software to low tech Jim’s Social Media Company Many start ups
There are only a few real players in the software space And many good agencies
Thank you
Jim Schwab
+1.585.261.9433
@jimschwab
SM2 Freemium http://sm2.techrigy.com
Alterian SM2
Social Media Monitoring
Sign up for a FREE Social Media Monitoring account!!!
Any questions for Jim?
We welcome you to type your questions in the ‘Question & Answer’ window at any time during
today’s Webinar. We will answer as many questions as time allows during the Q & A session following this
presentation.
Q & A Session
We welcome any questions you may have regarding the content of today’s Webinar.
Special thank you to each of our three presenters!
Thank you for joining us!
The slide deck along with a recording of today’s presentation will be available for download via our
website. We will be sending all attendees a link to theslide deck as soon as it is available.