the maxdiff typing tool - sawtooth software€¦ · the maxdiff typing tool section 3 22 ......
TRANSCRIPT
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Learning to Use the
MaxDiff Typing Tool
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Background on typing tools
How the naïve Bayes MaxDiff typing tool works
Practical use of Sawtooth Software’s MaxDiff typing tool
Using the software
Agenda
2
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
BACKGROUND
Section 1
3
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
We have a set of cases (e.g. respondents) with given segment membership assignments
Now we want to predict segment membership for new cases not included in the original segmentation exercise
This is a job for “supervised learning”
Unsupervised learning, e.g. cluster analysis, creates group memberships in the absence of a dependent (supervising) variable
Supervised learning identifies rules or equations that predict group membership when a supervising variable (group membership) is available
We want to apply segment assignments from a learning sample (or original sample tagged with segment membership) to a new sample of cases/respondents
The assignment
4
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Linear-in-parameters models
Discriminant analysis
Logit
Machine learning methods
Nearest neighbor analysis
Tree-based methods
Tree ensembles (random forests)
Naïve Bayes classifiers
Other supervised learning methods (support vector machines, etc.)
Some supervised learning methods
5
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Discriminant analysis has a categorical dependent variable (DV) and some number of independent variables (IVs)- metric or categorical predictors
To make a typing tool let the segment membership be the DV and search for a model where a small number of independent variables predicts segment membership well
One set of outputs are Fisher’s Linear Discriminant Functions
One linear function per level in the dependent variable
Compute the values of these functions with data from a given new respondent
Predicted segment is the one corresponding to the function with the largest value
Discriminant analysis
6
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Age (in years), gender (1=male, 2=female) scores on a 5-point rating scale predicts membership into 3 segments
Functions
DF1: 14.3 - 0.4(Age) + 1.5(Gender) + 0.8 (RS)
DF2: -8.0 + 0.3(Age) + 0.9(Gender) - 0.4 (RS)
DF3: 1.8 + 0.1(Age) -0.1(Gender) + 0.5 (RS)
e.g. 30 year old Mr. Jones who gives 2 to the rating question
DF1: 14.3 - 0.4(30) + 1.5(1) + 0.8 (2) = 5.4
DF2: -8.0 + 0.3(30) + 0.9(1) - 0.4 (2) = 1.1
DF3: 1.8 + 0.1(30) - 0.1(1) + 0.5 (2) = 5.7
Discriminant analysis example
7
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
One can also use a polytomous multinomial logit (MNL) with segment assignment as the DV and a small number of IVs
As with discriminant analysis, assign respondent to the segment with the largest value for the linear function
With MNL you can also use values resulting from the linear functions and the logit choice rule to compute the probability of segment membership
This can helps us distinguish
Core segment members
Peripheral members
“Fence sitters”
Logit
8
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Stepwise search to find the variables that most discriminate group membership
Tree identifies the variables and the rules that classify respondents best into known segments
Now we can apply those classification rules to new cases/respondents
Classification trees
9
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
If respondent reports “4” for Q17, is over 44 years old and reports “5” for Q8, classify as Segment B
Classification trees
10
Total Sample
40% A, 25% B,
35% C
Q17 < 4
16% A, 14% B,
70% C
Q17 > 3
60% A, 15% B,
25% C
45+
22% A, 48% B,
30% C
<45
80% A, 15% B,
5% C
Q10 < 3
3% A, 1% B,
92% C
Q10 > 2
38% A, 55% B,
7% C
Q8 = 5
10% A, 86% B,
4% C
Q8 > 4
28% A, 5% B,
67% C
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Grow an ensemble of partially-informed trees (use a random subset of variables for each tree and for each node in each tree)
Run each new observation through each tree in the forest
Assign respondent to the modal prediction of the forest
Random forests
11
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Classifying pets
12
Friendly
Hateful
SmartStupid
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Cats and dogs
13
Friendly
Hateful
SmartStupid
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Nearest neighbor analysis
14
Friendly
Hateful
SmartStupid
cc
c
c
c
c
cc
cc
c
d
d
dd
d
d
d
d
d
d
d
d
h hh
h
h h
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Nearest neighbor analysis
15
Friendly
Hateful
SmartStupid
cc
c
c
c
c
cc
cc
c
d
d
dd
d
d
d
d
d
d
d
d
N
h hh
h
h h
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
HOW THE NAÏVE BAYES CLASSIFIER
WORKS
Section 2
16
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Naïve Bayes classifiers use conditional probabilities from Bayes’ Theorem (specifically the posterior probabilities) to identify most likely group membership from a set of input variables
It is “naïve” because it assumes that the conditional probabilities are independent and can simply be multiplied together
Ideal for MaxDiff
Naïve Bayes classifiers
17
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Our tool uses a type of Naïve Bayes to
1. Identify a small set of MaxDiff questions that do a good job of classifying respondents from an existing MaxDiff based segmentation database into their known segments
Likely these will be questions few or no respondents actually answered
It can also use non-MaxDiff “auxiliary” variables to improve its predictions
2. Classify new respondents into those segments using the small set of MaxDiff questions
Sawtooth Software MaxDiff Typing Tool
18
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
With the utilities of existing segment members in hand we can predict how the average member in any of the segments should answer any possible MaxDiff question and any possible combination of MaxDiff questions
Here’s the Bayesian part: we can also calculate the likelihood that a respondent giving any pattern of responses to a given set of MaxDiff questions belongs to any of the segments
How it works
19
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
The tool searches for a subset of MaxDiff questions that correctly classifies the greatest number of existing respondents to the correct segments using a fast swapping procedure
The search procedure has a random starting point, so we typically use many (50, 100, etc.) starting points so that we can arrive at a near optimal solution
The user indicates how many questions of how many items each to include in each search
You can also choose to focus on success predicting particular segments
How it works
20
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Now we can ask the small set of MaxDiff questions to new respondents
New respondents will answer with one of the possible patterns of responses and we know which patterns most likely belong to which segments, so we can assign new respondents accordingly
How it works
21
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
EXPERIENCE USING THE MAXDIFFTYPING TOOL
Section 3
22
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Percent correct predictions
50%
55%
60%
65%
70%
75%
80%
2 3 4 5 6 7 8 9 10
2
3
4
5
Items/Task
Number of Tasks
23
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Percent correct predictions
35.0%
40.0%
45.0%
50.0%
55.0%
60.0%
2 questions 3 questions 4 questions 5 questions 6 questions 7 questions 8 questions
Overall Hit Rate
2 items
3 items
4 items
5 items
24
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Fiendishly clever naïve Bayes classifier
150
200
250
300
350
400
2 3 4 5 6 7 8 9 10
Correct Classifications
2 items
3 items
4 items
5 items
Number of sets
25
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
USING THE SOFTWARE
Section 4
26
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Overview of Using MaxDiff Typing Tool
Conduct a full MaxDiff study to develop a segmentation (usually via LC or cluster analysis)
Create a Typing Tool MaxDiff Questionnaire using Typing.EXE
Field the Typing Tool MaxDiff Questionnaire among new resps
Assign new resps to previous segments using Classifying.EXE
1
2
3
4
27
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Conduct a full MaxDiff study to develop a segmentation (usually via LC or cluster analysis)
1
Often you have a full MaxDiff questionnaire with 12-36 items (where each item appears 2+ times)
You have developed a segmentation you like, usually via Latent Class or clustering on HB scores
You also have estimated HB scores for reporting, TURF, or other simulations
28
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Client Wants a Typing Tool
The client really likes the segmentation scheme and wants to be able to assign respondents to new surveys into those same segments with a high degree of accuracy
29
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Input files (can create using Excel or your favorite text editor like Notepad or Wordpad):
Typing.txt: Contains raw HB scores on all the MaxDiff items, segment membership, and (optionally) other survey variables that are highly predictive of segment membership (e.g. age, company size, intended usage)
Segmentscores.txt: Contains segment sizes, raw aggregate (pooled) logit scores for each of the segments on all the MaxDiff items
Params.txt: A file containing control parameters that tells Typing.EXE what to expect in the input files and what to do
Create a Typing Tool MaxDiff Questionnaire using Typing.EXE2
30
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Input Files (in Notepad)
31
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
More Info on Params.txt
32
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Launch Typing.exe!
Fun, fun! The command prompt (the DOS prompt)
Luckily, you don’t need to remember any DOS commands…
Just double-click LaunchCommandPrompt file, then type “Typing”, then press ENTER key
(Software Demo)
33
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Output File Gives Typing Tool
Log.txt (open with Notepad or Wordpad)
Items to show in MaxDiff typing Questionnaire (e.g. show items 23, 18, and 14 in task #1)
34
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Remember that you can Export the MaxDiff design to .CSV, modify it (to insert the typing tool questionnaire design), and re-import into Lighthouse Studio (You’ll need to “fool” the Designer by
telling it to allow designs without connectivity)
Now you’re using Lighthouse Studio’s MaxDiff questions, but with your typing tool questionnaire!
Field the Typing Tool MaxDiff Questionnaire among new resps3
35
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Assign new resps to previous segments using Classifying.EXE
4
Input files (can create using Excel or your favorite text editor like Notepad or Wordpad):
Respdata.txt: Contains respondent answers to the typing tool questionnaire, plus (optionally) responses to additional survey variables that could help assign people into the right segments
Segdata.txt: A file containing average segment MaxDiff scores and average segment responses to the optional survey questions used for classification
36
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Input Files (in Notepad)
Respondent#, #Survey_Variables, #MaxDiff_Sets, SurveyVariable_Values, #Items_In_Set1, Items_in_Set1, Best_Item_Set1, Worst_Item_Set1, Etc.
Line 1: #Segments, #Survey_Variables, #MaxDiff_ItemsLine 2: #Levels_for_Survey_VariablesLine 3: Segment_Size_Seg1, Survey_Variable_Probabilities_Seg1, Seg1_Logit_ScoresLine 4: Segment_Size_Seg2, Survey_Variable_Probabilities_Seg2, Seg2_Logit_Scores
37
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Output File Gives Segment Assignments
Respclass.txt (open with Notepad or Wordpad)
Respondent#, Probability_of_Membership, Segment_Assignment
38
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Classification Look-Up Table
Often clients don’t want to have to come back to you to run Classifying.exe each time they collect new respondents with the typing questionnaire
They want a lookup table that tells them the segment prediction given ANY possible combination of answers to the typing questionnaire
For our 4-set, 3-items at a time questionnaire (plus the optional survey question with 2-category response) there are just 6x6x6x6x2=2592 possible ways that a respondent could answer the typing tool questionnaire plus the survey question
39
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Create the Lookup Table
Create a file of 2592 “respondents” (respdata.txt) who represent all 2592 ways to answer the typing tool questionnaire plus additional survey question
Run Classifying.exe to generate the segment assignment for each of those 2592 “respondents”
Deliver classification lookup table to client
40
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
QUESTIONS?
41
Bryan Orme
President
www.sawtoothsoftware.com
+1 801 477 4700
@sawtoothsoft
Keith Chrzan
SVP, Sawtooth Analytics
© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com
Webinar
Lyon, David W. (2016) Naïve Bayes Classifiers: Or, How to Classify via MaxDiff without Doing MaxDiff, paper presented at the Sawtooth Software Conference, Park City.
Orme, Bryan and Rich Johnson (2009) A Procedure for Classifying New Respondents into Existing Segments Using Maximum Difference Scaling, available at: http://www.sawtoothsoftware.com/download/techpap/typing_tools_mrmag.pdf
References
42