discriminant analysis-market research

Upload: calmchandan

Post on 07-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Discriminant Analysis-Market research

    1/29

    Business Research Method

    Discriminant Analysis

  • 8/4/2019 Discriminant Analysis-Market research

    2/29

    Discriminant Analysis

    Discriminant analysis helps in discriminating between two

    or more sets of objects or people based on the knowledge ofsome of their characteristics

    Discriminate between Bones or skeletons of males or

    females Dividing people into potential buyers or non buyers

    Classifying individuals as good or bad credit risk

    Classifying companies as good or bad investment risks

    Classifying consumers as brand loyal or brand switchers

  • 8/4/2019 Discriminant Analysis-Market research

    3/29

    Similarities and Differences between

    ANOVA, Regression, and Discriminant

    Analysis ANOVA REGRESSION DISCRIMINANT ANALYSIS

    Similarit iesNumber of One One OnedependentvariablesNumber ofindependent Multiple Mult iple Mult iplevariablesDifferencesNature of thedependent Metric Metric CategoricalvariablesNature of theindependent Categorical Metric Metricvariables

  • 8/4/2019 Discriminant Analysis-Market research

    4/29

    Discriminant Analysis

    Discriminant analysis is a technique for analyzing datawhen the criterion or dependent variable is categorical andthepredictor or independent variables are metricin nature.

    The objectives of discriminant analysis are as follows:

    Development ofdiscriminant functions, or linear

    combinations of the predictor or independent variables,which will best discriminate between the categories of thecriterion or dependent variable (groups).

    Classification of cases to one of the groups based on the

    values of the predictor variables. Evaluation of the accuracy of classification.

  • 8/4/2019 Discriminant Analysis-Market research

    5/29

    When the criterion variable has twocategories, the technique is known as

    two-group discriminant analysis. When three or more categories are

    involved, the technique is referred to as

    multiple discriminant analysis.

    Discriminant Analysis

  • 8/4/2019 Discriminant Analysis-Market research

    6/29

    Discriminant Analysis Model

    The discriminant analysis model involves linearcombinations of

    the following form:

    D = b0 + b1X1 + b2X2 + b3X3 + . . . + bkXk where

    D = discriminant score

    b 's = discriminant coefficient or weight

    X's = predictor or independent variable

    The coefficients, or weights (b), are estimated so

    that the groups differ as much as possible on the

    values of the discriminant function.

  • 8/4/2019 Discriminant Analysis-Market research

    7/29

    Conducting Discriminant Analysis

    Assess Validity of Discriminant Analysis

    Estimate the Discriminant Function Coefficients

    Determine the Significance of the Discriminant Function

    Formulate the Problem

    Interpret the Results

  • 8/4/2019 Discriminant Analysis-Market research

    8/29

    Conducting Discriminant Analysis

    Formulate Problem

    Identify the objectives, the dependent variable, and the independentvariables.

    The dependent variable must consist of two or more mutually exclusiveand collectively exhaustive categories. (Gender, Credit Risk,Investment Risk,)

    The independent variables should be selected based on a theoretical

    model or previous research, or the experience of the researcher. Collect data on independent variables for each category of criterion

    variable

    One part of the sample, called the estimation oranalysis sample, isused for estimation of the discriminant function.

    The other part, called the holdoutorvalidation sample, is reserved forvalidating the discriminant function.

    Often the distribution of the number of cases in the analysis andvalidation samples follows the distribution in the total sample.

  • 8/4/2019 Discriminant Analysis-Market research

    9/29

    Example To determine salient characteristics of families that visited a vacation

    resort during last two years

    Data were obtained from a sample of 42 families of which 30 wereincluded in analysis sample & 12 in validation sample

    HHs that visited resort coded as 1 & those that did not as 2

    Both analysis and hold out samples were balanced in terms of visits

    Independent variables were

    --- Family income (V1)

    ---Attitude towards travel measured on a 9 point scale (V2)

    ---Importance attached to family vacation measured on a 9 point scale(V3)

    ---HH Size(V4)

    ---Age of the head of the HH(V5)

  • 8/4/2019 Discriminant Analysis-Market research

    10/29

    Information on Resort Visits: Analysis Sample

    Annual Attitude Importance Household Age of AmountResort Family Toward Attached Size Head ofSpent on No. Visit Income Travelto Family Household Family($000) Vacation Vacation1 1 50.2 5 8 3 43 M (2)2 1 70.3 6 7 4 61 H (3)3 1 62.9 7 5 6 52 H (3)4 1 48.5 7 5 5 36 L (1)5 1 52.7 6 6 4 55 H (3)6 1 75.0 8 7 5 68 H (3)7 1 46.2 5 3 3 62 M (2)

    8 1 57.0 2 4 6 51 M (2)9 1 64.1 7 5 4 57 H (3)10 1 68.1 7 6 5 45 H (3)11 1 73.4 6 7 5 44 H (3)12 1 71.9 5 8 4 64 H (3)13 1 56.2 1 8 6 54 M (2)14 1 49.3 4 2 3 56 H (3)

    15 1 62.0 5 6 2 58 H (3)

  • 8/4/2019 Discriminant Analysis-Market research

    11/29

    Annual Attitude Importance Household Age of AmountResort Family Toward Attached Size Head ofSpent on No. Visit Income Travelto Family Household Family($000) Vacation Vacation

    16 2 32.1 5 4 3 58 L (1)17 2 36.2 4 3 2 55 L (1)18 2 43.2 2 5 2 57 M (2)19 2 50.4 5 2 4 37 M (2)20 2 44.1 6 6 3 42 M (2)21 2 38.3 6 6 2 45 L (1)22 2 55.0 1 2 2 57 M (2)23 2 46.1 3 5 3 51 L (1)

    24 2 35.0 6 4 5 64 L (1)25 2 37.3 2 7 4 54 L (1)26 2 41.8 5 1 3 56 M (2)27 2 57.0 8 3 2 36 M (2)28 2 33.4 6 8 2 50 L (1)29 2 37.5 3 2 3 48 L (1)30 2 41.3 3 3 2 42 L (1)

    Information on Resort Visits: Analysis Sample

  • 8/4/2019 Discriminant Analysis-Market research

    12/29

    Information on Resort Visits:

    Holdout Sample

    Annual Attitude Importance Household Age ofAmount Resort Family Toward Attached Size Head ofSpent on No. Visit IncomeTravel to Family Household Family($000) Vacation Vacation

    1 1 50.8 4 7 3 45 M(2)2 1 63.6 7 4 7 55 H (3)3 1 54.0 6 7 4 58 M(2)4 1 45.0 5 4 3 60 M(2)5 1 68.0 6 6 6 46 H (3)6 1 62.1 5 6 3 56 H (3)7 2 35.0 4 3 4 54 L (1)8 2 49.6 5 3 5 39 L (1)9 2 39.4 6 5 3 44 H (3)10 2 37.0 2 6 5 51 L (1)11 2 54.5 7 3 3 37 M(2)12 2 38.2 2 2 3 49 L (1)

  • 8/4/2019 Discriminant Analysis-Market research

    13/29

    Conducting Discriminant Analysis

    Estimate the Discriminant Function

    Coefficients The direct method involves estimating the discriminant

    function so that all the predictors are included

    simultaneously.

    In stepwise discriminant analysis, the predictor variables

    are entered sequentially, based on their ability to

    discriminate among groups.

  • 8/4/2019 Discriminant Analysis-Market research

    14/29

    Results of Two-Group Discriminant

    AnalysisGROUP MEANSVISIT INCOME TRAVEL VACATION HSIZE AGE

    1 60.52000 5.40000 5.80000 4.33333 53.733332 41.91333 4.33333 4.06667 2.80000 50.13333Total 51.21667 4.86667 4.9333 3.56667 51.93333

    Group Standard Deviations

    1 9.83065 1.91982 1.82052 1.23443 8.77062

    2 7.55115 1.95180 2.05171 .94112 8.27101Total 12.79523 1.97804 2.09981 1.33089 8.57395

    Pooled Within-Groups Correlation MatrixINCOME TRAVEL VACATION HSIZE AGE

    INCOME 1.00000TRAVEL 0.19745 1.00000VACATION 0.09148 0.08434 1.00000HSIZE 0.08887 -0.01681 0.07046 1.00000

    AGE - 0.01431 -0.19709 0.01742 -0.04301 1.00000Wilks' (U-statistic) and univariate F ratio with 1 and 28 degrees of freedom

    Variable Wilks' F Significance

    INCOME 0.45310 33.800 0.0000TRAVEL 0.92479 2.277 0.1425VACATION 0.82377 5.990 0.0209HSIZE 0.65672 14.640 0.0007

    AGE 0.95441 1.338 0.2572 Contd.

  • 8/4/2019 Discriminant Analysis-Market research

    15/29

    Interpretation When Predictors are considered individually

    only Income, Importance of vacation &HH sizesignificantly differentiate between those who

    visited resort & those who did not( F ratio with

    k=1 & n-k-1= 30-1-1=28 d.f )

    Wilks lambda (also called U statistics) is ratioof within group SS to total SS. Its value varies

    between 0 to 1

    Small value of lambda indicate that groupmeans are different & a better discrimination

    power of that variable

  • 8/4/2019 Discriminant Analysis-Market research

    16/29

    Conducting Discriminant Analysis

    Determine the Significance of Discriminant Function

    The null hypothesis that, in thepopulation, the means of discriminant

    functions in both groups are equal can

    be statistically tested.

    In SPSS this test is based on Wilks' .

    If the null hypothesis is rejected,indicating significant discrimination,

    one can proceed to interpret the results.

  • 8/4/2019 Discriminant Analysis-Market research

    17/29

    Results of Two-Group Discriminant Analysis

    cont.

    CANONICAL DISCRIMINANT FUNCTIONS% of Cum Canonical After Wilks'Func tion Eigenvalue Va riance % Corr el at ion Funct ion Chi-square df Significance: 0 0 .3589 26.130 5 0.00011* 1.7862 100.00 100.00 0.8007 :

    * marks the 1 canonical discriminant functions remaining in the analysis.Standard Canonical Discriminant Function Coefficients

    FUNC 1 INCOME 0.74301TRAVEL 0.09611VACATION 0.23329HSIZE 0.46911AGE 0.20922Structure Matrix:

    Pooled within-groups correlations between discriminating variables & canonical discriminant functions(variables ordered by size of correlation within function)FUNC 1 INCOME 0.82202HSIZE 0.54096VACATION 0.34607TRAVEL 0.21337AGE 0.16354

    Contd.

  • 8/4/2019 Discriminant Analysis-Market research

    18/29

    Results of Two-Group Discriminant Analysis

    cont.

    Unstandardized Canonical Discriminant Function CoefficientsFUNC 1INCOME 0.8476710E-01TRAVEL 0.4964455E-01VACATION 0.1202813HSIZE 0.4273893

    AGE 0.2454380E-01(constant) -7.975476Canonical discriminant functions evaluated at group means (group centroids)Group FUNC 11 1.291182 -1.29118Classification results for cases selected for use in analysis

    Predicted Group MembershipActual Group No. of Cases 1 2Group 1 15 12 380.0% 20.0%Group 2 15 0 150.0% 100.0%

    Percent of grouped cases correctly classified: 90.00% Contd.

  • 8/4/2019 Discriminant Analysis-Market research

    19/29

    Interpretation

    Since 0.0001 is less than .05 we reject the nullhypothesis of equality of group means indicating

    better discriminating power of the discriminantfunction

    The unstandardised discriminant function is

    D= -7.975476 +

    +0.8476710E-01(INCOME)

    +0.4964455E-01(TRAVEL)

    +0.1202813(VACATION)+0.4273893(HSIZE)

    +0.2454380E-01(AGE)

  • 8/4/2019 Discriminant Analysis-Market research

    20/29

    Classification Using Model

    Group Centroids are values of discriminant function at Group

    Means The average of the two Group Centroids gives the cut off

    point

    The Group Centroids For the Resort Example are:

    Group Centroid 1 1.29118

    2 -1.29118

    The average of two centroids is 0

    Therefore any positive value of discriminant Score willlead to Classification as Resort Visit & negative valueto No Resort Visit

    Classification Using Model

  • 8/4/2019 Discriminant Analysis-Market research

    21/29

    Classification Using Model Annual Attitude Importance Household Age of

    Resort Family Toward Attached Size Head of

    No. Visit Income Travel to Family

    Household

    ($000) Vacation

    1 2 50.8 4 7 3 45

    Value Of Dicriminant Function For the Above Individual Would Be:

    D= -7.975476 +

    +0.8476710E-01(50.8)

    +0.4964455E-01(4)

    +0.1202813(7)+0.4273893(3)

    +0.2454380E-01(45)

    = - 0.242

  • 8/4/2019 Discriminant Analysis-Market research

    22/29

    Results of Two-Group Discriminant Analysiscont.

    Classification Results for cases not selected for use in the analysis (holdout sample)Predicted Group Membersh ipActual Group No. of Cases 1 2

    Group 1 6 4 266.7% 33.3%Group 2 6 0 60.0% 100.0%

    Percent of grouped cases correctly classified: 83.33%.

  • 8/4/2019 Discriminant Analysis-Market research

    23/29

    Interpretation of Results

    Find out the percentage of cases correctly

    classified by the model

    Find out variables which are relatively

    better in discriminating between groups

    How to classify a new subject into one of

    the groups

  • 8/4/2019 Discriminant Analysis-Market research

    24/29

    Conducting Discriminant Analysis

    Interpret the Results

    We can obtain some idea of the relative importance of thevariables by examining the absolute magnitude of thestandardized discriminant function coefficients.

    Some idea of the relative importance of the predictors can

    also be obtained by examining the structure correlations,also called canonical loadings ordiscriminant loadings.These simple correlations between each predictor and thediscriminant function represent the variance that the

    predictor shares with the function. Another aid to interpreting discriminant analysis results is

    to develop a characteristic profile for each group bydescribing each group in terms of the group means for the

    predictor variables.

  • 8/4/2019 Discriminant Analysis-Market research

    25/29

    Conducting Discriminant Analysis

    Access Validity of Discriminant Analysis

    The discriminant weights, estimated by using theanalysis sample, are multiplied by the values of the

    predictor variables in the holdout sample to generate

    discriminant scores for the cases in the holdout

    sample.

    The cases are then assigned to groups based on

    their discriminant scores and an appropriate decision

    rule. The hit ratio, or the percentage of casescorrectly classified, can then be determined by

    summing the diagonal elements and dividing by the

    total number of cases.

  • 8/4/2019 Discriminant Analysis-Market research

    26/29

    Stepwise Discriminant Analysis

    Stepwise discriminant analysis is analogous to stepwise

    multiple regression in that the predictors are enteredsequentially based on their ability to discriminate betweenthe groups.

    AnFratio is calculated for each predictor by conducting aunivariate analysis of variance in which the groups are

    treated as the categorical variable and the predictor as thecriterion variable.

    The predictor with the highestFratio is the first to beselected for inclusion in the discriminant function, if it

    meets certain significance and tolerance criteria. A second predictor is added based on the highest adjusted orpartialFratio, taking into account the predictor alreadyselected.

  • 8/4/2019 Discriminant Analysis-Market research

    27/29

    Each predictor selected is tested for retention based on itsassociation with other predictors selected.

    The process of selection and retention is continued until all

    predictors meeting the significance criteria for inclusion andretention have been entered in the discriminant function.

    The selection of the stepwise procedure is based on theoptimizing criterion adopted. The Mahalanobis procedureis based on maximizing a generalized measure of thedistance between the two closest groups.

    The order in which the variables were selected alsoindicates their importance in discriminating between thegroups.

    Stepwise Discriminant Analysis

  • 8/4/2019 Discriminant Analysis-Market research

    28/29

    SPSS Windows

    The DISCRIMINANT program performs both two-group

    and multiple discriminant analysis. To select this procedure

    using SPSS for Windows click:

    Analyze>Classify>Discriminant

  • 8/4/2019 Discriminant Analysis-Market research

    29/29

    Eigen Value =

    Between S S/ Within SS

    Wilk,s Lamda= WithinSS/TotalSS

    Canonical R = Correlation beween

    estimated Y & actual Y