9 discriminant analysis

Upload: eduson2013

Post on 03-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 9 Discriminant Analysis

    1/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Applied Marketing(Market Research Methods)

    Topic 9:

    Discriminant analysis

    Dr James Abdey

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    2/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Overview

    We discuss the technique of discriminant analysis,

    initially by examining its relationship to regression

    analysis

    Modelling of discriminant analysis is presented,along with formulation, estimation, significance,

    interpretation and validation of results

    Two-group and multiple group discriminantanalyses are introduced

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    3/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis

    Discriminant analysis is a technique for analysing

    data when the dependent variable is categorical

    and the independent variables are measurable

    The main objectives of discriminant analysis are:

    Development of discriminant functions, i.e. linearcombinations of the independent variables, which willbest discriminate between the categories (groups) ofthe dependent variable

    Checking whether significant differences existamong the groups, in terms of the independentvariables

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    4/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis

    Determination of which independent variablescontribute to most of the intergroup differences

    Classification of cases to one of the groups basedon the values of the independent variables, and

    determining the accuracy of classification

    When the dependent variable has two categories,the technique is called two-group discriminantanalysis

    When three or more categories are involved, thetechnique is called multiple discriminant analysis

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    5/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis

    In the two-group case, it is possible to derive only

    one discriminant function

    In multiple case, more than one function may be

    computed

    In general, with M groups and k independent

    variables, it is possible to estimate up to the smallerof M 1, or k, discriminant functions

    The first function has the highest ratio of

    between-groups to within-groups sum of squares

    The second function, uncorrelated with the first, hasthe second highest ratio, and so on

    However, not all the functions may be statistically

    significant

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    6/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis model

    The discriminant analysis model involves linear

    combinations of the following form:

    D= 0 + 1X1 + 2X2 + 3X3 + . . .+ kXk

    where

    D= discriminant score

    s = discriminant coefficient Xs = independent variables

    The coefficients, , are estimated so that the

    groups differ as much as possible on the values of

    the discriminant function This occurs when the ratio of between-group sum of

    squares to within-group sum of squares for the

    discriminant scores is at a maximum

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    7/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis statistics

    Canonical correlation the extent of associationbetween the discriminant scores and the groups. It is

    a measure of association between a discriminant

    function and the set of dummy variables that define

    the group membership

    Centroid mean values for the discriminant scores

    for a particular group. There are as many centroids

    as there are groups with one centroid per group

    Classification matrix contains the number of

    correctly classified and misclassified cases

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    8/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis statistics

    Discriminant function coefficients the

    (unstandardised) multipliers of independentvariables, when the variables are in the original units

    of measurement

    Discriminant scores the discriminant function

    coefficients are multiplied by the values of therespective independent variables. These products

    are summed and added to the constant term to

    obtain the discriminant scores

    Eigenvalue For each discriminant function, the

    eigenvalue is the ratio of between-group to

    within-group sums of squares. Larger eigenvalues

    indicate better functions

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    9/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis statistics

    Group means and group standard deviations computed for each independent variable for each

    group

    Standardised discriminant function coefficients used as the multipliers when the independent

    variables have been standardised, i.e. have a mean

    of 0 and a variance of 1

    Structure correlations simple correlationsbetween the predictors and the discriminant function

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    10/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Discriminant analysis statistics

    Total correlation matrix treating the cases as a

    single sample, a total correlation matrix is obtained

    Wilks for each independent variable, Wilks is

    the ratio of the within-group sum of squares to thetotal sum of squares. Its value varies between 0 and

    1. Large values of (near 1) indicate that group

    means do not seem to be different. Small values of

    (near 0) indicate that the group means do seem to bedifferent

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    11/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysis

    Discriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Formulate the problem

    Identify the objectives, the dependent variable

    and the independent variables

    The dependent variable must consist of two or more

    mutually exclusive and collectively exhaustivecategories

    The independent variables should be selected based

    on a theoretical model or previous research, or theexperience of the researcher

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    12/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Formulate the problem

    One part of the sample, called the estimation

    sample, is used for estimation of the discriminant

    function

    The other part, called the validation sample, isreserved for validating the discriminant function

    Often the distribution of the number of cases in the

    estimation and validation samples follows thedistribution in the total sample

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    13/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Estimate the discriminant function

    coefficients

    The direct method involves estimating the

    discriminant function so that all the independent

    variables are included simultaneously

    In stepwise discriminant analysis, the independent

    variables are entered sequentially, based on their

    ability to discriminate among the groups

    http://find/http://goback/
  • 7/28/2019 9 Discriminant Analysis

    14/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Determine the significance of

    discriminant functions

    The null hypothesis that, in the population, the

    means of all discriminant functions in all groups are

    equal can be statistically tested

    In SPSS this test is based on Wilks if severalfunctions are tested simultaneously (as in the case of

    multiple discriminant analysis), the Wilks statistic

    is the product of the univariate for each function

    If the null hypothesis is rejected, indicating

    significant discrimination, one can proceed to

    interpret the results

    Di i i

    http://find/http://goback/
  • 7/28/2019 9 Discriminant Analysis

    15/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Interpret the results

    The interpretation of the discriminant coefficients issimilar to that in multiple regression analysis

    Given the multicollinearity in the independent

    variables, there is no unambiguous measure of the

    relative importance of the independent variables in

    discriminating between the groups

    Nevertheless, we can obtain some idea of the

    relative importance of the variables by examiningthe absolute magnitude of the standardised

    discriminant function coefficients

    Di i i tI h l

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    16/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity ofdiscriminant analysis

    Interpret the results

    Some idea of the relative importance of the

    independent variables can also be obtained by

    examining the structure correlations

    These simple correlations between each

    independent variable and the discriminant functionrepresent the variance that the independent variable

    shares with the function

    Another aid to interpreting discriminant analysis

    results is to develop a characteristic profile for

    each group by describing each group in terms of the

    group means for the independent variables

    DiscriminantA lidi f di i i

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    17/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity of

    discriminant analysis

    Assess validity of discriminant

    analysis

    The discriminant coefficients, estimated by using the

    estimation sample, are multiplied by the values of the

    independent variables in the validation sample to

    generate discriminant scores for the cases in thevalidation sample

    The cases are then assigned to groups based on

    their discriminant scores and an appropriatedecision rule

    DiscriminantA lidit f di i i t

    http://find/
  • 7/28/2019 9 Discriminant Analysis

    18/18

    Discriminantanalysis

    Dr James Abdey

    Overview

    Discriminant analysisDiscriminant analysis model

    Discriminant analysis

    statistics

    Formulate the problem

    Estimate the discriminant

    function coefficients

    Determine the significance

    of discriminant functions

    Interpret the results

    Assess validity of

    discriminant analysis

    Assess validity of discriminant

    analysis

    The hit ratio, or the percentage of cases correctly

    classified, can then be determined by summing the

    diagonal elements and dividing by the total number

    of cases

    It is helpful to compare the percentage of cases

    correctly classified by discriminant analysis to the

    percentage that would be obtained by chance

    Classification accuracy achieved by discriminant

    analysis should be at least 25% greater than that

    obtained by chance

    http://find/