multivariate data analysis
DESCRIPTION
multivariate data analysisTRANSCRIPT
1
Chapter 1Chapter 1
IntroductionIntroduction
Copyright © 2007Copyright © 2007Prentice-Hall, Prentice-Hall, Inc.Inc.
2
LEARNING OBJECTIVES:LEARNING OBJECTIVES:Upon completing this chapter, you should be able to do the Upon completing this chapter, you should be able to do the
following:following:
1.1. Explain what mult ivariate analysis is and when its Explain what mult ivariate analysis is and when its application is appropriate.application is appropriate.
2.2. Define and discuss the specif ic techniques included in Define and discuss the specif ic techniques included in mult ivariate analysis.mult ivariate analysis.
3.3. Determine which multivariate technique is appropriate for Determine which multivariate technique is appropriate for a specif ic research problem.a specif ic research problem.
4.4. Discuss the nature of measurement scales and their Discuss the nature of measurement scales and their relat ionship to mult ivariate techniques.relat ionship to mult ivariate techniques.
5.5. Describe the conceptual and stat ist ical issues inherent in Describe the conceptual and stat ist ical issues inherent in mult ivariate analyses.mult ivariate analyses.
Chapter 1: IntroductionChapter 1: IntroductionChapter 1: IntroductionChapter 1: Introduction
3
• What is it? Multivariate Data Analysis = all What is it? Multivariate Data Analysis = all statistical methods that simultaneously analyze statistical methods that simultaneously analyze multiple measurements on each individual or multiple measurements on each individual or object under investigation.object under investigation.
• Why use it?Why use it? MeasurementMeasurement Explanation & PredictionExplanation & Prediction Hypothesis TestingHypothesis Testing
What is Mult ivariate Analysis?What is Mult ivariate Analysis?What is Mult ivariate Analysis?What is Mult ivariate Analysis?
4
• The VariateThe Variate• Measurement ScalesMeasurement Scales
NonmetricNonmetric MetricMetric
• Multivariate MeasurementMultivariate Measurement• Measurement ErrorMeasurement Error• Types of TechniquesTypes of Techniques
Basic Concepts of Multivariate AnalysisBasic Concepts of Multivariate AnalysisBasic Concepts of Multivariate AnalysisBasic Concepts of Multivariate Analysis
5
• The variate is a l inear combination of variables The variate is a l inear combination of variables with empirically determined weights.with empirically determined weights.
• Weights are determined to best achieve the Weights are determined to best achieve the objective of the specif ic multivariate technique.objective of the specif ic multivariate technique.
• Variate equation: (Y’) = Variate equation: (Y’) = W 1W 1 X X 1 1 + + W 2 W 2 XX 2 2 + . . . + + . . . + W n W n XX nn
• Each respondent has a variate value (Y’).Each respondent has a variate value (Y’).• The Y’ The Y’ valuevalue is a is a l inear combinationlinear combination of the entire of the entire
set of variables. It is the dependent variable.set of variables. It is the dependent variable.• Potential Independent VariablesPotential Independent Variables ::
X1 = incomeX1 = incomeX2 = educationX2 = educationX3 = family sizeX3 = family sizeX4 = ??X4 = ??
The VariateThe VariateThe VariateThe Variate
6
Types of Data and Measurement ScalesTypes of Data and Measurement Scales
DataData
MetricMetricoror
Quantitat iveQuantitat ive
NonmetricNonmetricoror
Qualitat iveQualitat ive
NominalNominalScaleScale
OrdinalOrdinalScaleScale
IntervalIntervalScaleScale
RatioRatioScaleScale
7
• NonmetricNonmetrico Nominal – size of number is not related to the amount of Nominal – size of number is not related to the amount of
the characterist ic being measuredthe characterist ic being measuredo Ordinal – larger numbers indicate more (or less) of the Ordinal – larger numbers indicate more (or less) of the
characterist ic measured, but not how much more (or less).characterist ic measured, but not how much more (or less).
• MetricMetrico Interval – contains ordinal propert ies, and in addit ion, there Interval – contains ordinal propert ies, and in addit ion, there
are equal dif ferences between scale points.are equal dif ferences between scale points.o Ratio – contains interval scale propert ies, and in addit ion, Ratio – contains interval scale propert ies, and in addit ion,
there is a natural zero point.there is a natural zero point.
NOTE: The level of measurement is crit ical in determining the NOTE: The level of measurement is crit ical in determining the appropriate mult ivariate technique to use!appropriate mult ivariate technique to use!
Measurement ScalesMeasurement ScalesMeasurement ScalesMeasurement Scales
8
• All variables have some error. What are All variables have some error. What are the sources of error?the sources of error?
• Measurement error = distorts observed Measurement error = distorts observed relationships and makes multivariate relationships and makes multivariate techniques less powerful.techniques less powerful.
• Researchers use summated scales, for Researchers use summated scales, for which several variables are summed or which several variables are summed or averaged together to form a composite averaged together to form a composite representation of a concept.representation of a concept.
Measurement ErrorMeasurement ErrorMeasurement ErrorMeasurement Error
9
In addressing measurement error, researchers In addressing measurement error, researchers evaluate two important characteristics of evaluate two important characteristics of measurement:measurement:
• Validity = the degree to which a measure Validity = the degree to which a measure accurately represents what it is supposed to.accurately represents what it is supposed to.
• Reliabil i ty = the degree to which the Reliabil i ty = the degree to which the observed variable measures the “true” value observed variable measures the “true” value and is thus error free.and is thus error free.
Measurement ErrorMeasurement ErrorMeasurement ErrorMeasurement Error
10
Statist ical Signif icance and PowerStatist ical Signif icance and Power
• Type I errorType I error, or , or αα, is the probabil i ty of rejecting the null , is the probabil i ty of rejecting the null hypothesis when it is true.hypothesis when it is true.
• Type II errorType II error, or , or ββ, is the probabil i ty of fai l ing to reject the null , is the probabil i ty of fai l ing to reject the null hypothesis when it is false.hypothesis when it is false.
• PowerPower, or , or 1-1-ββ, is the probabil i ty of rejecting the null , is the probabil i ty of rejecting the null hypothesis when it is false.hypothesis when it is false.
HH00 true true HH00 false false
Fail to Reject HFail to Reject H00 1-1- αα ββType II errorType II error
Reject HReject H00 ααType I errorType I error
1-1- ββPowerPower
11
Power is Determined by Three Factors:Power is Determined by Three Factors:
• Effect size:Effect size: the actual magnitude of the effect of the actual magnitude of the effect of interest (e.g., the difference between means or interest (e.g., the difference between means or the correlation between variables).the correlation between variables).
• Alpha (Alpha ( αα ):): as as αα is set at smaller levels, power is set at smaller levels, power decreases. Typically, decreases. Typically, αα = .05. = .05.
• Sample size:Sample size: as sample size increases, power as sample size increases, power increases. With very large sample sizes, even increases. With very large sample sizes, even very small effects can be statistical ly signif icant, very small effects can be statistical ly signif icant, raising the issue of practical signif icance vs. raising the issue of practical signif icance vs. statistical signif icance.statistical signif icance.
12
Figure 1-1 Impact of Sample Size on PowerFigure 1-1 Impact of Sample Size on Power
13
Rules of Thumb 1–1 Rules of Thumb 1–1
Statistical Power Analysis
• Researchers should always design the study to achieve a power level of .80 at the desired significance level.
• More stringent significance levels (e.g., .01 instead of .05) require larger samples to achieve the desired power level.
• Conversely, power can be increased by choosing a less stringent alpha level (e.g., .10 instead of .05).
• Smaller effect sizes always require larger sample sizes to achieve the desired power.
• Any increase in power is most likely achieved by increased sample size.
14
Types of Mult ivariate TechniquesTypes of Mult ivariate Techniques
• Dependence techniques:Dependence techniques: a variable or set of a variable or set of variables is identif ied as the dependent variable to variables is identif ied as the dependent variable to be predicted or explained by other variables known be predicted or explained by other variables known as independent variables.as independent variables.
o Mult iple RegressionMult iple Regressiono Mult iple Discriminant AnalysisMult iple Discriminant Analysiso Logit/Logist ic RegressionLogit/Logist ic Regressiono Mult ivariate Analysis of Variance (MANOVA) and Mult ivariate Analysis of Variance (MANOVA) and
CovarianceCovarianceo Conjoint AnalysisConjoint Analysiso Canonical Correlat ionCanonical Correlat iono Structural Equations Modeling (SEM)Structural Equations Modeling (SEM)
15
• Interdependence techniques:Interdependence techniques: involve the involve the simultaneous analysis of al l variables in the simultaneous analysis of al l variables in the set, without distinction between dependent set, without distinction between dependent variables and independent variables.variables and independent variables.
o Principal Components and Common Factor Principal Components and Common Factor AnalysisAnalysis
o Cluster AnalysisCluster Analysiso Mult idimensional Scaling (perceptual mapping)Mult idimensional Scaling (perceptual mapping)o Correspondence AnalysisCorrespondence Analysis
Types of Mult ivariate TechniquesTypes of Mult ivariate Techniques
16
Selecting a Mult ivariate TechniqueSelecting a Mult ivariate Technique
1.1. What type of relationship is being examined – What type of relationship is being examined – dependence or interdependence?dependence or interdependence?
2.2. Dependence relationship: How many variables are Dependence relationship: How many variables are being predicted?being predicted? What is the measurement scale of the What is the measurement scale of the
dependent variable?dependent variable? What is the measurement scale of the predictor What is the measurement scale of the predictor
variable?variable?3.3. Interdependence relationship: Are you examining Interdependence relationship: Are you examining
relationships between variables, respondents, or relationships between variables, respondents, or objects?objects?
17
Multiple RegressionMultiple Regression
A single metric A single metric
dependent variable is dependent variable is
predicted by several metric predicted by several metric
independent variables.independent variables.
18
A non-metric (categorical) A non-metric (categorical)
dependent variable is predicted bydependent variable is predicted by
several metric independent several metric independent
variables.variables.
Examples:
• Gender – Male vs. Female
• Heavy Users vs. Light Users
• Purchasers vs. Non-purchasers
• Good Credit Risk vs. Poor Credit Risk
• Member vs. Non-Member
Discriminant AnalysisDiscriminant Analysis
19
Logist ic RegressionLogist ic Regression
A single nonmetric dependent variable is A single nonmetric dependent variable is
predicted by several metric independent predicted by several metric independent
variables. This technique is similar to variables. This technique is similar to
discriminant analysis, but rel ies on discriminant analysis, but rel ies on
calculations more l ike regression.calculations more l ike regression.
20
MANOVAMANOVA
Several metric dependent variablesSeveral metric dependent variables
are predicted by a setare predicted by a set of nonmetric of nonmetric
(categorical) independent variables.(categorical) independent variables.
21
CANONICAL ANALYSISCANONICAL ANALYSIS
Several metric dependent Several metric dependent variables are predicted by variables are predicted by several metric independent several metric independent variables.variables.
22
. . . is used to understand . . . is used to understand respondents’ preferences respondents’ preferences for products and services.for products and services.
In doing this, it determines the In doing this, it determines the importance of importance of bothboth::
attributesattributes and and
levels of attributeslevels of attributes
. . . based on a smaller subset of . . . based on a smaller subset of combinations of attributes and combinations of attributes and
levels.levels.
CONJOINT ANALYSIS
23
Typical ApplicationsTypical Applications :: Soft DrinksSoft Drinks Candy BarsCandy Bars CerealsCereals BeerBeer Apartment Buildings; CondosApartment Buildings; Condos Solvents; Cleaning FluidsSolvents; Cleaning Fluids
CONJOINT ANALYSIS
24
Structural Equations Modeling (SEM)Structural Equations Modeling (SEM)
Estimates multiple, interrelated Estimates multiple, interrelated dependence relationships based on two dependence relationships based on two components:components:
1.1. Structural ModelStructural Model
2.2. Measurement ModelMeasurement Model
25
. . . .. . . . analyzes the structure of the analyzes the structure of the interrelationships among a large number interrelationships among a large number of variables to determine a set of common of variables to determine a set of common underlying dimensions (factors).underlying dimensions (factors).
Factor Analysis
26
. . . .. . . . groups objects (respondents, groups objects (respondents, products, f irms, variables, etc.) so that each products, f irms, variables, etc.) so that each object is similar to the other objects in the object is similar to the other objects in the cluster and different from objects in al l the cluster and different from objects in al l the other clusters.other clusters.
Cluster AnalysisCluster Analysis
27
Mult idimensional ScalingMult idimensional Scaling
. . .. . . identif ies “unrecognized” dimensions identif ies “unrecognized” dimensions that affect purchase behavior based on that affect purchase behavior based on customer judgments of:customer judgments of:
• similarit iessimilarit ies or or• preferencespreferences
and transforms these into distances and transforms these into distances represented as perceptual maps.represented as perceptual maps.
28
Correspondence AnalysisCorrespondence Analysis
. . .. . . uses non-metric data and evaluates uses non-metric data and evaluates either l inear or non-l inear relationships in either l inear or non-l inear relationships in an effort to develop a perceptual map an effort to develop a perceptual map representing the association between representing the association between objects (f irms, products, etc.) and a set of objects (f irms, products, etc.) and a set of descriptive characteristics of the objects.descriptive characteristics of the objects.
29
Guidelines for Mult ivariate AnalysisGuidelines for Mult ivariate Analysis
• Establish Practical Signif icance as Well Establish Practical Signif icance as Well as Statistical Signif icance.as Statistical Signif icance.
• Sample Size Affects All Results.Sample Size Affects All Results.• Know Your Data.Know Your Data.• Strive for Model Parsimony.Strive for Model Parsimony.• Look at Your Errors.Look at Your Errors.• Validate Your Results.Validate Your Results.
30
Stage 1:Stage 1: Define the Research Problem, Objectives, andDefine the Research Problem, Objectives, and Multivariate Technique(s) to be UsedMultivariate Technique(s) to be Used
Stage 2:Stage 2: Develop the Analysis PlanDevelop the Analysis PlanStage 3:Stage 3: Evaluate the Assumptions Underlying theEvaluate the Assumptions Underlying the
Multivariate Technique(s)Multivariate Technique(s)Stage 4:Stage 4: Estimate the Multivariate Model and AssessEstimate the Multivariate Model and Assess
Overall Model FitOverall Model Fit
Stage 5:Stage 5: Interpret the Variate(s)Interpret the Variate(s)
Stage 6:Stage 6: Validate the Multivariate ModelValidate the Multivariate Model
A Structured Approach to A Structured Approach to Mult ivariate Model Building:Mult ivariate Model Building:
31
Variable DescriptionVariable Description Variable TypeVariable TypeData Warehouse Classification VariablesData Warehouse Classification VariablesX1X1 Customer TypeCustomer Type nonmetric nonmetric X2X2 Industry TypeIndustry Type nonmetric nonmetric X3X3 Firm SizeFirm Size nonmetric nonmetric X4X4 RegionRegion nonmetricnonmetricX5X5 Distribution SystemDistribution System nonmetricnonmetricPerformance Perceptions VariablesPerformance Perceptions VariablesX6X6 Product QualityProduct Quality metricmetricX7X7 E-Commerce Activities/WebsiteE-Commerce Activities/Website metricmetricX8X8 Technical SupportTechnical Support metricmetricX9X9 Complaint ResolutionComplaint Resolution metricmetricX10X10 Advertising Advertising metricmetricX11X11 Product LineProduct Line metricmetricX12X12 Salesforce ImageSalesforce Image metricmetricX13X13 Competitive PricingCompetitive Pricing metricmetricX14X14 Warranty & ClaimsWarranty & Claims metricmetricX15X15 New ProductsNew Products metricmetricX16X16 Ordering & BillingOrdering & Billing metricmetricX17X17 Price FlexibilityPrice Flexibility metricmetricX18X18 Delivery SpeedDelivery Speed metricmetricOutcome/Relationship MeasuresOutcome/Relationship MeasuresX19X19 SatisfactionSatisfaction metric metric X20X20 Likelihood of RecommendationLikelihood of Recommendation metric metric X21X21 Likelihood of Future PurchaseLikelihood of Future Purchase metric metric X22X22 Current Purchase/Usage LevelCurrent Purchase/Usage Level metric metric X23X23 Consider Strategic Alliance/Partnership in FutureConsider Strategic Alliance/Partnership in Future nonmetricnonmetric
Description of HBAT Primary Database VariablesDescription of HBAT Primary Database Variables
32
Multivariate AnalysisMult ivariate AnalysisLearning Checkpoint:Learning Checkpoint:
1.1. What is multivariate analysis?What is multivariate analysis?2.2. Why use multivariate analysis?Why use multivariate analysis?3.3. Why is knowledge of measurement Why is knowledge of measurement
scalesscales important in using multivariate analysis?important in using multivariate analysis?4.4. What basic issues need to be examinedWhat basic issues need to be examined
when using multivariate analysis?when using multivariate analysis?5.5. Describe the process for applying Describe the process for applying
multivariate analysis.multivariate analysis.