1 mgt 540 research methods data analysis. 2 additional “sources” compilation of sources:...

41
1 Mgt 540 Mgt 540 Research Methods Research Methods Data Analysis Data Analysis

Upload: meghan-fox

Post on 11-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

11

Mgt 540Mgt 540Research MethodsResearch Methods

Data AnalysisData Analysis

Page 2: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

22

Additional “sources”Additional “sources”

Compilation of sources:Compilation of sources:http://lrs.ed.uiuc.edu/tse-portal/datacollectionmethodolhttp://lrs.ed.uiuc.edu/tse-portal/datacollectionmethodol

ogies/jin-tselink/tselink.htmogies/jin-tselink/tselink.htmhttp://http://web.utk.edu/~dap/Random/Order/Start.htmweb.utk.edu/~dap/Random/Order/Start.htm

Data Analysis Brief Book Data Analysis Brief Book (glossary)(glossary)http://http://rkb.home.cern.ch/rkb/titleA.htmlrkb.home.cern.ch/rkb/titleA.html

Exploratory Data AnalysisExploratory Data Analysishttp://http://

www.itl.nist.gov/div898/handbook/eda/eda.htmwww.itl.nist.gov/div898/handbook/eda/eda.htmStatistical Data AnalysisStatistical Data Analysis

http://obelia.jde.aca.mmu.ac.uk/resdesgn/arsham/oprehttp://obelia.jde.aca.mmu.ac.uk/resdesgn/arsham/opre330.htm330.htm

Page 3: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

33

FIGURE 12.1Copyright © 2003 John Wiley & Sons, Inc. Sekaran/RESEARCH 4E

Page 4: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

44

Data AnalysisData Analysis Get the “feel” for the dataGet the “feel” for the data

Get Mean, variance' and standard Get Mean, variance' and standard deviation on each variabledeviation on each variable

See if for all items, responses range all See if for all items, responses range all over the scale, and not restricted to one over the scale, and not restricted to one end of the scale alone. end of the scale alone.

Obtain Pearson Correlation among the Obtain Pearson Correlation among the variables under study. variables under study.

Get Frequency Distribution for all the Get Frequency Distribution for all the variables. variables.

Tabulate your data. Tabulate your data. Describe your sample's key Describe your sample's key characteristics (Demographic details of characteristics (Demographic details of sex composition, education, age, length sex composition, education, age, length of service, etc. ) of service, etc. )

See Histograms, Frequency Polygons, See Histograms, Frequency Polygons, etc. etc.

Page 5: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

55

Quantitative DataQuantitative Data

Each type of data requires Each type of data requires different analysis method(s):different analysis method(s):NominalNominal

LabelingLabelingNo inherent “value” basisNo inherent “value” basisCategorization purposes onlyCategorization purposes only

OrdinalOrdinalRanking, sequenceRanking, sequence

IntervalIntervalRelationship basis (e.g. age)Relationship basis (e.g. age)

Page 6: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

66

Central Tendency Central Tendency Mean, median modeMean, median mode

Spread Spread Variance, standard deviation, Variance, standard deviation, rangerange

Distribution (Shape )Distribution (Shape )Skewness, kurtosisSkewness, kurtosis

Descriptive StatisticsDescriptive StatisticsDescribing key features of dataDescribing key features of data

Page 7: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

77

Descriptive StatisticsDescriptive StatisticsDescribing key features of Describing key features of

datadata

NominalNominalIdentification / categorization Identification / categorization onlyonly

Ordinal Ordinal (Example on pg. 139)(Example on pg. 139)Non-parametric statisticsNon-parametric statistics

Do not assume equal intervals Do not assume equal intervals Frequency countsFrequency countsAverages (median and mode)Averages (median and mode)

IntervalIntervalParametricParametric

Mean, Standard Deviation, Mean, Standard Deviation, variancevariance

Page 8: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

88

Testing “Goodness of Testing “Goodness of Fit”Fit”

Reliability

Validity

Internal Consistency

Split Half

Discriminant

Convergent

FactorialInvolves Correlations and Factor Analysis

Page 9: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

99

Testing HypothesesTesting Hypotheses

Use appropriate statistical Use appropriate statistical analysisanalysisT-test T-test (single or twin-tailed)(single or twin-tailed)

Test the significance of differences Test the significance of differences of the mean of two groupsof the mean of two groups

ANOVAANOVATest the significance of differences Test the significance of differences among the means of more than two among the means of more than two different groups, using the F test.different groups, using the F test.

Regression Regression (simple or multiple)(simple or multiple)Establish the variance explained in Establish the variance explained in the DV by the variance in the IVsthe DV by the variance in the IVs

Page 10: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1010

Statistical PowerStatistical Power

Claiming a significant differenceClaiming a significant differenceErrors in MethodologyErrors in Methodology

Type 1 errorType 1 errorReject the null hypothesis when you should Reject the null hypothesis when you should

not.not. Called an “alpha” errorCalled an “alpha” error

Type 2 errorType 2 errorFail to reject the null hypothesis when you Fail to reject the null hypothesis when you

should.should. Called a “beta” errorCalled a “beta” error

Statistical power refers to the Statistical power refers to the ability to detect true differencesability to detect true differencesavoiding type 2 errorsavoiding type 2 errors

Page 11: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1111

Statistical PowerStatistical Power see discussion at see discussion at http://my.execpc.com/4A/B7/helberg/pitfalls/http://my.execpc.com/4A/B7/helberg/pitfalls/

Depends on 4 issuesDepends on 4 issuesSample sizeSample sizeThe effect size you want to The effect size you want to detectdetect

The alpha (type 1 error rate) The alpha (type 1 error rate) you specifyyou specify

The variability of the sampleThe variability of the sample Too little power Too little power

Overlook effectOverlook effect Too much powerToo much power

Any difference is significantAny difference is significant

Page 12: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1212

Parametric vs. Parametric vs. nonparametricnonparametric Parametric Parametric (characteristics (characteristics

referring to specific population referring to specific population parameters)parameters)Parametric assumptionsParametric assumptions

Independent samples Independent samples Homogeneity of varianceHomogeneity of varianceData normally distributedData normally distributedInterval or better scaleInterval or better scale

Nonparametric assumptionsNonparametric assumptionsSometimes independence of Sometimes independence of samplessamples

Page 13: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1313

t-tests t-tests (Look at t tables; p. 435)(Look at t tables; p. 435)

Used to compare two means or Used to compare two means or one observed mean against a one observed mean against a guess about a hypothesized guess about a hypothesized mean mean For large samples t and z can be For large samples t and z can be considered equivalentconsidered equivalent

Calculate Calculate tt= = - - μμ SS

Where SWhere S is the standard error of is the standard error of the mean,the mean,

S/S/√n and√n and df = n-1 df = n-1

Page 14: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1414

t-testst-tests

Statistical programs will give Statistical programs will give you a choice between a you a choice between a matched pair and an matched pair and an independent t-test.independent t-test.Your sample and research Your sample and research design determine which you will design determine which you will use.use.

Page 15: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1515

z-test for Proportions z-test for Proportions (Look at t tables; p. 435)(Look at t tables; p. 435)

When data are nominalWhen data are nominalDescribe by counting Describe by counting occurrences of each valueoccurrences of each value

From counts, calculate From counts, calculate proportionsproportions

Compare proportion of Compare proportion of occurrence in sample to occurrence in sample to proportion of occurrence in proportion of occurrence in populationpopulationHypotheses testing allows only one Hypotheses testing allows only one of two outcomes: success or failureof two outcomes: success or failure

Page 16: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1616

z-test for Proportions z-test for Proportions (Look at t tables; p. 435)(Look at t tables; p. 435)

HH00: : = k, where k is a value = k, where k is a value between 0 and between 0 and 11

HH11: : k k

z = p - z = p - = p - = p - p p √(√((1- (1- )/n))/n)

Equivalent to Equivalent to χχ22 for df = 1 for df = 1

Comparing sample proportion to the population proportion

Page 17: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1717

Chi-Square TestChi-Square Test(sampling (sampling distribution) distribution)

One Sample One Sample Measures sample varianceMeasures sample variance

Squared deviations from the mean – Squared deviations from the mean – based on normal distributionbased on normal distribution

NonparametricNonparametric Compare expected with observed Compare expected with observed

proportionproportion HH00: Observed proportion = : Observed proportion =

expected proportionexpected proportion df = number of data pointsdf = number of data points

categories, cells (k) minus 1categories, cells (k) minus 1

χχ22 = = (O – E)(O – E)22

EE

Page 18: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1818

Univariate z TestUnivariate z Test

Test a guess about a Test a guess about a proportion against an proportion against an observed sample; observed sample; eg., MBAs constitute 35% of the eg., MBAs constitute 35% of the managerial populationmanagerial population

HH00: : π = .35π = .35 HH11: π : π .35 .35 (two-tailed test (two-tailed test

suggested)suggested)

Page 19: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

1919

Univariate TestsUnivariate Tests

Some univariate tests are Some univariate tests are different in that they are different in that they are among statistical procedures among statistical procedures where you, the researcher, where you, the researcher, set the null hypothesis.set the null hypothesis.

In many other statistical tests In many other statistical tests the null hypothesis is implied the null hypothesis is implied by the test itself. by the test itself.

Page 20: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2020

Contingency TablesContingency TablesRelationship between nominal Relationship between nominal variablesvariables

http://www.psychstat.smsu.edu/introbook/sbk28m.htmhttp://www.psychstat.smsu.edu/introbook/sbk28m.htm Relationship between subjects' scores on Relationship between subjects' scores on

two qualitative or categorical variablestwo qualitative or categorical variables (Early childhood intervention)(Early childhood intervention)

If the columns are not contingent on the If the columns are not contingent on the rows, then the rows and column rows, then the rows and column frequencies are independent. The test of frequencies are independent. The test of whether the columns are contingent on whether the columns are contingent on the rows is called the chi square test of the rows is called the chi square test of independence. The null hypothesis is that independence. The null hypothesis is that there is no relationship between row and there is no relationship between row and column frequencies.column frequencies.

Page 21: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2121

CorrelationsCorrelations

A statistical summary of the A statistical summary of the degree and direction of degree and direction of association between two association between two variablesvariables

Correlation itself does not Correlation itself does not distinguish between distinguish between independent and dependent independent and dependent variablesvariables

Most common – Pearson’s Most common – Pearson’s rr

Page 22: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2222

CorrelationsCorrelations

You believe that a You believe that a linearlinear relationship exists between relationship exists between two variablestwo variables

The range is from –1 to +1The range is from –1 to +1 RR22, the coefficient of , the coefficient of

determination, is the % of determination, is the % of variance explained in each variance explained in each variable by the othervariable by the other

Page 23: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2323

CorrelationsCorrelations

rr = S = Sxyxy/S/SxxSSyy or the covariance or the covariance between x and y divided by their between x and y divided by their standard deviationsstandard deviations

Calculations neededCalculations neededThe means, x-bar and y-barThe means, x-bar and y-barDeviations from the means, (x – x-Deviations from the means, (x – x-bar) and (y – y-bar) for each casebar) and (y – y-bar) for each case

The squares of the deviations from The squares of the deviations from the means for each case to insure the means for each case to insure positive distance measures when positive distance measures when added, (x - x-bar)added, (x - x-bar)22 and (y – y-bar) and (y – y-bar)22

The cross product for each case (x – The cross product for each case (x – x-bar) times (y – y-bar)x-bar) times (y – y-bar)

Page 24: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2424

CorrelationsCorrelations

The null hypothesis for The null hypothesis for correlations iscorrelations isHH00: : ρ = 0 ρ = 0 and the alternative is usuallyand the alternative is usuallyHH11: ρ ≠ 0: ρ ≠ 0

However, if you can justify it However, if you can justify it prior to analyzing the data you prior to analyzing the data you might also usemight also useHH11: ρ > 0 or H: ρ > 0 or H11: ρ < 0 , : ρ < 0 , a one-tailed testa one-tailed test

Page 25: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2525

CorrelationsCorrelations

Alternative measuresAlternative measuresSpearman rank correlation, Spearman rank correlation, rrranksranks

rrranksranks and and r r are nearly always are nearly always equivalent measures for the same equivalent measures for the same data (even when not the data (even when not the differences are trivial)differences are trivial)

Phi coefficient, Phi coefficient, rrΦΦ, when both , when both variables are dichotomous; variables are dichotomous; again, it is equivalent to again, it is equivalent to Pearson’s Pearson’s rr

Page 26: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2626

CorrelationsCorrelations

Alternative measuresAlternative measuresPoint-biserial,Point-biserial, r rpb pb when when correlating a dichotomous with correlating a dichotomous with a continuous variablea continuous variable

If a scatterplot shows a If a scatterplot shows a curvilinear relationship there curvilinear relationship there are two options:are two options:A data transformation, orA data transformation, orUse the Use the correlation ratio, correlation ratio, ηη2 2

(eta-squared)(eta-squared)1 - 1 - SSwithin

SStotal

Page 27: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2727

ANOVAANOVA

For two groups For two groups onlyonly the t-test the t-test and ANOVA yield the same and ANOVA yield the same resultsresults

You must do paired You must do paired comparisons when working comparisons when working with three or more groups to with three or more groups to know where the means lieknow where the means lie

Page 28: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2828

Multivariate Multivariate TechniquesTechniques Dependent variableDependent variable

Regression in its various formsRegression in its various formsDiscriminant analysisDiscriminant analysisMANOVA MANOVA

Classificatory or data Classificatory or data reduction reduction Cluster analysisCluster analysisFactor analysisFactor analysisMultidimensional scalingMultidimensional scaling

Page 29: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

2929

Linear RegressionLinear Regression

We would like to be able to We would like to be able to predict y from xpredict y from x

Simple linear regression with Simple linear regression with raw scoresraw scoresy = dependent variabley = dependent variablex = independent variablex = independent variableb = regression coefficient = rb = regression coefficient = rxyxyc = a constant termc = a constant term

The general model isThe general model isy = bx + c (+e)y = bx + c (+e)

sx

sy

Page 30: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3030

Linear RegressionLinear Regression The statistic for assessing the The statistic for assessing the

overall fit of a regression model is overall fit of a regression model is the Rthe R22 , or the overall % of , or the overall % of variance explained by the model variance explained by the model

RR22 = 1 – = 1 –

= =

= 1 – (s= 1 – (s22e e // s s22

yy), where s), where s22e e is is

the variance of the error or the variance of the error or residual residual

unpredictable variance

total variance

predictable variancetotal variance

Page 31: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3131

Linear RegressionLinear Regression

Multiple regression: more than one Multiple regression: more than one predictorpredictory = by = b11xx11 + b + b22xx22 + c + c

Each regression coefficient b is Each regression coefficient b is assessed independently for its assessed independently for its statistical significance; Hstatistical significance; H00: b = 0 : b = 0

So, in a statistical program’s output So, in a statistical program’s output a statistically significant b a statistically significant b rejectsrejects the notion that the variable the notion that the variable associated with b contributes associated with b contributes nothing to predicting ynothing to predicting y

Page 32: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3232

Linear RegressionLinear Regression

Multiple regressionMultiple regressionRR22 still tells us the amount of variation still tells us the amount of variation in y explained by all of the predictors (x) in y explained by all of the predictors (x) togethertogether

The F-statistic tells us whether the The F-statistic tells us whether the model as a whole is statistically model as a whole is statistically significantsignificant

Several other types of regression models Several other types of regression models are available for data that do not meet are available for data that do not meet the assumptions needed for least-squares the assumptions needed for least-squares models (such as logistic regression for models (such as logistic regression for dichotomous dependent variables)dichotomous dependent variables)

Page 33: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3333

Regression by SPSS & Regression by SPSS & other Programsother Programs Methods for developing the Methods for developing the

modelmodelStepwiseStepwise: : let’s computer try to fit all let’s computer try to fit all

chosen variables, leaving out those not chosen variables, leaving out those not significant and re-examining variables in the significant and re-examining variables in the model at each stepmodel at each step

EnterEnter: : researcher specifies that all researcher specifies that all

variables will be used in the modelvariables will be used in the model Forward, backwardForward, backward: : begin with all begin with all

(backward) or none (forward) of the (backward) or none (forward) of the variables and automatically adds or removes variables and automatically adds or removes variables without reconsideration of variables without reconsideration of variables already in the modelvariables already in the model

Page 34: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3434

MulticollinearityMulticollinearity

Best regression model has Best regression model has uncorrelated IVsuncorrelated IVs

Model stability low with Model stability low with excessively correlated IVsexcessively correlated IVs

Collinearity diagnostics Collinearity diagnostics identify problems, suggesting identify problems, suggesting variables to be droppedvariables to be dropped

High tolerance, low variance High tolerance, low variance inflation factor are desirableinflation factor are desirable

Page 35: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3535

Discriminant AnalysisDiscriminant Analysis

Regression requires DV to be Regression requires DV to be interval or ratiointerval or ratio

If DV categorical (nominal) If DV categorical (nominal) can use discriminant analysis can use discriminant analysis

IVs should be interval or ratio IVs should be interval or ratio scaledscaled

Key result is number of cases Key result is number of cases classified correctlyclassified correctly

Page 36: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3636

MANOVAMANOVA

Compare means on two or Compare means on two or more DVs more DVs (ANOVA limited to one DV)(ANOVA limited to one DV)

Pure MANOVA via SPSS only Pure MANOVA via SPSS only from command syntaxfrom command syntax

Can use the general linear Can use the general linear model thoughmodel though

Page 37: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3737

Factor AnalysisFactor Analysis

A data reduction technique – a large set A data reduction technique – a large set of variables can be reduced to a smaller of variables can be reduced to a smaller set while retaining the information from set while retaining the information from the original data setthe original data set

Data must be on an interval or ratio Data must be on an interval or ratio scalescale

E.g., a variable called socioeconomic E.g., a variable called socioeconomic status might be constructed from status might be constructed from variables such as household income, variables such as household income, educational attainment of the head of educational attainment of the head of household, and average per capita household, and average per capita income of the census block in which the income of the census block in which the person resides person resides

Page 38: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3838

Cluster AnalysisCluster Analysis

Cluster analysis seeks to group cases Cluster analysis seeks to group cases rather than variables; it too is a data rather than variables; it too is a data reduction techniquereduction technique

Data must be on an interval or ratio Data must be on an interval or ratio scalescale

E.g., a marketing group might want to E.g., a marketing group might want to classify people into psychographic classify people into psychographic profiles regarding their tendencies to profiles regarding their tendencies to try or adopt new products – pioneers or try or adopt new products – pioneers or early adopters, early majority, late early adopters, early majority, late majority, laggardsmajority, laggards

Page 39: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

3939

Factor vs. Cluster Factor vs. Cluster AnalysisAnalysis Factor analysis focuses on Factor analysis focuses on

creating linear composites of creating linear composites of variables variables Number of variables with which Number of variables with which we must work is then reducedwe must work is then reduced

Technique begins with a Technique begins with a correlation matrix to seed the correlation matrix to seed the processprocess

Cluster analysis focuses on Cluster analysis focuses on cases cases

Page 40: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

4040

Potential BiasesPotential Biases Asking the inappropriate or wrong Asking the inappropriate or wrong

research questions.research questions. Insufficient literature survey and Insufficient literature survey and

hence inadequate theoretical hence inadequate theoretical model.model.

Measurement problems Measurement problems Samples not being representative.Samples not being representative. Problems with data collection: Problems with data collection:

researcher biases researcher biases respondent biasesrespondent biasesinstrument biases instrument biases

Data analysis biases:Data analysis biases:coding errors coding errors data punching & input errors data punching & input errors inappropriate statistical analysis inappropriate statistical analysis

Biases (subjectivity) in Biases (subjectivity) in interpretation of results.interpretation of results.

Page 41: 1 Mgt 540 Research Methods Data Analysis. 2 Additional “sources” Compilation of sources: Compilation of sources:  portal/datacollectionmethodologies/jin-tselink/tselink.htm

4141

FIGURE 11.2Copyright © 2003 John Wiley & Sons, Inc. Sekaran/RESEARCH 4E

Questions to ask:Questions to ask: Adopted from Robert NilesAdopted from Robert Niles

Where did the data come from? Where did the data come from? How (Who) was the data reviewed, How (Who) was the data reviewed,

verified, or substantiated?verified, or substantiated? How were the data collected?How were the data collected? How is the data presented?How is the data presented?

What is the context?What is the context?Cherry-picking?Cherry-picking?

Be skeptical when dealing with Be skeptical when dealing with comparisons comparisons

Spurious correlationsSpurious correlations