rachel tabak research methods november 12, 2012 quantitative data analysis
Post on 13-Jan-2016
218 Views
Preview:
TRANSCRIPT
R A C H E L T A B A KR E S E A R C H M E T H O D SN O V E M B E R 1 2 , 2 0 1 2
QUANTITATIVE DATA ANALYSIS
OUTLINE
• Introduction to Quantitative data• Descriptive vs Inferential statistics• Univariate vs. Bivariate analysis• Frequency Distributions• Central Tendency• Dispersion• Levels of Measurement• Interpreting Bivariate/Multivariate Tables • Measures of Association • Effect Size• Substantive Significance• Inferential statistics
WHAT IS QUANTITATIVE DATA?
020406080
100
0
2
4
6
8
Low sugar cereal
Baked chips
Skim or 1% milk
Diet soda
100% fruit juice
Nu
mb
er o
f H
os
pit
als
Healthy Food and Beverage Options
0%
1-10%
11-33%
Figure : Collaboratives participating in the Coalitions and Networks for Active Living (CANAL) Study by state estimates for physical inactivity in US adults 18
WHY IS IT IMPORTANT TO UNDERSTAND QUANTITATIVE METHODS?
REAL WORLD EXAMPLE-VIDEO
• The video made the file too big. So it has been removed from the powerpoint file.
REAL WORLD EXAMPLE
• 5 hour energy video• Of the 73% of primary care physicians
• who would recommend a low calorie energy supplement • to their healthy patients • who use energy supplements,
• 56% would specifically recommend 5-hour ENERGY for their healthy patients who use energy supplements.
• This ad insinuates that 73% of doctors surveyed recommended 5 Hour Energy
THE PROBLEM
• Connecting two statements that aren’t actually connected: • Doctors recommend low-calorie energy supplements for those
already taking supplements. • 5 Hour Energy is a low-calorie energy supplement.
• Suggested Conclusion: Doctors must recommend 5 Hour Energy.
• HOWEVER, the ad is actually saying that 73% of doctors said that if you are going to take an energy supplement anyways, then they recommend a low-calorie supplement.
• The doctors are never actually advocating energy drinks. AND at no point do they support consuming 5 Hour Energy.
DESCRIPTIVE VS. INFERENTIAL
DESCRIPTIVE ANALYSIS
• Purpose:• Limited to describing data in the current study sample• Not assertions about the larger population from which the
sample was drawn• Not assertions about the causal process to explain the
observations in the current data
• Does not provide sufficient grounds for inferring relationships exist in general or have theoretical meaning
INFERENTIAL ANALYSIS
• In the data from our study 2 variables may appear to be strongly related
• HOWEVER, we have to consider the possibility that our results might be a fluke
• The 2 variables may not be really related:• In the population • In any theoretical sense
ANALYSIS EXAMPLE
• Your degree program (MSW or MPH)• The number of letters in your last name
Program
Number of Letters
MPH Odd
MPH Even
MSW Odd
MPH Odd
MSW Even
MPH Even
MSW Odd
MSW Odd
Odd
Even
Total
MPH 10 3 13
MSW 4 9 13
Total 14 12 26
ANALYSIS EXAMPLE
• Your degree program (MSW or MPH)• The number of letters in your last name
Program
Number of Letters
MPH Odd
MPH Even
MSW Odd
MPH Odd
MSW Even
MPH Even
MSW Odd
MSW Odd
Odd
Even
Total
MPH 77% 23% 13
MSW
31% 69% 13
Total 14 12 26
ANALYSIS EXAMPLE-RESULTS
• MPH students are more likely to have an odd number of letters in their last name than MPH
• The finding in our class show:• 77% of MPH students have odd-numbered last
names• 31% of MSW students have odd-numbered last
names
WHAT DOES THIS EXAMPLE SHOW?
• Mere chance (or sampling error) can account for relationships in the study data
• The relationships we observe in our findings do not necessarily really exist in the population or in a theoretical sense
• Cause or explanation for differences in our dependent variable may be:• Independent variable (or intervention)• Chance
DESCRIPTIVE VS. INFERENTIAL
• Descriptive Analysis limits us to describing data in the current study sample
• Inferential Analysis make assertions about • Larger population from which the sample was
drawn or • Causal process to explain the observations in
the current data
METHODS OF DESCRIPTIVE UNIVARIATE ANALYSIS
FREQUENCY DISTRIBUTIONS
• Range of ways to present data• Ex-client age
• List of all ages• Arrange such that x clients are x age, y clients are
z age
• Frequency distributions: • Grouped data:
• x clients are <45 yrs, y clients 45-50• x % of clients were <45 and y% were 45-50, etc.
CENTRAL TENDENCY
• Summary tendencies• Mode=most frequent attribute-grouped or
ungrouped• Arithmetic mean=sum of the values of
several observation divided by the number of observations• Susceptible to extreme values: a few very large
or very small numbers• Example: mean person in Redmond, Washington
• Median=middle attribute in the ranked distribution of observed attributes
CENTRAL TENDENCY +/-
• Averages reduce raw data to the most manageable form: • Single number represents all detailed data for
variable• Comes at a cost - reader cannot reconstruct
original data from an average
• Report summaries of the dispersion
DISPERSION
• Distribution of values around central value• Range
• Simplest measure of dispersion• Distance separating highest and lowest values
• Standard deviation• How far away from the mean individual scores are
located, on average
DISPERSION-EXAMPLE
• Mean client age=14.9• Standard deviation=1.0• Substantial portion of clients’ ages fall between:
• 13.9 (1.0 below the mean) and • 15.9 (1.0 above the mean)
• Mean client age=14.9• Standard deviation =5• Many clients are:
• As young as 9.9 (5 less than the mean) and • As old as 19.9 (5 more than the mean)
LEVELS OF MEASUREMENT
• Nominal• Qualitative attributes • Categorical only • Frequencies appropriate
• Ordinal • Rank-ordered - attributes represent relatively more or
less• Differences between the attributes are not precise
• Interval• Differences between levels have the same meanings
• Ratio• Same as interval, but have a true zero point
STATISTICS BY LEVEL OF MEASUREMENT
• Medians, means, and standard deviations • Calculated only for interval and ratio data• Many ‘gray-area’ situations in the calculation of averages
• Ordinal• No reason to believe distance from 1 (very dissatisfied) to 2
(dissatisfied) is the same as distance between 2 and 3 (satisfied)
• Technically questionable: • Mean rating or standard deviation for clients• Treat ordinal ratings real values• Common, still useful
• Mean score - no precise mathematical meaning • Compares large numbers of ratings across subgroups
STATISTICS BY LEVEL OF MEASUREMENT - EXAMPLE
• 4-point scale assessing client satisfaction• Ethnic minority client mean rating=1.4• White client mean rating=3.2 • Despite imprecise meaning, clear, useful
indication• Groups do not express same satisfaction
STATISTICS BY LEVEL OF MEASUREMENT – DOUBLE EDGED SWORD
• Key is utility• Statistical calculations are useful in guiding practice• Application of statistics to data that do not warrant them
• Danger-thinking results represent something truly precise
• Utility and appropriateness of carrying statistics to three decimal places
INTERPRETING BIVARIATE TABLES
BIVARIATE/MULTIVARIATE ANALYSIS
• Relationships among 2 (or more) variables• Not looking at more than one variable, but
without the relationship• Would still be univariate analysis
EXPLANATORY BIVARIATE TABLES
• Compare independent variable subgroups with one another in terms of given attribute of dependent variable
• Categories for independent variable spread across rows
• Table shows number with/without experiences across level (grad/undergrad) category
CROSS TABULATION OF STUDENT CHARACTERISTICS AND RESEARCH EXPERIENCE, N = 200
Level Previous research
experience TotalNo Yes
Undergraduate 40 60 100 Graduate 20 80 100
Comparing dependent variable rates (previous research experience) for each category of independent variable)
EXAMPLE TABLE
• Relationships between Level with previous research experience
• Level: independent variable, because we wonder if this explains previous research experience
• Compare dependent variable rates (previous research experience) for each category of Level
INTERPRETING MULTIVARIATE TABLES
MULTIVARIATE TABLES
• Constructed from several variables• Same logic as bivariate tables• Instead of one independent and one
dependent variable, more than one independent variable
• Explaining dependent variable on more than one independent variable
EXAMPLE USING LEVEL AND PREVIOUS RESEARCH EXPERIENCE
• Origin might affect relationship between level and previous research experience
• Maybe it’s not level that explains difference in previous research experience
• Maybe it’s that undergrads are less likely to be international
• Large portion of graduate, than undergrad, likely to be international
• Maybe international students less likely to have previous research experience than domestic
• Then factors related to opportunities in their home countries may better explain different previous research experience than the broader variable, Level
CROSS TABULATION OF STUDENT CHARACTERISTICS AND RESEARCH EXPERIENCE, N = 200
Origin LevelPrevious research
experience TotalNo Yes
International Undergraduate 30 10 40
Graduate 10 30 40
Domestic Undergraduate 10 50 60
Graduate 10 50 60
READING THE TABLE
• Two bivariate tables side by side1. Examine the bivariate table pertaining just to
domestic students• No difference in previous research experience• 60 Domestic students in each group• 50 having previous research experience
2. Examine right side of the table-pertaining just to international students
• Big difference in previous research experience • 10 undergraduates• 30 graduates
READING THE TABLE-SUMMARY
• Undergrad vs. graduate previous research experience:• Domestic students: no difference • International students: big difference; graduate students
more likely • International students: either level less likely than
domestic
• Opportunities in home countries appear to have more to do with previous experience than level
• Appear to have a bigger impact on international undergrads than graduates
MEASURES OF ASSOCIATION
INTERPRETING MEASURES OF ASSOCIATION
• Measures of association• Measures of relationship strength• Measures of effect size• Difference as “big,” refers to strength of the
association • Eyeballing different percents in a table• Statistical procedures for interpreting how
strongly variables are related
RANGE
• Some commonly used measures of association range:• 0 = no relationship• 1.0 = perfect relationship• -1.0 = perfect relationship-minus sign means
the variables are negatively (inversely) related
• Closer the value is to zero, the less we can predict the relative value of one variable by knowing the other
EXAMPLE 1
• Same service refusal rate for 2 groups-knowing a case was in one group does not affect calculation of the odds the case would refuse services
• Might be able to predict the odds from the overall service refusal rate for cases in either group
• Knowing what group the case was in would not influence prediction
• The relationship magnitude=0• Still possible to have a descriptive finding
EXAMPLE 2
• Group 1: 0% refusal rate • Group 2: 100% service refusal rate• Knowing which group a case was in can predict
with 100% accuracy whether that case would refuse
• Relationship magnitude =1.0, perfect relationship.
EXAMPLE 3
• Perfect degree of predictability, but minus sign in front of the 1.0
• As one variable increases, the other decreases• Magnitude is -1.0 because each increase in
independent variable, reduces dependent variable by same amount
• Perfect relationship - independent variable all we need to know to predict dependent variable
• Minus sign does not signify a weaker relationship
EXAMPLE 4-PROBLEM
• 40% service refusal rate for the U.S.-born• 60% for immigrants• Predicted service refusal for every
immigrant referral and service utilization for every U.S.-born referral
• We would be wrong 40% of the time
EXAMPLE 4-ANSWER
• Service refusal rate of 50%• Try to predict without knowing group-wrong
50% of the time • Knowing which group referral was in
reduces % of errors from 50% to 40%• Some relationship between variables –
knowing one variable reduces errors in predicting the other
• Stronger relationship –prediction errors reduced more
WHICH MEASURE OF ASSOCIATION SHOULD BE USED?
• Depends primarily on variable level of measurement• Some common measures of association:
• Pearson’s product-moment correlation (r)• Lambda• Yule’s Q• Phi• Cramer’s V• Eta• Point-biserial correlation coefficient
• Don’t get overwhelmed by the measure used• Focus on what it means when an article says the
correlation (or other measure of association) was 0.30, or -0.40.
CREATING OUR OWN DATA
• Take your pulse• Count for 30 seconds and multiply that by 2• Or, count for 15 seconds and multiply by 4
• Do 30 Jumping-Jacks• Take your pulse a second time
• Count for 30 seconds and multiply that by 2• Or, count for 15 seconds and multiply by 4
EFFECT SIZE
MEASURES OF ASSOCIATION
• Effect-size statistics=Measures of association that range 0 to +/-1
• So are others• Effect-size statistics:
• Portray the strength of association found in any study• Regardless of the outcome measure used• In terms that are comparable across studies
• Enable us to compare effects of different interventions across studies using different types of outcome measures
EXAMPLE-PROBLEM
• Two experimental evaluations of effectiveness of different approach to treating male batterers1. Evaluates a cognitive-behavioral approach and finds that
during a post-treatment follow-up period• Mean 2 physically abusive incidents for experimental• Mean 3 physically abusive incidents for control
2. Evaluates a psychosocial approach-nominal measure of whether any abuse occurred during the follow-up period• 40% of experimental subjects were abusive• 60% of control subjects were abusive
• Difficult to judge which intervention had stronger effect, because outcomes and indicators are different
EXAMPLE-ANSWER
• Correlations coefficient for first study can be compared to the correlation statistic for second
• Intervention with the larger effect size is one with greater effect on its dependent variable
• Not everyone who reports correlations refers to them as the ‘effect size.’• “The effect size was substantial, because the correlation
was .50,” • “ the relationship was strong, because the correlation
was .50.”
COHEN’S d
• Another group of effect-size statistic • When interval or ratio-level data permit dividing
difference between the means of experimental and control groups by standard deviation
• Different approaches vary on the standard deviation used:• Control group• Pooled standard deviation of the two groups• Estimate of standard deviation of the population about
which a study is attempting to generalize
Here, we’ll just use the pooled standard deviation
CALCULATING COHEN’S d
• Example: Using the two studies just mention (about treating male batterers), but with ratio-level measures of outcome
EXAMPLE 1 OF COHEN’S d
• Mean number of physically abusive incidents• Experimental group mean=2• Control group mean=3• Standard deviation=1
• +/- sign in the dividend-reduction in the outcome measure represents un/desirable effect
• Here, desirable- reduced physical abuse• Interpret this ES of +1.0 - experimental group’s mean
was one standard deviation better than control group’s
EXAMPLE 2 OF COHEN’S d
• Mean number of verbally and physically abusive incidents combined
• First study assessed only physical abuse and had shorter post-treatment measurement period
• Second study finds • Experimental group mean=20• Control group mean=26• Standard deviation=10
EXAMPLE 1 VS 2
• Two studies quantified outcome in different ways• One dealing with much larger numbers• Dividing by standard deviation makes results comparable
• Results indicate that the cognitive-behavioral intervention evaluated in the first study had stronger effects on its outcome measure than psychosocial intervention in second study had on its outcome measure
STRONG, MEDIUM, AND WEAK EFFECT SIZES
• Interpret effect sizes in terms such as weak, medium, or strong
• Tendency to attach the same meanings to correlations as exam scores
• These benchmarks may be applicable to studies of measurement reliability
• Not as general guide points to distinguish strong and weak relations is incorrect
EFFECT SIZE STRENGTH EXAMPLE
• Example 1: Experiment to evaluate the effectiveness of an intervention to prevent recidivism among parents referred for child abuse • 40% recidivism rate for 10 treated cases• 60% recidivism rate for 100 untreated cases• Correlation=0.20. • Is this was a weak relationship? • What if prior experimental research found alternative interventions had a
weaker effect or no impact on recidivism rates?
• Example 2: What if experiment found 35% recidivism rate for treated cases versus a 65% recidivism rate for untreated cases? • phi statistic=0.30. • May seem small • 65% recidivism rate for untreated cases would nearly double the 35% rate
for treated cases
DETERMINING EFFECT SIZE STRENGTH
• Cohen (1998) argues that a Cohen’s d of approximately 0.5 should be considered to be of medium strength
• Cohen’s d of 0.8 should be considered strong• Cohen deems a Cohen’s d of about 0.2 to be weak• Rosenthal and Rubin (1982): value of some
interventions with “weak” effect sizes often underestimated• Example: new intervention improves treatment success rate
from 45% to 55%• Correlation=0.1• Increase from 0.45 to 0.55 might be very important in
outcome indicators as survival rates or cure rates
BEYOND EFFECT SIZE
• Intervention’s effect size alone does not indicate value• Intervention with a stronger effect size is not necessarily
“better” than one with a weaker effect size. • Example 1: Intervention that reduces the rate of child
abuse from 55% to 45% may be more valuable to society than an intervention that reduces the annual turnover rate among Big Brother or Big Sister volunteers from 60% to 40%.
• Example 2: Intervention reduces the rate of extreme physical abuse from 55% to 45% might be deemed more valuable than an intervention that reduces the rate of mild verbal abuse from 65% to 35%.
• Which intervention is “better” or more valuable
SUBSTANTIVE SIGNIFICANCE (PRACTICAL SIGNIFICANCE)
IMPORTANCE/MEANINGFULNESS FROM A PRACTICAL STANDPOINT
• Strength of relationship-not automatically substantive significance
• Clinical interventions-clinical significance • No matter how strong a relationship - still
ask whether it constitutes a substantively important or trivial finding
SUBSTANTIVE SIGNIFICANCE EXAMPLE 1
• Child abuse example• Study1 intervention-difference of 35% and a 65%
recidivism rate• Study 2: Different intervention, same target population
difference of 20% and 80% recidivism rate
• Assuming the two intervention were equal in terms of cost, time, and so on, the more substantively significant finding would be straightforward - one with the 20% recidivism rate
SUBSTANTIVE SIGNIFICANCE EXAMPLE 2
• Child abuse example• Study 3: after five years of daily psychoanalysis costing
$500/week, 80% of treated abusive parents as compared to 20% of untreated parents say they agree with Freud’s ideas.
• Which intervention would you find to be more valuable if you were providing services to that target population: • one that made a 30% difference in child abuse recidivism• one that made a 60% difference in agreement with
Freud’s ideas
It’s important to know how strong a relationship is and how it compares to the findings of comparable studies that used comparable variables; not all studies are comparable in practical importance
INFERENTIAL ANALYSIS
INFERENCE?
• No matter how strongly two variables appear to be related in data from our study, the two variables may not really be related • In the population • In any theoretical sense
• Example we used earlier: MSW students more or less likely to have odd number of letter in their last
• We found a difference, but you would not interpret it to mean that MSW students are more likely than MPH students to have an odd number of letters in their last names
INFERENTIAL STATISTICS
• Drawing inferences about the causal process• Is there a relationship between X & Y?• Did X lead to Y?
T-statisticF-valueR-value
Will tell you if there is a relationship…
p-valueWill tell you how confident you can be that the relationship is due to chance…
LUCK AND SAMPLING ERROR
• Refuting chance-need to rule out chance (or sampling error) to explain relationships observed in our data
• Descriptive statistic won’t enable us to rule out chance
• Need to use inferential statistics• Pertinent inferential statistics
• Likelihood a relationship can be attributed to chance is quite tiny-relationship deemed statically significant
• Determine likelihood - conduct test of statistical significance• Regardless of type of significance test, result of test
reported in terms of a probability level
STATISTICAL SIGNIFICANCE
• Research report tests a hypothesis, p<.05 means the probability (likelihood) the results supporting that hypothesis can be attributed to chance is less than 5%
• Most research reports call a relationship statistically significant when the p value is at or below .05. • .05 means that there is only a 1 in 20 chance (5 in 100
chance) relationship can be attributed to chance • Some studies with very small samples will use 0.10 cutoff
point
SIGNIFICANCE & P-VALUES
• When p < .05
• The probability that these findings are due to chance is less than 5%.
• We are 95% sure that the relationship between X and Y is not due to chance.
• The relationship between X and Y is statistically significant.
ONE- AND TWO-TAILED TESTS
• One-tailed tests are used for directional hypotheses• Infer whether the relationship can be generalized in the
predicted direction
• Two-tailed tests are used for non-directional hypotheses• Hypothesis doesn’t specify a positive or negative
relationship• Interested in the extreme values• Often used even when the hypothesis is directional
Type I Error Type II ErrorReject H0 when it is true Fail to reject H0 when it is
false
Incorrectly determine that a relationship exists
Incorrectly determine that there is no relationship when one really exists
H0: There is no relation between coffee consumption and stress.
Type I Error? Type II Error?
SELECTING A SIGNIFICANCE TEST
• Level of measurement of the variables• Number of variables (bivariate vs.
multivariate)• Sampling methods• Distribution of variables in the target
population
PARAMETRIC TESTS
• At least one variable has an interval or ratio level of measurement
• Main assumptions:• Normal distribution• Homogeneity of variance• Independent groups are being compared
• Common tests:• t-test• ANOVA• Pearson’s correlation
T-TESTS
• IV: Nominal or ordinal, dichotomous• DV: Interval or ratio-level
• Example: Compare average QOL scores for men and women
• Accounts for degrees of freedom• Df = Sample size – 2
• Result: t-statistic
RESEARCH METHODS PROFICIENCY EXAMINATION SCORE BY STUDENT CHARACTERISTICS, N = 200
T-TEST
Characteristic (n) Proficiency exam score
t
p-valueM (SD)
Gender Female (109) 88.83 6.78 2.05 .46 Male (91) 87.65 4.32 Level Undergraduate (100) 82.95 10.72 5.16 .006** Graduate (100) 93.86 5.84 Origin International (57) 91.48 6.76 1.16 .68 Domestic (143) 89.96 8.12 Note: * significant at p≤.05; ** significant at p≤.01
LET’S TRY OUR OWN
ANOVA
• Same logic as t-test, however:• IV: Nominal or ordinal; can be dichotomous or have
multiple categories• DV: Must be interval or ratio-level
• Example: Compare QOL among current smokers, former smokers, and never smokers
• Result: F-value
RESEARCH METHODS PROFICIENCY EXAMINATION SCORE BY MSW CONCENTRATION, N = 200
ANOVA
Characteristic (n) Proficiency exam score
F
p-valueM (SD)
CYF (58) 87.58 5.84 2.05 .46Mental health (47) 88.83 6.78 SED (53) 87.65 4.32 Individualized (42) 90.37 8.61 Note: CYF = children, youth & families; SED = social & economic developmentNote: * significant at p≤.05; ** significant at p≤.01
PEARSON’S CORRELATION
• What is the probability that the null hypothesis (r = 0) is true for the entire population, and the correlation we found in our sample happened by chance?
• IV: Interval or ratio• DV: Interval or ratio• Result: r
PEARSON CORRELATION FOR RESEARCH METHODS PROFICIENCY EXAMINATION SCORE BY ASSIGNMENT GRADES
IN RESEARCH METHODS , N = 200 PEARSON CORRELATION
Assignment r
Quiz 1 -.13
Quiz 2 .49*
Midterm exam .54**
Quiz 3 -.06
Quiz 4 .12
Final exam .39*Note: *significant at p≤.05; ** significant at p≤.01
NON-PARAMETRIC TESTS
• Use when parametric assumptions are not met
• Most do not require interval or ratio levels of measurement
• Can be used with nominal or ordinal data that are not distributed normally
• Common tests• Chi-Square – IV and DV are both nominal or ordinal• Fisher’s exact test – use when sample is too small for Chi-
Square
CHI-SQUARE
• IV: Nominal• DV: Nominal
• Example: Are there differences in the proportion of men and women who have a chronic medical condition?
• Assess the extent to which the frequencies you observe differ from what is expected if the distribution occurred by chance
DEGREES OF FREEDOM (DF)
• # of degrees of freedom depends on the number of categories in the variable
• Example for calculating df for Chi-Square:• Df = (r-1)*(c-1)• r = # categories in the row variable• c = # categories in the column variable
• Used when assessing significance
CROSS TABULATION OF STUDENT CHARACTERISTICS AND RESEARCH EXPERIENCE, N = 200
2X2-CHI-SQUARE
Characteristic (n) Previous research
experience χ2 df p-valueNo Yes
Gender
Female (109) 91 18 .05 1 .83 Male (91) 77 14 Level Undergraduate (100) 93 7 7.20 1 .03* Graduate (100) 32 68 Origin International (57) 26 31 1.16 1 .68 Domestic (143) 75 68 .05 1 .83
Note: * significant at p≤.05; ** significant at p≤.01
TAKE HOME POINTS
• Choose statistical significance tests based upon the nature of the data, considering the:• Level of measurement• Number of variables• Distribution of variables in the population
• Statistical tests indicate the probability of a Type I error
• By convention, significance is determined at the p< 0.05 level
• In practice, weigh statistical vs. clinical significance
QUESTIONS???
top related