-notes 06 - correlation
DESCRIPTION
Brief discussion on correlationTRANSCRIPT
Chapter Three Numerically Summarizing Data
Exploring Relationships
Le.sson.1
Correlation
Lesson 1: CorrelationBivariate Data
Bivariate data is data in which two variables are measured on an individual.The response variable is the variable whose value can be explained or determined based upon the value of the predictor variable.A lurking variable is one that is related to the response and/orpredictor variable, but is excluded from the analysis
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationScatter DiagramsA scatter diagram shows the relationship between two quantitative variables measured on the same individual.The value of the predictor is read on the horizontal axis and the response variable on the vertical axis.Each individual in the data set is represented by a point in the scatter diagram.Do not connect the points when drawing a scatter diagram.
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 1:Drawing a Scatter DiagramP 202, #16.An engineer wanted to determine how the weight of a car affected the gas mileage. The data represent the weight of various domestic cars and their city mileage rating (in mpg) for the 2001 model year.(a) Determine which is the likely predictor variable and which is the likely response variable.Predictor variable: weight Response variable: mileage
Weight (pounds)3565344039703305334032003230256025203065360033003625359026052370
Miles Per Gallon19201719202019282820181919192328
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 1:Drawing a Scatter DiagramP 202, #16.An engineer wanted to determine how the weight of a car affected the gas mileage. The data represent the weight of various domestic cars and their city mileage rating (in mpg) for the 2001 model year.(b) Draw a scatter diagram.
City Mileage (MPG)Weight vs. Mileage30
25
20
1520002500300035004000Weight (lbs)
Weight (pounds)3565344039703305334032003230256025203065360033003625359026052370
Miles Per Gallon19201719202019282820181919192328
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationRelationships Between Two VariablesScatter diagrams reveal the type of relationship or trend that exists between two variables.
Linear(Decreasing)NonlinearNo trend
Linear(Increasing)Nonlinear
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 2: Identifying the TrendP 199, #1 4.Determine whether the relationship between the variables is linear or non-linear.If linear, indicate whether there is a positive or negative trend.1.2.
Nonlinear3.4.
LinearNegative
LinearPositiveNonlinearUnit 2: Probability Distributionstzf
Lesson 1: CorrelationPositive Linear RelationshipsTwo variables that are linearly related are said to be positively associated when above average values of one variable are associated with above average values of the corresponding variable.
IIIyIIIIVThat is, two variables are positively associated when the values of the predictor variable increase, the values of the response variable also increase.
x
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationNegative Linear RelationshipsTwo variables that are linearly related are said to be negatively associated when above average values of one variable are associated with below average values of the corresponding variable.
IIIyIIIIVThat is, two variables are negatively associated when the values of the predictor variable increase, the values of the response variable decrease.
x
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationMeasuring the Strength of the Linear RelationshipThe linear correlation coefficient (or Pearson product moment correlation coefficient) is a measure of the strength of linear relation between two quantitative variables.We use the Greek letter (rho) to represent the population correlation coefficient and r to represent the sample correlation coefficient.
We shall only present the formula for the sample correlation coefficient:
r
xi x sx
yi y
sy
n 1The correlation coefficient is a unitless measure of association. The units of measure for x and y play no role in the interpretation of r.
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationProperties of the Linear Correlation CoefficientThe linear correlation coefficient is always between 1 and 1.
r = 1If r = +1, there is a perfect positive linear relation between the two variables.
The closer r is to +1, the stronger the evidence of positive associationbetween the two variables.
r .9r .4
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationProperties of the Linear Correlation Coefficient
r = 1If r = 1 , there is a perfect negative linear relation between the two variables.
The closer r is to 1 , the stronger the evidence of negative association between the two variables.
r .9r .4
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationProperties of the Linear Correlation CoefficientIf r is close to 0, there is little or no linear relation between the two variables.
r 0, no relationshipr 0, nonlinear relationship
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 3:Estimating Correlation from a Scatter PlotP 200, # 6.Match the correlation coefficient to the scatter diagram.
(c)r = 1(d)r = 0.992(b)r = 0.049(a)r = 0.969(a) r = 0.969(b) r = 0.049(c) r = 1(d) r = 0.992
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 4: Anticipating CorrelationP 205, #27.For each of the following statements, state whether you think the variables will have a positive correlation, negative correlation, or no correlation.(a) Number of children in the household under the age of 3 and
expenditures on diapers.
Positive correlation
(b) Interest rates on car loans and the number of cars sold.
Negative
(c) Number of hours per week on the treadmill and cholesterol level.Negative correlation(d) Price of a Big Mac and the number of MacDonalds french fries
sold in a week.
Negative correlation
(e) Shoe size and IQ.
No correlation
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationCalculating the Correlation CoefficientA more efficient formula for computing the correlation coefficient is
r SxySxx Syy
where
Sxx
(xi
x )2
xi2
xi
22n
Syy
( yi
y)2
yi2
yi n
Sxy
(xi
x )( yi
y)
xi yi
xi
yi n
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 5:Computing a Correlation
P 200, # 8.Given the data:(a) Draw a scatter diagram.
y6543210
xy25.735.252.861.962.2
123456x
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 5:Computing a Correlation
xyx2y2xy25.7432.4911.435.2927.0415.652.8257.8414.061.9363.6111.462.2364.8413.22217.811075.8265.6P 200, # 8.Given the data:(b) Compute the correlation coefficient.
Compute x
2, y
2, and xy.
Sum all columns.Calculate SSxx, SSyy, and SSxy.222
17.82
S110 13.2
S75.82 12.452
xx
Sxy 65.6
5(22)(17.8)5
yy512.72
Calculate the correlation:r
12.72(13.2)(12.452)
.99
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 5:Computing a Correlation
xyx2y2xy25.7432.4911.435.2927.0415.652.8257.8414.061.9363.6111.462.2364.8413.22217.811075.8265.6P 200, # 8.Given the data:(c) Comment on the relationship between x and y.The correlation coefficient indicates there is a strong negative linear relationship between x and y.
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 6: Weight vs. Mileage RatingP 202, #16. The data represent the weight of various domestic cars and their city mileage rating (in mpg) for the 2001 model year.(c) What type of relation that appears to exist between the weight of the car between the weight of a car and its city mileage rating.
Weight (pounds)3565344039703305334032003230
Miles Per Gallon19201719202019
City Mileage (MPG)Weight vs. Mileage30
25
20
152000250030003500400
There is a negative linear relationship between weight and mileage.
256028252028306520360018330019362519359019260523
Weight (lbs)0
237028
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 6: Drawing a Scatter DiagramP 202, #16. The data represent the weight of various domestic cars and their city mileage rating (in mpg) for the 2001 model year.(d) Compute the linear correlation coefficient between the weight of the car between the weight of a car and its city mileage rating.
Weight (pounds)3565344039703305334032003230
Miles Per Gallon19201719202019
City Mileage (MPG)Weight vs. Mileage30
25
20
15
r = .92
256028252028306520360018330019362519359019
2000250030003500400
260523
Weight (lbs)0
237028
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationCorrelation & CausationA word of caution when interpreting the correlation coefficient:A linear correlation coefficient that implies a strong positive or negative association that is computed using observational data does not imply causation among the variables.The predictor and response variables may both be determined by an unknown lurking variable.If data are obtained through a controlled experiment, then a stronglinear correlation also implies causation.
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 7: Brain Size and IntelligenceP 203, #21.Researchers interested in whether a persons brain size is related to mental capacity selected a sample of 20 students who had SAT scores higher than 1350 and administered an IQ test. Brain size was determined by an MRI scan.
(a) Use the TI-83 to draw a scatter diagram treating MRI count as the predictor variable and IQ as the response variable.
Gender Female Female Female Female Female Female Female Female Female Female
MRICount816932951545991305833868856472852244790619866662857782948066
IQ133137138132140132135130133133
Gender Male Male Male Male Male Male Male Male Male Male
MRICount949395100112110384379653539554661079549924059955003935494949589
IQ140140139133133141135139141144
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 7: Brain Size and IntelligenceP 203, #21.Researchers interested in whether a persons brain size is related to mental capacity selected a sample of 20 students who had SAT scores higher than 1350 and administered an IQ test. Brain size was determined by an MRI scan.
(:b) Use the TI-83 to compute the correlation coefficient between the MRI count and IQ.Do they appear to be linearly related?
Gender Female Female Female Female Female Female Female Female Female Female
MRICount816932951545991305833868856472852244790619866662857782948066
IQ133137138132140132135130133133
Gender Male Male Male Male Male Male Male Male Male Male
MRICount949395100112110384379653539554661079549924059955003935494949589
IQ140140139133133141135139141144
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 7: Brain Size and IntelligenceP 203, #21.Researchers interested in whether a persons brain size is related to mental capacity selected a sample of 20 students who had SAT scores higher than 1350 and administered an IQ test. Brain size was determined by an MRI scan.
(c) Gender is a lurking variable in the analysis.Draw separate scatter diagrams for each gender.What do you notice?
Gender Female Female Female Female Female Female Female Female Female Female
MRICount816932951545991305833868856472852244790619866662857782948066
IQ133137138132140132135130133133
Gender Male Male Male Male Male Male Male Male Male Male
MRICount949395100112110384379653539554661079549924059955003935494949589
IQ140140139133133141135139141144
Unit 2: Probability Distributionstzf
Lesson 1: CorrelationExample 7: Brain Size and IntelligenceP 203, #21.Researchers interested in whether a persons brain size is related to mental capacity selected a sample of 20 students who had SAT scores higher than 1350 and administered an IQ test. Brain size was determined by an MRI scan.
(d) Calculate the correlation coefficient separately for males and females.Do you still believe that MRI count and IQ are linearly related?
Gender Female Female Female Female Female Female Female Female Female Female
MRICount816932951545991305833868856472852244790619866662857782948066
IQ133137138132140132135130133133
Gender Male Male Male Male Male Male Male Male Male Male
MRICount949395100112110384379653539554661079549924059955003935494949589
IQ140140139133133141135139141144
Unit 2: Probability Distributionstzf