gerald kruse, ph.d. & cathy stenson, ph.d. juniata college mathematics department
TRANSCRIPT
Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D.
Juniata CollegeMathematics Department
CityMPG = EPA's estimated miles per gallon for city driving
Weight = Weight of the car (in pounds)
FuelCapacity = Size of the gas tank (in gallons)
QtrMile = Time (in seconds) to go 1/4 mile from a standing start
Acc060 = Time (in seconds) to accelerate from zero to 60 mph
PageNum = Page number on which the car appears in the buying guide
Place the letter for each pair on the chart below to indicate your guess as to the direction (negative, neutral, or positive) and strength of the association between the two variables.
(a) Weight vs. CityMPG (d) Weight vs. QtrMile
(b) Weight vs. FuelCapacity (e) Acc060 vs. QtrMile
(c) PageNum vs. Fuel Capacity (f) CityMPG vs. QtrMile
StrongNegative
Moderate Negative
Weak Negative
No Association
Weak Positive
Moderate Positive
Strong Positive
Scatterplot Matrix26.75
20.25
3570
2420
20.35
13.65
17.85
15.35
10.775
7.325
26.7520.25
202
108
35702420
20.3513.65
17.8515.35
10.7757.325
202108
CityMPG
Weight
FuelCap
QtrMile
Acc060
PageNum
Matrix Plot - Car Data
Place the letter for each pair on the chart below to indicate your guess as to the direction (negative, neutral, or positive) and strength of the association between the two variables.
(a) Weight vs. CityMPG (d) Weight vs. QtrMile
(b) Weight vs. FuelCapacity (e) Acc060 vs. QtrMile
(c) PageNum vs. Fuel Capacity (f) CityMPG vs. QtrMile
StrongNegative
Moderate Negative
Weak Negative
No Association
Weak Positive
Moderate Positive
Strong Positive
(a) (d) (c) (f) (b) , (e)
Definition: The correlation, r, measures the strength of linear association between two quantitative variables.
YX S
YY
S
XX
nr
1
1
Measure of Correlation
valuesYofDevStdS
valuesXofDevStdS
valuesYofmeanY
valuesXofmeanX
Y
X
Measure of Correlation
Sample Correlations in 1999 Car Data
CityMPG Weight FuelCap QtrMile Acc060
Weight -0.907
FuelCap -0.793 0.894
QtrMile 0.510 -0.450 -0.469
Acc060 0.506 -0.454 -0.465 0.994
PageNum 0.283 -0.237 -0.081 0.196 0.205
Place the letter for each pair on the chart below to indicate your guess as to the direction (negative, neutral, or positive) and strength of the association between the two variables.
(a) Weight vs. CityMPG (d) Weight vs. QtrMile
(b) Weight vs. FuelCapacity (e) Acc060 vs. QtrMile
(c) PageNum vs. Fuel Capacity (f) CityMPG vs. QtrMile
StrongNegative
r “between” -1.0 and -0.8
Moderate Negative
r “between” -0.8 and -0.5
Weak Negative
r “between” -0.5 and 0
No Association
r “around” 0
Weak Positive
r “between” 0 and 0.5
Moderate Positive
r “between” 0.5 and 0.8
Strong Positive
r “between” 0.8 and 1.0
(a) = -0.907
(d) = -0.450
(c) = -0.081
(f) = 0.510
(b) = 0.894
(e) = 0.994
1) -1 ≤ r ≤ 1
2) The sign indicates the direction of associationpositive association: r > 0negative association: r < 0no linear association: r approx 0
3) The closer r is to ±1, the stronger the linear association
4) r has no units and does not depend on the units of measurement
5) The correlation between X and Y is the same as the correlation between Y and X
(0) faculty.juniata.edu/kruse(1) Open the Excel file: ConsumerReportsCarData1999.xlsx
(2) Highlight column C, City MPG(3) CTRL – click and highlight column F, Weight(4) Insert -> Scatter -> Scatterplot(5) Remove legend(6) “Zoom” on axes(7) Add axes titles(8) Modify plot title, “City MPG vs. Weight”
(9) Add trendline
We were given that the r-value for this data is -0.907.
Excel calculated R2 as 0.8225?
Let’s take the square root…
0.906918, which if we round and add the negative sign for the slope, is what we would expect.
We could also calculate the r-value:(1) using the Data Analysis Add-In in Excel(2) by “hand,” in Excel
A correlation near zero does not (necessarily) mean that the two variables are unrelated.
EXAMPLE: A circus performer (the Human Cannonball) is interested in how the distance downrange (Y) that a projectile shot from a cannon will travel depends on the angle of elevation (X) of the cannon.
Suppose that we designed an experiment to examine this relationship by test firing (dummies) at various angles ranging from X=0o to X=90o. Sketch a typical scatterplot that you might expect to see from such an experiment.
Would you say that there is likely to be a strong relationship between angle X and distance downrange Y? Estimate the correlation between the X and Y variables from your scatterplot.
Remember: Correlation measures the strength of linear association between two variables.
http://stat.duke.edu/courses/Fall12/sta101.002/Sec2-145.pdf
X Y
Y
X0 deg 90 deg
A strong correlation does not (necessarily) imply a cause/effect relationship.
Would you agree that there is a fairly strong negative association between these two variables? Given this association, would it be reasonable to set a foreign policy goal to send lots of TV's to the countries with lowest life expectancies, thus decreasing the number of people per TV and thereby helping the inhabitants to live longer lives?
http://www.public.iastate.edu/~pcaragea/S226S09/Notes/student.notes.section2.4.pdf
R = -0.8038
A strong correlation does not (necessarily) imply a cause/effect relationship.
http://www.nbcnews.com/id/41479869/ns/health-diet_and_nutrition/t/daily-diet-soda-tied-higher-risk-stroke-heart-attack/
The following web-page has a Java applet which can be used to construct scatterplots and calculate Pearson’s Correlation Coefficient.
http://illuminations.nctm.org/LessonDetail.aspx?ID=L456
1) Coefficient of Correlation lies between -1 and +1
2) Coefficients of Correlation are independent of Change of Origin and Scale
3) Coefficients of Correlation possess the property of Symmetry
4) Co-efficient of Correlation measures only linear correlation between X and Y
5) If two variables X and Y are independent, coefficient of correlation between them will be zero.