gerald kruse, ph.d. & cathy stenson, ph.d. juniata college mathematics department

18
Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Upload: helena-leonard

Post on 16-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D.

Juniata CollegeMathematics Department

Page 2: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department
Page 3: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

CityMPG = EPA's estimated miles per gallon for city driving

Weight = Weight of the car (in pounds)

FuelCapacity = Size of the gas tank (in gallons)

QtrMile = Time (in seconds) to go 1/4 mile from a standing start

Acc060 = Time (in seconds) to accelerate from zero to 60 mph

PageNum = Page number on which the car appears in the buying guide

Page 4: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Place the letter for each pair on the chart below to indicate your guess as to the direction (negative, neutral, or positive) and strength of the association between the two variables.

(a) Weight vs. CityMPG (d) Weight vs. QtrMile

(b) Weight vs. FuelCapacity (e) Acc060 vs. QtrMile

(c) PageNum vs. Fuel Capacity (f) CityMPG vs. QtrMile

StrongNegative

Moderate Negative

Weak Negative

No Association

Weak Positive

Moderate Positive

Strong Positive

             

Page 5: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Scatterplot Matrix26.75

20.25

3570

2420

20.35

13.65

17.85

15.35

10.775

7.325

26.7520.25

202

108

35702420

20.3513.65

17.8515.35

10.7757.325

202108

CityMPG

Weight

FuelCap

QtrMile

Acc060

PageNum

Matrix Plot - Car Data

Page 6: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Place the letter for each pair on the chart below to indicate your guess as to the direction (negative, neutral, or positive) and strength of the association between the two variables.

(a) Weight vs. CityMPG (d) Weight vs. QtrMile

(b) Weight vs. FuelCapacity (e) Acc060 vs. QtrMile

(c) PageNum vs. Fuel Capacity (f) CityMPG vs. QtrMile

StrongNegative

Moderate Negative

Weak Negative

No Association

Weak Positive

Moderate Positive

Strong Positive

(a)   (d)  (c)   (f)  (b) , (e)

Page 7: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Definition: The correlation, r, measures the strength of linear association between two quantitative variables.

YX S

YY

S

XX

nr

1

1

Measure of Correlation

Page 8: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

valuesYofDevStdS

valuesXofDevStdS

valuesYofmeanY

valuesXofmeanX

Y

X

Measure of Correlation

Page 9: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Sample Correlations in 1999 Car Data

CityMPG Weight FuelCap QtrMile Acc060

Weight -0.907

FuelCap -0.793 0.894

QtrMile 0.510 -0.450 -0.469

Acc060 0.506 -0.454 -0.465 0.994

PageNum 0.283 -0.237 -0.081 0.196 0.205

Page 10: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

Place the letter for each pair on the chart below to indicate your guess as to the direction (negative, neutral, or positive) and strength of the association between the two variables.

(a) Weight vs. CityMPG (d) Weight vs. QtrMile

(b) Weight vs. FuelCapacity (e) Acc060 vs. QtrMile

(c) PageNum vs. Fuel Capacity (f) CityMPG vs. QtrMile

StrongNegative

r “between” -1.0 and -0.8

Moderate Negative

r “between” -0.8 and -0.5

Weak Negative

r “between” -0.5 and 0

No Association

r “around” 0

Weak Positive

r “between” 0 and 0.5

Moderate Positive

r “between” 0.5 and 0.8

Strong Positive

r “between” 0.8 and 1.0

(a) = -0.907

 (d) = -0.450

  (c) = -0.081

 (f) = 0.510 

(b) = 0.894

(e) = 0.994

Page 11: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

1) -1 ≤ r ≤ 1

2) The sign indicates the direction of associationpositive association: r > 0negative association: r < 0no linear association: r approx 0

3) The closer r is to ±1, the stronger the linear association

4) r has no units and does not depend on the units of measurement

5) The correlation between X and Y is the same as the correlation between Y and X

Page 12: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

(0) faculty.juniata.edu/kruse(1) Open the Excel file: ConsumerReportsCarData1999.xlsx

(2) Highlight column C, City MPG(3) CTRL – click and highlight column F, Weight(4) Insert -> Scatter -> Scatterplot(5) Remove legend(6) “Zoom” on axes(7) Add axes titles(8) Modify plot title, “City MPG vs. Weight”

(9) Add trendline

Page 13: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

We were given that the r-value for this data is -0.907.

Excel calculated R2 as 0.8225?

Let’s take the square root…

0.906918, which if we round and add the negative sign for the slope, is what we would expect.

We could also calculate the r-value:(1) using the Data Analysis Add-In in Excel(2) by “hand,” in Excel

Page 14: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

A correlation near zero does not (necessarily) mean that the two variables are unrelated.

EXAMPLE: A circus performer (the Human Cannonball) is interested in how the distance downrange (Y) that a projectile shot from a cannon will travel depends on the angle of elevation (X) of the cannon.

Suppose that we designed an experiment to examine this relationship by test firing (dummies) at various angles ranging from X=0o to X=90o. Sketch a typical scatterplot that you might expect to see from such an experiment.

Would you say that there is likely to be a strong relationship between angle X and distance downrange Y? Estimate the correlation between the X and Y variables from your scatterplot.

Remember: Correlation measures the strength of linear association between two variables.

http://stat.duke.edu/courses/Fall12/sta101.002/Sec2-145.pdf

X Y

Y

X0 deg 90 deg

Page 15: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

A strong correlation does not (necessarily) imply a cause/effect relationship.  

Would you agree that there is a fairly strong negative association between these two variables? Given this association, would it be reasonable to set a foreign policy goal to send lots of TV's to the countries with lowest life expectancies, thus decreasing the number of people per TV and thereby helping the inhabitants to live longer lives?

http://www.public.iastate.edu/~pcaragea/S226S09/Notes/student.notes.section2.4.pdf

R = -0.8038

Page 16: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

A strong correlation does not (necessarily) imply a cause/effect relationship.

http://www.nbcnews.com/id/41479869/ns/health-diet_and_nutrition/t/daily-diet-soda-tied-higher-risk-stroke-heart-attack/

Page 17: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

The following web-page has a Java applet which can be used to construct scatterplots and calculate Pearson’s Correlation Coefficient.

http://illuminations.nctm.org/LessonDetail.aspx?ID=L456

Page 18: Gerald Kruse, Ph.D. & Cathy Stenson, Ph.D. Juniata College Mathematics Department

1) Coefficient of Correlation lies between -1 and +1

2) Coefficients of Correlation are independent of Change of Origin and Scale

3) Coefficients of Correlation possess the property of Symmetry

4) Co-efficient of Correlation measures only linear correlation between X and Y

5) If two variables X and Y are independent, coefficient of correlation between them will be zero.