chapters 14 and 15 – linear regression and correlation

50
Chapters 14 and 15 – Linear Regression and Correlation

Upload: kristian-tenpenny

Post on 14-Dec-2015

241 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Chapters 14 and 15 – Linear Regression and Correlation

Chapters 14 and 15 – Linear Regression and Correlation

Page 2: Chapters 14 and 15 – Linear Regression and Correlation

Contingency tables are useful for displaying information on two qualitative variables

Page 3: Chapters 14 and 15 – Linear Regression and Correlation

Scatter plots are useful for displaying information on two quantitative variables.

Page 4: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 5: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 6: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 7: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 8: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 9: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 10: Chapters 14 and 15 – Linear Regression and Correlation

What type of relationship is present in the following scatter plot?

A. No relationshipB. Linear relationshipC. Quadratic relationshipD. Other type of relationship

Page 11: Chapters 14 and 15 – Linear Regression and Correlation

We can quantify how strong the linear relationship is by calculating a correlation coefficient.

The formula is:

It is easier to let technology do the calculation!

Page 12: Chapters 14 and 15 – Linear Regression and Correlation

We can quantify how strong the linear relationship is by calculating a correlation coefficient.

It is easier to let technology do the calculation!

You have multiple options:• Calculator• Minitab• Excel• Websites

Page 13: Chapters 14 and 15 – Linear Regression and Correlation

Calculation example

Correlation Coefficient = -0.492

Correlation Coefficient is abbreviated by r.

r = -0.492

x y5 72 63 86 45 54 62 6

Page 14: Chapters 14 and 15 – Linear Regression and Correlation

TI Calculator: Type x data into L1 and y data into L2 then go to VARS -> Statistics -> EQ -> r

r = -0.492

Note: it does not matter which is the x data and which is the y data for computing r.

x y5 72 63 86 45 54 62 6

Page 15: Chapters 14 and 15 – Linear Regression and Correlation

Consider the following data:

A. r = - 0.734B. r = 0.538C. r = 0.734D. r = 0.466E. r = - 0.538

x y2 1414 1057 2814 1623 1656 188 1

Page 16: Chapters 14 and 15 – Linear Regression and Correlation

Consider the following data:

A. r = - 0.034B. r = - 0.724C. r = - 0.545D. r = - 0.983E. r = - 0.241

x1 x20 -48 -67 -89 -95 -58 -88 -76 -9

Page 17: Chapters 14 and 15 – Linear Regression and Correlation

Properties of the Correlation Coefficient

• If then there is a negative relationship between the two variables

• If then there is a positive relationship between the two variables

• r only measures a linear relationship• The greater , the stronger the relationship

Page 18: Chapters 14 and 15 – Linear Regression and Correlation

The correlation coefficient is 0.734There is a positive relationship

x y2 1414 1057 2814 1623 1656 188 1

Page 19: Chapters 14 and 15 – Linear Regression and Correlation

The correlation coefficient is - 0.724There is a negative relationship

x1 x20 -48 -67 -89 -95 -58 -88 -76 -9

Page 20: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = - 0.821B. r = - 0.759C. r = 0.388D. r = 0.674E. r = 0.983

r = 0.983

Page 21: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = 0.121B. r = 0.372C. r = 0.644D. r = 0.865E. r = 0.978

r = 0.865

Page 22: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = 0.372B. r = 0.522C. r = 0.644D. r = 0.865E. r = 0.978

r = 0.522

Page 23: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = - 0.034B. r = - 0.299C. r = - 0.438D. r = - 0.601E. r = - 0.894

r = - 0.601

Page 24: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = - 0.004B. r = - 0.156C. r = - 0.441D. r = - 0.699E. r = - 0.923

r = - 0.156

Page 25: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = 0.7484B. r = 0.3156C. r = 0.0116D. r = - 0.2994E. r = - 0.6235

r = 0.0116

Page 26: Chapters 14 and 15 – Linear Regression and Correlation

Guess the correlation

A. r = 0.7484B. r = 0.2676C. r = 0.0018D. r = - 0.1944E. r = - 0.7588

r = 0.0018

Page 27: Chapters 14 and 15 – Linear Regression and Correlation
Page 28: Chapters 14 and 15 – Linear Regression and Correlation

Fill in the blank: If one variable tends to increase linearly as the other variable increases, the variables are __________ correlated.

A. PositivelyB. NegativelyC. Not

Page 29: Chapters 14 and 15 – Linear Regression and Correlation

Fill in the blank: If one variable tends to increase linearly as the other variable decreases, the variables are __________ correlated.

A. PositivelyB. NegativelyC. Not

Page 30: Chapters 14 and 15 – Linear Regression and Correlation

If there is a correlation (relationship) between two variables, it does not necessarily mean there is a causal relationship between the two variables (one variable affects the other).

Page 31: Chapters 14 and 15 – Linear Regression and Correlation

If there is a correlation (relationship) between two variables, it does not necessarily mean there is a causal relationship between the two variables (one variable affects the other)

Page 32: Chapters 14 and 15 – Linear Regression and Correlation

Nobel Prize and McDonalds data set

The correlation coefficient of this data set is closest to what value?A. -0.999B. 0.999C. 0.099D. -0.099

Country Nobel Prize Count

McDonalds Count

Austria 11 148Czech Republic 2 60Denmark 13 99Finland 2 93Greece 2 48Hungary 3 76Iceland 1 3Ireland 5 62Luxembourg 0 6Norway 8 55Portugal 2 91Slovakia 2 10Turkey 0 133United States 270 12804

Page 33: Chapters 14 and 15 – Linear Regression and Correlation

The correlation between the number of Nobel Prizes awarded and number of McDonald’s Restaurants for select countries is strong. Therefore, we can correctly conclude that if a country were to build more McDonald’s Restaurants its inhabitants would be more likely to receive Nobel Prizes.

A. TrueB. False

Country Nobel Prize Count

McDonalds Count

Austria 11 148Czech Republic 2 60Denmark 13 99Finland 2 93Greece 2 48Hungary 3 76Iceland 1 3Ireland 5 62Luxembourg 0 6Norway 8 55Portugal 2 91Slovakia 2 10Turkey 0 133United States 270 12804

Page 34: Chapters 14 and 15 – Linear Regression and Correlation

Nobel Prize and McDonalds data set

A confounding variable is a variable that is not accounted for that can affect both variables being studied.

Country Nobel Prize Count

McDonalds Count

Austria 11 148Czech Republic 2 60Denmark 13 99Finland 2 93Greece 2 48Hungary 3 76Iceland 1 3Ireland 5 62Luxembourg 0 6Norway 8 55Portugal 2 91Slovakia 2 10Turkey 0 133United States 270 12804

Page 35: Chapters 14 and 15 – Linear Regression and Correlation

Recall the equation of a line is: where m is the slope of the line and b is the y intercept.

In statistics we use this notation: where is the slope and is the y intercept. The values of and are unknown and must be estimated from the data.

Page 36: Chapters 14 and 15 – Linear Regression and Correlation

The values of and are unknown and estimated using a method called “least squares.” This method picks the line that minimizes the sum of the squared errors of all the data points.

What is an error?

Page 37: Chapters 14 and 15 – Linear Regression and Correlation

An error is the vertical distance between a data point and the line and is abbreviated as ε

Page 38: Chapters 14 and 15 – Linear Regression and Correlation

The method of least squares picks the line that results in this being the smallest: .

Page 39: Chapters 14 and 15 – Linear Regression and Correlation

We will let computes calculated the line of best fit or the least squares line because it requires multivariate calculus.

Page 40: Chapters 14 and 15 – Linear Regression and Correlation

The regression line below is a poor fit of the data and results in high error.

Page 41: Chapters 14 and 15 – Linear Regression and Correlation

The regression line below is a better fit of the data and results in lower error.

Page 42: Chapters 14 and 15 – Linear Regression and Correlation

The regression line below is the line of best fit or the least squares line.

Page 43: Chapters 14 and 15 – Linear Regression and Correlation

Review of properties of a line! Consider: where x measures time in hours and y measures distance in miles. The interpretation if the slope is?

A. An increase of 1 hour results in an increase of 2.4 miles.

B. A decrease of 1 hour results in an increase of 2.4 miles.

C. A decrease of 2.4 miles results in a decrease of 1 mile.

D. An increase of 2.4 miles results in a decrease of 1 mile.

Page 44: Chapters 14 and 15 – Linear Regression and Correlation

A study looked at the weight (in hundreds of pounds) and mpg of 82 vehicles. Following is the scatter plot:

Page 45: Chapters 14 and 15 – Linear Regression and Correlation

The line of best fit is: MPG = 68.2 - 1.11 Weight

Page 46: Chapters 14 and 15 – Linear Regression and Correlation

The line of best fit is: MPG = 68.2 - 1.11 Weight. What does the slope tell us?

A. An increase in mpg of 1 results in an increase in weight of 111 pounds.

B. A decrease in mpg of 1 results in an increase in weight of 111 pounds.

C. An increase in weight of 100 pounds results in a decrease in gas mileage of 1.11 mpg.

D. An increase in weight of 100 pounds results in an increase in gas mileage of 1.11 mpg.

Page 47: Chapters 14 and 15 – Linear Regression and Correlation

The line of best fit is: MPG = 68.2 - 1.11 Weight. What does the y-intercept tell us? A. A car with a weight of 0 lbs gets 68.2 mpg

B. A car with a weight of 100 lbs gets 68.2 mpgC. A car with a weight of 1000 lbs gets 68.2 mpgD. A car with a weight of 2000 lbs gets a 68.2 mpg

Page 48: Chapters 14 and 15 – Linear Regression and Correlation

Consider the following data set and graph. The graph is of this data.

A. TrueB. False

X Y6 52 77 53 66 37 84 47 4

765432

7

6

5

4

3

X

Y

Scatterplot of Y vs X

Page 49: Chapters 14 and 15 – Linear Regression and Correlation

A direct relationship means an increase in one variable results in an increase in the other. This is also a positive correlation

An inverse relationship means an increase in one variable results in a decrease in the other. This is also a negative correlation

Page 50: Chapters 14 and 15 – Linear Regression and Correlation

A. There is a negative correlation between the two variables which indicates a direct relationship between femur length and horse height.

B. There is a positive correlation between the two variables which indicates an inverse relationship between femur length and horse height.

C. There is a negative relationship between the two variables which indicates an inverse relationship between femur length and horse height.

D. There is a positive relationship between the two variables which indicates a direct relationship between femur length and horse height.

E. None of the above

Femur Length (cm)

Hors

e H

eig

ht

(hands)

1009080706050

18

16

14

12

10

Equestrian Quantification