what is a single linear regression

242
Single Linear Regression Conceptual Explanation

Upload: byu-center-for-teaching-learning

Post on 23-Jun-2015

91 views

Category:

Education


2 download

TRANSCRIPT

Page 1: What is a Single Linear Regression

Single Linear Regression

Conceptual Explanation

Page 2: What is a Single Linear Regression

• Welcome to this explanation of Single Linear Regression.

Page 3: What is a Single Linear Regression

• Welcome to this explanation of Single Linear Regression.• Single linear regression is an extension of

correlation.

Page 4: What is a Single Linear Regression

• Welcome to this explanation of Single Linear Regression.• Single linear regression is an extension of

correlation.

Correlation Single Linear Regressionextends to

Page 5: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Page 6: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Page 7: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Page 8: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

+.99

As one variable

increases the other

increases

Page 9: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

+.99

As one variable

increases the other

increases

This coefficient represents an almost perfect positive

correlation or relationship between these two variables.

Page 10: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Ave Daily Temp

500

600

700

800

900

Page 11: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Ave Daily Temp

500

600

700

800

900

As one variable

decreases the other

increases

Page 12: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Ave Daily Temp

500

600

700

800

900

-.99

As one variable

decreases the other

increases

Page 13: What is a Single Linear Regression

• Correlation is designed to render a single coefficient that represents the degree of coherence between two variables

Ave Daily Temp

500

600

700

800

900

-.99

As one variable

decreases the other

increases

Almost a perfect negative correlation or relationship

between these two variables.

Page 14: What is a Single Linear Regression

• Single linear regression uses that information to predict the value of one variable based on the given value of the other variable.

Page 15: What is a Single Linear Regression

• Single linear regression uses that information to predict the value of one variable based on the given value of the other variable.

Page 16: What is a Single Linear Regression

• Single linear regression uses that information to predict the value of one variable based on the given value of the other variable.

• For example:

Page 17: What is a Single Linear Regression

• For example:If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000?

Page 18: What is a Single Linear Regression

• For example:If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000?

Ave Daily Ice Cream Sales

?560

480

350

320

230

Ave Daily Temp

1000

900

800

700

600

500

Page 19: What is a Single Linear Regression

• Single linear regression uses that information to predict the value of one variable (ice cream) based on the given value of the other variable (temperature).

Page 20: What is a Single Linear Regression

• Single linear regression uses that information to predict the value of one variable (ice cream) based on the given value of the other variable (temperature).

Page 21: What is a Single Linear Regression

If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000?

• Rather than simply examining the relationship between the variables (as is the case with the Pearson Product Moment Correlation), one variable will be used as the predictor (temperature) and the other value will be used as the outcome or predicted (ice cream sales).

Ave Daily Ice Cream Sales

630?560

480

350

320

230

Ave Daily Temp

1000

900

800

700

600

500

Page 22: What is a Single Linear Regression

If the following data set were real, what would you predict ice cream sales would be when the temperature reaches 1000?

• Rather than simply examining the relationship between the variables (as is the case with the Pearson Product Moment Correlation), one variable will be used as the predictor (temperature) and the other value will be used as the outcome or predicted (ice cream sales). • Linear Regression makes it possible to estimate a value like

630

Ave Daily Ice Cream Sales

630?560

480

350

320

230

Ave Daily Temp

1000

900

800

700

600

500

Page 23: What is a Single Linear Regression

• In some cases which variable is considered predictor or outcome is arbitrary.

Page 24: What is a Single Linear Regression

• In some cases which variable is considered predictor or outcome is arbitrary.• Like measures of depression and anxiety

Page 25: What is a Single Linear Regression

• In some cases which variable is considered predictor or outcome is arbitrary.• Like measures of depression and anxiety

Composite Depression Score

33

26

22

14

12

6

Composite Anxiety Score

103

100

92

74

52

26

Page 26: What is a Single Linear Regression

• In some cases which variable is considered predictor or outcome is arbitrary.• Like measures of depression and anxiety

• It’s not clear which influences which. Most likely depression and anxiety mutually influence one another.

Composite Depression Score

33

26

22

14

12

6

Composite Anxiety Score

103

100

92

74

52

26

Page 27: What is a Single Linear Regression

• In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome.

Page 28: What is a Single Linear Regression

• In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome.

Ave Daily Exposure to Sunlight

3.3 hrs

2.6 hrs

2.2 hrs

1.4 hrs

1.2 hrs

0.6 hrs

Page 29: What is a Single Linear Regression

• In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome.

Ave Daily Exposure to Sunlight

3.3 hrs

2.6 hrs

2.2 hrs

1.4 hrs

1.2 hrs

0.6 hrs

Levels of Vitamin E after two months

10.3 units

8.1 units

7.3 units

7.0 units

6.8 units

5.7 units

Page 30: What is a Single Linear Regression

• In some cases, either by theory or by the nature of the research design, one variable will be rationally defined as the predictor and the other as the outcome.

Ave Daily Exposure to Sunlight

3.3 hrs

2.6 hrs

2.2 hrs

1.4 hrs

1.2 hrs

0.6 hrs

Levels of Vitamin E after two months

10.3 units

8.1 units

7.3 units

7.0 units

6.8 units

5.7 units

In this example, exposure to sunlight may impact levels of

Vitamin E.

But, levels of Vitamin E would not impact the amount of sunlight

one gets.

Page 31: What is a Single Linear Regression

• An easy way to conceptualize single linear regression is to create a scatterplot in Cartesian space.

Page 32: What is a Single Linear Regression

• An easy way to conceptualize single linear regression is to create a scatterplot in Cartesian space.

Let’s plot the following data set:

Page 33: What is a Single Linear Regression

• An easy way to conceptualize single linear regression is to create a scatterplot in Cartesian space.

Let’s plot the following data set:Composite

Depression Score33

26

22

14

12

6

Composite Anxiety Score

103

100

92

74

52

26

Page 34: What is a Single Linear Regression

• First, we assign the predictor variable along the X axis, which in this case we’ll arbitrarily say is depression.

Page 35: What is a Single Linear Regression

• First, we assign the predictor variable along the X axis, which in this case we’ll arbitrarily say is depression.

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

Page 36: What is a Single Linear Regression

• ... and the outcome variable along the Y axis we’ll arbitrarily say is Anxiety.

Page 37: What is a Single Linear Regression

• ... and the outcome variable along the Y axis we’ll arbitrarily say is Anxiety.

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

Page 38: What is a Single Linear Regression

• Now, let’s identify or plot each point or dot

Page 39: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

Page 40: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

Page 41: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

(33, 103)

Page 42: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

Page 43: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

(26, 100)

Page 44: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

Page 45: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

(22, 92)

Page 46: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

Page 47: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

(14, 74)

Page 48: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

Page 49: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

(12, 52)

Page 50: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

Page 51: What is a Single Linear Regression

• Now, let’s identify or plot each point or dotDepression

33262214126

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

(6, 26)

Page 52: What is a Single Linear Regression

• Visually, one can see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related.

Page 53: What is a Single Linear Regression

• Visually, one can see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related.

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

Page 54: What is a Single Linear Regression

• Visually, one can see in the plotted space whether there is a tendency for the variables to be related and in what direction they are related.

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

In this case there is a strong tendency

to relate and the relationship is

positive

Page 55: What is a Single Linear Regression

• With this data set the tendency for the variables to relate is strong and the direction is negative:

Page 56: What is a Single Linear Regression

• With this data set the tendency for the variables to relate is strong and the direction is negative:

Depression

61214222633

Anxiety

10310092745226

Page 57: What is a Single Linear Regression

• With this data set the tendency for the variables to relate is strong and the direction is negative:

Depression

61214222633

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

Page 58: What is a Single Linear Regression

• With this data set the tendency for the variables to relate is strong and the direction is negative:

Depression

61214222633

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

Strong and Negative

Page 59: What is a Single Linear Regression

• When no relationship exists the scatter plot tends to look like a big circle.

Page 60: What is a Single Linear Regression

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

2233126

1426

Anxiety

10310092745226

Page 61: What is a Single Linear Regression

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

2233126

1426

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

Page 62: What is a Single Linear Regression

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

2233126

1426

Anxiety

10310092745226

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

Page 63: What is a Single Linear Regression

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

226

33261412

Anxiety

10310092745226

Page 64: What is a Single Linear Regression

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

226

33261412

Anxiety

10310092745226

Page 65: What is a Single Linear Regression

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

226

33261412

Anxiety

10310092745226

Weak and Positive

Page 66: What is a Single Linear Regression

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

61433261222

Anxiety

10310074925226

Page 67: What is a Single Linear Regression

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

61433261222

Anxiety

10310074925226

Page 68: What is a Single Linear Regression

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

• When no relationship exists the scatter plot tends to look like a big circle.

Depression

61433261222

Anxiety

10310074925226

Weak and Negative

Page 69: What is a Single Linear Regression

• You might have noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other.

Page 70: What is a Single Linear Regression

• You might have noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other.

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

Page 71: What is a Single Linear Regression

• You might have noticed that as the variables are related either positively or negatively, the plot looks more like an oval tilted one way or the other.

Weak and Negative

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

Depression

Anx

iety

0 5 10 15 20 25 30 350

20

40

60

80

100

120

Relationship between Depression & Anxiety

DepressionA

nxie

ty

Weak and Positive

Page 72: What is a Single Linear Regression

• As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature).

Page 73: What is a Single Linear Regression

• As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more

accurate the prediction.

Page 74: What is a Single Linear Regression

• As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more

accurate the prediction.

• The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.

Page 75: What is a Single Linear Regression

• As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more

accurate the prediction.

• The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.

Page 76: What is a Single Linear Regression

• As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more

accurate the prediction.

• The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.

• One of the ways to represent those relationships is of course with the coefficients (e.g., +.99, +.14, -.03, -.99).

Page 77: What is a Single Linear Regression

• As mentioned before, Linear Regression is used to predict one variable (ice cream sales) from another related variable (temperature). • The stronger the relationship (e.g., +.99 or -.99) the more

accurate the prediction.

• The weaker the relationship (e.g., +.14 or -.03) the less accurate the prediction.

• One of the ways to represent those relationships is of course with the coefficients (e.g., +.99, +.14, -.03, -.99).• Another way to represent it is by graphing the relationship.

Page 78: What is a Single Linear Regression

• Recall that a line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0).

Page 79: What is a Single Linear Regression

• Recall that a line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0).

[Y= intercept + (slope X)]∙

Page 80: What is a Single Linear Regression

• Recall that a line in Cartesian space is defined by its slope and its Y intercept (the value of Y when X equals 0).

[Y= intercept + (slope X)]∙

0 1 2 3 4 5 60

1

2

3

4

5

6

Page 81: What is a Single Linear Regression

• In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”.

Page 82: What is a Single Linear Regression

0 1 2 3 4 5 60

1

2

3

4

5

6

rise1

• In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”.

run1

Page 83: What is a Single Linear Regression

0 1 2 3 4 5 60

1

2

3

4

5

6

rise1

• In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”.

• So the equation for this line so far would look like this:

run1

Page 84: What is a Single Linear Regression

0 1 2 3 4 5 60

1

2

3

4

5

6

rise1

• In this case the slope would be 1. You may remember that this value is derived by taking what is called the “rise” over the “run”.

• So the equation for this line so far would look like this:

run1

𝒚=0+11𝒙

Page 85: What is a Single Linear Regression

0 1 2 3 4 5 60

1

2

3

4

5

6

rise1

run1

𝒚=0+11𝒙

Page 86: What is a Single Linear Regression

0 1 2 3 4 5 60

1

2

3

4

5

6

rise1

run1

𝒚=0+11𝒙

This is where the line crosses the

Y axis.

Page 87: What is a Single Linear Regression

0 1 2 3 4 5 60

1

2

3

4

5

6

rise1

run1

𝒚=0+11𝒙

This is the slope which is the rise

over the run.

Page 88: What is a Single Linear Regression

• A line represents the functional relationship between variable X and variable Y, therefore, that line can be used to predict a Y value from any given X value.

Page 89: What is a Single Linear Regression

• A line represents the functional relationship between variable X and variable Y, therefore, that line can be used to predict a Y value from any given X value.

Feb

Mar

Apr

May

Jun

Ave Monthly Temperature

500

600

700

800

900

Ave Monthly Ice Cream Sales

239

320

400

480

560

Page 90: What is a Single Linear Regression

• In this case the two variables (temperature and ice cream sales) have a perfect linear relationship. This is rarely ever seen among variables such as these in the real world, but for illustrative purposes we have created a perfect relationship.

Page 91: What is a Single Linear Regression

• In this case the two variables (temperature and ice cream sales) have a perfect linear relationship. This is rarely ever seen among variables such as these in the real world, but for illustrative purposes we have created a perfect relationship.

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

Page 92: What is a Single Linear Regression

• Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July

Page 93: What is a Single Linear Regression

• Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July

Feb

Mar

Apr

May

Jun

JUL

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

?

Page 94: What is a Single Linear Regression

• Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July

• Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction:

Feb

Mar

Apr

May

Jun

JUL

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

?

Page 95: What is a Single Linear Regression

• Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July

• Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction:

Feb

Mar

Apr

May

Jun

JUL

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

?

¿

Page 96: What is a Single Linear Regression

• Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July

• Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction:

• There are many ways to write this equation. Here is one way:

Feb

Mar

Apr

May

Jun

JUL

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

?

¿

Page 97: What is a Single Linear Regression

• Now let’s say we have data for the average temperature during the month of July. But, we don’t have the data for the average ice cream sales for July

• Using single linear regression we can predict the average ice cream sales for July. Here is the formula we will use for the prediction:

• There are many ways to write this equation. Here is one way:

Feb

Mar

Apr

May

Jun

JUL

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

?

¿

¿

Page 98: What is a Single Linear Regression

• Using this data set we can create a formula for a straight line that represents that relationship:

Page 99: What is a Single Linear Regression

• Using this data set we can create a formula for a straight line that represents that relationship:

Feb

Mar

Apr

May

Jun

Ave Monthly Temperature

500

600

700

800

900

Ave Monthly Ice Cream Sales

239

320

400

480

560

Page 100: What is a Single Linear Regression

• Using this data set we can create a formula for a straight line that represents that relationship:

Feb

Mar

Apr

May

Jun

Ave Monthly Temperature

500

600

700

800

900

Ave Monthly Ice Cream Sales

239

320

400

480

560

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦= -162+8( )𝑥

Page 101: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Page 102: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

Page 103: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = -162 + 8(100)

Page 104: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = -162 + 8(100)

Page 105: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = -162 + 800

Page 106: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = 638

Page 107: What is a Single Linear Regression

• With this equation we can now plug in the average temperature for July (1000) and see what the predicted average ice cream sales would be:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

638

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = 638

Page 108: What is a Single Linear Regression

• So, based on our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638.

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

638

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = 638

Page 109: What is a Single Linear Regression

• So, based on our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638.

• This is a simple demonstration of how regression works.

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

638

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = 638

Page 110: What is a Single Linear Regression

• So, based on our single linear regression analysis we would predict that in the month of July that the average monthly ice cream sales will be 638.

• This is a simple demonstration of how regression works.• In reality, however, most variables will not correlate so

perfectly like this did:

Feb

Mar

Apr

May

Jun

Jul

Ave Monthly Temperature

500

600

700

800

900

1000

Ave Monthly Ice Cream Sales

239

320

400

480

560

638

40 60 80 100 1200

100

200

300

400

500

600

700

Average Monthly Ice Cream Sales

Ave

Mon

thly

Tem

pera

ture

𝑦 ̂� = 638

Page 111: What is a Single Linear Regression

• Most will look like this:

Page 112: What is a Single Linear Regression

• Most will look like this:

Page 113: What is a Single Linear Regression

• Most will look like this:

• This line is called the best fitting line because it minimizes the distance between the line and all of the points. You will notice again that we have a linear equation for that line:

Page 114: What is a Single Linear Regression

• Most will look like this:

• This line is called the best fitting line because it minimizes the distance between the line and all of the points. You will notice again that we have a linear equation for that line:𝑦= -

50.93+7.21(x)

Page 115: What is a Single Linear Regression

• Most will look like this:

• This equation is calculated by using the standard deviations and means of the two variables. For brevity sake we will not go into this here. 𝑦= -

50.93+7.21(x)

Page 116: What is a Single Linear Regression

• Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.

Page 117: What is a Single Linear Regression

• Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.

Page 118: What is a Single Linear Regression

• Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.

This line is the predicted values of Y calculated from the

equation

Page 119: What is a Single Linear Regression

• Given the infinite number of positive linear fitting through a scatterplot, the one closer to represent the functional relationship between X and Y is the line that results in the cumulative least squared error between the predicted values of Y and the true observed values of Y for each given X.

These dots represent the actual data

This line is the predicted values of Y calculated from the

equation

Page 120: What is a Single Linear Regression

• We don’t have to actually plot the coordinates and lines. We can operate solely on the equations to generate predicted values and errors in prediction. In this way we can determine if temperature is a statistically significant predictor of ice cream sales.

Page 121: What is a Single Linear Regression

• So here are the actual data we plotted the data from:

Page 122: What is a Single Linear Regression

• So here are the actual data we plotted the data from:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

Page 123: What is a Single Linear Regression

• So here are the actual data we plotted the data from:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

Page 124: What is a Single Linear Regression

• So here are the actual data we plotted the data from:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

• We can now plot the predicted Y using the equation:

Page 125: What is a Single Linear Regression

• So here are the actual data we plotted the data from:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

• We can now plot the predicted Y using the equation: = -50.93+7.21(x)

Page 126: What is a Single Linear Regression

• So here are the actual data we plotted the data from:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

• We can now plot the predicted Y using the equation:

• Which is the equation for the best fitting line between these two variables:

= -50.93+7.21(x)

Page 127: What is a Single Linear Regression

• We can now plot the predicted Y using the equation:

Page 128: What is a Single Linear Regression

• We can now plot the predicted Y using the equation:= -50.93+7.21(x)

Page 129: What is a Single Linear Regression

• We can now plot the predicted Y using the equation:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(x)

Page 130: What is a Single Linear Regression

• We can now plot the predicted Y using the equation:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(x)

= -50.93+7.21(300) == -50.93+7.21(320) =

= -50.93+7.21(480) =

= -50.93+7.21(370) =

= -50.93+7.21(560) =

= -50.93+7.21(640) =

= -50.93+7.21(720) =

= -50.93+7.21(600) =

= -50.93+7.21(400) == -50.93+7.21(300) =

= -50.93+7.21(200) == -50.93+7.21(122) =

Page 131: What is a Single Linear Regression

• We can now plot the predicted Y using the equation:

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(x)

= -50.93+7.21(300) == -50.93+7.21(320) =

= -50.93+7.21(480) =

= -50.93+7.21(370) =

= -50.93+7.21(560) =

= -50.93+7.21(640) =

= -50.93+7.21(720) =

= -50.93+7.21(600) =

= -50.93+7.21(400) == -50.93+7.21(300) =

= -50.93+7.21(200) == -50.93+7.21(122) =

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Page 132: What is a Single Linear Regression

• We can now plot the predicted Y using the equation:

• With this information we can now determine if x (temperature) is a statistically significant predictor of “y” (ice cream sales).

Jan 40 300

Feb 50 320

Mar 60 370

Apr 70 480

May 80 560

Jun 90 640

Jul 100 720

Aug 90 600

Sep 80 400

Oct 60 300

Nov 40 200

Dec 20 122

(X) Ave Monthly

Temp

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(x)

= -50.93+7.21(300) == -50.93+7.21(320) =

= -50.93+7.21(480) =

= -50.93+7.21(370) =

= -50.93+7.21(560) =

= -50.93+7.21(640) =

= -50.93+7.21(720) =

= -50.93+7.21(600) =

= -50.93+7.21(400) == -50.93+7.21(300) =

= -50.93+7.21(200) == -50.93+7.21(122) =

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Page 133: What is a Single Linear Regression

• To begin we need to determine the total sum of squares just like we would do with analysis of variance.

Page 134: What is a Single Linear Regression

• To begin we need to determine the total sum of squares just like we would do with analysis of variance.

• This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year.

Page 135: What is a Single Linear Regression

• To begin we need to determine the total sum of squares just like we would do with analysis of variance.

• This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year.

• The mean is calculated by adding up the values and divided them by how many there are.

Page 136: What is a Single Linear Regression

• To begin we need to determine the total sum of squares just like we would do with analysis of variance.

• This is done by subtracting the actual “Y” (ice cream sales) values from the average or mean ice cream sales for the whole year.

• The mean is calculated by adding up the values and divided them by how many there are.

• (300+320+370+480+560+640+720+600+400+300+200+122) / 12 = 417 average ice cream sales

Page 137: What is a Single Linear Regression

• We then subtract each y value from the mean

Page 138: What is a Single Linear Regression

• We then subtract each y value from the mean

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 139: What is a Single Linear Regression

• We then subtract each y value from the mean

• Note - if we did not know the functional relationship between X and Y, our best prediction of any one person’s Y value would be the mean of Y.

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 140: What is a Single Linear Regression

• Because we are calculating the total sum of squares we will need to square the results and then take the average of the sum of squares. This is the same as the variance of all of the scores.

Page 141: What is a Single Linear Regression

• Because we are calculating the total sum of squares we will need to square the results and then take the average of the sum of squares. This is the same as the variance of all of the scores.

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Squared

13689

9409

2209

3969

20449

49729

91809

33489

289

13689

47089

87025

Page 142: What is a Single Linear Regression

• Because we are calculating the total sum of squares we will need to square the results and then sum up the results

Page 143: What is a Single Linear Regression

• Because we are calculating the total sum of squares we will need to square the results and then sum up the result

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Squared

13689

9409

2209

3969

20449

49729

91809

33489

289

13689

47089

87025

Sum up

SUM 372844

Page 144: What is a Single Linear Regression

• Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.

Page 145: What is a Single Linear Regression

• Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.• Let’s see if the residual or the regression is greater.

Page 146: What is a Single Linear Regression

• Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.• Let’s see if the residual or the regression is greater.• We know that the total sums of squares is 31,070.

Page 147: What is a Single Linear Regression

• Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.• Let’s see if the residual or the regression is greater.• We know that the total sums of squares is 31,070.

Sum of Squares df Mean Square F-ratio Significance Total 372,844

Page 148: What is a Single Linear Regression

• Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.• Let’s see if the residual or the regression is greater.• We know that the total sums of squares is 31,070. • Now we will calculate the residual (error) and the

regression sums of squares which will add up to 372,844. Sum of Squares df Mean Square F-ratio Significance Total 372,844

Page 149: What is a Single Linear Regression

• Now we find regression (good) and residual (bad). To have better prediction power we want the regression sums of squares to be large and the residual or error sums of squares to be small.• Let’s see if the residual or the regression is greater.• We know that the total sums of squares is 31,070. • Now we will calculate the residual (error) and the

regression sums of squares which will add up to 372,844. Sum of Squares df Mean Square F-ratio SignificanceRegression ? Residual (error) ? Total 372,844

Page 150: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.

Page 151: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.• Once again we subtract the actual Y values from the mean

of the actual Y values

Page 152: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.• Once again we subtract the actual Y values from the mean

of the actual Y values(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

------------

Page 153: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.• Once again we subtract the actual Y values from the mean

of the actual Y values(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

------------

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

Page 154: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.• Once again we subtract the actual Y values from the mean

of the actual Y values(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

------------

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

Page 155: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.• Once again we subtract the actual Y values from the mean

of the actual Y values(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

------------

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

Page 156: What is a Single Linear Regression

• Before we calculate residual and regression let’s see visually how we calculated the total sums of squares -372,844.• Once again we subtract the actual Y values from the mean

of the actual Y values(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

------------

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

Page 157: What is a Single Linear Regression

• The first data set are the actual Y values. We subtract them from the mean (417) which would be our best prediction if we did not know the relationship between X (temperature) and Y (ice cream sales)

Page 158: What is a Single Linear Regression

• The first data set are the actual Y values. We subtract them from the mean (417) which would be our best prediction if we did not know the relationship between X (temperature) and Y (ice cream sales)

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

------------

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

Page 159: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

Page 160: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

122

Final Exam

Mid

term

Exa

m

122-417

= -295

417

Page 161: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

122

Final Exam

Mid

term

Exa

m

122-417

= -295

417

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 162: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

122

Final Exam

Mid

term

Exa

m

122-417

= -295

417

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 163: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

200

Final Exam

Mid

term

Exa

m

200-417

= -217

417

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 164: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

200

Final Exam

Mid

term

Exa

m

200-417

= -217

417

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 165: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

200-417

= +303

417

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 166: What is a Single Linear Regression

• Here is the graphic depiction of our subtracting each data point from the mean (417):

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

200-417

= +303

417

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 167: What is a Single Linear Regression

• Now we have the difference between the actual values for Y (ice cream sales) and the mean of the values for Y (417)

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 168: What is a Single Linear Regression

• Now we have the difference between the actual values for Y (ice cream sales) and the mean of the values for Y (417)

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

417

417

417

417

417

417

417

417

417

417

417

417

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

------------

============

Page 169: What is a Single Linear Regression

• As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero.

Page 170: What is a Single Linear Regression

• As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero.

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

Squared

13689

9409

2209

3969

20449

49729

91809

33489

289

13689

47089

87025

SUM

= 0

SUM

= 372,844

Page 171: What is a Single Linear Regression

• As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero.

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

Squared

13689

9409

2209

3969

20449

49729

91809

33489

289

13689

47089

87025

SUM

= 0

SUM

= 372,844

• We are doing all this once again to show a visual depiction of what the total sums of squares are:

Page 172: What is a Single Linear Regression

• As we showed previously we have to square this value because if we don’t when we sum the differences they will come to zero.

Difference

-117

-97

-47

63

143

223

303

183

-17

-117

-217

-295

Squared

13689

9409

2209

3969

20449

49729

91809

33489

289

13689

47089

87025

SUM

= 0

SUM

= 372,844

• We are doing all this once again to show a visual depiction of what the total sums of squares are:

Sum of Squares

df Mean Square F-ratio Significance

Total 372,844

Page 173: What is a Single Linear Regression

• Now that we’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression.

Page 174: What is a Single Linear Regression

• Now that we’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression.

Sum of Squares

df Mean Square F-ratio Significance

Regression Residual Total 372,844

Page 175: What is a Single Linear Regression

• Now that we’ve seen a visual depiction of how we calculated total sums of squares we compare the sums of squares that are associated with error (residual) and those associated with regression.

• Let’s calculate the error or residual sums of squares now.

Sum of Squares

df Mean Square F-ratio Significance

Regression Residual Total 372,844

Page 176: What is a Single Linear Regression

• The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value.

Page 177: What is a Single Linear Regression

• The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value.• Here are the actual Y values

Page 178: What is a Single Linear Regression

• The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value.• Here are the actual Y values

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

These are the actual Y values or average ice cream sales

aver

age

ice

crea

m s

ales

Page 179: What is a Single Linear Regression

• The error or residual sums of squares are computed by subtracting each actual Y value from each Y predicted value.• Here are the actual Y values

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

These are the actual Y values or average ice cream sales

aver

age

ice

crea

m s

ales

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Page 180: What is a Single Linear Regression

• Here are the predicted values using the linear regression formula:

Page 181: What is a Single Linear Regression

• Here are the predicted values using the linear regression formula:

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

These are the ac-tual Y values or average ice cream sales

aver

age

ice

crea

m s

ales

300

320

370

480

560

640

720

600

400

300

200

122

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(300) == -50.93+7.21(320) =

= -50.93+7.21(480) =

= -50.93+7.21(370) =

= -50.93+7.21(560) =

= -50.93+7.21(640) =

= -50.93+7.21(720) =

= -50.93+7.21(600) =

= -50.93+7.21(400) == -50.93+7.21(300) =

= -50.93+7.21(200) == -50.93+7.21(122) =

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Page 182: What is a Single Linear Regression

• Here are the predicted values using the linear regression formula:

300

320

370

480

560

640

720

600

400

300

200

122

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(300) == -50.93+7.21(320) =

= -50.93+7.21(480) =

= -50.93+7.21(370) =

= -50.93+7.21(560) =

= -50.93+7.21(640) =

= -50.93+7.21(720) =

= -50.93+7.21(600) =

= -50.93+7.21(400) == -50.93+7.21(300) =

= -50.93+7.21(200) == -50.93+7.21(122) =

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 183: What is a Single Linear Regression

• Here are the predicted values using the linear regression formula:

300

320

370

480

560

640

720

600

400

300

200

122

(y) Actual Ave Monthly Ice Cream Sales

= -50.93+7.21(300) == -50.93+7.21(320) =

= -50.93+7.21(480) =

= -50.93+7.21(370) =

= -50.93+7.21(560) =

= -50.93+7.21(640) =

= -50.93+7.21(720) =

= -50.93+7.21(600) =

= -50.93+7.21(400) == -50.93+7.21(300) =

= -50.93+7.21(200) == -50.93+7.21(122) =

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 184: What is a Single Linear Regression

• From these points and the linear regression formula a line can be drawn

Page 185: What is a Single Linear Regression

• From these points and the linear regression formula a line can be drawn

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 186: What is a Single Linear Regression

• From these points and the linear regression formula a line can be drawn

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 187: What is a Single Linear Regression

• The difference between each actual value (orange) and the predicted value (green line) is what is called error or residual. The closer these two values are to each other the smaller the error. The farther these two values are from each other the larger the error and the weaker the predictive power of the regression line.

Page 188: What is a Single Linear Regression

• The difference between each actual value (orange) and the predicted value (green line) is what is called error or residual. The closer these two values are to each other the smaller the error. The farther these two values are from each other the larger the error and the weaker the predictive power of the regression line.

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Difference

Difference

Page 189: What is a Single Linear Regression

• Let’s subtract the orange actual values and the green line predicted values:

Page 190: What is a Single Linear Regression

• Let’s subtract the orange actual values and the green line predicted values:

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

62.53

10.43

-11.67

26.23

34.13

42.03

49.93

2.03

-125.87

-81.67

-37.47

28.73

------------

============

+28.73122

93

Page 191: What is a Single Linear Regression

• Let’s subtract the orange actual values and the green line predicted values:

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

62.53

10.43

-11.67

26.23

34.13

42.03

49.93

2.03

-125.87

-81.67

-37.47

28.73

------------

============

-125.87

525

400

Page 192: What is a Single Linear Regression

• Let’s subtract the orange actual values and the green line predicted values:

• And so on…

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

62.53

10.43

-11.67

26.23

34.13

42.03

49.93

2.03

-125.87

-81.67

-37.47

28.73

------------

============

-125.87

525

400

Page 193: What is a Single Linear Regression

• We then square those difference (deviations)

Page 194: What is a Single Linear Regression

• We then square those difference (deviations)

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

62.53

10.43

-11.67

26.23

34.13

42.03

49.93

2.03

-125.87

-81.67

-37.47

28.73

------------

============

Squared

3910.00

108.78

136.19

688.01

1164.86

1766.52

2493.00

4.12

15843.26

6669.99

1404.00

825.41

Page 195: What is a Single Linear Regression

• We then square those difference (deviations) and sum them up

(y) Actual Ave Monthly Ice Cream Sales

300

320

370

480

560

640

720

600

400

300

200

122

Predicted Ave Monthly Ice Cream

Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Difference

62.53

10.43

-11.67

26.23

34.13

42.03

49.93

2.03

-125.87

-81.67

-37.47

28.73

------------

============

Squared

3910.00

108.78

136.19

688.01

1164.86

1766.52

2493.00

4.12

15843.26

6669.99

1404.00

825.41

Sum up

= 35,014

Page 196: What is a Single Linear Regression

Sum of Squares

df Mean Square F-ratio Significance

Regression Residual 35,014 Total 372,844

Page 197: What is a Single Linear Regression

• We will now calculate the regression sums of squares.

Page 198: What is a Single Linear Regression

• We will now calculate the regression sums of squares.

• Our hope is that this value will be much bigger than the residual (35,014).

Sum of Squares

df Mean Square F-ratio Significance

Regression Residual 35,014 Total 372,844

Page 199: What is a Single Linear Regression

• The regression sums of squares is calculated by subtracting the predicted values from the mean.

Page 200: What is a Single Linear Regression

• The regression sums of squares is calculated by subtracting the predicted values from the mean.• Let’s see what this looks like visually. The green line is the

predicted values for Y or the regression line.

Page 201: What is a Single Linear Regression

• The regression sums of squares is calculated by subtracting the predicted values from the mean.• Let’s see what this looks like visually. The green line is the

predicted values for Y or the regression line.

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 202: What is a Single Linear Regression

• The regression sums of squares is calculated by subtracting the predicted values from the mean.• Let’s see what this looks like visually. The green line is the

predicted values for Y or the regression line.

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 203: What is a Single Linear Regression

• The regression sums of squares is calculated by subtracting the predicted values from the mean.• Let’s see what this looks like visually. The green line is the

predicted values for Y or the regression line.

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

The blue line is the mean (417)

which is the best predictor absent

anything else.

Page 204: What is a Single Linear Regression

• You can probably already tell that it will be bigger because a simple way to calculate it is to subtract the residual (35,014) from the total (372,844).

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 205: What is a Single Linear Regression

• You can probably already tell that it will be bigger because a simple way to calculate it is to subtract the residual (35,014) from the total (372,844).• However, we will calculate it the long way so you can see what

is happening.

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 206: What is a Single Linear Regression

• We subtract each predicted value from the mean of the actual Y values

Page 207: What is a Single Linear Regression

(y) Actual Ave Monthly Ice Cream Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Mean Monthly Ice Cream Sales

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

Difference

-180.2

-108.1

-36.0

36.1

108.2

180.3

252.4

180.3

108.2

-36.0

-180.2

-324.4

------------

============

• We subtract each predicted value from the mean of the actual Y values

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

Page 208: What is a Single Linear Regression

• We subtract each predicted value from the mean of the actual Y values

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

(y) Actual Ave Monthly Ice Cream Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Mean Monthly Ice Cream Sales

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

Difference

-180.2

-108.1

-36.0

36.1

108.2

180.3

252.4

180.3

108.2

-36.0

-180.2

-324.4

------------

============

93- 417- 324

Page 209: What is a Single Linear Regression

• We subtract each predicted value from the mean of the actual Y values

10 20 30 40 50 60 70 80 90 100 1100

100

200

300

400

500

600

700

800

Final Exam

Mid

term

Exa

m

aver

age

ice

crea

m s

ales

(y) Actual Ave Monthly Ice Cream Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Mean Monthly Ice Cream Sales

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

Difference

-180.2

-108.1

-36.0

36.1

108.2

180.3

252.4

180.3

108.2

-36.0

-180.2

-324.4

------------

============

670- 417+252

Page 210: What is a Single Linear Regression

• Then we square the differences (or deviations)

Page 211: What is a Single Linear Regression

• Then we square the differences (or deviations)

(y) Actual Ave Monthly Ice Cream Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Mean Monthly Ice Cream Sales

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

Difference

-180.2

-108.1

-36.0

36.1

108.2

180.3

252.4

180.3

108.2

-36.0

-180.2

-324.4

------------

============

Squared

32470.8

11684.9

1295.76

1303.45

11708

32509.3

63707.4

32509.3

11708

1295.76

32470.8

105233

Page 212: What is a Single Linear Regression

• Then we square the differences (or deviations) and sum them up

(y) Actual Ave Monthly Ice Cream Sales

237.47

309.57

381.67

453.77

525.87

597.97

670.07

597.97

525.87

381.67

237.47

93.27

Mean Monthly Ice Cream Sales

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

417.7

Difference

-180.2

-108.1

-36.0

36.1

108.2

180.3

252.4

180.3

108.2

-36.0

-180.2

-324.4

------------

============

Squared

32470.8

11684.9

1295.76

1303.45

11708

32509.3

63707.4

32509.3

11708

1295.76

32470.8

105233

Sum up

= 337,830

Page 213: What is a Single Linear Regression

• Then we square the differences (or deviations) and sum them up

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 Residual 35,014 Total 372,844

Page 214: What is a Single Linear Regression

• Now we have all of the information to test for significance

Page 215: What is a Single Linear Regression

• Now we have all of the information to test for significance

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 Residual 35,014 Total 372,844

Page 216: What is a Single Linear Regression

• The degrees of freedom (df) for the regression are the number of parameters that are being estimated which in this case is the Y intercept and the slope in this equation minus

Page 217: What is a Single Linear Regression

• The degrees of freedom (df) for the regression are the number of parameters that are being estimated which in this case is the Y intercept and the slope in this equation minus • 2 parameters -1 = 1

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1

Residual 35,014 Total 372,844

Page 218: What is a Single Linear Regression

• The degrees of freedom for residual is the number of cases (12) minus the number of parameters (2)

Page 219: What is a Single Linear Regression

• The degrees of freedom for residual is the number of cases (12) minus the number of parameters (2)• 12 months – 2 parameters (slope / y intercept) = 10

Page 220: What is a Single Linear Regression

• The degrees of freedom for residual is the number of cases (12) minus the number of parameters (2)• 12 months – 2 parameters (slope / y intercept) = 10

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1

Residual 35,014 10 Total 372,844

Page 221: What is a Single Linear Regression

• We now have the information we need to calculate the Mean Square values. They are calculated by dividing the sums of squares by the degrees of freedom.

Page 222: What is a Single Linear Regression

• We now have the information we need to calculate the Mean Square values. They are calculated by dividing the sums of squares by the degrees of freedom.

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 =337,830

Residual 35,014 10 =3,501

Total 372,844

Page 223: What is a Single Linear Regression

• The F-ratio is computed by dividing the Regression Mean Square by the Residual Mean Square

Page 224: What is a Single Linear Regression

• The F-ratio is computed by dividing the Regression Mean Square by the Residual Mean Square

• 337,830 / 3,501 = 96.5

Page 225: What is a Single Linear Regression

• The F-ratio is computed by dividing the Regression Mean Square by the Residual Mean Square

• 337,830 / 3,501 = 96.5

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 226: What is a Single Linear Regression

• With this information we can turn to the F-distribution table to determine the significance value.

Page 227: What is a Single Linear Regression

• With this information we can turn to the F-distribution table to determine the significance value.

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 228: What is a Single Linear Regression

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 229: What is a Single Linear Regression

• The regression degrees of freedom (1) is represented by the columns below:

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 230: What is a Single Linear Regression

• The regression degrees of freedom (1) is represented by the columns below:

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 231: What is a Single Linear Regression

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 232: What is a Single Linear Regression

• The residual degrees of freedom (10) is represented by the rows below:

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 233: What is a Single Linear Regression

• The residual degrees of freedom (10) is represented by the rows below:

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 234: What is a Single Linear Regression

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 235: What is a Single Linear Regression

• Put them together and we have found the critical F value at the .05 alpha level to be 4.96.

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 236: What is a Single Linear Regression

• Put them together and we have found the critical F value at the .05 alpha level to be 4.96.

Sum of Squares

df Mean Square F-ratio Significance

Regression 337,830 1 337,830 96.5

Residual 35,014 10 3,501

Total 372,844

Page 237: What is a Single Linear Regression

• Because the F-ratio (96.5) exceeds the F-critical (4.96) we will reject the null hypothesis and indicate that temperature is a statistically significant predictor of ice cream sales

Page 238: What is a Single Linear Regression

In Summary

Page 239: What is a Single Linear Regression

In Summary

• The whole point of this demonstration was to

Page 240: What is a Single Linear Regression

In Summary

• The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature)

Page 241: What is a Single Linear Regression

In Summary

• The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature)(2) show that the total variance in Y can be partitioned into regression (prediction power) and residual (error)

Page 242: What is a Single Linear Regression

In Summary

• The whole point of this demonstration was to (1) explain that linear regression is used to predict the value of one variable (ice cream sales) based on another variable (temperature)(2) show that the total variance in Y can be partitioned into regression (prediction power) and residual (error) (3) show how this can be used to test whether the prediction is better than by chance.