© the mcgraw-hill companies, inc., 2000 correlationandregression further mathematics - core

32
© The McGraw-Hill Companies, Inc., 2000 Correlation and Regression Further Mathematics - CORE

Upload: barry-ross

Post on 17-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

Correlation

and

Regression

Further Mathematics - CORE

Page 2: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

Outline

11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression

Page 3: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

Outline

11-5 Coefficient of

Determination and

Standard Error of

Estimate

Page 4: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

Objectives

Draw a scatter plot for a set of ordered pairs.

Find the correlation coefficient. Test the hypothesis H0: = 0. Find the equation of the

regression line.

Page 5: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

Objectives

Find the coefficient of determination.

Find the standard error of estimate.

Page 6: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2 Scatter Plots

A scatter plot is a graph of the ordered pairs (x, y) of numbers consisting of the independent variable, x, and the dependent variable, y.

Page 7: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2 Scatter Plots - Example

Construct a scatter plot for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects.

The data is given on the next slide.

Page 8: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2 Scatter Plots - Example

Subject Age, x Pressure, y

A 43 128

B 48 120

C 56 135

D 61 143

E 67 141

F 70 152

Page 9: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2 Scatter Plots - Example

70605040

150

140

130

120

Age

Pre

ssur

e

70605040

150

140

130

120

Age

Pre

ssur

ePositive Relationship

Page 10: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2 Scatter Plots - Other Examples

15105

90

80

70

60

50

40

Number of absences

Fin

al g

rade

15105

90

80

70

60

50

40

Number of absences

Fin

al g

rade

Negative Relationship

Page 11: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2 Scatter Plots - Other Examples

706050403020100

10

5

0

X

Y

706050403020100

10

5

0

x

yNo Relationship

Page 12: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-3 Correlation Coefficient

The correlation coefficient computed from the sample data measures the strength and direction of a relationship between two variables.

Sample correlation coefficient, r. Population correlation coefficient,

Page 13: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-1311-3 Range of Values for the

Correlation Coefficient

Strong negativerelationship

Strong positiverelationship

No linearrelationship

Page 14: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-1411-3 Formula for the Correlation

Coefficient r

r

n xy x y

n x x n y y

2 2 2 2

Where n is the number of data pairs

Page 15: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-1511-3 Correlation Coefficient -

Example (Verify)

Compute the correlation coefficient for the age and blood pressure data.

.897.0

.443 112 ,399 20

634 47= ,819= ,34522

r

givesrforformulatheinngSubstituti

yx

xyyx

Page 16: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-1611-3 The Significance of the

Correlation Coefficient

The population correlation coefficient, , is the correlation between all possible pairs of data values (x, y) taken from a population.

Page 17: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-1711-3 The Significance of the

Correlation Coefficient

H0: = 0 H1: 0 This tests for a significant

correlation between the variables in the population.

Page 18: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-1811-3 Formula for the t-tests for the

Correlation Coefficient

tn

rwith d f n

2

12

2

. .

Page 19: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-19 11-3 Example

Test the significance of the correlation coefficient for the age and blood pressure data. Use = 0.05 and r = 0.897.

Step 1: State the hypotheses. H0: = 0 H1: 0

Page 20: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-20

Step 2: Find the critical values. Since = 0.05 and there are 6 – 2 = 4 degrees of freedom, the critical values are t = +2.776 and t = –2.776.

Step 3: Compute the test value. t = 4.059 (verify).

11-3 Example

Page 21: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-21

Step 4: Make the decision. Reject the null hypothesis, since the test value falls in the critical region (4.059 > 2.776).

Step 5: Summarize the results. There is a significant relationship between the variables of age and blood pressure.

11-3 Example

Page 22: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-22

The scatter plot for the age and blood pressure data displays a linear pattern.

We can model this relationship with a straight line.

This regression line is called the line of best fit or the regression line.

The equation of the line is y = a + bx.

11-4 Regression

Page 23: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2311-4 Formulas for the Regression

Line y = a + bx.

ay x x xy

n x x

bn xy x y

n x x

2

2 2

2 2

Where a is the y intercept and b is the slope of the line.

Page 24: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-24 11-4 Example

Find the equation of the regression line for the age and the blood pressure data.

Substituting into the formulas give a = 81.048 and b = 0.964 (verify).

Hence, y = 81.048 + 0.964x. Note, a represents the intercept and b

the slope of the line.

Page 25: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-25 11-4 Example

70605040

150

140

130

120

Age

Pre

ssur

e

70605040

150

140

130

120

Age

Pre

ssur

e

y = 81.048 + 0.964x

Page 26: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2611-4 Using the Regression Line to Predict

The regression line can be used to predict a value for the dependent variable (y) for a given value of the independent variable (x).

Caution: Use x values within the experimental region when predicting y values.

Page 27: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-27 11-4 Example

Use the equation of the regression line to predict the blood pressure for a person who is 50 years old.

Since y = 81.048 + 0.964x, theny = 81.048 + 0.964(50) = 129.248 129.2

Note that the value of 50 is within the range of x values.

Page 28: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2811-5 Coefficient of Determination and Standard Error of Estimate

The coefficient of determination, denoted by r2, is a measure of the variation of the dependent variable that is explained by the regression line and the independent variable.

Page 29: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-2911-5 Coefficient of Determination and Standard Error of Estimate

r2 is the square of the correlation coefficient.

The coefficient of nondetermination is (1 – r2).

Example: If r = 0.90, then r2 = 0.81.

Page 30: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-3011-5 Coefficient of Determination and Standard Error of Estimate

The standard error of estimate, denoted by sest, is the standard deviation of the observed y values about the predicted y values.

The formula is given on the next slide.

Page 31: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-3111-5 Formula for the Standard

Error of Estimate

s

y y

nor

sy a y b xy

n

est

est

2

2

2

2

Page 32: © The McGraw-Hill Companies, Inc., 2000 CorrelationandRegression Further Mathematics - CORE

© The McGraw-Hill Companies, Inc., 2000

11-3211-5 Standard Error of Estimate -

Example

From the regression equation, y = 55.57 + 8.13x and n = 6, find sest.

Here, a = 55.57, b = 8.13, and n = 6. Substituting into the formula gives sest

= 6.48 (verify).