chapter 3 introductory linear regression. introduction linear regression is a study on the linear...

24
CHAPTER 3 INTRODUCTORY LINEAR REGRESSION

Upload: agnes-smith

Post on 20-Jan-2016

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

CHAPTER 3

INTRODUCTORY LINEAR REGRESSION

Page 2: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Introduction

Linear regression is a study on the linear relationship between two variables. This is done by fitting a linear equation to the observed data.

The linear equation is then used to predict values for the data.

In a simple linear relationship, only two variables are involved:a)X is the independent variable - the variables has been controlledb)Y is the dependent variable - the response variables. In other

word, the value of y depends on the value of x.

Page 3: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Example

A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight.a) X is the carbohydrate intake (independent variable).b) Y is the weight (dependent variable).

An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume.a) X is costb) Y is sales volume

Page 4: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Scatter plots

A scatter plot is essentially a plot between the pair of (x,y) values.

The purpose of constructing the plot is to examine the relationship between the two variables.

Page 5: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

(a) Positive linear relationship

01020304050607080

5 5.5 6 6.5 7

Y

X

Page 6: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

0

100

200

300

400

500

5 5.2 5.4 5.6 5.8 6 6.2

Y

X

Page 7: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

0

2

4

6

8

10

0 5 10

Y

X

Page 8: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

A linear regression equation is a mathematical equation that can be used to predict the values of one dependent variable from known values of an independent variable.

This equation represents a straight line so it is of the form, y=mx+c, where m is the slope and c is the y-intercept.

The true regression line or the probabilistic model is given by,

where,

: intercept of the line.

: slope of the line.

: random error.

y x

y

Page 9: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

We will call this model the simple linear regression model because it has only one independent variable.

This regression line is estimated from the data collected by fitting a straight line to the data set and getting the equation of the straight line,

ˆˆy x

Page 10: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Least Squares Method

The least squared method is the commonly method used for estimating the regression coefficient, .

The straight line fitted to the data set is the line .

and

ˆˆy x

2

1 1 12

1 1

2

ˆ ˆˆ and ; where

;

xy

xx

n n n

i i in ni i i

xy i i xx ii i

ii

yy i

Sy x

S

x y x

S x y S xn n

y

S y

2

1

1

n

n

i n

Example

Page 11: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

1

To test the existence of a linear relationship between any two variables

x and y, we proceed with testing the hypothesis

: 0 (the slope is zero meaning there is no linear relationship)

: 0 (the sloH

H

ope is not zero meaning there exist a linear

relationship)

ˆˆNote: If 0, the model reduces to . This implies the values of

have no effect on , that is there is no relationship b

y x

y

etween the

two variables.

Page 12: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

0

1

Testing procedure:

Set up hypothesis:

: 0

: 0

ˆCalculate the test statistic:

ˆ

where

ˆ 1ˆ2

yy xy

xx

H

H

tVar

S SVar

n S

Page 13: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

0

/2

/2 /2

Determine the rejection region:

This is two-tailed test so reject

If or

, where are based on 2 degrees of freedom.

H

t t

t t t n

/2, 2nt /2, 2nt

Example

Page 14: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

The analysis of variance (ANOVA) method is an approach to test the significance of the regression. We can arrange the test procedure using this approach in an ANOVA table as shown below

The test hypotheses are

Source of Variation

Sum of Squares

Degrees of

freedom

Mean Square

testf

Regression xySSR BS 1 MSR MSR

fMSE

Residual

SSE=SST-SSR

n-2 MSE

Total yySST S n-1

0 : 0H

1 : 0H

We will reject if at α level of significance. Then we conclude there exist a linear relationship between the two variable being investigated.

,1, 2test nf f 0H

Page 15: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Correlation

Correlation measures the strength of a linear relationship between the two variables. One numerical measure is the Pearson product moment correlation coefficient, r.

Properties of r: Values of r close to 1 implies there is a strong positive linear

relationship between x and y. Values of r close to -1 implies there is a strong negative linear

relationship between x and y. Values of r close to 0 implies little or no linear relationship

between x and y.

xy

xx yy

Sr

S S

1 1r

Page 16: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Before

65 63 76 46 68 72 68 57 36 96

After 68 66 86 48 65 66 71 57 42 87

0.05.

Page 17: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

2

2

2

10 44435

647 44279 64.7

656 44884 y = 65.6

647 65644435 1991.8

10

64744279 2418.1

10

448.

xy

xx

yy

Solution

n xy

x x x

y y

S

S

S

265684 1850.4

10

Page 18: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

1991.8ˆa) 0.82370.8237

ˆˆ 65.6 0.8237 64.7 12.3063

ˆ 12.3063 0.8237

xy

xx

S

S

y x

y x

b) 60

12.3063 0.8237 60 61.7283

x

y

Page 19: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

0

1

c)

1. : 0 (no linear relationship)

: 0 (exist linear relationship)

ˆ 0.8237 2. t 7.9260

0.0108ˆ

ˆ 1.850.4 0.8237 1991.81 1ˆ 2 8 2418.1

test

yy xy

xx

H

H

Var

S SVar

n S

0.0108

Page 20: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

0.025,8

0.025,8

3. 0.05, 2.306

4. 7.926 2.306test

t

t t

0 reject the score before is linearly related to their scores after

the trip

H

2.306 2.306

Page 21: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

d)

1991.8 0.9416

2418.1 1850.4

There is a strong positive linear relationship between

score obtained before and after.

xy

xx yy

Sr

S S

Page 22: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

Suppose you wish to investigate the relationship between the numbers of hours student’s spent studying for an examination and the mark they achieved.

Students A B C D E F G H

numbers of hours (x)

5 8 9 10 10 12 13 15

Final marks ( (y) 49 60 55 72 65 80 82 85

Numbers of hours student’s spent studying

for an examination ( x – Independent

variable )

the mark (y) they achieved.( y – Dependent variable )

will cause

Page 23: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

ˆlinear regression model : 26.89 4.06y x

Strong Linear positive correlation

89.26% of variation in marked achieved is due to variation in

numbers oh hours student’s spent studying

Page 24: CHAPTER 3 INTRODUCTORY LINEAR REGRESSION. Introduction  Linear regression is a study on the linear relationship between two variables. This is done by

This chapter introduces important methods (regression) for making inferences about a relationship between two variables and describing such a relationship with an equation that can be used for predicting value of one variable given the value of the other variable.