chapter 8 correlation and regression analysis. chapter 8 correlation and regression analysis...

62
Chapter 8 Correlation and regression analysis

Post on 19-Dec-2015

322 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8Correlation and regression analysis

Page 2: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis

Statistics in Practice

The restaurant vocation in western countries has an unwritten law, that is when consuming service in restaurant, people need to pay a amount of tips, many people heard of that how much of tips should pay?

About 16% of bill, is it true? Let’s seeing about table 10-1, the data in table is the sample data through investigation, through analyzing and observing these data, we can find out the quantity relation of two.

Table 10-1The data of bill and tip Bill

(dollar) 33.5 50.7 87.9 98.8 63.6 107.3 120.7

Tips (dollar)

5.5 5.0 8.1 17.0 12.0 16.0 18.6

STAT

Page 3: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression AnalysisQuest ions are :

1 Are there enough evidences to conclude: there exists some relations between bill and tips?

2 If this relation exists, how to use this relation to confirm how much of the tips should be left?

The key points in this chapter are making some deduction based on the sample data appeared in couples . Example as above, we want to make sure if there exists some relations between bill and tips, if it exists, we want to use a formula to describe it, by doing so we can find out the rules people obeyed when they pay tips. There are many questions like this, such as:

(1) The rate of crime and the rate of stealing;

(2) The cigarettes being consumed and the rate of being cancered;

(3) The level of individual’s earning and the years of being educated;

(4) The age and blood pressure;

(5) The stature of parents and children;

(6) The stipend and the price of alcohol;

(7) The length of lifeline in the palm of people and the length of people’s life-span.

STAT

Page 4: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Key points

1 Correlative relations and the description of regression equation;

2 Correlative relations of determination;

3 Fit the regression equation;

4 The application of regression equation.

Difficult points

1 Calculate coefficient of production-moment correlation

2 The sum of squares of total deviation and its decompose

References and Bibliography 1 、 Li Xinyu :《 Application Economy Statistics 》, Beijing university Press ;2 、 David S.Moore: 《 The World of Statistics 》, Zhongxin Press ;3 、 Yuan Wei :《 New Statistics Tutorial 》, Economy and Technology Press ;4 、 Statistics Websites : UNSD 、 OECD 、 China National Statistics Bureau ;

Chapter 8 Correlation and Regression Analysis STAT

Page 5: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Summarize the correlation relations

1 Mutual relations between variables

(1) Function relation

Definition: Complete certain( quantity) relation.

A: One group of variables have one to one corresponding relations with another group of variables;

[Example] Wages by pieces (y) and output (x) y=f(x)=10x ; x0=1piece , y0=10yuan ; x1=2piece , y1=20yuan the area of round S = ΠR2 , R=10 , S=100 ΠB: y the variable being explained( dependent variable); x explained variable( independent variable).

2 Correlation relation

(1) Definition: Incomplete certain relation.

A: When one group of variables have relations with the other one, but not one to one corresponding;

STATChapter 8 Correlation and Regression Analysis

Page 6: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

STAT

[ Example ] Stature y and Weight x ;  A : x=60kg 、 y=170m ; B : x=60kg 、 y=1.72m ; C : x=60kg 、 y=1.68m ; D : x=60kg 、 y=1.65m 。

B: Description : y=f(x)+。 The factors that affect stature: weight 、 inheritance 、 exercise 、 the

quality of dormancy

2 、 Causes(1) Some affected factors haven’t been recognized;(2) Although have been recognized but can’t be measured;(3) Measure errors. [ Example ] some fruit p yuan/kilo: quantum of purchase y=Px quantity x=2kilo y=2P+=2×1.9+0.2 3 、 The forms of quantity relations

Chapter 8 Correlation and Regression Analysis

Page 7: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

(1) Single cause and effect relations;

(2) Mutual cause and effect relations;

(3) Concomitant relations.

3 、 The kinds of correlativity

( 1 ) Sort by correlative level A: Complete correction : function relations ; B: Dissociation : no relations ; C: Incomplete correction.

( 2 ) Sort by corrective direction A: Positive correlation: the variables’ directions of change are the same( increase the same time and decrease the same time); B: Negative correlation: the variables’ directions of change are opposite( one increases and the other decreases).

STATChapter 8 Correlation and Regression Analysis

Page 8: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis

STAT

3 Sort by correlative forms

(1) Linear dependent;

(2) Linear independent

Correlative level is close

Correlative level is not close

Page 9: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

4 Sort by affected factors

(1) Single correlation: only one variable:

[Example] The grades of study and the time spent on it;

Blood pressure and the age; the output of a unit of area and the quantity of fertilizing.

(2) Multiple correlation: two or more than two variables;

[Example] The relations between the growth of economy and the growth of population 、 the level of science and technology 、 natural resource 、 the level of management and so on.

The relations of weight 、 appetite 、 the time of sleeping and so on.

(3) Partial correlation: Measure the two variables’ correlative level among some variables while supposing other variable doesn’t change.

[Example] To y=ax1+bx2+ , investigate the relations of y and x1, supposing x2 doesn’t change.

STATChapter 8 Correlation and Regression Analysis

Page 10: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Measure the relations of linear dependent

[Purpose ] Measure the correlative directions and close level among the variables.

1 Correlative graph

( 1 ) Correlative table

A: Single variable grouping of correlative graph: independent variables are grouped and calculated the times, dependent variables are only calculated the average numbers.

Relational data of 30 congener enterprises

Output (piece) x The numbers of enterprises Average cost of a unit (yuan) y

20

30

40

50

80

16.8

15.6

15.0

14.8

14.2

9

5

5

6

5

STATChapter 8 Correlation and Regression Analysis

Page 11: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

(2) Double variables grouping of correlative graph: both dependent variable and independent variable are grouped.

Annotate: independent variable X axis; dependent Y axis.

Relational data of 30 congener enterprises

Cost of a unit (yuan/piece)

18

16

15

14

Output x (piece)

20 30 40 50 80

4

4

1

__

__

3

2

__

__

1

3

1

__

1

3

2

__

__

1

4

Summation

4

9

10

7

Summation 9 5 5 6 5 30

STATChapter 8 Correlation and Regression Analysis

2 Correlative table: scatter diagram

[Shortage] Difficult to reflect the correlative close level accurately

Page 12: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

2 (Linear) Correlation coefficient

( 1 ) Production-moment method calculation formula

Suppose is a group of sample observation values of

then, is the correlation coefficient of x and y ,

),( ii yx ),( YX

yx

xyr

yx

yx

yx

xy

,

,

Covariance

Standard deviation

2222 )()(

))((

)()(

))((

yyxx

yyxx

nyy

nxxn

yyxx

r

yyxx

xy

LL

L

STATChapter 8 Correlation and Regression Analysis

Page 13: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

STAT

2 The effects of covariance xy

(1) Show the relative direction of x and y.

X

Y

yy

xx )1()2(

)3( )4(

),( 11 yx

),( nn yx

0)(

)(

))(()3()1(

xyyy

xx

yyxx

yx

xyr

n

yyxxxy

))((

Positive correlation 0 r

Chapter 8 Correlation and Regression Analysis

Page 14: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

[Negative correlation]

X

Y

yy

xx )1()2(

)3( )4(

),( 11 yx

),( nn yx

0)(

)(

))(()4()2(

xyyy

xx

yyxx

n

yyxxxy

))((

yx

xyr

Negative correlation 0 r

Page 15: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

2 、 Show the relative level of x and y

)(

)(

))(()3()1(

yy

xxtableA

yyxx

X

Y

P

Q

))(( yyxx

Table A dense distributing Table B dishevelled distributing

partial to bigger ))(( qqpp partial to smaller

Page 16: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

[ Negative correlation ]

)(

)(

))(()4()2(

yy

xxTableA

yyxx

X

Y

P

Q

))(( yyxx

Table A dense distributing Table B dishevelled distributing

partial to bigger ))(( qqpp partial to smaller

Page 17: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

[Dissociation ]

00))((0: xyyyxxxx

X

Y

X

Yxx

yy

00))((0: xyyyxxyytableB

Table A Table B

X and y have no linear correlation

Page 18: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

[ Conclude ] The effects of xy

Firstly, show the relative direction of x and y

00

00

00

r

r

r

r

xy

xy

xy

yx

xy

No linear correlation

Positive correlation

Negative correlation

Secondly, show the relative close level of x and y

xy

xy

Is bigger the relative level of x and y is higher

Is smaller the relative level of x and y is lower

STATChapter 8 Correlation and Regression Analysis

Page 19: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

3 The effects of x 、 y

Make covariance of different variables standardization direct contrast.

yxyx

xy n

yyxx

r

))((

yxn

yyxx

))((

n

yyxx

yx

n

yyxx

yyxx

n

yyxx ))((Standardization covariance

STATChapter 8 Correlation and Regression Analysis

Page 20: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

2 Let 111 rr

n

yyxx

r yx n

yyxx

r yx

2

2

222

1

yxyx

yyxxyyxx

n

2

2

2

22

)()(1

yxyx nyy

nxxyyxx

n

111

2

yx

yyxxn

21

2

2

yx

yyxx

nr

1022 rr The same can be proved 1r

Chapter 8 Correlation and Regression Analysis STAT

Page 21: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

4 Shortcut calculation formula of correlation coefficient by production-moment method

2222 )()(

))((

)()(

))((

yyxx

yyxx

n

yy

n

xx

n

yyxx

ryx

xy

nyx

xy

)())(( yxyxyxxyyyxx

yxyxxyxy

ny

nx

nnyx

nyx

xy

yxnyn

xx

n

yxy

n

yxxyyyxx

))((Conclusion:

Page 22: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

[Shortcut calculation formula]

22 2 xxxx

)2()( 222 xxxxxx

222

)(2

nx

nnx

x

n

xx

22 )(

n

xxxx

222 )(

)(

n

yyyy

222 )(

)(

22 2 xnxn

xx

Conclusion

Page 23: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression Analysis STAT

[r’s terse calculation formula]

22 )()(

))((

yyxx

yyxxr

yx

xy

ny

ynx

x

nyx

xy

22

22 )()(

n

yy

n

xxn

n

yxxyn

22

22 )()(

)(

2222 )()( yynxxn

yxxyn

2222 yyxx

yxxy

yx

yxxy

n

yy

n

xx

n

n

yxxy

n2

22

2 )()(1

)(1

Page 24: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

5 The judge rules of linear dependent 3.0r Slender correlation 5.03.0 r Low correlation

8.05.0 r Significance correlation 18.0 r High correlation

0r X and y have non-linear relation, but may have other relations

1r X and y have absolute linear relation: function relation

[Example] In order to know the amount relations of consumption and tips in restaurant , select 10 consumers through random sampling from some consumers to investigate, the amounts gained are in the following:

STATChapter 8 Correlation and Regression Analysis

The data of the consumption amount in restaurant and the tips are in the following: unit: dollar

Consumption 33.5 50.7 87.9 98.8 63.6 107.3 120.7 78.5 102.3 140.6

Tips 5.5 5.0 8.1 17 12 16 18.6 9.4 15.4 22.4

Page 25: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Someone believe that the length of the lifeline of palm can forcast their’s life span.

In the letter which relesed in 《 American medicine association

transaction 》 by M.E.Winson and L.E.Mather, denounce refute it through the research of the ashes. The age of death and the length of the lifeline of palm are recorded. The author have a conclusion that there have no pertinent relevent between the age of death and the length of the lifeline of palm . Hand anthroposcopy is lost, so the hand put down.

STATChapter 8 Correlation and Regression analysis

Page 26: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

(6) Characteristic of relevent coefficient of sample

1, two variables both are random variable. 2, two variables are equal rxy= ryx 。

3, the extent of closing to 1 is relevent to sample content n.

n small r 1. sepecial example : when n=2, r=1

148

48

16.9225

48

)()( 2222

yynxxn

yxxynr

[Example] : sample ( x,y ) is ( 6,12.6 ) , ( 1,3.0 ) , n=2.

[Example] draw out 10 stores randomly from the 100 stores, we have

8

stores

money

Profit %

STATChapter 8 Correlation and Regression analysis

Page 27: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

(7) normal error of correlation

When we explain the result of correlation, there would be there normal errors.

1, correlation imply the relation of cause and effect. Such as : one research indicate that the salary of the statistic professor have a positive correlation with the amount of consuming of beer of per person, but these two variables are effected by economic position.

2, Correlation coefficient is zero, to a centainty is inrelevent.

3, the correlation extent of the relevent analysis of mean value and the relevent analysis of unit data. For example: in a research, the twin data of individual income and education bring the linear correlation cofficient 0.4, but when the area of using is average, the linear correlation cofficient change to 0.7.

STATChapter 8 Correlation and Regression analysis

Page 28: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

(8) Hypothesis testing of linear correlation ( two methods)

1, advance the original and alternative hypothesis

2, advance the the level of significance α.

3, choose the method of testing and design tesstatistic.

4, compared test statistic with critical value, if the absolute value of test statistic is larger than the critical value, reject the original hypothesis, otherwise, don’t reject original hypothesis.

T testing

2

1 2

n

r

rt

0:,0: 10 HH

r testing: using the computed r as the test statistic, its critical value can be find in the table

STATChapter 8 Correlation and Regression analysis

Page 29: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Hypothesis testing of linear correlation ( two methods)

Like the former example : r of the bill and tip is 0.92, if use test statistic;

0:,0: 10 HH

r testing hypothesis:

N=10,r=0.92,rα=0.632, r > rα reject original hypothesis, consider there exsit ∴pertinent linear correlation between the two.

If

so reject original hypothesis

it is considered that there exists pertinent relevant relationship between bill consumption and tip

STATChapter 8 Correlation and Regression analysis

Page 30: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

The third section regression analysis

A. summarize of regression analysis

(1) concept

1, linear correlation analysis: calculate the linear correlation coefficient r establish the correlation aspect and osculation extent of the two variables.

[not enough] can not indicate the relation of cause and effect of the two variables can’t presume the change of the variable( y) according to one or several variables ( xi )

The money and tip of ten consumers who have meals consume

Bill x 33.5 50.7 63.6 78.5 87.9 98.8 107.3 102.3120.7 140.6

Tip y 5.5 5 12 9.4 8.1 17 16 15.4 18.622.5

  r=0.92

STATChapter 8 Correlation and Regression analysis

Page 31: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

2, regression analysis : through the change of one variable to explain the change of other variable

y = a+bx 、 y=a+b1x1+bx2 、 y=0+ 1x1+ 2x2+…+ nxn

[regression] first advance by England biologist F · Galton

elder stature offspring stature

X y y = f ( x ) + men’s average stature

(2) varieties of regression analysis

1, classify by the number of independent variable

(1) simply ( unitary ) regression: only one independent varible

[example] y = a+bx unitary regression equation

(2) ] multiple regression: two or more independent varibles

[example] y=0+ 1x1+ 2x2+…+ nxn

STATChapter 8 Correlation and Regression analysis

Page 32: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

2, classify by the character of the regression equation

(1) linear regression: dependent variable is the linear function of independent variable

[example] y = a+bx unitary linear regression equation

(2) nonlinear regression: dependent varible is the nonlinear function of indenpent variable

[example] double curve regression equation

Exponential function regression equation

Logarithmic function regression equation

STAT

power function regression equation

Chapter 8 Correlation and Regression analysis

Page 33: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

(3) steps of regression analysis

1, establish independent and dependent variable

[example] output of food supplies ( y ) output of fertilization ( x );

expenditure of consume ( y ) country income (x) ;

fire lost ( y ) the distance between the fire accures and the nearestfirehouse ( x ) .

2, establish the sample regression equation

3, testing statistic

4, forecast or control

[example] the regression equation of consume and income: y= a+bx= 200+0.15x

known x establish y : estimate or forcast

known y establish x : control

STATChapter 8 Correlation and Regression analysis

Page 34: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

B, fit of unitary linear regression equation

(1) population regression equation

[example] the data of governable income and the expenditure of consume of 40 families

incomeconsumption

first group

second groupthird group

fourth group

firth group

condition probility

condition mean

condition probility : condition mean :

STATChapter 8 Correlation and Regression analysis

Page 35: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

[table]

50

100

150

200

iY

80 100 120 140 160 180 200iX

Population regression beeling

distribution

distribution

distribution

STATChapter 8 Correlation and Regression analysis

Page 36: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

[suppose] the mean of y distribution are all in a beeling

50

100

150

200

iY

80 100 120 140 160 180 200iX

Population regression beeling

Premise 1 : there exist linear relation between X and E (Y/X )

Premise 2 : N

Premise 3 : the effect of casual factor is counteracted.

)/( ii XYE = population regression beeling

STATChapter 8 Correlation and Regression analysis

Page 37: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Yi/Xi= condition mean +εi =α+βXi+ εi

50

100

150

200

iY

80 100 120 140 160 180 200iX

Population regression beeling

22)(

)var(

N

YY ii

i 160ii XY /

i

Population regression beeling

random disturb and suppose

STATChapter 8 Correlation and Regression analysis

Page 38: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

50

100

150

200

iY

80 100 120 140 160 180 200iX

Population regression beeling

)/( ii XYE = population regression beeling

Sample regression equation

bxay ˆiii XXYE )/(

[ fit ideas ] sample N n,

STATChapter 8 Correlation and Regression analysis

Page 39: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

( 2 ) fit of sample regression equation

randomly sample from the population, get a group of sample observational value. [example] : the data of the governable income and expenditure of consumption of 40 familities

income

consumption

condition probility

condition mean

STATChapter 8 Correlation and Regression analysis

Page 40: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

[table]

50

100

150

200

iY

80 100 120 140 160 180 200iX

Sample regression beeling

1e

2e

iiiii ebxaeyy ˆ

sample regression equation ( beeling )

Residual : observational value – regression value

regression coefficient

sample a

sample b

population

population

“ population regression equation “ is unknown

STATChapter 8 Correlation and Regression analysis

Page 41: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

50

100

150

200

iY

80 100 120 140 160 180 200iX

Sample regression beeling

1e

2e

unknown

known

step : 1, use sample date fit sample regression beeling, try to reduce error; 2, test the fungible extant of sample regression beeling for population

regression beeling.

STATChapter 8 Correlation and Regression analysis

Page 42: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

( 3 ) the fit method of sample regression equation

XXYE )/(

bxay ˆ

n

iii yy

1

2)ˆ(

min)(1

2

n

iii bxay

n

iieQ

1

2

1, fit method of absolute value

Let the beeling of “ best beeling “

2, OLS

basic thinking : the beeling which make squares sum of residual least is “ best beeling “

find the best beeling find the best a and b

STATChapter 8 Correlation and Regression analysis

Page 43: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

can find the value of a and b which make the value of Q is least

min)()ˆ( 22 bxayyyQ

0)()(2

0)1)((2

xbxayb

Q

bxaya

Q

xbyn

xb

n

ya

We get

From ( 1 ) equation

STATChapter 8 Correlation and Regression analysis

Page 44: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

xbyn

xb

n

ya

xbxaxy

xbnay

)2(

)1(2

2xbxn

xb

n

yxy

22)(

xbn

xb

n

yx

n

xx

n

yxxy

b2

2 )(

Let a into the (2) equation, we get

clean up :

STATChapter 8 Correlation and Regression analysis

Page 45: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

[simple calculate]

xbyn

xb

n

ya

n

xx

n

yxxy

b2

2 )(

22 )( xxn

yxxyn

n

xxxx

222 )(

)(

2)(

))((

xx

yyxxb

n

xxn

yyxx

2)(

))((

2x

xy

Known :

STATChapter 8 Correlation and Regression analysis

Page 46: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

2x

xy

yx

xy br

yx

xy

y

x

x

xy

y

xbr

2

2x

xy

x

y

yx

xy

x

yrb

bxay ˆ

The relationship of correlation coefficient r and regression coefficient b

(1) both are in the same direction;

(2) r reflect the correlation direction and osculation extent

b reflect the average change of one variable when a variable change a

unit .

STATChapter 8 Correlation and Regression analysis

Page 47: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Example: In order to research the relationship between the consumption of having

dinner and expenditure of tip, randomly draw out ten customers of having dinner,

we get sample date follows: The money of having dinner consumption and tip data follows: unit: dollar

consumption

tip

please fit sample regression equation

sample correlation coefficient r=0.92

STATChapter 8 Correlation and Regression analysis

Page 48: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

5. 55

129. 48. 1

17 16 15. 418. 6

22. 5

0

5

10

15

20

25

33.5

50.7

63.6

78.5

87.9

98.8 107

102

121

141

[example] In order to research the relationship between the consumption of having dinner and expenditure of tip, randomly draw out ten customers of having dinner ( use EXCEL softeware inborm the scatter diagram )

please fit sample regression equation

STATChapter 8 Correlation and Regression analysis

Page 49: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

• Solution: through the scatter diagram can approximatively see the linear connection between the consuming of having dinner and tip expenditure. So we let

• y=a+bx

• regression equation:

• Economic meaning: when add the 100 RMB of the consuming of having dinner expenditure, there are averaged adding 16.6RMB of the tip expenditure.

18.13031,59.1987

23.87703,5.129,9.883,102

2

xyy

xyxn

166.009.95753

75.15846

9.88323.8770310

5.1299.88318.1303110

)( 222

xxn

yxxynb

723.139.88166.095.12

n

xb

n

yxbya

xbxay 166.0723.1ˆ

Chapter 8 Correlation and Regression analysisSTAT

Page 50: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

• The variance analysis of regression equation• Bring forward the question: sample

magnitude, namely fit goodness. • ( 1 ) decompose of the sum of squares for total deviation

• total deviation= • residual + regression deviation

• regression deviation

)ˆ(ˆ yybxay

)]ˆ()ˆ[()( yyyyyy )ˆ()ˆ()( yyyyyy

bxayebxay ˆ

xbayxbya residualeyy ˆ )(ˆ xxbyy

)( yy )ˆ( yy

)ˆ( yy

bxay ˆ x

y

yy

Chapter 8 Correlation and Regression analysisSTAT

Page 51: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

• According:

• Both sides adding

)ˆ()ˆ()( yyyyyy 22 )]ˆ()ˆ[()( yyyyyy

)ˆ)(ˆ(2)ˆ()ˆ()( 222 yyyyyyyyyy

xbayxbyabxay ˆ

))(()ˆ)(ˆ( xbabxabxayyyyy )]()([ xbbxbxxbyy )()( xxbbxxbyy ))](()[( xxxxbyyb 0])())([( 2 xxbyyxxb

22

)())(()(

))((xxbyyxx

xx

yyxxb

22

)())(()(

))((xxbyyxx

xxyyxx

b

22 )]ˆ()ˆ[()( yyyyyy

Chapter 8 Correlation and Regression analysisSTAT

Page 52: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

• [Analysis of deviation]

• sum of squares for total deviation (SST)= squares for the residual(SSE)• + squares for the regression(SSR)• ( 1 ) SSE analysis:• the error resulted from residual

smaller of the error e y is closer to the y better of the fit degree

larger of the error e y is farer from the y worse of the fit degree

( 2 ) SSR analysis:• residual resulted from the change of x

222 )ˆ()ˆ()( yyyyyy

222 )()ˆ( ebxaebxayy 2)ˆ( yy

2222 )()()ˆ( xxbxbabxayy

2)ˆ( yy

Chapter 8 Correlation and Regression analysisSTAT

Page 53: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

is larger (y-y) is smaller the effect of y fit y is good

is smaller (y-y) is larger the effect of y fit y is bad

222 )ˆ()ˆ()( yyyyyy

2

2

2

2

2

2

)()ˆ(

)()ˆ(

)()(

yyyy

yyyy

yyyy

22

2

2

2

)(

)ˆ(

)(

)ˆ(1 r

yy

yy

yy

yy

2

22

)(

)ˆ(1

yy

yyr

2

2

)(

)ˆ(

yy

yy

2

2

r

r

the proportion of SSR account for the STR

(determinant coefficient)

•Determinant coefficient

Chapter 8 Correlation and Regression analysisSTAT

Page 54: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

• [Effect of determinant coefficient]

yyr ˆ12

yyr ˆ02

10 2r

2

22

)(

)ˆ(1

yy

yyr

2

2

)(

)ˆ(

yy

yy

bxay ˆ

yy

)( yy

)ˆ( yy

)ˆ( yy

x

y

Chapter 8 Correlation and Regression analysisSTAT

Page 55: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Determine the relationship between the coefficient of determination r2 and correlation coefficient r

Chapter 8 Correlation and Regression analysis

2

22

)(

)ˆ(

yy

yyr

xbaybxayand ˆ:

2222 )()()ˆ( xxbxbabxayy

2

22

)(

)ˆ(

yy

yyr

2

22

)(

)(

yy

xxb

n

yyn

xxb

2

22

)(

)(

2

22

y

xb

2

22

2

2

2 rbryx

xy

y

x

x

xy

y

x

The sum of squares of regression deviation The sum of squares of total deviation

STAT

Page 56: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

• (3) standard error of the estimate• 1. Definition: The average error between the observed

value and regression value.• 2. Formula: Regression analysis

Chapter 8 Correlation and Regression analysis

XYEpopulation

bxaysample

)(:

ˆ:

:)ˆ( 2yy

2

)ˆ( 2

n

yyS yx

smaller

gerlar

S yx

The average error between y and y

The larger of the mean deviation the worse of the effect of fit

The smaller of the mean deviation the better of the effect of fit

The sum of squares of deviation between observed value and regression value.

STAT

Page 57: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression analysis

Graph analysis N

YYn

yyS yyx

22 )(2

)ˆ(

iY

200

150

100

50

2

2

n

xybyayS yx

80 100 120 140 160 180 200 iX

SSE yyx )(

Simple and fast calculation formula

The regression beeline of population

is the unbiased estimator value of

STAT

Page 58: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression analysis

( 4 ) The method of variance and coefficient of determinationThe method of variance and coefficient of determination

2

2

2

22

)(

)ˆ(1

)(

)ˆ(

yy

yy

yy

yyr

2

2

)(

)ˆ(1

yy

yyr

n

yyn

yy

2

2

)(

)ˆ(

1

2

2)ˆ(

1y

n

yy

2 nn 2

2

1y

yxSr

2

22 1

y

yxSr

22

2

1 rS

y

yx

21 rS yyx

coefficient of determination

N is very large The method of variance

STAT

Page 59: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression analysis

Example: Knowing the following information , try to calculate the coefficient of determination and the standard error of the estimate.

Revenue x Expendi ture y x2 y2 xy 20

30

33

40

15

13

26

38 35 43

7

9

8

11

5

4

8

10 9 10

400 900 1089 1600 225 169 676 1444 1225 1849

49 81 64 121 25 16 64 100 81 100

140 270 264 440 75 52 208 380 315 430

293 81 9577 701 2574

STAT

Page 60: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression analysis

Answer: The deviation about the mean between observed value and regression value is 0.73, and 88.03% of the total deviation is due to the change of X.

2574,701,9577,81,293,10 22 xyyxyxn

2033.01726.2 ba

2992.4)ˆ( 22 xybyayyy

73.0210

2992.42

)ˆ( 2

n

yyS yx

%03.8849.4

5374.01

1.81.70

5374.01

5374.011

2222

22

yy

Sr

y

yx

Example: Knowing the following information , try to calculate the coefficient of determination and the standard error of the estimate.

STAT

Page 61: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Chapter 8 Correlation and Regression analysis

• The third section: Multiple linear regression analysis11 、 、 Multiple linear regression modelMultiple linear regression model: It refers to study the quantity relations between independent variable and dependent variable which are two or over two, under the condition that it is linear correlative.The model is : y=0+ 1X1 2X2+…+ nXn+ei

2 、 The parameter estimate of the multiple linear regression model :least squares method. To get the estimate value of regression coefficient, we usually use the statistics software. And the equation can be expressed by matrix:

nnnnnknn

k

k

n e

e

e

eB

y

y

y

YB

u

u

u

U

xx

xx

xx

X

y

y

y

Y 2

1

2

1

2

1

2

1

2

1

2

222

121

2

1

,

ˆ

ˆ

ˆ

ˆ,

ˆ

ˆ

ˆ

ˆ,,,

1

1

1

,

STAT

Page 62: Chapter 8 Correlation and regression analysis. Chapter 8 Correlation and Regression Analysis Statistics in Practice The restaurant vocation in western

Thanks for Your Attention