correlation & regression

26
Name of Institution 1 CORRELATION & REGRESSION ANALYSIS

Upload: hitesh-thakur

Post on 26-Oct-2014

86 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Correlation & Regression

Name of Institution

1

CORRELATION & REGRESSION ANALYSIS

Page 2: Correlation & Regression

Name of Institution

2

CORRELATION

• When the relationship is of quantitative nature, the appropriate statistical tool for discovering and measuring the relationship and expressing it in a brief formula is known as correlation.

• The measure of correlation called the coefficient of correlation indicates the strength & direction of relationship between two variables.

• The coefficient between two variables x and y is denoted by r or rxy

or ρ.

• It lies between – 1 to + 1.

• If r = 0, then the variables are said to be independent.

Page 3: Correlation & Regression

Name of Institution

3

TYPES OF CORRELATION

I) Based on Direction:--Positive Correlation: When increase/decrease in the value of one variable results in a corresponding increase/ decrease in the value of other variable.Negative Correlation: When increase/ decrease in the value of one variable results in a corresponding decrease/ increase in the value of other variable.

II) Based on Degree:-- High

ModerateLow

Page 4: Correlation & Regression

Name of Institution

4

METHODS OF STUDYING CORRELATION

1) Scatter Diagram Method.

2) Karl Pearson’s Coefficient of Correlation.

3) Spearman’s Rank Correlation Coefficient.

Page 5: Correlation & Regression

Name of Institution

5

SCATTER DIAGRAM

• The simplest method for studying correlation in two variables is a special type of dot chart called Dotogram or Scatter Diagram.

• In this method given data are plotted in the form of dots, for each pair of X and Y.

• The more the plotted points scatter over the chart, the lesser is the degree of relationship between two variables.

• The more nearly the points come to the line, the higher the degree of relationship.

Page 6: Correlation & Regression

Name of Institution

Y

X

= -1= -1 Y

X

= 0= 0

Y

X

= 1= 1 Y

X

= 0= 0

Perfect negativeCorrelation

No Correlation

Perfect PositiveCorrelation

No Correlation

Page 7: Correlation & Regression

Name of Institution

7

Advantages:

1. It is readily comprehensive and enables us to form a rough idea of the nature of relationship between the two variables x and y.2. It is not affected by extreme observations.

Disadvantages:

1.It is not a suitable method if the number of observations is fairly large.2.It is only a rough measure of correlation where the exact magnitude cannot be known.

Page 8: Correlation & Regression

Name of Institution

8

KARL PEARSON COEFFICIENT OF CORRELATION

• Also known as Pearsonian Coefficient of Correlation.

• It describes the degree & direction of relationship between two variables X and Y.

• It is denoted by the symbol ‘r’.

• The value of Pearson’s coefficient of correlation lies between -1 to +1.

• If X and Y are independent variables then coefficient of correlation is zero.

Page 9: Correlation & Regression

Name of InstitutionPEARSON FORMULA

• Correlation coefficient is denoted by r given by the formula:-

n

yy

n

xx

n

yxxy

ror

formThird

yyxx

yyxxr

formSecond

yxCov

yx

yxCovr

formFirst

yx

2

2

2

2

22

)(

)()(

))((

),.(

varvar

),.(

Page 10: Correlation & Regression

Name of Institution

10

Ques 1. Calculate Karl Pearson coefficient of correlation.

X Y

12 14

9 8

8 6

10 9

11 11

13 12

7 3

Page 11: Correlation & Regression

Name of Institution

11

Ques 2. A financial analyst wanted to find out whether inventory turnover influences any company’s earnings per share.Random sample of 7 companies listed in stock exchange were selected and the following data was recorded for each.Find the correlation coefficient.

Company Inventory turnover

Earnings per share (%)

A 4 11

B 5 9

C 7 13

D 8 7

E 6 13

F 3 8

G 5 8

Page 12: Correlation & Regression

Name of Institution

12

Ques 3. The following table gives the indices of industrial production and number of registered unemployed people (in lakhs). Calculate Karl Pearson’s coefficient of correlation.

Index of production

No. of unemployed

100 15

102 12

104 13

107 11

105 12

112 12

103 19

99 26

Page 13: Correlation & Regression

Name of InstitutionSPEARMAN CORRELATION

• Rank X and Y separately.• The largest value gets rank 1 and the second

largest 2 and so on.• Formula is:-

• For tied ranks:-

YRankXRankdwherenn

d

;

)1(

*61

2

2

.

)1(

.......)(121

)(121

*61

2

23

213

12

repeatedisvalueatimesofnumbertheismHere

nn

mmmmd

Page 14: Correlation & Regression

Name of Institution

Question1) Calculate the coefficient of correlation for the following heights in inches of fathers(X) and sons(Y).

X Y

65 67

66 68

67 65

67 68

68 72

69 72

70 69

72 71

Page 15: Correlation & Regression

Name of Institution

15

Question 2) Find rank correlation coefficient between x and y.

X Y

85 18.3

91 20.8

56 16.9

72 15.7

95 19.2

76 18.1

89 17.5

51 14.9

59 18.9

90 15.4

Page 16: Correlation & Regression

Name of Institution

Question 3) obtain the rank correlation coefficient for the following data.

X Y

68 62

64 58

75 68

50 45

64 81

80 60

75 68

40 48

55 50

64 70

Page 17: Correlation & Regression

Name of InstitutionREGRESSION

• Regression analysis provides a mathematical model of the relationship between two variables, in which one is independent and one is dependent.

• If X and Y are two variables, then we have two regression lines:-

(a) Regression line of X on Y.

(b) Regression line of Y on X.

Page 18: Correlation & Regression

Name of InstitutionRegression line X on Y.

The regression line of X on Y is given by:-

X= a + b Y

where, b is called regression coefficient X on Y, denoted by bxy

Here, Y is the independent variable and X is dependent variable.

Normal equations to estimate a and b are:-

2YbYaXY

YbnaX

Page 19: Correlation & Regression

Name of Institution

Another form of regression equation X on Y is :-

y

xxy

y

x

rbHere

YYrXX

*,

*

Page 20: Correlation & Regression

Name of InstitutionRegression line Y on X.

The regression line of Y on X is given by:-

Y= a + b X

where, is called regression coefficient X on Y, denoted by byx

Here, X is the independent variable and Y is dependent variable.

Normal equations to estimate a and b are:-

2XbXaXY

XbnaY

Page 21: Correlation & Regression

Name of Institution

Another form of regression equation Y on X is :-

x

yyx

x

y

rbHere

XXrYY

*,

*

Page 22: Correlation & Regression

Name of InstitutionProperties of regression lines and

coefficients

• Both the regression lines passes through the point • The correlation coefficient is the geometric mean of two

regression coefficients of X and Y i.e

• If one of the regression coefficients is greater than 1,the other must be less than 1.

• bxy and byx and correlation coefficient (r) have the same sign.

for eg:-if bxy = -0.664 and byx = -0.234

then r = -(0.664*0.234)1/2 = -0.394

yx,

yxxy bbr

Page 23: Correlation & Regression

Name of Institution

QUESTION 1) You are given the following information about advertising expenditure and sales.

Advertisement(x) Sales(y)

A.M 10 90

S.D 3 12

And r = 0.8

(a)Obtain the two regression lines.

(b)Find the likely sales when advertisement budget is Rs 15 lakhs?

Page 24: Correlation & Regression

Name of Institution

QUESTION 2) The two regression lines are given by:-

3 X + 12 Y = 19

9 X +3 Y = 46

And σx = 4.

Obtain:-

(a). Mean values of X and Y.

(b) The value of correlation coefficient.

(c) Standard deviation of y.

Page 25: Correlation & Regression

Name of Institution

25

Question 3. For the following data,

Obtain the two regression equations and hence find the correlation coefficient.

X 1 2 3 4 5

Y 2 5 3 8 7

Page 26: Correlation & Regression

Name of Institution

26

Question 4. The following data gives the ages and blood pressure of 10 women.

(i) Find the correlation coefficient between age and blood pressure.(ii) Determine the regression equation of blood pressure on age.(iii) Estimate the blood pressure of a woman whose age is 45 years.

Age 56 42 36 47 49 42 60 72 63 55

B.P 147 125 118 128 145 140 155 160 149 150