ch 11 correlatiion analysis

Upload: iamaking

Post on 30-May-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Ch 11 Correlatiion Analysis

    1/26

    Correlation Analysis

    Correlation is a statistical tool which studies therelationship between two variables and CorrelationAnalysis involves various methods and techniquesused for studying and measuring the extent of therelationship between two variables.

    Correlation Analysis is a statistical procedure by

    which we can determine the degree of association orrelationship between two or more variables.

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE1

  • 8/14/2019 Ch 11 Correlatiion Analysis

    2/26

    Statistical Relationship

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE2

    Relation between height & weight; Price & demand,Age & Height; Radius & Area of a circle

    Two variables are said to be correlated if a change

    in the value of one variable is accompanied by achange in the value of another variable.

    Such a relationship is called Statistical Relationship.

    When both the variables in the bi-variate data arequantitative, we use the term Correlation analysis todescribe the methods to find out if relationshipexists or not?

  • 8/14/2019 Ch 11 Correlatiion Analysis

    3/26

    Croxton and Crowden defined thecorrelation as

    The relationship of quantitative nature. The appropriate statisticaltool for discovering and measuring the relationship and expressingit in brief formula is known as Correlation.

    According to the Statistician A. M. Tuttle

    Correlation is an analysis of the covariation between two ormore variables.

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE3

  • 8/14/2019 Ch 11 Correlatiion Analysis

    4/26

    Sample Data for House Price Model

    House Price in $1000s(Y)

    Square Feet(X)

    245 1400

    312 1600

    279 1700308 1875

    199 1100

    219 1550

    405 2350324 2450

    319 1425

    255 1700

    Prof Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    5/26

    0

    50

    100

    150

    200

    250

    300

    350

    400

    450

    0 500 1000 1500 2000 2500 3000

    Square Fee

    Hous

    e

    Price

    ($1000s)

    Graphical Presentation

    House price model: scatterplot

    Prof Kuldeep Sharma, IIBS

    Bengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    6/26

    Prof Kuldeep Sharma, IIBSBengaluru

    Types of Relationships

    Y

    X

    Y

    X

    Y

    Y

    X

    X

    Linear relationships Curvilinear relationships

  • 8/14/2019 Ch 11 Correlatiion Analysis

    7/26

    Types of Relationships

    Y

    X

    Y

    X

    Y

    Y

    X

    X

    Strong relationships Weak relationships

    (continued)

    Prof Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    8/26

    Types of Relationships

    Y

    X

    Y

    X

    No relationship

    (continued)

    Prof Kuldeep Sharma, IIBS

    Bengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    9/26

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE9

    UNIVARIATE & BIVARIATE DISTRIBUTION

    In a bivariate population we are interested to knowwhether there exists some sort of functionalrelationship between the two variables involved.

    The change in one variable affects a change in theother variable or not?

    If yes what is the nature of this relationship?

  • 8/14/2019 Ch 11 Correlatiion Analysis

    10/26

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE10

    COVARIANCE

    Covariance is an absolute measure between twovariables X & Y, denoted by Cov. (X,Y) and definedas

    Cov. (X,Y) = (x - )*(y - )/n

    Cov. (X,Y) = 1/n*{ xy 1/n*( x)*( y)}

    The covariance measures the strength of the linearrelationship between two variables

  • 8/14/2019 Ch 11 Correlatiion Analysis

    11/26

    SCATTER DIAGRAM OR DOT DIAGRAMMETHOD

    Scatter diagram is a graphical method of showing thecorrelation between the two variables x & y.

    The scatter diagram may indicate both degree andthe type of correlation.

    From scatter diagram, we can form a fairly good,though rough idea about the relationship between thetwo variables.

    PAGE11Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    12/26

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE12

    Scatter Plot

    A scatter plot (or scatter diagram) can be used to show the

    relationship between two variablesC os t pe r D a y v s . P rod uc tion

    0

    5 0

    1 0 0

    1 5 0

    2 0 0

    2 5 0

    0 1 0 2 0 3 0 4 0 5 0 6 0 7

    V o lu m e p e r

    Cos

    tperDay

    Volumeper day Cost perday

    23 125

    26 140

    29 146

    33 16038 167

    42 170

    50 188

    55 195

    60 200

  • 8/14/2019 Ch 11 Correlatiion Analysis

    13/26

    Advantage & Disadvantage of Scatter Diagram

    Readily comprehensible and enables us to form a rough idea ofthe nature of relationship between the two variables

    Not affected by extreme observations Not influenced by extreme items

    Not a suitable method if the number of observations is verylarge

    Provides only rough measure of Correlation which can differ

    from man to man

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE13

  • 8/14/2019 Ch 11 Correlatiion Analysis

    14/26

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE14

    Co-efficient of Correlation r

    It gives the degree of association or relationship

    correlation.

    The relationship between two variables such that a

    change in one variable results in a positive or

    negative change in the other variable and also a

    greater change in one variable results in

    corresponding greater or smaller change in the othervariable is known as Correlation.

  • 8/14/2019 Ch 11 Correlatiion Analysis

    15/26

    Coefficient of Correlation

    PAGE15

    ( ) ( )

    ( ) ( )

    1

    2 2

    1 1

    n

    i ii

    n n

    i i

    i i

    X X Y Y

    r

    X X Y Y

    =

    = =

    =

    Measures the strength of the linear relationshipbetween two quantitative variables

    Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    16/26

    Application of Correlation analysis

    Correlation analysis is used to measurestrength of the association (linear

    relationship) between two variables Correlation is only concerned with strength

    of the relationship

    No causal effect is implied with correlation

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE16

  • 8/14/2019 Ch 11 Correlatiion Analysis

    17/26

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE17

    Properties of Co-efficient of correlation

    1. It is a measure of the closeness of a fit in a relativesense

    2. R lies between -1 & +1

    3. The correlation is perfect negative when r = -1

    4. The correlation is perfect positive when r= +15. If r = 0 then there is no correlation, Thus Variables are

    independent

    6. R is a pure number and is not affected by a change oforigin & scale

    7. Relative measure of association between two or morevariables

  • 8/14/2019 Ch 11 Correlatiion Analysis

    18/26

    Scatter Plots of Data with Various

    Correlation Coefficients

    Y

    X

    Y

    X

    Y

    X

    Y

    X

    Y

    X

    r = -1 r = -.6 r = 0

    r

    = .6

    r = 1

    Prof. Kuldeep Sharma, IIBSBen aluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    19/26

    Karl Pearsons Coefficient of Correlation

    Karl Pearson (1857-1936) a great Statistician provided formula for measuringthe magnitude of linear correlation coefficient between two variables.

    (X,Y) = rxy = Cov (x,y)

    (VarX *VarY)

    (X,Y) = rxy = (x - )(y - )

    (x - )2* (y - ) 2

    Prof. Kuldeep Sharma, IIBSBengaluru

    PAGE19

  • 8/14/2019 Ch 11 Correlatiion Analysis

    20/26

    Karl Pearsons Coefficient of Correlationcontd.

    (X,Y) = rxy = n* x*y x* Y

    {n* x2 ( x)2}*{n* y2 ( y)2}

    Above formula saves a lot of computational labour.

    Also It reduces the error due to computation & rounding off.

    Other forms also can be used

    (X,Y) = rxy = dx* dy where dx=(x- )

    dx2 * dy2 where dx2=(x- )2

    (X,Y) = rxy = dx* dy

    n* x* y

    PAGE20Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    21/26

    Another Formula called short cut method

    (X,Y) = rxy = n* dx*dy dx* dY

    {n* dx2 ( dx)2}*{n* dy2 ( dy)2}

    where dx = (x - a) a is assumed mean for X where dx2= (x - a) 2

    where dy= (y - b) b is assumed mean for Y

    where dy2= (y - b) 2

    PAGE21Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    22/26

    Prof. Kuldeep Sharma, IIBSBengaluruPAGE22

    Nature of Relationship

    Positive correlation means that low values of one variable areassociated with low values of the other, and high values of one

    variable are associated with high values of the other.

    Negative correlation means that low values of one variable areassociated with high values of the other, and high values of onevariable are associated with low values of the other.

    The degree of correlation between two variables is measured by thePersonian ( Product moment) correlation coefficient. ( r )

    The nearer r to +1 or 1. The stronger the relationship.

  • 8/14/2019 Ch 11 Correlatiion Analysis

    23/26

    Spearmans Rank Correlation Coefficient R

    It is applied in the problems in which data cannot be measuredquantitatively but qualitatively assessment is possible such asbeauty, honesty etc.

    In this case the best individual is given rank number1, next 2 andso on.

    R = 1 - 6* (D)2

    n(n2 1)

    Where is the square of the difference of corresponding ranksand n is number of pairs of observations.

    PAGE23Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    24/26

    Spearmans Rank Correlation CoefficientWhen Ranks are tied or Repeated ranks

    R = 1 - 6[ (D)2+(p3p)/12+(q3q)/12]

    n(n2 1)where p, q.. Are the number of times a value isrepeated

    PAGE24Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    25/26

    Spearmans Rank Correlation Coefficient

    It is simpler to understand and easy to calculate as compared toKarls Pearsons Method.

    It is useful for qualitative data such as beauty, honesty,efficiency etc.

    It is a useful method when the actual data is not given but onlyranks are given.

    Limitation It cant be used for grouped frequency distribution

    It is no as accurate as Pearsons coefficient. It cant be used in continuous series. When no of items is >30, and if ranks are not given; it takes more time and

    therefore cant be used conveniently.

    PAGE25Prof. Kuldeep Sharma, IIBSBengaluru

  • 8/14/2019 Ch 11 Correlatiion Analysis

    26/26

    Quiz

    State the nature of the following correlation (positive, Negative or no correlation)

    1. The amount of rainfall & Yield of crops

    2. The colour of a saree and the intelligence of the girl wearing it

    3. Age if life insurance & the premium of insurance

    4. Demand for goods and their prices under normal time

    5. Production of pig iron and soot contents in Durgapur

    6. Unemployment index and the purchasing power of the commonman

    PAGE26