session 1-upload et

Upload: ankur-chugh

Post on 25-Feb-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Session 1-Upload ET

    1/19

    PGPEX

    Session 1

    2016

    Understanding EconometricsA case of Simple Regression

  • 7/25/2019 Session 1-Upload ET

    2/19

    Data types we face in Real World

    Cross Section: bank wagesbankwages.csv

    Time series: Stock return data

    Panel data

    1/28/2016 2Understanding Econometrics

    http://bankwages.csv/http://stockreturn.csv/http://paneldata.csv/http://paneldata.csv/http://stockreturn.csv/http://bankwages.csv/
  • 7/25/2019 Session 1-Upload ET

    3/19

    Graphical illustration of the correlation coefficient

    X

    Y

    v

    vv

    v

    1

    4

    2

    3

    x

    y

    Quadrant yi -y xi - x (yiy) (xi x)

    1 + + +

    2 + - -

    3 - - +

    4 - + -

    Algebraic Signs of the Quantities (yiy ) and (xi x )

    1/28/2016 3Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    4/19

    Fail to capture

    The relationship: Y=50-X^2

    The data:

    Y X

    1 -7

    14 -6

    25 -5

    34 -4

    41 -3

    46 -2

    49 -1

    50 0

    49 1

    46 2

    41 3

    34 425 5

    14 6

    1 7 0

    10

    20

    30

    40

    50

    Y

    -10 -5 0 5 10X

    1/28/2016 4Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    5/19

    Flaw in Correlation CoefficientY1 X1 Y2 X2 Y3 X3 Y4 X4

    8.04 10 9.14 10 7.46 10 6.58 8

    6.95 8 8.14 8 6.77 8 5.76 8

    7.58 13 8.74 13 12.74 13 7.71 8

    8.81 9 8.77 9 7.11 9 8.84 8

    8.33 11 9.26 11 7.81 11 8.47 8

    9.96 14 8.1 14 8.84 14 7.04 8

    7.24 6 6.13 6 6.08 6 5.25 8

    4.26 4 3.1 4 5.39 4 12.5 19

    10.84 12 9.13 12 8.15 12 5.56 84.82 7 7.26 7 6.42 7 7.91 8

    5.68 5 4.74 5 5.73 5 6.89 8

    1/28/2016 5Understanding Econometrics

    use http://www.ats.ucla.edu/stat/stata/examples/chp/p025b, clear

    Correlation=.82

  • 7/25/2019 Session 1-Upload ET

    6/19

    Flaw in Correlation Coefficient

    4

    6

    8

    10

    12

    4 6 8 10 12 14X1

    Y1 Fitted values

    2

    4

    6

    8

    10

    4 6 8 10 12 14X2

    Y2 Fitted values

    4

    6

    8

    10

    12

    4 6 8 10 12 14X3

    Y3 Fitted values

    6

    8

    10

    12

    14

    5 10 15 20X4

    Y4 Fitted values

    1/28/2016 6Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    7/19

    A flowchart illustrating the dynamic iterative regression process

    Start

    Formulate the problem

    Fit the model

    Validate assumptions

    Evaluate the fitted

    model

    Choose a set of variables

    Choose form of model

    Choose method of fitting

    Specify assumptions

    Use method of fitting

    Residual plots

    Outliers detection

    Sensitivity analysis

    OK?

    Ok?

    Goodness of fit tests

    Use the model for the

    intended purposeStop

    Yes

    Yes

    No

    No

    1/28/2016 7Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    8/19

    Bank wage

    Sctatter: A starting point

    9.5

    10

    10.5

    11

    11.5

    12

    y

    5 10 15 20EDUC

    Correlation: 0.6967

    1/28/2016 8Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    9/19

    Simple Linear Regression

    Although we could fit a line "by eye" e.g. using atransparent ruler, this would be a subjectiveapproach and therefore unsatisfactory.

    An objective, and therefore better, way of

    determining the position of a straight line is touse the method of least squares.

    Using this method, we choose a line such that thesum of squares of distances of all points from the

    line is minimized.

    1/28/2016 9Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    10/19

    Least-squares or regression line

    These distances, i.e., the distance between yvalues and their corresponding estimated

    values on the line are called residuals

    The line which fits the best is called theregression line or, sometimes, the least-

    squares line

    The line always passes through the pointdefined by the mean of Y and the mean of X

    1/28/2016 10Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    11/19

    Steps involved

    1. Statement of theory: Labour

    Economics: Salary depends on

    education

    Step 2: Econometric Model

    Step 3: data : bank wage

    )(XfY=

    11

    ueducationysalary ++== )log( 21

    1/28/2016 Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    12/19

    Econometric model Building Step

    Why error term is appearing?.

    1. Omission of other explanatory variables

    examples?

    Note that there can be many x variables:Multiple regression model

    2. Measurement Error & Model Misspecification

    3. Purely random

    4. Linear approximation

    121/28/2016 Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    13/19

    Statistical Inference(step 4)

    4. Next task is to estimate the parameters of the

    model so that we can say show relative increase in

    salary due to one year of additional education.

    To obtain a fitted line

    Where y=log(salary) and X=education

    XbbY 21

    +=

    131/28/2016 Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    14/19Understanding Econometrics

    DERIVING LINEAR REGRESSION COEFFICIENTS: method of OLS

    XXnX1

    Y

    b1

    XbbY

    uXY

    21

    21

    :lineFitted

    :modelTrue

    1211 XbbY

    1Y

    b2

    nY

    nn XbbY

    21

    14

    1/28/2016 14

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/25/2019 Session 1-Upload ET

    15/19

    Understanding Econometrics

    DERIVING LINEAR REGRESSION COEFFICIENTS

    XXnX1

    Y

    b1

    XbbY

    uXY

    21

    21

    :lineFitted

    :modelTrue

    nnnnn XbbYYYe

    XbbYYYe

    21

    1211111

    .....

    1211 XbbY

    1Y

    b2

    nY

    1e

    ne

    nn XbbY

    21

    16

    1/28/2016 15

  • 7/25/2019 Session 1-Upload ET

    16/19

    9.5

    10

    10.5

    11

    11.5

    12

    5 10 15 20EDUC

    y Fitted values

    Bank wage Data

    1/28/2016 16Understanding Econometrics

  • 7/25/2019 Session 1-Upload ET

    17/19

    Actual and Fitted Model

    1/28/2016 Understanding Econometrics 17

    DERIVING LINEAR REGRESSION COEFFICIENTS

  • 7/25/2019 Session 1-Upload ET

    18/19

    Understanding Econometrics

    iiiiii

    nnnnnn

    nnn

    XbbYXbYbXbnbY

    XbbYXbYbXbbY

    XbbYXbYbXbbY

    XbbYXbbYeeRSS

    2121

    22

    2

    2

    1

    2

    2121

    22

    2

    2

    1

    2

    12111211

    2

    1

    2

    2

    2

    1

    2

    1

    2

    21

    2

    1211

    22

    1

    222

    222

    ...

    222

    )(...)(...

    DERIVING LINEAR REGRESSION COEFFICIENTS

    19

    02220 211

    =

    ii XbYnb

    b

    RSS

    ii XbYnb21

    XbYb 21

    02220 12

    2

    2

    =

    iiii XbYXXb

    b

    RSS

    1/28/2016 18

  • 7/25/2019 Session 1-Upload ET

    19/19

    19

    =

    =

    22

    21

    )(

    ))((

    xx

    xxyyb

    xbyb

    i

    ii

    1/28/2016 Understanding Econometrics

    222XnX

    YXnYXb

    i

    ii

    =