day 2 review of regression ols1
DESCRIPTION
hoi quyTRANSCRIPT
-
Economics 20 - Prof. Anderson 1
M hnh hi qui n gin
y = 0 + 1x + u
-
Economics 20 - Prof. Anderson 2
D bo s dng m hnh chui thigian(Time Series Models for Forecasting)
n tp phng php hi quiReview of Regression
Nguyn Ngc AnhTrung tm Nghin cu Chnh sch v Pht trin
Nguyn Vit Cngi hc Kinh t Quc dn
-
Economics 20 - Prof. Anderson 3
Hi qui l g?
L mt cng c quan trng nht ca cc nhnghin cu kinh tHi qui l phng php m t v nh gi miquan h gia mt bin (gi l bin ph thuc, thng k hiu l y) vi mt hay nhiu bin khc(gi l bin c lp, x1, x2, ... , xk )
-
Economics 20 - Prof. Anderson 4
So snh hi qui v tng quan
Trong quan h tng quan, hai bin y v xl tng ng nhau. Trong m hnh hi qui, chng ta coi binc lp v bin ph thuc l hon ton khcnhau. Bin y c gi thit l c tnh ngunhin, cn bin x c gi thit l c nh(nhn gi tr c nh)
-
Economics 20 - Prof. Anderson 5
So snh hi qui v tng quan
M hnh hi qui cho php chng ta clng (estimate) v suy din thng k(inferences) cc tham s ca tng th.Trong kinh t lng, mc tiu ca chng tal c lng tc ng nhn qu ca vic X thay i mt n v i vi Y.
-
Economics 20 - Prof. Anderson 6
Nu so snh, th gic tng quan vic clng m hnh hi qui cng ging nh clng con s trung bnh. Trong m hnh hi qui, vic suy din thng k bao gm cc vic sauc lng (Estimation): Lm th no c lng
Kim nh gi thuyt (Hypothesis testing): Tham s c lng c c khc 0 hay khng?
Xy dng khong tin cy : Xy dng khong tin cy cho tham s c c
lng
M hnh hi qui n gin
-
Economics 20 - Prof. Anderson 7
M hnh hi qui n gin
M hnh ch bao gm mt bin c lp k=1. Trongm hnh ny bin y ch ph thuc vo mt bin xM hnh c th c nhiu bin x, nhng ta s xttrng hp ny sau. M hnh hi qui n gin c ths dng trong mt s trng hp : Lm pht v tht nghip Li nhun ca chng khon quan h th no vi
ri ro M phng quan h gia gi chng khon v c tc
-
Economics 20 - Prof. Anderson 8
M hnh hi qui n gin : V d Gi s ta c s liu nh :
Chng ta mun tm hiu mi quan h gia x v y
Year, t Excess return= rXXX,t rft
Excess return on market index= rmt - rft
1 17.8 13.72 39.0 23.23 12.8 6.94 24.2 16.85 17.2 12.3
-
Economics 20 - Prof. Anderson 9
Biu ri rc
0
5
10
15
20
25
30
35
40
45
0 5 10 15 20 25
Excess return on market portfolio
E
x
c
e
s
s
r
e
t
u
r
n
o
n
f
u
n
d
X
X
X
-
Economics 20 - Prof. Anderson 10
Tm ng ph hp nhtChng ta c th s dng phng trnh
y= + x c lng ng thng tt nht. l dc ca ng thngng thng ny cn gi l ng hi qui ca tng th (population regression line)Ta khng bit v , nn phi c lngng thng nh vy hon ton mang tnhxc nh (deterministic) c hp l khng?
-
Economics 20 - Prof. Anderson 11
Mt s k hiu v thut ng
Vit dng tng qut hn, vi m hnh hi qui tuyn tnh gin n, ta c y = + x+ u, y c gi l m hnh hi qui tuyn tnh catng thChng ta thng gi y l bin ph thuc v x lbin c lp/bin kim soat. l intercept, l slope ( dc)u l sai s ca ng hi qui tng th
-
Economics 20 - Prof. Anderson 12
Ti sao li c sai s u
- Chng ta c th b st nhng yu t c tc ngn yt- Vic o lng/ghi nhn s liu i vi bin s yt cth c sai- Nhng tc ng ngu nhin i vi bin s yt mchng ta khng th m hnh ha c
-
Economics 20 - Prof. Anderson 13
Biu din m hnh trn bng hnh nh
-
Economics 20 - Prof. Anderson 14
Mt s gi thit
Trung bnh ca cc sai s trong m hnhhi qui bng 0.
E(u) = 0 y khng phi l mt gi thit qu nngn, do chng tao lun c th dng chun ha trung bnh/k vng ton ca u, E(u) v khng.
-
Economics 20 - Prof. Anderson 15
Gi thit ca m hnh hi qui
Chng ta cn phi a ra gi thit v miquan h gia u v xChng ta mun gi thit rng, nhng thngtin m chng ta bit v x s khng chochng ta bit g v u, v nh vy, u v x lhon ton khng c quan h vi nhauE(u|x) = E(u) = 0, v iu ny dn tiE(y|x) = 0 + 1x
-
Economics 20 - Prof. Anderson 16
E(u|x) = E(u) = 0
-
Economics 20 - Prof. Anderson 17
Phng php bnh phng cc tiu
tng c bn ca vic hi qui l clng cc tham s ca tng th trn c smt mu s liuGi {(xi,yi): i=1, ,n} l mt mu ngunhin, c c l n m ta thu c t tng thVi mi quan st trong mu ny, ta s cyi = + xi + ui
-
Economics 20 - Prof. Anderson 18
.
..
.
y4
y1
y2y3
x1 x2 x3 x4
}
}
{
{
u1
u2
u3
u4
x
y E(y|x) = + x
ng hi qui ca tng th, im s liuv cc sai s
-
Economics 20 - Prof. Anderson 19
c lng vi phng php bnhphng cc tiu
c lng vi phng php bnh phng cctiu, chng ta cn thy rng, gi thit chnh cachng ta l E(u|x) = E(u) = 0, v iu ny cngha l
Cov(x,u) = E(xu) = 0
Ti sao? T l thuyt c bn v xc sut ta cCov(X,Y) = E(XY) E(X)E(Y)
-
Economics 20 - Prof. Anderson 20
c lng vi phng php bnhphng cc tiu
Vi tng l tm ng ph hp nht, chng tac th xy dng bi ton cc tiuTc l chng ta mun tm cc tham s sao cho
biu thc di y t gi tr cc tiu :
-
Economics 20 - Prof. Anderson 21
.
==t
tt xyL 0)(2
==t
ttt xyxL 0)(2
( ) ( )==
+=n
iii
n
ii xyu
1
2
1
2 )(
-
Economics 20 - Prof. Anderson 22
c lng vi phng php bnhphng cc tiu
Chng ta c th s dng o hm gii bi ton cc tiuny, chng ta nu ly o hm bc 1, theo va v gii ccphng trnh thu. Qua ta c th c lng c cc thams ca m hnh hi qui.
SXY = ng phng sai ca (X, Y)SX2 = phng sai ca (X)
2
12
1
)(
)()(X
XYN
i i
N
i ii
SS
XX
YYXX ==
=
=XY =
-
Economics 20 - Prof. Anderson 23
Tm tt v c lng tham s beta (slope estimate)
c lng v dc l ng phng saitnh trn mu gia y v x, chia cho phngsai mu ca x.Nu x v y c tng quan thun (dng) vinhau, th c lng c du dngNu x v y c tng quan nghch (m) vi
nhau, th c lng c du mChng ta ch cn x bin thin trong
-
Economics 20 - Prof. Anderson 24
OLS
V mt trc gic, OLS l vic c lng ngthng qua cc im s liu trong mu sao cho tngkhong cch bnh phng sai s l nh nht, nnc tn l bnh phng cc tiu. Sai s, , chnh l c lng cho sai s u v l ssai khc gia ng c lng (ng hi qui trn mu) v cc im s liu.
-
Economics 20 - Prof. Anderson 25
.
..
.
y4
y1
y2y3
x1 x2 x3 x4
}
}
{
{
1
2
3
4
x
y
xy 10 +=
ng hi qui mu, im s liuv cc sai s c lng
-
Economics 20 - Prof. Anderson 26
ng hi qui tng th l m hnh m chng ta chorng to ra s liu, v cc tham s thc l v .Hi qui tng thHi qui muv chng ta bit rng .
Chng ta s dng ng hi qui mu suy din vng m hnh ca tng th
Chng ta cng mun bit l cc c lng v c phi l cc c lng tt hay khng
tt xy +=ttt uxy ++=
ttt yyu =
-
Economics 20 - Prof. Anderson 27
Tnh cht ca OLS
Tng cc sai s (residual) OLS l bng 0Nh vy, trung bnh mu cc sai s OLS
cng bng 0ng phng sai mu gia cc bin c
lp v sai s OLS cng bng 0ng OLS s chy xuyn qua im trungbnh ca s liu
-
Economics 20 - Prof. Anderson 28
Biu din bng i s, ta c
xy
ux
n
uu
n
iii
n
iin
ii
10
1
1
1
0
0
thus,and 0
+==
==
=
==
-
Economics 20 - Prof. Anderson 29
Tnh cht ca c lng OLS
Tuyn tnh (linear)
Khng trch (unbiased)
Hiu qu nht (best)
Best Linear Unbiased Estimator
$ $
$$
$
$
$
-
Economics 20 - Prof. Anderson 30
S dng STATA c lng OLS
Thc hin hi qui trong STATA rt ginn. V c lng m hnh hi qui y theo x th ta ch cn nh lnhreg y x
-
Economics 20 - Prof. Anderson 31
c lng s dng STATA
regress testscr str, robust
Regression with robust standard errors Number of obs = 420 F( 1, 418) = 19.26 Prob > F = 0.0000 R-squared = 0.0512 Root MSE = 18.581 ------------------------------------------------------------------------- | Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- str | -2.279808 .5194892 -4.39 0.000 -3.300945 -1.258671 _cons | 698.933 10.36436 67.44 0.000 678.5602 719.3057 -------------------------------------------------------------------------
-
Economics 20 - Prof. Anderson 32
Mc ph hp ca m hnh(Goodness-of-Fit)
( )( )
SSR SSE SST co Ta(SSR)du con phuongbinh Phan tng
(SSE) thich giai duoc phuongbinh Phn tng
(SST)cach khoang phuongbinh Tng
:saunhu nghiainh somt co se taChng : thichgiai duoc khngphn thich vgiai duocPhn
phn 2 c gm lst quan mi coi thc taChng
2
2
2
+=
+=
i
i
i
iii
u
yy
yy
uyy
-
Economics 20 - Prof. Anderson 33
Chng minh rng SST = SSE + SSR
( ) ( ) ( )[ ]( )[ ]
( ) ( )( )
( )
=++=
++=+=
+=
0 rangbit tav
SSE 2 SSR
2
22
2
22
yyu
yyu
yyyyuu
yyu
yyyyyy
ii
ii
iiii
ii
iiii
-
Economics 20 - Prof. Anderson 34
Mc ph hp ca m hnh(Goodness-of-Fit)
Chng ta nh gi th no v ng hi qui m tac lng? C ph hp vi s liu hay khng?
C th tnh t l tng bnh phng khong cch(SST) c gii thch bi m hnh, v gi t lny l R-bnh phng ca m hnh hi qui.
R2 = SSE/SST = 1 SSR/SSTNm trong khong 0-1. Cng ln cng tt!!!!
-
Economics 20 - Prof. Anderson 35
Phn phi mu ca c lng OLS
c lng OLS c tnh ton da trn mt mu s liu, mt mu s liu khc s cho ta mt gi tr khc ca 1 . y c gi l tnh bt nh theo mu ca 1 . Chng ta mun nh gi mc bt nh ca 1 S dng 1 tin hnh kim nh gi thuyt nh 1 = 0 Xy dng khong tin cy cho 1
Tt c nhng iu ny i hi chng ta phi xem xt ti phn phi mu (sampling distribution) ca c lng OLS. lm c iu ny, ta phi xem xt Phn phi ca c lng OLS
-
Economics 20 - Prof. Anderson 36
Hm phn phi ca 1Cng ging nh trung bnh mu, Y , 1 cng c phn phi mu . Vy k vng ton ca E( 1 ) l bao nhiu
Nu nh E( 1 ) = 1, th c lng OLS l c lng khng trch Cn mun g hn?!
Phng sai ca 1 - var( 1 )? (cho chng ta bit c mc bt nh ca c lng)
Phn phi ca 1 trong cc mu nh l phn phi g ? Vn ny rt kh!!!!!
Phn phi ca 1 cc mu ln l phn phi g ? Vi cc mu ln, 1 c phn b l phn b chun
(normally distributed).
-
Economics 20 - Prof. Anderson 37
Tnh khng trch ca OLS (Unbiasedness)
Gi thit rng m hnh tng th l tuyntnh theo tham s c dng y = 0 + 1x + uGi thit rng chng ta s dng mt mu cqui m n, {(xi, yi): i=1, 2, , n}, c lyt m hnh tng th. Nh vy ta c th biudin m hnh mu l yi = 0 + 1xi + uiGi thit E(u|x) = 0 v nh vy E(ui|xi) = 0Gi thit rng xi c bin thin
-
Economics 20 - Prof. Anderson 38
Tnh khng trch ca OLS (Unbiasedness)
xt tnh khng trch ca c lng, chng tavit li di dng tham s ca tng th. Vit mt cng thc ngin l
( )( )
=22
21 where,
xxs
syxx
ix
x
ii
-
Economics 20 - Prof. Anderson 39
Tnh khng trch ca OLS (Unbiasedness)
( ) ( )( )( ) ( )( )( ) ( )( ) ii
iii
ii
iii
iiiii
uxx
xxxxx
uxx
xxxxx
uxxxyxx
++
=++
=++=
10
10
10
-
Economics 20 - Prof. Anderson 40
Tnh khng trch ca OLS (Unbiasedness)
( )( ) ( )
( )( )
211
21
2
thusand ,asrewritten becan numerator the,so
,0
x
ii
iix
iii
i
suxx
uxxs
xxxxx
xx
+=+
==
-
Economics 20 - Prof. Anderson 41
Tnh khng trch ca OLS (Unbiasedness)
( )
( ) ( ) 121121
1
then,1
thatso ,let
=
+=
+==
iix
iix
i
ii
uEdsE
uds
xxd
-
Economics 20 - Prof. Anderson 42
Tnh khng trch ca OLS (Unbiasedness)
Cc c lng OLS ca tham s 1 v 0 lkhng trchVic chng minh tnh khng trch, da trn 04
gi thit. Nu mt gi thit m khng ng, thc lng OLS s khng phi l khng trchLu rng, tnh khng trch l tnh cht ca phpc lng (estimator) cn trong mt mu c th, th c lng thu c c th nhiu t khc vitham s thc t
-
Economics 20 - Prof. Anderson 43
Phng sai ca c lng OLS
Chng ta bit rng hm phn b (sampling distribution) ca c lng nm xungquanh tham s thcMun bit xem hm phn b ny c phn tn nh th noa thm mt gi thit na v phng saiGi thit l Var(u|x) = 2(Homoskedasticity)
-
Economics 20 - Prof. Anderson 44
Phng sai ca c lng OLS
Var(u|x) = E(u2|x)-[E(u|x)]2
E(u|x) = 0, so 2 = E(u2|x) = E(u2) = Var(u)Nh vy, 2 cng l phng sai khng iukin, v c gi phng sai ca sai s, c gi l sai s chun ca sai sC th ni rng : E(y|x)=0 + 1x vVar(y|x) = 2
-
Economics 20 - Prof. Anderson 45
..
x1 x2
Trng hp phng sai ng nht(Homoskedastic)
E(y|x) = 0 + 1x
y
f(y|x)
-
Economics 20 - Prof. Anderson 46
.xx1 x2
yf(y|x)
Phng sai khng ng nht(Heteroskedastic)
x3
. . E(y|x) = 0 + 1x
-
Economics 20 - Prof. Anderson 47
Phng sai ca c lng OLS
( )( ) ( )
( )12222222
2
2222
2
2
22
2
2
2
211
1
11
11
1
Varsss
dsds
uVardsudVars
udsVarVar
xx
x
ix
ix
iix
iix
iix
==
=
=
=
=
=
+=
-
Economics 20 - Prof. Anderson 48
Phng sai ca c lng OLS
Phng sai ca sai s, 2 cng ln, thphng sai ca c lng cng ln
xi bin thin cng nhiu, th phng saica c lng cng nhDo , mu ln s lm gim phng saica c lngVn l phng sai ca sai s chng ta
li khng bit
-
Economics 20 - Prof. Anderson 49
c lng phng sai ca sai s
Chng ta khng bit phng sai ca sai, 2, ca sai s l bao nhiu v chng ta khngquan st c sai s, ui
Chng ta ch quan st c , i
Chng ta c th s dng i c lngphng sai ca sai s
-
Economics 20 - Prof. Anderson 50
c lng phng sai ca sai s
( )( ) ( )
( ) ( )2/21
is ofestimator unbiasedan Then,
22
21100
1010
10
==
=++=
=
nSSRun
u
xux
xyu
i
i
iii
iii
-
Economics 20 - Prof. Anderson 51
c lng phng sai ca sai s
( )
( ) ( )( ) 21211
2
/se
, oferror standard the
have then wefor substitute weif
sd that recall
regression theoferror Standard
=
===
xx
s
i
x
-
Economics 20 - Prof. Anderson 52
Tm tt v phn phi mu ca 1Nu cc gi thit ca OLS l ng th Hm phn phi mu ca 1 c:
E( 1 ) = 1 (tc l, 1 l c lng khng trch) var( 1 ) = 4var[( ) ]1 i x i
X
X un
1
n.
Khi mu ln , 1 11
( )var( )
E
~ N(0,1) (CLT)
.
-
Economics 20 - Prof. Anderson 53
Kim nh gi thuyt v sai s chun ca 1
Mc tiu ca vic kim nh trong m hnh hi qui l s dng s liu kim nh mt gi thuyt v tng th nh 1 = 0, v a ra kt lun liu gi thuyt c ng hay khng
Gi thuyt trng v gi thuyt thay th hai pha H0: 1 = 1,0 vs. H1: 1 1,0
Trong 1,0 l mt gi tr gi thuyt
Gi thuyt trng v gi thuyt thay th mt pha : H0: 1 = 1,0 vs. H1: 1 < 1,0
-
Economics 20 - Prof. Anderson 54
Phng php kim nh: Xy dng thng k t hoc z, tnh p-value, hoc so snh vi gi tr ti hn ca hm phn phi N(0,1)) Ni chng ta c: t = (c lng gi tr mun kim nh)/sai s chun ca c
lng
Khi kim nh v trung bnh ca Y: ta c t = ,0/
Y
Y
Ys n
Khi kim nh 1, ta c t = 1 1,01
( )SE
,
-
Economics 20 - Prof. Anderson 55
Cng thc tnh SE( )1 1
2 =
2
2 2
1 estimator of (estimator of )
v
Xn =
2
12
2
1
1 1 2
1 ( )
n
ii
n
ii
vn
nX X
n
=
=
Trong iv = ( )i iX X u . OK. SE( 1 ) trng phc tp, nhng STATA tnh rt nhanh v
ta khng phi nh cc cng thc ny .
-
Economics 20 - Prof. Anderson 56
V dc lng ca m hnh hi qui: Test score = 698.9 2.28STR STATA cng cho ta c lng lch chun ca con s c lng l
SE( 0 ) = 10.4 SE( 1 ) = 0.52
Ta c th tnh cc kim nh thng k cho 1 vi gi thuyt Ho: 1,0 = 0 t-statistic testing 1,0 = 0 = 1 1,0
1
( )SE
= 2.28 0
0.52 = 4.38
mc ngha 1% gi tr l 2.58, nn ta c th bc b gi thuyt trng vi mc ngha 1%.
Ta cng c th tnh gi tr p-value . Nhng STATA lm h ht ri !
-
Economics 20 - Prof. Anderson 57
c lng s dng STATA
regress testscr str, robust
Regression with robust standard errors Number of obs = 420 F( 1, 418) = 19.26 Prob > F = 0.0000 R-squared = 0.0512 Root MSE = 18.581 ------------------------------------------------------------------------- | Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- str | -2.279808 .5194892 -4.39 0.000 -3.300945 -1.258671 _cons | 698.933 10.36436 67.44 0.000 678.5602 719.3057 -------------------------------------------------------------------------
-
Economics 20 - Prof. Anderson 58
Tm tt: kim nh H0: 1 = 1,0 v. H1: 1 1,0,
Tnh kim nh thng k t (t-statistic) t = 1 1,0
1
( )SE
=
1
1 1,0
2
Bc b gi thuyt trng vi mc ngha 5% nu |t| > 1.96 Bc b gi thuyt trng nu p
-
Economics 20 - Prof. Anderson 59
c kt qu STATA regress testscr str, robust
Regression with robust standard errors Number of obs = 420 F( 1, 418) = 19.26 Prob > F = 0.0000 R-squared = 0.0512 Root MSE = 18.581 ------------------------------------------------------------------------- | Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------+---------------------------------------------------------------- str | -2.279808 .5194892 -4.38 0.000 -3.300945 -1.258671 _cons | 698.933 10.36436 67.44 0.000 678.5602 719.3057 ------------------------------------------------------------------------- Y = 698.9 2.28STR, , R2 = .05,
(10.4) (0.52) t (1 = 0) = 4.38, p-value = 0.000 (2-sided) Khong tin cy 95% ca 1 l (3.30, 1.26)
M hnh hi qui n ginD bo s dng m hnh chui thi gian (Time Series Models for Forecasting) n tp phng php hi qui Review of RegressioHi qui l g? So snh hi qui v tng quan So snh hi qui v tng quanM hnh hi qui n ginM hnh hi qui n gin : V d Biu ri rcTm ng ph hp nht Mt s k hiu v thut ngTi sao li c sai s u Biu din m hnh trn bng hnh nh Mt s gi thitGi thit ca m hnh hi qui E(u|x) = E(u) = 0Phng php bnh phng cc tiu c lng vi phng php bnh phng cc tiuc lng vi phng php bnh phng cc tiuc lng vi phng php bnh phng cc tiuTm tt v c lng tham s beta (slope estimate) OLSTnh cht ca OLS Biu din bng i s, ta cTnh cht ca c lng OLS S dng STATA c lng OLS c lng s dng STATA Mc ph hp ca m hnh (Goodness-of-Fit)Chng minh rng SST = SSE + SSRMc ph hp ca m hnh (Goodness-of-Fit)Phn phi mu ca c lng OLSHm phn phi ca Tnh khng trch ca OLS (Unbiasedness)Tnh khng trch ca OLS (Unbiasedness)Tnh khng trch ca OLS (Unbiasedness)Tnh khng trch ca OLS (Unbiasedness)Tnh khng trch ca OLS (Unbiasedness)Tnh khng trch ca OLS (Unbiasedness)Phng sai ca c lng OLS Phng sai ca c lng OLSPhng sai ca c lng OLSPhng sai ca c lng OLSc lng phng sai ca sai s c lng phng sai ca sai sc lng phng sai ca sai sTm tt v phn phi mu ca Kim nh gi thuyt v sai s chun ca Cng thc tnh SE( ) V d c lng s dng STATA Tm tt: kim nh H0: 1 = 1,0 v. H1: 1 1,0, c kt qu STATA