i. bayesian econometricseconweb.ucsd.edu/~jhamilto/econ226_1ce_slides.pdf · 2015. 4. 6. · i....

97
I. Bayesian econometrics A. Introduction B. Bayesian inference in the univariate regression model C. Statistical decision theory Question: once we’ve calculated the posterior distribution, what do we do with it? 1. Example: portfolio allocation problem

Upload: others

Post on 08-Mar-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

A. IntroductionB. Bayesian inference in the univariate

regression modelC. Statistical decision theory

Question: once we’ve calculated the posterior distribution, what do we do with it?1. Example: portfolio allocation problem

Page 2: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

r jt � gross return on asset j at time t

rt � �r1t, . . . , rJt� �

rt |�,� � N��,�� �� known)

likelihood:

p�r1, . . . ,rT|�,�� � 1�2��JT/2 |�|T/2 �

exp ��1/2��t�1

T

�rt � �����1�r t � ��

Page 3: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

classical inference:

�� � T�1 �t�1

T

r t � r

�� � N��,T�1��

Page 4: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Bayesian prior:

� � N�m, M�

Bayesian posterior:

�|r1, . . . , rT � N�m�, M��

M� � �M�1 � T��1��1

m� � M��M�1m � T��1 r �

Page 5: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Classical econometrician:

Step 1: Solve portfolio allocation

problem as if �,� known with certainty

Step 2: Estimate �� from data and

plug results into Step 1

Page 6: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Step 1: portfolio allocation if �,� known

aj � quantity of asset j purchased

j � 1, . . . , J

y � income

budget constraint:

�j�1J a j � y

Page 7: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

c � future consumption

c � �j�1J r jaj

�a1,...,aJ�max EU �

j�1J r ja j

s.t. �j�1J a j � y

Page 8: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

U�c� � �exp���c�

EU�c|�,�� � �Eexp���a�r�

� �exp���a�� � ��2/2�a��a�

(since for r � N��,��,

Eexp�s �r� � exp�s �� � �1/2�s ��s�

here s � � �a�

Page 9: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Conclusion: if classical econometrician

in step 1 solves portfolio decision

as if �,� known with certainty, then solves

�a�max �exp���a�� � ��2/2�a��a�

s.t. a�1 � y

Page 10: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

� � ��a�� � ��2/2�a��a � ��y � a �1�

��� � �2�a � �1 � 0

a � ��2���1��� � �1��

Page 11: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Optimal decision when � known:

a � ��2���1��� � �1��

Step 2: Estimate � from data and plug in:

a� � ��2 ��1���� � ��1�

Page 12: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Bayesian econometrician:

Solve optimization problem

under uncertainty

rT�1 |� � N��,��

(or rT�1 � � � � t �t � N�0,���

�|r1, . . . , rT � N�m�,M��

� rT�1 |r1, . . . , rT � N�m�,� � M��

Page 13: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

EU�cT�1 |r1, . . ,rT� � �Eexp���a �rT�1�

� �exp���a�m� � ��2/2�a ��� � M��a�

(since for rt�1 � N�m�,� � M��,

Eexp�s �r� � exp�s �m� � �1/2�s ��� � M��s�

here s � � �a�

Page 14: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

That is, correct decision problem

recognizes that not only is rT�1 random,

we also have uncertainty about � and

this matters for making the optimal

decision

Page 15: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

E�U�c�|Y� � �E�exp���a�r�|Y�

� �exp���a�m� � ��2/2�a ��� � M��a�

a� � ��2��� � M���1��m� � ��1��

uncertainty about � influences portfolio

allocation decision (even if we have

diffuse prior so that m� � ��)

Page 16: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Bayesian considers the statistical inference problem to be: calculate the posterior distribution

How this distribution is used to come up with a “parameter estimate” requires specifying a loss function

Page 17: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theory1. Example: portfolio allocation problem2. General decision theory

Page 18: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

� � unknown true value

�� � estimate

����,�� � loss function

� how much we are concerned

if we announce an estimate of

�� but the truth is �

Page 19: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

�� is solution to

�min �

�����,��p��|Y�d�

where � � �

Page 20: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Scalar examples:

(1) quadratic loss

����,�� � �� � ���2

Claim: optimal �� � E��|Y�

Page 21: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Proof:

E�|Y � � ��2

� E�|Y � � E��|Y� � E��|Y� � ��2

� E�|Y�� � E��|Y��2 � E��|Y� � ��2

� 2 E�|Y�� � E��|Y�� E��|Y� � ��

� E�|Y�� � E��|Y��2 � E��|Y� � ��2

minimized at �� � E��|Y�

Page 22: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Conclusion: for quadratic loss,

optimal estimate is posterior mean

Page 23: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

(2) absolute loss

����,�� � |� � �� |

Claim: optimal �� � �med

���

�med p��|Y�d� � 0. 5

Page 24: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Proof:

���

�|� � �� |p��|Y�d�

� ���

����� � ��p��|Y�d�

� ���

��� � ���p��|Y�d�

differentiating with respect to �� gives:

Page 25: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

differentiating with respect to �� gives:

��� � ���p��� |Y� � ���

��p��� |Y�

���� � ���p��� |Y� � ���

�p��� |Y�

minimized when

���

��p��� |Y� � �

��

�p��� |Y�

Conclusion: for absolute loss,

optimal estimate is posterior median

Page 26: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

(3) point loss (discrete case)

� � ��1, . . . ,�J�

����,�� � 0 if � � ��

� 1 if � � ��

�� � arg min �j�1J 1 � ���� � � j� P�� � � j|Y�

� �� � �j for which P�� � � j|Y�

is highest

Conclusion: for point loss,

optimal estimate is posterior mode

Page 27: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Returning to example from first lecture

yt |� � N��,�2� (� known)

� � N�m,2� (prior)

�|Y � N�m�,�2� (posterior)

m� ���2/T�

��2/T� � 2 m

� 2

��2/T� � 2 y

Page 28: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

m� ���2/T�

��2/T� � 2 m

� 2

��2/T� � 2 y

for any of these three loss functions

(quadratic, absolute, point), the

estimate would be m�

diffuse prior: � �� �� � y

Page 29: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theory1. Example: portfolio allocation problem2. General decision theory3. Bayesian statistics and admissibility

Page 30: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

More generally, we can consider

some action a we plan to take.

Parameter estimation:

a � �� means we announce that

our estimate is �� .

Page 31: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Hypothesis testing:

a � 0 if we accept H0: � � �0

a � 1 if we reject H0: � � �0

���, a� � loss if we take the action

a when the true value of the

parameter turns out to be �.

Page 32: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Bayesian decision: choose action

to minimize posterior expected loss:

aB�Y� minimizes � ���, a�Y��p��|Y�d�.

where expectation is with respect to �

taking data Y as given.

In other words, Bayesian maximizes

expected utility given current uncertainty.

Page 33: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Classical decision: choose action to

minimize expected loss across samples:

aC�Y� minimizes � ���, a�Y��p�Y|��dY

where expectation is with respect to Y

taking parameter � as given.

Page 34: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

A decision rule a�Y� is said to be

inadmissible if there exists an

alternative rule aA�Y� such that

� ���, aA�Y��p�Y|��dY � ���, a�Y��p�Y|��dY

for all � with strict inequality for some �.

Page 35: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Under certain regularity conditions,

any Bayesian decision is admissible.

Proof for simple case. Suppose

� � ��1, . . . ,�J�

Y � �Y1, . . . , Ym�

p�� j� � 0 for j � 1, . . . , J

��� j, Y k� c for all j, k

Page 36: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

For aB�Y k� the Bayes decision and

aA�Y k� any other decision,

�j�1J

��� j, aB�Y k��p�� j|Y k�

�j�1J

��� j, aA�Y k��p�� j|Y k�.

Page 37: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

�j�1J

��� j, aB�Y k��p�� j|Y k�

�j�1J

��� j, aA�Y k��p�� j|Y k�.

Multiplying by p�Y k� � �i�1J p�Y k |� i�p�� i�

and adding over k,

�k�1K �

j�1J

��� j, aB�Y k��p�� j|Y k�p�Y k�

�k�1K �

j�1J

��� j, aA�Y k��p�� j|Y k�p�Y k�.

Page 38: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

But if Bayesian rule was inadmissible, there

would have to exist aA�Y k� for which

�k�1K

��� j, aB�Y k��p�Y k |� j�

�k�1K

��� j, aA�Y k��p�Y k |� j�.

Page 39: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

�k�1K

��� j, aB�Y k��p�Y k |� j�

�k�1K

��� j, aA�Y k��p�Y k |� j�.

Multiply by p�� j� and adding over j

�j�1J �

k�1K

��� j, aB�Y k��p�Y k |� j�p�� j�

� �j�1J �

k�1K

��� j, aA�Y k��p�Y k |� j�p�� j�

Page 40: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

�j�1J �

k�1K

��� j, aB�Y k��p�Y k |� j�p�� j�

� �j�1J �

k�1K

��� j, aA�Y k��p�Y k |� j�p�� j�

�j�1J �

k�1K

��� j, aB�Y k��p�� j|Y k�p�Y k�

� �j�1J �

k�1K

��� j, aA�Y k��p�� j|Y k�p�Y k�

which contradicts definition of

Bayes decision.

Page 41: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Converse also true:

If aC�Y k� is admissible, there exists a

prior p�� j� j � 1, . . . , J for which aC�Y k�

is Bayes decision.

Page 42: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

A class of decision rules � is said to

be a complete class if all admissible

rules are contained in �.

The class is said to be minimal complete

if no proper subclass is complete.

Page 43: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Complete Class Theorem: under certain

regularity conditions, the set of Bayes

decisions for all possible priors is a

minimal complete class.

Page 44: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Example: hypothesis testing

a � 1 (reject H0: � � �0�

a � 0 (accept H0)

���, 1� � 0 if � � �0

� 1 if � � �0

���, 0� � 0 if � � �0

� c if � � �0

Page 45: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Bayes decision: choose a � 1 if

E����, 1�|Y� E����, 0�|Y�

P�� � �0 |Y� c�1 � P�� � �0 |Y��

P�� � �0 |Y� c/�1 � c�

Page 46: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

The hypothesis test

reject H0 if T�Y� � t

is said to be inadmissible if there

exists an alternative test

reject H0 if S�Y� � s

such that:

Page 47: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

(1) for every � � �0,

�T�Y��t

p�Y|��dY �S�Y��s

p�Y|��dY

(2) for every � � �0,

�T�Y��t

p�Y|��dY �S�Y��s

p�Y|��dY

(3) there is some � for which the

the inequality in either (1) or (2)

is strict

Page 48: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theoryD. Large sample results

Goal of this section:A Bayesian is doing something with the

data. How would a classical econometrician describe what that is?

Page 49: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theoryD. Large sample results

1. Background: The Kullback-Leibler information inequality

Page 50: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Claim:

logx x � 1

equality only if x � 1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-5

-4

-3

-2

-1

0

1

2

3

4

Page 51: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Implication:

E logx E�x� � 1

with equality only if x � 1

with probability 1

Page 52: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Application of claim to case

of discrete parameter space

and discrete random variables

� � ��1, . . . ,�J�

�� � true value

yt � �1, . . . , I�

Page 53: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Define

x�y t ,�j� �P�Y � y t |� � �j�

P�Y � y t |� � ���

This is a random variable (because yt is

random) that with probability

P�Y � i|� � ��� takes on the valueP�Y � i|� � �j�P�Y � i|� � ���

Page 54: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

E���x�y t,�j��

� �i�1

IP�Y � i|� � �j�P�Y � i|� � ���

P�Y � i|� � ���

� �i�1

I

P�Y � i|� � �j�

� 1

Page 55: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

E���logx�y t,�j��

� �i�1

I

logP�Y � i|� � �j�P�Y � i|� � ���

P�Y � i|� � ���

� E�� logp�y t |�j�p�y t |���

Page 56: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

The claim

E log x E�x� � 1

implies for this case that

E�� logp�yt |�j�p�yt |���

1 � 1 � 0

with equality only if

p�yt |� j� � p�yt |��� �yt

Page 57: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Kullback-Leibler information inequality:

E�� logp�yt |��p�yt |�

�� 0

with equality only if � � ��

Page 58: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theoryD. Large sample results

1. Background: The Kullback-Leibler information inequality2. Implications of K-L for Bayesian posterior probabilities

will illustrate how data eventually overwhelm any prior

Page 59: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

p��s |Y� �p��s�p�Y|�s�

�j�1J p�� j�p�Y|� j�

�p��s� t�1

T p�y t |�s�

�j�1J p�� j� t�1

T p�y t |� j�

�p��s� t�1

T �p�y t |�s�/p�y t |����

�j�1J p�� j� t�1

T �p�y t |� j�/p�y t |����

�exp log p��s� �� t�1

T logp�yt |�s�

p�yt |���

�j�1J exp log p�� j� �� t�1

T logp�yt |�j�

p�yt |���

Page 60: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

LLN:

T�1 �t�1

T

logp�yt |�s�p�yt |�

�� ��� E�� log

p�yt |�s�p�yt |�

��

which is 0 if �s � ��

� 0 if �s � ��

Page 61: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

p��s |Y�

�exp logp��s� ��t�1

T logp�yt |�s�

p�yt|���

�j�1J exp logp�� j� ��t�1

T logp�yt |� j�

p�yt |���

p��s |Y�p�

0 if �s � ��

1 if �s � ��

conclusion: Bayesian posterior

distribution collapses to a spike

at truth for i.i.d. discrete data

Page 62: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theoryD. Large sample results

1. Background: The Kullback-Leibler information inequality2. Implications of K-L for Bayesian posterior probabilities3. Bayesian posterior distribution as approximation to asymptotic distribution of MLE

Page 63: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

log p�Y|�� � �t�1

T

logp�yt |��

define

�� T � arg max log p�Y|��

� log p�Y|���� ���� T

� 0

Page 64: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

log p�Y|��

� logp�Y|�� T� �� log p�Y|��

�� ���� T

�� � �� T�

� 12�� � �� T� �

�2 logp�Y|������� ���� T

�� � �� T�

�� T � �T� ��1 � �T��� T

Page 65: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Ht��� � ��2 log p�yt |��

���� �

H��� � �E�2 logp�yt |��

�����

Page 66: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

log p�Y|�� � logp�Y|�� T� � 12

T �� � �� T�� �

T�1 �t�1

T

H t��� T� T �� � �� T�

Page 67: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

log p�Y|�� � log p�Y|�� T� � 12

T �� � �� T�� �

H���� T �� � �� T�

Page 68: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

p��|Y� � k Tp���exp���1/2� T �� � �� T�� �

H���� T �� � �� T��

� p���qT���

qT��� � kernel of N�0,H�����1� density

for T �� � �� T�

Page 69: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

0 1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

1.2

1.4

blue: p��� green: q10��� red: q100���

Page 70: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

0 1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

1.2

1.4

blue: T � 0 green: T � 10 red: T � 100

posterior distributions

Page 71: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Conclusions: the sequence of posterior

distributions p��|YT� has the property

p��|YT�p� 1 at � � ��

p� 0 at � � ��

Page 72: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Let �T be sequence of random

variables with distribution p��|YT�.

Then conditional on �YT� we have

T ��T � �� T�L� N�0,H�����1�

where distribution is across

realizations of �T

Page 73: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Contrast with classical result:

T ��� T � ���L� N�0,H�����1�

where distribution is across

realizations of YT

Page 74: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Implication: calculating the Bayesian posterior distribution is a way to find the asymptotic distribution of the MLE when regularity conditions hold

Page 75: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

yt |� � N��,�2� (� known)

� � N�m,2� (prior)

�|Y � N�m�,�2� (posterior)

m� ���2 /T�

��2/T� � 2 m

� 2

��2/T� � 2 y T

�2 �2�2/T

��2/T� � 2

Page 76: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

�2 �2�2/T

��2/T� � 2

Conditional on YT, the variable �|YT

has a distribution characterized by

��1��T � mT�� � N�0, 1�

T� ���2 /T� � 2�1/2��T � mT

�� � N�0, 1�

Page 77: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

T� ���2 /T� � 2�1/2��T � mT

�� � N�0, 1�

As T � �T� ��T � y� T� � N�0, 1�

classical result:T� �y�T � ��� � N�0,1�

Page 78: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

I. Bayesian econometrics

C. Statistical decision theoryD. Large sample resultsE. Diffuse priors

Page 79: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Interpretations:

(1) Start with finite , calculate

posterior, and consider limiting

properties of sequence as � �

Page 80: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Interpretations:

(2) Start with � �?

p��� � 12�

exp � �� � m�2

22

limit as � � is not a density

Page 81: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

(3) Just use kernels?

p�Y|�� � exp � ��2 � 2�y��2��2 /T�

p��� � 1 ?

(diffuse prior?)

Page 82: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

implies

p��|Y� � exp � ��2 � 2�y��2��2 /T�

�|Y � N�y�,�2/T�

gives the correct answer in

this case

Page 83: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

But p��� � 1 is not a proper

density for � � �1

p��� � 1 is called an "improper" prior

In this case, it gave us the correct answer.

In other cases it can fail (with either

analytical or numerical methods)

Page 84: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Another problem with the improper prior

p��� � 1 is that it is not invariant with

respect to reparameterization.

Page 85: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Example: T � 1

p�y1 |�;�� � 12� �

exp � �y t � ��2

2�2

If parameter of interest is ��1 and

p���1� � 1 then

p���1 |y1;�� � 1� exp � �y t � ��2

2�2

Page 86: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

The constant of proportionality needed

to ensure �0

�p���1|y1;�� d��1 � 1 is

p���1 |y1;�� ��y1 � ��2

� exp � �y1 � ��2

2�2

Page 87: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Suppose instead parameter of interest

is taken to be ��2 and prior is p���2� � 1

p���2 |y1 ;�� � 1��2�1/2

exp ��y t � ��2

2�2

p���2 |y1;�� ���y1 � ��2�3/2

2����2�1/2 �

exp � �y1 � ��2

2�2

(a �3/2, �y1 � ��2/2� distribution)

Page 88: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Problem:

P���1 � 1|y1;�� � �1

�p���1 |y1 ;��d��1

� �1

�p���2 |y1;��d��2

� P���2 � 1|y1;��

Page 89: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Issue: if � � g��� then

w � ��� � g��1�w�� d�1�w�dw

Conclusion: the "improper priors"

p���1� � 1 and p���2� � 1

represent different prior beliefs

Page 90: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Question: which (if either) should be called a “diffuse prior” corresponding to complete uncertainty?

Page 91: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Jeffreys prior:

p��� � �h����1/2

h��� � ���T

�2 log p�y|����2 p�y|�� dy

for y � �T

Page 92: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Example: if � � ��1

log p�y|�� � ��T/2� log2� � T log��1

� �1/2��t�1T �y t � ��2���1�2

� log p�y|��/�� � T/��1 ��t�1T �yt � ��2��1

�2 log p�y|��/��2 � �T/��2 ��t�1T �y t � ��2

�E��2 logp�y|��/��2 � � T�2 � T�2 � 2T�2

p��� � �h����1/2 � p���1� � �

Page 93: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

If we instead take � � ��2:

log p�y|�� � ��T/2� log 2� � �T/2� log��2

� �1/2��t�1T �y t � ��2��2

� logp�y|��/�� � �T/�2��2�

� �1/2��t�1T �y t � ��2

�2 log p�y|��/��2 � �T/2��4

�E��2 log p�y|��/��2 � � �T/2��4

p��� � �h����1/2 � p���2� � �2

Page 94: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Advantage of Jeffreys prior:

Probabilities implied by p���1 |Y;��

derived from p���1� � � are identical

to those implied by p���2 |Y;�� derived

from p���2� � �2

Page 95: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Note: for the Normal-gamma prior

p���2� ���/2��N/2�

�N/2����2 ���N/2��1� �

exp � ���2

2

we characterized the diffuse prior as

N � 0,� � 0 or

p���2� � �2

Page 96: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

Concerns about Jeffreys prior:

does not seem to represent

"prior ignorance" in many examples

Page 97: I. Bayesian econometricseconweb.ucsd.edu/~jhamilto/Econ226_1CE_slides.pdf · 2015. 4. 6. · I. Bayesian econometrics C. Statistical decision theory D. Large sample results 1. Background:

My recommendation:

Use improper prior p��� � 1 or

Jeffreys prior only for guidance,

checking results, or in cases where

operation is well understood.

Use mildly informative prior to

avoid all problems.