estimation and hypothesis testing. the investment decision what would you like to know? what will be...

Estimation and Hypothesis Testing

The Investment DecisionWhat would you like to know?

What will be the return on my investment?Not possible

PDF for return

Use statistics to estimate the correct PDF. Can do for discrete PDF’s.For continuous PDFs, beyond the scope of this course.

1) Assume the normal PDF2) Use statistics to estimate E[r] and .

The Game of InvestingSuppose you’re offered to play a game:

Cost: $1.25Flip a coinIf heads, you get $2 (return is 60%)If tails, you get $0 (return is -100%)Coin may be biased

Assume the true pdf for this investment is

Of course, this information is unknown!

5376.0][ %,12][ %100for 30.0

%60for 70.0)(

rVarrEr

rrf

The Game of InvestingYou would like to know

What will the coin flip turn out to be?Not possible

What is the PDF (probability of getting heads/tails)?Not possible to know with certainty: need to estimate

E[r] and Not possible to know perfectly: need to estimate.

How to estimate? Why not flip the coin a few times?

Random SampleSuppose you flip the coin 10 times and get

H, H, T, H, H, H, H, H, H, T8 heads, 2 tails

This is an independent random sampleA sample of observations generated by the same pdfIndependent: One outcome does not affect othersNothing other than the pdf determines how observations are

chosen. No “cherry picking”: picking certain observations you like, and

eliminating others

How do we use this information to estimate f(heads)E[r]

EstimatorsEstimator – function of outcomes for a random sample.

An estimator is a random variable It has it’s own PDF!

Let be an estimator of E[r]

Two important properties of estimators: Unbiased: Consistent: As the number of observations we observe becomes

large,

Suppose we use the following rule to get our estimator of E[r]: For any random sample, choose the first observation as our

estimate of E[r]. Call this estimator r1.

][]ˆ[ rEE

0]ˆ[ uVar

Estimator of E[r]Is r1 unbiased?

For any random sample of any size, r1 is simply a random variable governed by the pdf

So E[r1]=12%=E[r]r1 is therefore an unbiased estimator of E[r]

%100for 30.0

%60for 70.0)(

r

rrf

Estimator of E[r]Is r1 consistent?

For any random sample of any size, r1 is simply a random variable governed by the pdf

So Var[r1]=0.5376 for any sample size.r1 is therefore not a consistent estimator of

E[r]

%100for 30.0

%60for 70.0)(

r

rrf

Estimator of E[r]Suppose we use the following rule to get our

estimator of E[r]:For any random sample, take the average

return as our estimator of E[r].Call this estimator .

r

Estimator of E[r]Is unbiased?

What is ? Use stat rule #1

is therefore an unbiased estimator of E[r]

r

nn r

nrn

rnn

rrrr

1...

11...21

21

][rE

][)12(.

)12(.1

...)12(.1

)12(.1

][1

...][1

][1

][ 21

rEn

nnnn

rEn

rEn

rEn

rE n

r

Estimator of E[r]Is consistent?How do we find the variance of a sum of random

variables?We haven’t learned this yet, but we will later.

We can generate random samples of estimators to get some idea of the properties of their PDF.For each outcome for an estimator, we need to

generate a random sample of observations, and then compute the estimator.

Use Excel Spreadsheet (posted on course website)

r

CommentWhy do we use probability weights when calculating

E[r] from a pdf, but when we estimate E[r] we just use an equally weighted average?

Given the PDF above, we should expect in any random sample to see heads 70% of the time.

Assume we draw a sample where exactly 70% are heads70% of the returns in the sample will be 0.6030% of the returns in the sample will be -1.00A simple average across observations is equal to

Simple averages naturally put more weight on those outcomes which are more likely.

00.130.60.70.

00.130.60.70.

n

nn

Estimator of Stdev[r]How do we use data to estimate Stdev[r]? Var[r]=E(r-E[r])2 = E[r2]-E[r]2

Stdev(r)=sqrt(Var(r))

Suppose we use the following rule to get our estimator of Stdev[r]:

Is the estimator unbiased?Is the estimator consistent?Use Excel spreadsheet.

2/12

2

2/111

)(1

ˆ

i iii

ii r

nr

nrr

n

AverageThe same results apply to continuous PDFsFor a given random sample:

. ofestimator consistent unbiased,an is )(1

1ˆ

E[r]. ofestimator consistent unbiasedan is 1

][ˆ

2/1

2

ii

ii

rrn

rn

rE

Estimates

When is the average a good estimate of E[r]?When is our estimator for standard deviation

a good estimate of Stdev[r]?

When you have a large sample of outcomesWhen the PDF doesn’t change mid-sample

Stat RulesStat Rule 1.E

Let x1,…,xn be a random sample of the random variable X.

Let y1,…,yn be a random sample of the random variable Y.

Let zi=axi+byi, for i=1,…,n where a and b are constants.Then

Stat Rule 2.ELet x1,…,xn be a random sample of the random variable

X.Let zi=axi+c, for i=1,…,n where a is a constantThen

ybxaz

xz

xz

a

a

ˆ

ˆ 222

Estimated Sharpe RatioThe Sharpe Ratio may be estimated as

where we use the yield on a t-bill as a proxy for the risk-free rate.

][ˆ

frrE

Time and E[r] and Stdev[r]E[r] and stdev[r] have a unit of time attached to them.

E[r]=10% over a year is much different than E[r]=10% over a day.

[r]=0.16 over a year is much different than [r]=0.16 over a day.

Let p denote a “short” time period (e.g., a month)Let P denote a “long” time period (e.g., a year)Let N denote the number of “short” time periods in a

“long” time period (e.g., 12)Let Ep[r] and p[r] be the appropriate parameters over the

short time periodLet EP[r] and P[r] be the appropriate parameters over the

long time periodThen to a close approximation,

][][

][][

rNr

rENrE

pP

pP

How Good Are the Estimates?

Does the E[r] for a stock meet some pre-determined benchmark?You can’t observe the PDF to calculate the true

E[r]Over the past 10 years, returns have been as

follows:From this you estimate E[r] to be 18.3%Is this enough information to reject the hypothesis

that the true E[r] for the PDF that generated this sample is 10%?

1999 0.062000 0.572001 0.462002 0.152003 -0.182004 0.572005 0.352006 -0.152007 0.132008 -0.13

Hypothesis Testing

Null HypothesisThe hypothesis to be testedE[r]=10%

Alternative HypothesisE[r] 10%

Hypothesis Testing

c = distance of test statistic from null hypothesis that defines zone of acceptance.

Hypothesis Testing

Standard Practice: Choose c so that we know the probability of making a type 1 error.

Hypothesis Tests of the MeanLet (X1, X2, …,Xn) be an independent

random sample from any PDFTrue mean=True standard deviation=

What PDF governs the outcome for the sample average?

The laws of statistics say that the sample average is approximatelyNormally distributedTrue mean=Standard Deviation=

The standard deviation of the sample average is called the “standard error”

n/

Hypothesis Testing

Null HypothesisE[r]=10%

Alternative HypothesisE[r] 10%

Assuming the null is true, the sample mean is approximately normally distributed with=.10=0.0921

1999 0.062000 0.572001 0.462002 0.152003 -0.182004 0.572005 0.352006 -0.152007 0.132008 -0.130921.0/ˆ

183.0

n

r

Normal Distribution

Hypothesis TestingFor any normally distributed random variable, there

is only a 5% probability of getting an outcome above or below .

Assuming the null is true, there is only a 5% chance of drawing a sample average above 0.10+1.96*(0.0921) =0.2805 or below 0.10-1.96*(0.0921)=-0.08052

If the sample average is above 0.2805 or below -0.08052, we therefore conclude that it’s too unlikley (<5%) that we would observe such an outcome, given the null is true.

Hence, the null must not be true. Reject H0.

Hypothesis Testing

Hypothesis Testing

The sample average we observe is 18.3%This is in the zone of acceptance.Do not reject the null hypothesis.“We cannot reject the hypothesis that the true mean is 10% with 95%Confidence.

Hypothesis Tests of the Mean

ExampleHypothesis: =2%Alternative: 2%

100 years of stock market returns

Sample Average = 16%Standard Deviation = 0.18

Hence, standard error is 0.18/10 = 0.018

Hypothesis Tests of the Meanc = 1.96*0.018 = 0.03528

Assuming null hypothesis is true, too unlikely you would observe the actual sample mean. Reject the null hypothesis.

T-statistic

0 that hypothesis

nullreject can then you 1.96, |statistic"-t|" If

statistic"-t" called the is

96.1 implies 96.1

96.1 implies 96.1

X

XX

XX

X

XX

XX

estimation and hypothesis testing. the investment decision what would you like to know? what will be...

Documents