estimation and hypothesis testing. the investment decision what would you like to know? what will be...
TRANSCRIPT
Estimation and Hypothesis Testing
The Investment DecisionWhat would you like to know?
What will be the return on my investment?Not possible
PDF for return
Use statistics to estimate the correct PDF. Can do for discrete PDF’s.For continuous PDFs, beyond the scope of this course.
1) Assume the normal PDF2) Use statistics to estimate E[r] and .
The Game of InvestingSuppose you’re offered to play a game:
Cost: $1.25Flip a coinIf heads, you get $2 (return is 60%)If tails, you get $0 (return is -100%)Coin may be biased
Assume the true pdf for this investment is
Of course, this information is unknown!
5376.0][ %,12][ %100for 30.0
%60for 70.0)(
rVarrEr
rrf
The Game of InvestingYou would like to know
What will the coin flip turn out to be?Not possible
What is the PDF (probability of getting heads/tails)?Not possible to know with certainty: need to estimate
E[r] and Not possible to know perfectly: need to estimate.
How to estimate? Why not flip the coin a few times?
Random SampleSuppose you flip the coin 10 times and get
H, H, T, H, H, H, H, H, H, T8 heads, 2 tails
This is an independent random sampleA sample of observations generated by the same pdfIndependent: One outcome does not affect othersNothing other than the pdf determines how observations are
chosen. No “cherry picking”: picking certain observations you like, and
eliminating others
How do we use this information to estimate f(heads)E[r]
EstimatorsEstimator – function of outcomes for a random sample.
An estimator is a random variable It has it’s own PDF!
Let be an estimator of E[r]
Two important properties of estimators: Unbiased: Consistent: As the number of observations we observe becomes
large,
Suppose we use the following rule to get our estimator of E[r]: For any random sample, choose the first observation as our
estimate of E[r]. Call this estimator r1.
][]ˆ[ rEE
0]ˆ[ uVar
Estimator of E[r]Is r1 unbiased?
For any random sample of any size, r1 is simply a random variable governed by the pdf
So E[r1]=12%=E[r]r1 is therefore an unbiased estimator of E[r]
%100for 30.0
%60for 70.0)(
r
rrf
Estimator of E[r]Is r1 consistent?
For any random sample of any size, r1 is simply a random variable governed by the pdf
So Var[r1]=0.5376 for any sample size.r1 is therefore not a consistent estimator of
E[r]
%100for 30.0
%60for 70.0)(
r
rrf
Estimator of E[r]Suppose we use the following rule to get our
estimator of E[r]:For any random sample, take the average
return as our estimator of E[r].Call this estimator .
r
Estimator of E[r]Is unbiased?
What is ? Use stat rule #1
is therefore an unbiased estimator of E[r]
r
nn r
nrn
rnn
rrrr
1...
11...21
21
][rE
][)12(.
)12(.1
...)12(.1
)12(.1
][1
...][1
][1
][ 21
rEn
nnnn
rEn
rEn
rEn
rE n
r
Estimator of E[r]Is consistent?How do we find the variance of a sum of random
variables?We haven’t learned this yet, but we will later.
We can generate random samples of estimators to get some idea of the properties of their PDF.For each outcome for an estimator, we need to
generate a random sample of observations, and then compute the estimator.
Use Excel Spreadsheet (posted on course website)
r
CommentWhy do we use probability weights when calculating
E[r] from a pdf, but when we estimate E[r] we just use an equally weighted average?
Given the PDF above, we should expect in any random sample to see heads 70% of the time.
Assume we draw a sample where exactly 70% are heads70% of the returns in the sample will be 0.6030% of the returns in the sample will be -1.00A simple average across observations is equal to
Simple averages naturally put more weight on those outcomes which are more likely.
00.130.60.70.
00.130.60.70.
n
nn
Estimator of Stdev[r]How do we use data to estimate Stdev[r]? Var[r]=E(r-E[r])2 = E[r2]-E[r]2
Stdev(r)=sqrt(Var(r))
Suppose we use the following rule to get our estimator of Stdev[r]:
Is the estimator unbiased?Is the estimator consistent?Use Excel spreadsheet.
2/12
2
2/111
)(1
ˆ
i iii
ii r
nr
nrr
n
AverageThe same results apply to continuous PDFsFor a given random sample:
. ofestimator consistent unbiased,an is )(1
1ˆ
E[r]. ofestimator consistent unbiasedan is 1
][ˆ
2/1
2
ii
ii
rrn
rn
rE
Estimates
When is the average a good estimate of E[r]?When is our estimator for standard deviation
a good estimate of Stdev[r]?
When you have a large sample of outcomesWhen the PDF doesn’t change mid-sample
Stat RulesStat Rule 1.E
Let x1,…,xn be a random sample of the random variable X.
Let y1,…,yn be a random sample of the random variable Y.
Let zi=axi+byi, for i=1,…,n where a and b are constants.Then
Stat Rule 2.ELet x1,…,xn be a random sample of the random variable
X.Let zi=axi+c, for i=1,…,n where a is a constantThen
ybxaz
xz
xz
a
a
ˆ
ˆ 222
Estimated Sharpe RatioThe Sharpe Ratio may be estimated as
where we use the yield on a t-bill as a proxy for the risk-free rate.
][ˆ
frrE
Time and E[r] and Stdev[r]E[r] and stdev[r] have a unit of time attached to them.
E[r]=10% over a year is much different than E[r]=10% over a day.
[r]=0.16 over a year is much different than [r]=0.16 over a day.
Let p denote a “short” time period (e.g., a month)Let P denote a “long” time period (e.g., a year)Let N denote the number of “short” time periods in a
“long” time period (e.g., 12)Let Ep[r] and p[r] be the appropriate parameters over the
short time periodLet EP[r] and P[r] be the appropriate parameters over the
long time periodThen to a close approximation,
][][
][][
rNr
rENrE
pP
pP
How Good Are the Estimates?
Does the E[r] for a stock meet some pre-determined benchmark?You can’t observe the PDF to calculate the true
E[r]Over the past 10 years, returns have been as
follows:From this you estimate E[r] to be 18.3%Is this enough information to reject the hypothesis
that the true E[r] for the PDF that generated this sample is 10%?
1999 0.062000 0.572001 0.462002 0.152003 -0.182004 0.572005 0.352006 -0.152007 0.132008 -0.13
Hypothesis Testing
Null HypothesisThe hypothesis to be testedE[r]=10%
Alternative HypothesisE[r] 10%
Hypothesis Testing
c = distance of test statistic from null hypothesis that defines zone of acceptance.
Hypothesis Testing
Standard Practice: Choose c so that we know the probability of making a type 1 error.
Hypothesis Tests of the MeanLet (X1, X2, …,Xn) be an independent
random sample from any PDFTrue mean=True standard deviation=
What PDF governs the outcome for the sample average?
The laws of statistics say that the sample average is approximatelyNormally distributedTrue mean=Standard Deviation=
The standard deviation of the sample average is called the “standard error”
n/
Hypothesis Testing
Null HypothesisE[r]=10%
Alternative HypothesisE[r] 10%
Assuming the null is true, the sample mean is approximately normally distributed with=.10=0.0921
1999 0.062000 0.572001 0.462002 0.152003 -0.182004 0.572005 0.352006 -0.152007 0.132008 -0.130921.0/ˆ
183.0
n
r
Normal Distribution
Hypothesis TestingFor any normally distributed random variable, there
is only a 5% probability of getting an outcome above or below .
Assuming the null is true, there is only a 5% chance of drawing a sample average above 0.10+1.96*(0.0921) =0.2805 or below 0.10-1.96*(0.0921)=-0.08052
If the sample average is above 0.2805 or below -0.08052, we therefore conclude that it’s too unlikley (<5%) that we would observe such an outcome, given the null is true.
Hence, the null must not be true. Reject H0.
Hypothesis Testing
Hypothesis Testing
The sample average we observe is 18.3%This is in the zone of acceptance.Do not reject the null hypothesis.“We cannot reject the hypothesis that the true mean is 10% with 95%Confidence.
Hypothesis Tests of the Mean
ExampleHypothesis: =2%Alternative: 2%
100 years of stock market returns
Sample Average = 16%Standard Deviation = 0.18
Hence, standard error is 0.18/10 = 0.018
Hypothesis Tests of the Meanc = 1.96*0.018 = 0.03528
Assuming null hypothesis is true, too unlikely you would observe the actual sample mean. Reject the null hypothesis.
T-statistic
0 that hypothesis
nullreject can then you 1.96, |statistic"-t|" If
statistic"-t" called the is
96.1 implies 96.1
96.1 implies 96.1
X
XX
XX
X
XX
XX