standard errors in cf
DESCRIPTION
Standard Errors in CF. Robin Greenwood Empirical Topics in Corporate Finance March 2011. Panel regressions. “Clustering is a Cambridge sickness” Tuomo Vuolteenaho. Panel Regressions. Hard to disentangle issues associated with specification design with issues related to standard errors - PowerPoint PPT PresentationTRANSCRIPT
Standard Errors in CF
Robin GreenwoodEmpirical Topics in Corporate Finance
March 2011
Panel regressions• “Clustering is a Cambridge sickness”– Tuomo Vuolteenaho
Panel Regressions• Hard to disentangle issues associated with
specification design with issues related to standard errors
• Today all about SEs
Lang, Ofek, Stulz (1995)• Leverage and Investment (panel regression)– OLS, p-values, no firm fixed effects
Opler, Pinkowitz, Stulz, Williamson (1999)• Corporate Cash Holdings• White SE, or Fama Macbeth, plus FE regs
Stulz and Williamson (2003)• Culture, Openness, and Finance• Back to OLS country regressions
Helwege, Pirinsky, and Stulz• Why do firms become widely held? • Pooled OLS: Depvar = 5% drop in ownership
Stulz and Fahlenbrach (2009)• Changes in q and changes in ownership, cluster
Doidge, Karolyi, and Stulz• Why has IPO activity picked up everywhere but for
the US?
Fama-Macbeth
• Workhorse empirical method in modern finance• Used to deal with panels where there is high degree
of cross-sectional correlation, but not much time correlation
• Makes sense to use this when describing returns– Mostly random– Correlated across firms– Some time-dependence, however, at least in the expected
return component• Vastly overused – Together with “portfolio” approaches
Panel Analysis
←N→
↑T↓
←N→100
100
100
100
100
100
100
100
100
Xit
Yit it t t itY a b X
1 2
3 4
5 FamaMacbeth
Panel Analysis
←N→
↑T↓
←N→100
100
100
100
100
100
100
100
100
Xit
Yit it t t itY a b X
1 2
3 4
5 FamaMacbeth
/tb
SE T
Watch out for persistence• Good scenario for bts:
• Bad scenario:
Watch out for persistence• Good scenario for bts:
• Bad scenario:
Watch out for persistence• Good scenario for bts:
• Bad scenario: Simple fix:Modify using Newey-West
In my experience,Approximately doublesThe SEs
Fama & Macbeth• Original use in asset pricing
• Stage 1: Estimate betas• Stage 2: Estimate cross-sectional relation between returns
and betas• Stage 3: Collect your estimates and get t-stat
• Benefit: Flexible parameters, not memory intensive• These benefits are less apparent today, yet method still
popular because it’s hard to game• Main benefit: Weights PERIODS equally• Can get close to this by running panel and weighting by
1/N(t), but people will be suspicious
Still mostly used in Asset Pricing• Pontiff and Woodgate, Share issuance and cross-
sectional returns• Table VI
Gong, Louis, Sun“Earnings management following open-market
repurchases”
Examples of FM from Corporate Finance• Fama French 2002– Testing tradeoff vs. pecking order
Main table 1965-1999
Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches
Mitch PetersenRFS 2009
Papers Contribution• Examines a variety of approaches to estimating
standard errors and statistical significance in panel data sets
• Also looks at a variety of papers published from 2001-2004:– Only 42% of papers adjusted standard errors for possible
dependence in residuals.• Many different approaches.• Which are correct under what circumstances.
• The bar for you will be much higher
Overview• OLS standard errors are unbiased when residuals are
independent and identically distributed.• Residuals in panel data may be correlated by firm-
specific effects that are correlated across time.– Firm effect.
• Residuals of a given year may be correlated across different firms (cross sectional dependence)– Time effect.
Paper’s Approach• Simulate data that has either firm effect or a time
effect.• Test various estimation techniques.• See how they deal with the simulated data.• Then takes regression approaches to actual data and
compares them.
Firm Fixed Effects• Assumption of OLS is that cross product matrix has
only non-zero numbers on the diagonal.• Figure 1 – Example of a firm effect.– Cluster standard errors by firm.
Time for other firmsTime for this firm
Time for other firmsTime for this firm Firm 1, date 1
Firm 1, date 2…
Firm 2, date 1Firm 2, date 2
OLS vs. Clustering by Time vs. FM with Firm Effect• Simulate 5000 samples with 5000 observations.
– 500 firms and ten years of observations.• Let the residual and independent variable variance due to the firm effect vary
between 0 and 75%.How do you do this?X_g = normrnd(0,1,[NUM_FIRMS 1]);X_i = normrnd(0,1,[NUM_BOTH 1]);E_g = normrnd(0,2,[NUM_FIRMS 1]);E_i = normrnd(0,2,[NUM_BOTH 1]);X(i) = sqrt(variation_X)*X_g(c_f) + sqrt(1-variation_X)*X_i(i);E(i) = sqrt(variation_E)*E_g(c_f) + sqrt(1-variation_E)*E_i(i);
• 500 clusters by firm.
OLS vs. Clustering on Firm vs. FM with Firm Effect
• Table 1– Compare average coefficients, st. dev. of coefficient
estimates, % significant, average SE clustered and % significant with clustered SE.
– Vary how much of the independent variable variation is due to firm effect and how much of the residual variation is due to firm effect.
• Figure 2 – Compare OLS, Clustered by firm, and Fama-McBeth.
• Table 2- Fama-MacBeth
Table 1
Table 1• Why is the true standard error increasing as we ramp
up the firm effect?
1 ( 1)
1 9
1 9*0.5*0.5
1 9*0.5*0.51.8
(0.0508 / 0.0283) 1.8!
X
X
T
Table 2
OLS vs. Clustering by Time vs. FM with Time Effect
• Simulate 5000 samples with 5000 observations.• Let the residual and independent variable variance
due to the time effect vary between 0 and 75%.• Not this is the situation that FM developed FM for.• Clustering will be by the 10 years.
OLS vs. Clustering by Time vs. FM with Time Effect
• Table 3 – Compare OLS and Clustering by time.• OLS does pretty poor job.• Table 4 – Using FM to estimate.
Table 3
Table 4
Lit Review• Petersen points out many papers which have
persistent firm characteristics on other persistent firm characteristics. Both OLS and FM will be biased here– Fama and French 2001 (DivPayer on M/B, size, etc)– M/B on firm chars
• Pastor and Veronesi, Kemsley and Nissim– Capital structure regressions
• Baker and Wurgler 2002; Fama and French 2002; Johnson 2003
Lit Review• Obnoxious• Wu (2004) “FM method accounts for the lack of
independence because of multiple yearly observations per company”
• Denis, Denis, Yost (2002” “pooling of cross-sectional and TS data in our tests creates a lack of independence in the regression models…..to address the importance of this bias, we estimate the regression model separately for each of the 14 years…”
• Choe, Bong-Chan, and Stulz (2005) “The FM regressions take into account the cross-correlations and the serial correaltion in the error term, so that the t-stats are more conservative”
OLS vs. Clustering by Time vs. FM with Firm and Time Effect
• In many typical examples, could have both a firm and time effect.
• Figure 6, typical structure with both.• Can cluster by firm and time together.– See Samuel Thompson’s 2006 working paper for math.– We’ll cover this later today
Figure 6
OLS vs. Clustering by Time vs. FM with Firm and Time Effect
• Simulate 5000 samples with 5000 observations.• Let the residual and independent variable variance
due to firm and time effect vary• Table 5 – Compare OLS, with and without firm
dummies, Clustered by firm and time, GLS, and FM.
Real Data• Table 6 – Look at asset pricing application.– Equity returns on asset tangibility.– Different methods matter.– OLS and firm clusters do poorly.– Time and firm clustering and FM work well.– Seems to say that for returns may be more affected by a
time effect.
Real Data• Table 7 – Capital structure regressions.– OLS, clustering by time, and FM do poorly.– Clustering by firm and clustering by firm and time do well.– Says that within corporate finance a lot of the effects
seem to have firm level persistence.
Table 6
Table 7
Recommendations• Think about the structure of the panel data structure.• What is the likely source of dependence.• Comparing different methods may provide additional
information about the research question.
• Starting point should probably be double clustering by firm and year
Samuel Thompson• Simple formulas for standard errors that cluster by
both firm and time (JFE 2011)
• Basic formula:
• This means you can do it in STATA• Email Sam Hanson or go to Mitch Petersen’s website,
there is pre-packaged code to do this
,0 ,0ˆ ˆ ˆ ˆ( ) firm time whiteVar V V V
Firm effects, time effects, and persistent common shocks
• Firm effects: arbitrary correlation across time for a given firm
• Time effects: errors have arbitrary correlation across firms at a moment in time
• Persistent common shocks: correlation between firms, but these shocks die out over time
( | ) 0it ik it ikE x x for t k
( | ) 0it jt it jtE x x for i j
( | ) 0 & | |it jt it jtE x x for i j t k L
Single vs. Double Cluster• Bias largest when the time and firm dimensionality of
the dataset is approximately the same• If you have ten firms and 1000 time period, biggest
bias reduction?– Cluster by firm
When does double clustering work?• Monte Carlos suggest that double clustering pretty
good for N & T greater than 25• To allow for persistent common shocks, need T>100
Application to industry Profitability• Hypothesis: Profitability is higher in more
concentrated industries• Unit of observation: Industry-year• 434 industries, 43 years
Application to industry Profitability• Hypothesis: Profitability is higher in more concentrated
industries• Unit of observation: Industry-year• 434 industries, 43 years
• “Clustering makes a big difference when both the error and the regressor are correlated within the clustering dimension”
Application to industry Profitability• Average ROA varies across time but not across
industries• HHI varies a lot between industries, but not much over
time within an industry
• Double clustering gives you a conservative estimate in both cases
Thompson- summary• More and more papers are using this double
clustering, probably will become the de facto standard