estimating panel data fixed and random effects with … · cahier de recherche 2018-03 estimating...
TRANSCRIPT
Cahier de recherche 2018-03
Estimating panel data fixed and random effects with application to the new Fama-French model using GMM robust
instruments
François-Éric Racicot
Telfer School of Management, University of Ottawa, Ottawa, ON K1N 6N5, Canada. Affiliate Research Fellow, IPAG Business School, Paris, France
Chaire d’information financière et organisationnelle, ESG-UQAM
E-mail: [email protected]
William F. Rentz
Telfer School of Management, University of Ottawa, Ottawa, ON K1N 6N5, Canada
Raymond Théoret
École des sciences de la gestion, Université du Québec à Montréal
Chaire d’information financière et organisationnelle, ESG-UQAM
May 9, 2018
2
Estimating panel data fixed and random effects with application to the new
Fama-French model using GMM robust instruments
Abstract
We investigate the five-factor Fama-French (2015) model using a GMM robust instrumental variables technique
comparing panel data fixed and random effects approaches. We rely on an improved Hausman artificial regression
to test for measurement errors. We also study a six-factor model that adds the Pástor-Stambaugh (2003) illiquidity
risk factor. While the fixed effects model is the most used in practice, we find that the random effects model is the
most appropriate for the data sample we resort to. Our fixed and random effects panel data approaches using
robust instrumental variables strongly suggest that the only consistently significant factor is the market risk factor.
Keywords: fixed and random effects; GMM; higher moment instruments; illiquidity; Fama-French five-factor model
JEL classification: C10; G11
Estimation GMM robuste des effets et variables de données en panel :
Le cas du nouveau modèle de Fama et French
Résumé
Nous investiguons le nouveau modèle à cinq facteurs de Fama et French (FF, 2015, 2016) rehaussé d’une mesure de liquidité bien connue (Pástor and Stambaugh, 2003) à l’aide d’une modification du GMM qui recourt à des instruments robustes, cela dans le cadre d’une analyse en panel. Lorsque nous recourons à l’estimateur OLS, notre modèle de Fama-French semble avoir un pouvoir explicatif en regard des rendements d’un portefeuille à douze secteurs. Cependant, notre étude en panel suggère que le seul facteur significatif est la prime de risque du marché, ce qui nous conduit à rejeter le modèle élargi de Fama-French. Dépendamment de la technique utilisée, nous trouvons que les erreurs de mesure peuvent être à la source de ce résultat, ce qui tendrait à appuyer le modèle élargi de Fama-French.
Mots-clefs : effets fixes et variables; GMM; instruments robustes; illiquidité; modèle de Fama et French à cinq
facteurs.
Classification JEL : C10; G11.
3
1. Introduction
Sharpe (1964), Lintner (1965), and Mossin (1966) developed what is known as the capital asset
pricing model (CAPM). Jensen (1968) is credited with the development of alpha ( ) , who he
applied to investigate the performance of mutual funds via the CAPM. Black (1972) extended
the theory of the CAPM to what is known as the zero-beta CAPM. Collectively, these ideas form
the basis of modern portfolio management and equity valuation. They have been widely
implemented over the last 50 years by academics and practitioners. There were many attempts to
extend the CAPM to a dynamic framework, such as the intertemporal CAPM (Merton, 1973) and
the consumption CAPM due to Hansen and Singleton (1982, 1984). Later Mehra and Prescott
(1985) studied the consumption CAPM to further investigate what is known as the equity
premium puzzle1. Nevertheless, it appears that the most appealing extension of the CAPM is the
static Fama and French (FF, 1992, 1993) three-factor asset pricing model that is akin to the
arbitrage pricing theory (APT) of Ross (1976). In addition to the excess market return factor, the
FF three-factor model includes size and value factors.
According to Cochrane (2011, p.1061), there is a “zoo of new variables”. In our study,
we will restrain the factors to the original FF three factors, the profitability and investment
factors recently introduced by FF (2015), and the Pástor-Stambaugh (PS, 2003) illiquidity factor.
These risk factors appear to be the most widely recognized factors explaining the cost of equity2.
All of these factors may be replicated by portfolios. If these portfolio risk factors do not span the
whole space of the unknown state factors, then specification errors could occur. Furthermore, as
noted by FF (2015, p. 2), the book/market ratio “is a noisy proxy for expected return”, which
implies potential measurement errors.
In addition to potential specification errors, some of the explanatory variables may be
highly interrelated. Cochrane (1991, 2011) used a modified version of Tobin’s (1969) Q theory
to show a link between asset prices and investment. Cochrane’s link can be modified to express a
relation between expected returns and investment3. Since Cochrane’s Q is approximated by the
market/book ratio, the FF value and investment factors are likely to be highly related.
1 For a summary of these developments, see Campbell et. al. (1997) or Cochrane (2005, 2008). Note that Hansen
and Singleton (1984) also previously found the equity premium puzzle. 2 See Pinto et. al. (2015, ch. 2). 3 See Hou, Xue, and Zhang (2015).
4
The original illiquidity factor of Pástor-Stambaugh (2003) is an example of what is
considered a generated variable because it is a parameter obtained from a regression, in this case
relating stock return to its trading volume. Note that there is a portfolio version of this variable
which is the one we use. This portfolio is long in illiquid stocks and short in liquid stocks.
However, this portfolio factor is statistically indistinguishable from its original version. Although
the OLS estimator remains unbiased, generated variables can increase the variance of the OLS
estimator according to Pagan (1984, 1986)4 and Shanken (1992)
5. Thus, the resulting inference
may be biased. Furthermore, Adrian et al. (2017) argue that traditional illiquidity measures are
endogenous variables, which therefore results in biased coefficients using OLS.
A powerful solution to the problems of specification and measurement errors is the
generalized method of moments (GMM) developed by Hansen (1982). However, the usefulness
of this method is questionable in the presence of weak instruments. Nelson and Starz (1990a,b);
Bound, Jaeger, and Baker (1995); and Hahn and Hausman (2003) show that the two-stage least
squares (2SLS) estimator is inconsistent when instrumental variables are weak.
Dagenais and Dagenais (1994, 1997) develop a method that creates instruments with
greater robustness. These robust instruments are generated through a Bayesian averaging
approach originally developed by Theil and Goldberger (1961). This approach employs
generalized versions of Durbin (1954) and Pal (1980) higher moment estimators. The principal
two features of this approach are i) it is parsimonious in the sense that it requires minimal
computational power and ii) it essentially minimizes a distance (d) measure. Based on this
distance notion, we refer to this approach as GMMd.
This article develops an empirical extension of Racicot (2015) that generalizes the GMMd
approach to a fixed and random effects panel data framework. In addition, we allow not only for
the Jensen performance measure to vary across individuals (sectors) but also the systematic
risk measure to vary6. This generalization enables us to i) evaluate the robustness of the new
five-factor FF (2015) model and ii) compare this model to a six-factor model that incorporates
4 Pagan and Ullah (1988), however, find that when estimating a regression using a generated variance regressor (e.g.
from GARCH), the resulting estimator will be biased. 5 In the two-pass regression approach, the second step uses estimated betas. These betas may be considered as
generated variables. Shanken (1992) showed that the standard error from this two-step approach should be corrected.
This result appears analogous to Pagan (1984, 1986). 6 However, note that a seemingly unrelated regression (SUR) procedure would have been more appropriate here
since in our applications, we have not enough cross-section compared to time range.
5
the PS (2003) illiquidity risk factor. This empirical framework allows us to provide some new
insights on the effects of unobserved heterogeneity in panel data models that may compound
measurement errors if not tackled properly. One approach to removing unobserved heterogeneity
is to rely on first-differencing. In fact, this may actually worsen the situation. Arrenallo (2003)7
showed that it is only by chance that the method of first-differencing in a panel data framework
will diminish measurement errors.
Fama and McBeth (1973) introduced a process for estimating cross-sectional regressions
and standard errors correcting for cross-sectional correlation in a panel data framework.
Cochrane (2005, p.245) showed that when the right-hand side variables are invariant through
time, the Fama-McBeth results are equivalent to (i) the pooled regression, (ii) cross-section OLS
with standard errors corrected for cross-sectional correlations, and (iii) single cross-sectional
regression on time series averages with standard errors corrected for cross-sectional correlations.
Shanken (1992) proposed a way to correct the bias in the estimation process for the standard
errors caused by the two-pass regression approach. However, as Cochrane (2005) points out, one
way to tackle all of these problems is to use the more powerful GMM approach. One of the
virtues of our proposed generalized GMMd panel data framework is a systematic treatment of the
previous specification errors including the problem of measurement errors. To the best of our
knowledge, we are the first to use panel data for both fixed and random effects models for
estimating factor risk premiums using our new GMMd approach.
In this paper, we find, using the Jarque-Bera (1980) statistic, that the return data for the
FF 12 sectors, the FF 5 portfolio risk factors and for the PS portfolio illiquidity factor all depart
significantly from normality. In general, our results show that using OLS in panel data for fixed
or random effects models, most of the new FF portfolio risk factors are significant although the
PS portfolio illiquidity is not. However, when using the GMMd approach, we obtain a different
picture, viz., the only strongly significant risk factor is the market factor and the illiquidity factor
is weakly significant for the pooled GMMd (fixed effects). We also find significant measurement
errors for the new FF investment factor and for the PS illiquidity factor relying on our modified
artificial regression Hausman (1978) test (Hausd).
7 Dagenais (1994) showed when pseudo differencing is used to correct for autocorrelation as in the iterative method
of Cochrane and Orcutt (1949), the problem of measurement error is exacerbated.
6
The remainder of this article is organized as follows. Section 2 introduces an extension of
the basic fixed and random effects panel data framework in the context of errors in variables in
the new Fama-French (FF, 2015) five-factor model and the six-factor model that includes Pástor-
Stambaugh (PS, 2003) illiquidity. Section 3 incorporates the GMMd approach into the panel data
framework. Section 4 discusses our Hausd test for measurement errors. Section 5 lays out the
testing procedures for random versus fixed effects models. Section 6 interprets some descriptive
statistics of the data used in this paper. Section 7 presents our empirical results. Section 8
discusses our conclusions and suggestions for further research.
2. The Fixed and Random Effects Fama-French Models
2.1 The five- and six- risk factor models8
In two influential papers, Fama and French (FF, 1992, 1993) introduced their three-factor asset
pricing model. Their idea was to improve on the explanatory power of observed equity returns.
The capital asset pricing model (CAPM) developed Sharpe (1964), Lintner (1965), and Mossin
(1966) is known to have only modest explanatory power for individual equity returns9. Fama and
French improved the cost of equity calculation by adding the size t
SMB and value t
HML factors
to the CAPM excess market return factorMt ft
R R to create the following three-factor model.
it Ft i i Mt Ft i t i t itR R a b R R s SMB h HML e (1)
tSMB is the difference in returns in period t of a diversified small cap portfolio and a diversified
large (i.e. big) cap portfolio. Note that this differential return also may be proxied by computing
the difference in return of the Russell 2000 and the S&P 500 index. t
HML is the difference in
returns in period t of a high book-to-market portfolio and a low book-to-market portfolio.
To further refine their model, FF (2015) introduced two additional factors, the
profitability factor t
RMW and the investment factor t
CMA to create the following five-factor
model10.
8 Note that some authors have recently considered other factors instead of illiquidity, like the momentum factor (see
Barillas and Shanken, 2015). This factor is, however, not new and is well documented in the literature (see Carhart,
1997). 9 Several authors (e.g. Benninga, 2014) show that the explanatory power of the CAPM substantially improves when
applied to a portfolio of equities. 10 The data for the five FF factors and sector returns are available from
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html
7
it Ft i i Mt Ft i t i t i t i t itR R a b R R s SMB h HML r RMW c CMA e (2)
The profitability factor tRMW is the difference in returns in period t of diversified portfolios of
stocks with robust and weak profitability. The investment factor tCMA is the difference in returns
in period t of diversified portfolios of conservative and aggressive firms with respect to
investment behavior.
As a starting point for justifying these new factors, FF (2015) examined the market value
per share mt which is the discounted value at time t of the expected dividends per share tE d
where 1... and r is the cost of equity.
1
( )
1
tt
E dm
r
(3)
As FF explains, (3) can be manipulated to extract the relation between expected return and
expected profitability and between expected return and investment.
FF’s approach follows Miller and Modigliani’s (MM, 1961) approach, albeit with slightly
different notation. MM used the following expression to value the firm 0V at time 0,
10
1(0) ( ) ( )
1t
t
V X t I t
(4)
where I(t) is the investment at time t, X(t) is the total net profitability at time t, and is the
discount rate. Note that in (4), investment and profitability are explicitly considered.
Essentially FF generalized (4) to be at any time t, explicitly considered the expectation
operator E , and divided both sides of (4) by the book value Bt of the firm at time t to obtain
(5) which illustrates why book-to-market or B/M ratios are related to the rate of return of a
financial asset.
1
/ 1t t
t
t t
E NI B rM
B B
(5)
tNI is the net income for period ,t 1t t tB B B is the change in total book value
of equity, and r is the return on the financial asset. The change in book value of equity for any
period is the investment (disinvestment if negative) when a firm is all equity financed.
Macroeconomists would call this the change in capital, viz. 1 1t t t tK K B B I t
for the
8
period t. Note that (5) is also a proxy for Tobin’s (1969) Q, which is the market value of installed
capital divided by its replacement cost.
A firm should invest more when its marginal Q is high11
in order to maximize
shareholder wealth. As the firm invests more, it will move down its investment opportunity
schedule until marginal benefit equals marginal cost. Hence, higher investment will drive down
the firm’s rate of return. Hou et. al. (2015) used this argument to develop the following equation
for stock i at time t,
1
1 /
t itt it
it it
EE r
a I A
(6)
where t itE r and 1t itE are the conditional expected return and profitability, respectively; a
is a parameter for adjustment costs; itI is the investment; and itA are the firm’s productive assets.
This model is based on Lin and Zhang’s (2013) stochastic general equilibrium model in a two-
period setting, where the rate of return on investment is equated to the firm’s discount rate or
cost of capital. (6) provides a rationale for the factors tRMW and tCMA in (2).
Using Bellman’s (1957) equation of dynamic programming, Abel (1983)12
related
investment It to Tobin’s Qt13
, the interest rate r, and the elasticity of investment as shown in
(7).
1
1
1t
t
QI with
r
(7)
To be more specific, Abel proposed a simple model of investment where a firm undertakes to
accumulate (reduce) its capital stock in order to maximize its discounted net revenues subject to
the constraint of a Cobb-Douglas (1938) production function and to an uncertain future price for
its product or services14
. Note that (7) is consistent with (6) in the sense that investment increases
with Q and is inversely related to r.
11 The marginal Q is the NPV of future cash flows generated from an additional unit of assets. Note that (6) is
derived equating the marginal benefit to marginal cost. 12 See also Chow (1997) who also discussed this model. 13 Tobin’s Q is the expected marginal revenue product of capital. 14 Instead of using the cumbersome approach of dynamic programming, Chow (1997) showed how this problem can
be transformed into a simple Lagrange optimization problem.
9
Pástor and Stambaugh (2003) introduced a liquidity factor LIQt to the original Fama and
French (1992) three-factor model. The Pástor-Stambaugh liquidity factor may be viewed as a
generated variable. LIQt is an average of the stock it obtained from regression (8).
1 1 1id t md t it it idt it idt mdt idt id tr r r sign r r v (8)
where ridt is the return of stock i on day d in month t and vidt is the dollar trading volume of stock
i on day d in month t. Pagan (1984, 1986)15
shows that generated variables may increase the
variance of the OLS estimator but the estimator remains unbiased. In this paper, we compare (2)
with an augmented version of this equation that includes the liquidity as a sixth factor16
.
2.2 Fixed Effects Model17
We extend the model in (2) to a fixed effects panel data framework including the LIQ factor in
(9) below, written in a stacked vector format for the 12 FF sectors.
12 12
1 1
F i i i i M F
i i
Y R R D D R R s SMB h HML r RMW cCMA l LIQ e
(9)
11 1 1 12,1 1 12,T, , , , , , ,F T FT F FTY' R R R R R R R R represents the transpose of the
stacked vector Y of excess returns for each sector. ' 0, ,0, ,1, ,1,0, ,0iD is the transpose
of the stacked dummy variable, which is 0 everywhere except for the T observations for sector i.
i is the Jensen (1968) performance measure for sector i.
'M FR R 1 1 M1 1 MT, , , , , ,M F MT FT F FTR R R R R R R R is the transpose of the
stacked vector of excess market returns. That is, the excess market returns are stacked 12 times, once for
each sector. i is the sector i CAPM systematic risk beta. The other explanatory variables are
similarly defined. The coefficients of these other variables are 12-sector pooled coefficients. e is
the stacked vector of error terms.
For the fixed effects (FE) model, we will need the estimate of the variance-covariance
matrix. One way to proceed is by transforming the model into its deviation from the time means.
Consider first the basic LSDV model:
Y D X e (10)
15 See also Pagan and Ulah (1988) and Shanken (1992) for more information on related matters. 16 The LIQ factor is available from Pastor’s website http://faculty.chicagobooth.edu/lubos.pastor/research/ .We use
the tradable LIQ factor and multiply it by 100 to put it in percentage form. 17 See Heij et al. (2004) for a parsimonious introduction to the panel data framework with EViews applications.
10
Following Wooldridge (2002), we can transform (10) into its deviation form by subtracting the
time mean from both sides (i.e., ��𝑖𝑡 = 𝑦𝑖𝑡 − ��𝑖, ��𝑖𝑡 = 𝑥𝑖𝑡 − ��𝑖 and ��𝑖𝑡 = 𝑒𝑖𝑡 − ��𝑖) to obtain:
�� = ��𝛽 + �� (11)
The fixed effect estimator is obtained by applying OLS on (11):
��𝐹𝐸 = (��′��)−1
��′�� (12)
The variance-covariance matrix of this estimator is therefore identical to the OLS one except for
the fact that it is in deviation form,
��𝐹𝐸(��𝐹𝐸|𝑋) = ��𝑒2(��′��)
−1 (13)
where ��𝑒2 = ��′��/(𝑁𝑇 − 𝑁 − 𝑘) and ��𝑖𝑡 = ��𝑖𝑡 − ��𝑖𝑡′��𝐹𝐸.
2.3 Random Effects Model18
To introduce the standard random effects model, we begin with the classic model where only the
constant term is allowed to vary randomly. We then progress to the general random parameters
model where all the parameters are allowed to vary randomly.
In the case of the standard random effects model, a generalized reformulation of (9) for
sector i at time t yields
it it i ity x β u ' (14)
where itx β' is the product of the explanatory variables and the vector of coefficients in (9) and
𝑢𝑖 = 𝑧𝑖′𝛼 − 𝐸(𝑧𝑖
′𝛼) is the random heterogeneity of the ith
sector added to 𝛼, the constant term. 𝑢𝑖
may be viewed as a set of factors for the ith
sector, 𝑧𝑖′𝛼, that are not in the regression and are
specific to that sector. Note that one way to remove this heterogeneity is by transforming the
model to deviation (Baltagi, 2001; Arenallo, 2003; Greene, 2012) from the group mean, that is
it i i i it i it i
it i it i
y y u u x x β
x x β
''
''
(15)
where ��𝑖 = (𝛼 + 𝑢𝑖) + ��𝑖′𝛽 + 𝜀��, i=1,…, N. In this setting, the LSDV estimator is a consistent
estimator of 𝛽. This approach has the virtue of being robust to specification errors19
. However,
18 We follow the presentation of Greene (2012, 2015). 19That is, if we wrongly choose the random or the fixed effects model, the LSDV estimator remains consistent.
11
the approach, while instructive, is like the OLS estimator: not efficient. An efficient GLS exists
and this is the preferred method to estimate (14). The GLS estimator (Greene, 2012) is given by
-1
-1-1 -1 -1 -1
1 1
' 'ˆ 'N N
i i i i
i i
X X X y X X X y
(16)
where Ω = (𝐼𝑁⨂Σ), Σ−1/2 =1
𝜎𝜀(𝐼 −
𝜃
𝑇i𝑇i𝑇
′ ), and 𝜃 = 1 −𝜎𝜀
√𝜎𝜀2+𝑇𝜎𝑢
2. The transformation of the
dependent explanatory variables used for the GLS is obtained by multiplying these variables by
Σ−1/2. The GLS estimator can be shown to be, like the pooled OLS, a weighted average (matrix)
of the within and between-units estimators (Greene, 2012):
ˆ ˆ ˆwithin within between betweenF b F b (17)
where ��𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝐼 − ��𝑤𝑖𝑡ℎ𝑖𝑛, ��𝑤𝑖𝑡ℎ𝑖𝑛 = (𝑆𝑥𝑥𝑤𝑖𝑡ℎ𝑖𝑛 + 𝜆𝑆𝑥𝑥
𝑏𝑒𝑡𝑤𝑒𝑒𝑛)−1𝑆𝑥𝑥𝑤𝑖𝑡ℎ𝑖𝑛, and 𝜆 = (1 − 𝜃)2 =
𝜎𝜀2
𝜎𝜀2+𝑇𝜎𝑢
2. From this, it can be seen that the inefficiency of the ordinary least squares, that is when
𝜆=1, come from the fact that it puts too much weight on the between-unit variation compared to
the GLS estimator. In practice, with Σ rarely known, one remedy is to rely on an estimated
version (10) to remove heterogeneity to obtain an estimator of 𝜎𝜀2 (Greene, 2012)
2
2ˆ
2 1 1ˆ ˆ
ˆ ˆ
N Tiiti t
LSDVNT N K
(18)
where 𝜀��𝑡 and 𝜀�� are, respectively, the estimated residuals and the mean over the time period of
the estimated residuals, both obtained from (14). An estimation of 𝜎𝑢2 can be obtained by first
applying OLS on the pooled model (that is y = Xβ+ε , where all the data is stacked) to obtain
��𝑃𝑜𝑜𝑙𝑒𝑑2 and then computing ��𝑢
2 = ��𝑃𝑜𝑜𝑙𝑒𝑑2 − ��𝐿𝑆𝐷𝑉
2 20. The estimator β in (16) is unbiased and
consistent, provided matrix X is uncorrelated to the errors. Otherwise, as in the case of the fixed
effects, there is no guarantee that the errors in the explanatory variables and the unobserved
heterogeneity would cancel out. For that matter, Arellano (2003) uses a simple cross-sectional
setting to show how this attenuation of the bias21
may happen by chance.
2.4 Generalized random effects model
20
See Greene (2012, pp. 375-376) for potential problems in implementing this approach. 21
Not to be confused with the well-known bias due to errors in variables called attenuation which results in an
estimator that tends to 0.
12
Note that we are using a generalized version of the fixed effects model in (9)22
. Thus, to be
comparable, we will use a generalized version of the random effects model where all the
parameters are allowed to vary randomly. This version of the model can be written as follows:
i i i iy X e (19)
with Xi, a matrix of observations of dimension T x k, i i
v where vi is a vector of random
effects of dimension k x 1 withi i i i i
E(v X ) 0, E(v v ' X ) and ei is a vector of random errors
of dimension T x 1. Thus, we can see that the i for an FF sector is the result of a random
process with mean vector and covariance matrix .
We can rewrite (2) in the format of the general random effects model as follows
1
2
3
4
5
6
i
i
ii F M F i
i
i
i
a v
b v
s vR R R R SMB HML RMW CMA e
h v
r v
c v
1
(20)
where Ri, RF, RM, SMB, HML, RMW, CMA, and ei are vectors of dimension T x 1 and vki
(k=1,…,6) is random variable. Note that we are assuming that all of the parameters are random,
not just the intercept term and the coefficient for the excess market return factor. The random
effects model simplifies considerably assuming that we are in the case where there no
autocorrelation or cross-section correlations in ei. Following Swamy (1970) and Greene (2012),
we apply GLS to (19)23
obtain
12-1
-1 -1 *
1
ˆ ' i i
i
X X X y W b
(21)
which is a simple weighted average of the OLS bi. Note that in this equation, Ω has to be
estimated. An empirical estimation of *iW in (21) is needed to implement this model. *
iW in its
theoretical form is given by
22In this paper we are not considering the more modern version of the random effects model that is referred to as the
hierarchical model. See Greene (2012) for a discussion. 23 This is equivalent to (20) using the variables in our model.
13
1
1 11 1
* 2 ' 2 '
1
,i i
N
i e i i e i i
i
W X X X X
(22)
where N = 12 is the number of FF sectors. To estimate *iW Swamy (1970) estimated using
the empirical variance of a set of N least squares estimates for the vector bi minus the average
value of 1
2 '
i i is X X
, viz.
''
1 1
ˆ 1 1/1
N N
i i i
i i
bb Nbb N VN
where 1
1N
i
i
b bN
and 1
2 ' .i i i iV s X X
In summary, we obtain an estimate of the vector given by the weighted
average vector β with an estimator of the covariance matrix given by .
We can write the empirical version of (21) for the random effects (RE) model by
substituting for and 1
2 '
i i is X X
for 1
2 '
ie i iX X
(i.e., Ω for Ω , which implies substituting
*ˆiW for *
iW ) to obtain the feasible generalized least squares (FGLS)
12-1
-1 -1 *
1
ˆ ˆ ˆ ˆ'ˆRE i i
i
X X X y W b
(23)
The asymptotic RE variance-covariance matrix of (23) is given by the standard GLS one
(Wooldridge, 2002):
1ˆ ˆˆ
REV (β) X' X
=-1Ω (24)
which translate empirically to (Swamy, 1970) :
1
11
2 '
1
ˆ ˆN
i i i
i
RE s X XV (β)
= (25)
As it can be seen, the estimated variance-covariance matrix of the random effects model is
simply given by the first part of (22).
Our purpose here is to propose a parsimonious approach to tackle measurement errors or
the endogeneity of the explanatory X based on the generalized method of moments. This method
has the virtue of freeing the analyst from having to choose between one instrument and another.
As the literature has well established (e.g., Anderson and Rubin, 1949, 1950; Dufour, 2003;
Nelson and Startz, 1990a,b; Hahn and Hausman, 2002, 2003; Stock and Yogo, 2005; Hausman,
Stock and Yogo, 2005; Stock and Watson, 2011; Greene, 2012; Olea and Pflueger, 2013), weak
14
instruments present a perverse problem. Choosing the wrong instruments may result in
increasing the problem one hoped to confront in the first place. That is, it may transform the
estimator into a biased and inconsistent one. For example, it may bias the two-stage least squares
estimator toward the OLS24
. Also, it will render the basic framework for statistical inference
inappropriate (Nelson and Startz, 1990a,b; Hahn and Hausman, 2003).
3. The GMMd approach in the panel data framework
3.1 Standard instrumental variables approach – differencing required
Before discussing our proposed methodology to tackle the endogeneity introduced by
measurement errors, we first show the standard instrumental variables approach (Hausman and
Taylor, 1981; Arellano and Bond, 1991). In this framework, some sort of differencing is required
either from their group mean or first differencing.
Assume that we have the following equation to estimate using panel data
'it it ity x (26)
where no fixed effects/random effects is shown explicitly and no transformation has been applied.
We also assume there are errors in the explanatory variables which take the following form:
𝑥𝑖𝑡 = ��𝑖𝑡 + 𝑣𝑖𝑡 (27)
where ��𝑖𝑡 is the unobserved explanatory variables measured with errors 𝑣𝑖𝑡. Applying the first
difference approach to (26), yields25
:
-1 -1 -1
-1
( ) '
( ) '
it it it it it it
it it it
y y x x
x x
(28)
The first step in this framework is to apply 2SLS on (28) assuming a matrix of instrumental
variables Z, resulting in (Greene, 2012, 2015)26
:
-1-1 -1
2
1 1 1 1 1 1
' ' ' ' ' 'ˆN N N N N N
SLS i i i i i i i i i i i i
i i i i i i
X Z Z Z Z X X Z Z Z Z y
(29)
where Xi, is a T x k matrix as in (15) and Zi is a matrix of instruments.
24
The OLS estimator is biased and inconsistent in that context. 25
There are generally three common approaches to deal with heterogeneity: first differencing, group demeaning, and
for the fixed effects model, the least squares dummy variable approach. 26
See also Arenallo (2003) for a presentation of the GMM in a panel data context.
15
The second step consists of forming the weighting matrix (��) for the GMM estimator based on
the estimated residuals of (28):
21
1' 'ˆ ˆˆ
N
i i i i
i
w Z ZN
(30)
Substituting (30) in the criterion or GMM estimation gives
-1
1 1
1' ' ˆ1ˆ ˆ
N N
i i i i
i i
q Z w ZN N
(31)
Finally, minimizing (31) for parameters 𝛽 yields
-1
-1 -1
1 1 1 1
' 'ˆ 'ˆ ' ˆN N N N
GMM i i i i i i i i
i i i i
X Z w Z X X Z w Z y
(32)
where the asymptotic variance-covariance matrix of estimator 𝜃𝐺𝑀𝑀 is given by:
-1
-1
1 1
. . ˆ ˆ' 'N N
GMM i i i i
i i
Est AsyV X Z w Z X
(33)
A parsimonious instrumental variables approach – no differencing required
Turning to our parsimonious approach, it has the virtue of not requiring any type differencing in
the context of our panel data approach to the five-factor FF (2015). In this context, our estimator
(Racicot, 2015) is obtained by replacing the Z instruments by our new robust instruments which
can be qualified as strong instruments. We describe below our GMMd approach in the context of
panel data fixed effects and generalized random effects models.
3.2 GMMd fixed effects model
The GMM estimator ˆd
GMM for estimating the fixed effects panel data regression models is
given by
-1
-1 -1
1 1 1 1
' ' 'ˆ ˆ 'ˆd
N N N N
GMM i i i i i i i i
i i i i
X d w d X X d w d Y
(34)
where 𝑑𝑖 = 𝑥𝑖 − ��𝑖 is a vector of robust “distance” instruments. These new instruments – the d
“distance” instruments – can be computed using a matrix-weighted average by applying GLS to
a combination of two robust estimators, namely the Durbin (1954) and Pal (1980) estimators.
16
These estimators are respectively defined, in there multivariate representation, by (Racicot,
2015):
-1
' '1 1D z x z y (Durbin) (35)
-1
' '2 2P z x z y (Pal) (36)
where 𝑧1 = [𝑥𝑖𝑗2 ], 𝑧2 = 𝑧3 − 3𝐷𝑖𝑎𝑔(𝑥′𝑥/𝑁)𝑥′, 𝑧3 = [𝑥𝑖𝑗
3 ], and Diag(x’x/N) = x’x/N Ik are
stacked vectors with i representing the sectors (i = 1, …, N), k the number of explanatory
variables (either 5 or 6), and t the time subscript (t = 1,…,T). The notation is the Hadamard
product. The second and third power (moments) of the de-meaned variables (x) are then
computed. This is analogous to computing the second and third moments of the explanatory
variables. In short, the instruments are obtained by taking the matrix of explanatory variables (X)
in deviation from its mean (x). Next, we obtain the weighted estimator (𝛽𝐻) by an application of
the GLS to the following combination (Racicot, 2015):
DH
P
W
(37)
where 𝑊 = (𝐶′𝑆−1𝐶)−1𝐶′𝑆−1 is the GLS weighting matrix, S is the covariance matrix of (𝛽𝐷
𝛽𝑃)
under the null hypothesis (i.e., no measurement errors), and 𝐶 = (𝐼𝑘𝐼𝑘
) is a matrix of two staked
identity matrices of dimension k. Note that this weighting approach, which relies on GLS as the
weighting matrix, is optimal in the Aitken (1935) sense27
. However, we opt for the GMM
method to weight the Durbin and Pal’s estimators. We consider this to be a more efficient
procedure than the one used by Dagenais and Dagenais (1997) in that we rely on the asymptotic
properties of the GMM estimator with respect to the correction of heteroskedasticity and
autocorrelation to weight the instruments obtained with GLS. Note that when using GMM, we
give up some efficiency gain in order to avoid completely specifying the nature of the
autocorrelation or heteroskedasticity of the innovation and the data generating process of the
27
Note that we use W as a weighting matrix in the GLS estimator in (37). As well-known, this matrix can be replaced
by the White (1980) or the Newey-West (1987) HAC asymptotically consistent variance-covariance matrix. For the
problem of cross-sectional correlation (or spatial correlation) see Driscoll and Kraay (1998).
17
measurement errors (Hansen, 1982). Again, we consider this a great advantage over the GLS
estimator.
3.3 Implementing the panel data fixed effects GMMd approach
To implement the GMMd approach in a fixed effects panel data framework, first create
the dummy variables for each sector. Next compute the robust instruments using the above
algorithm described in (37). Then calculate the GMM estimators in (34) using a HAC matrix
with the newly computed robust instruments and the sector dummy instruments.
3.4 Implementing the panel data random effects GMMd approach
To implement the GMMd estimator in the context of the generalized random effects
model, simply substitute ˆd
GMM given by (34) for ib in (23). Also, the least squares variance-
covariance estimator 1
2 '
i i is X X
should be replaced by 1
2 '
,dGMM i i is X X
.
4. Hausd test for measurement errors
To test whether there are measurement errors, we rely on a modified Hausman (1978) artificial
regression which we refer to as Hausd. Each variable in the original five-factor and six-factor
models has a companion variable in Hausd with its own t statistic that indicates whether the
original variable contains measurement error.
To implement the Hausd artificial regression, start by estimating the following equation
using OLS:
ˆY X e (38)
It is a two-stage least squares (2SLS) estimator because is also obtained by OLS and (38) can
be rewritten as
2ˆ ˆ *SLSY X e (39)
where measures the under/over estimation of the OLS benchmark estimator.
Following Pindyck and Rubinfeld (1998, pp. 195-197), can be obtained using the following
procedure. Assume a regression model of the form Y X , where X is an unobservable
variable that is related to the observable variable X* and where X* = X + v and v is matrix of
measurement errors that are assumed to be normally distributed. The OLS regression
* *Y X is related to the original regression by noting that * v . We can write
18
ˆ ˆ* *X X , where are the regression residuals from applying OLS on 0
ˆˆˆitx z . Note
this expression is just another representation of 1 ˆ ˆ' 'zP X Z Z Z Z X Z X
, the projection
of X. Substituting for *X in the equation for Y yields ˆ ˆ* *Y X . Let represent the
coefficient of the variable . Substituting ˆ ˆ* *X X yields ˆ* *Y X ,
which is analogous to (39). The resulting t statistics can be analyzed in the usual fashion. That is,
if a significant t statistic is obtained for any variable, there are significant
specification/measurement errors in the model28
.
(39) is a Hausman (1978) artificial regression that can also be obtained using 2SLS with the
same set of instruments (Spencer and Berk, 1981). To be precise, the estimated ‘slope’
coefficients of the GMMd regression should be the same as the corresponding ‘slope’
coefficients in the Hausman artificial regression.
In (39), is a matrix of residuals of the regression of each explanatory variable on the
instrument set. The notation is commonly used in Hausman artificial regressions. It is
equivalent to the 𝑑𝑖 = 𝑥𝑖 − ��𝑖 residual that emphasizes the idea of a ‘distance’ variable.
5. Testing for random effects versus fixed effects models
The standard approach to test whether the fixed effects model should be retained over the
random effects model is usually performed via a Hausman (1978) specification test. The test
simply verifies that the quadratic distance between the fixed effects estimator is significantly
different from the random effects one. The test can be written as follows (Greene, 2012;
Wooldridge, 2002)
' 1 2ˆ ˆ ~
a
FE RE FE RE FE RE MH b V V b
(40)
which is asymptotically distributed as chi-squared with k degrees of freedom. To be precise, we
have k coefficients (i.e., k explanatory variables excluding the constant term) in the estimator
vectors bFE and ˆRE for the fixed effects and random effects models, where FEV and REV can be
estimated using (13) and (25), respectively. Note that if there were only one parameter in these
28 An F test can be done to see if collectively, none of the coefficients of the variables in the artificial regression are
significantly different from zero. This turns out to be unnecessary, since at least one coefficient in every regression
is significantly different from zero using t tests on the individual coefficients.
19
vectors, the square root of statistic (40) would asymptotically follow a t statistic and under
certain assumptions would in fact be a normal distribution asymptotically (Wooldridge, 2002).
That being said, in this paper, we would prefer to rely on an auxiliary regression version
of this test. This is consistent with the approach used in this paper. The implementation of the
test is, in our view, more practical and parsimonious. This approach can be implemented via the
following regression (Mundlak, 1978; Schmidheiny, 2015)
' ' '
it i it it i t ity x z x u (41)
In this generalized version of the test, 1/i ittx T x is the time average of the explanatory
variables. In our case, this implies that we have to compute the time averages of each of our risk
factors. Note also that in our case that = t = 0, since we are using the constrained version of the
test as we did not consider the time varying effects model or other sources of heterogeneity. The
test is generally implemented by testing the null hypothesis that the vector of coefficients = 0
(H0: = 0) using a Wald test corrected for clustered errors that may be heteroskedastic or
autocorrelated29
. Essentially, the test amounts to running an LSDV regression, where the term
'
ix is added. More precisely, one needs to stack the time data for each sector for the returns Y
into a vector and for the risk factors X into a matrix. The term ix is a stacked vector of the
average factor value for each factor. To further provide implementation details about (35),
rewrite it into its matrix format
Y D X X u (42)
where Y are the stacked returns for each sector, X are the stacked FF factors including the
liquidity factor, its associated vector of coefficients that remain constant for each sector, and
X is defined as follows
29 Note that an expeditious way of testing the joint significance of the parameters in is to use an F test and look at
its associated p-value, where it should be under 5% if the desired level of confidence is 5%.
20
[ 11 12 16
11 12 16
21 22 26
21 22 26
12,1 12,2 12,6
12,1 12,2 12,6
x x x
x x x
x x x
x x x
x x x
x x x
L
L
M M M
L
L
M M M
L
L
M M M ]
𝑁𝑇×𝑘
X is a matrix of data of dimension NT = 12 sectors × number of monthly observations, k = 6
factors in the augmented version of the FF (2015) model, and λ is a vector of dimension k × 1.
6. Data
6.1 Data
Our sample is composed of monthly returns of 12 indices classified by FF industrial sectors. The
observation periods are from January 1968 through December 2014 for a total 564 monthly
observations. The panel data framework yields 12 sectors × 564 monthly observations = 6,768
total observations. The FF risk factors are drawn from French’s website30. The PS liquidity factor
is from Pástor’s website31.
30 French’s website is http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html. 31 Pástor’s website is http://faculty.chicagobooth.edu/lubos.pastor/research/liq_data_1962_2012.txt
21
6.2 Descriptive statistics
Tables 1 and 2 present the descriptive statistics of the dependent and independent variables,
respectively32
.
Insert Table 1 here
For all sectors, note that the JB statistic is greater than 5.99, which is the critical value of
the chi-square distribution at the 5% level for 2 degrees of freedom. Thus, we reject the null
hypothesis of normality for all sector returns. Mandelbrot (1963, 1972) and Fama (1963, 1965)
reached similar conclusions. This empirical behavior was discovered even earlier by Mitchell
(1915), who may have been the first to notice both time varying volatility and high-peaked (fat
tails) in commodity prices. Pareto also found a fat-tailed distribution of income in the late 1800s
and developed a theory for this (Haug, 2007). Note that nowadays authors consider modeling
fund returns using the tempered class of distributions (Bianchi, 2015).
Sector 6 Business Equipment has the highest standard deviation of 6.68. On a standalone
basis in the Markowitz (1959)33 mean-variance framework, this would indicate that Business
Equipment is the riskiest sector. However, in the higher-moments framework of Rubinstein
(1973) and of Jurczenko and Maillet (2006), this sector has the second lowest kurtosis. This
suggests that perhaps Business Equipment is not the most risky.
Nine of the 12 sectors show negative skewness, which is an indicator of downside risk.
Only Sector 2 Durables, Sector 4 Energy, and Sector 10 Health have the desirable positive
skewness, which is an indicator of strong upside potential.
Insert Table 2 here
32 The Jarque-Bera (1980) statistic is calculated by
22
23~ 2
6 24
akurtskewJB n k
where n is the
number of observations, k is the number of regressors which is zero when using the raw data, skew is the skewness
of the data which is zero for a normal distribution, and kurt is the kurtosis which is three for the normal distribution.
However, note that the kurtosis and the skewness measures are not independent. Wilkins (1944) shows that 𝑘𝑢𝑟𝑡 ≥𝑠𝑘𝑒𝑤2 + 1, and Schopflocher and Sullivan (2005) go further and write that 𝑘𝑢𝑟𝑡 = 𝑎 + 𝑏 ∗ 𝑠𝑘𝑒𝑤2. These measures
must therefore handled with care. 33
Markowitz (2012) noted that the mean-variance model still works well in the presence of moderate amounts of
skewness and kurtosis.
22
In Table 2, the JB statistics are even more indicative of non-normality. The new FF risk
factor RMW has an extremely high JB statistic, and the new FF risk factor CMA has the lowest
JB statistic. Nevertheless, at 46.41, the CMA JB statistic is still well above the 5.99 chi-square
5% cutoff value. The values for all of the risk factors indicate that extreme events occur far more
frequently than with the normal distribution. This is a reflection of the kurtosis measuring well
over the normal distribution value of 3 for each of these 6 risk factors. The highest kurtosis value
is for the RMW risk factor at 14.17, being over 4 times the normal distribution value. Only the
kurtosis and JB statistics for RMW fall outside the range of the kurtosis and JB values from Table
1 for the sector returns.
All these results support the logic of our proposed methodology, which uses higher
moments (cumulants) as instruments for the GMM estimation process. Using OLS when such
strong non-normality is present in both the dependent and explanatory variables, may lead to
wrong inferences.
7. Empirical Results
7.1 The Fixed Effects estimation of the FF new five-factor model
Table 3a presents our estimation results for the new FF five-factor model using the LSDV and
GMMd approaches for the fixed effects model.
Insert Table 3a here
For the FF five-factor OLS pooled model for the FF twelve-sector portfolio, the
coefficients of all five factors are significant at 1% except for SMB which is significant at 10%.
This suggests strong support for the FF five factors. However, using GMMd pooled, only the
coefficient of the market factor is significant at 1% with the RMW coefficient significant at 5%.
From an investment performance perspective, the Jensen performance measure is
negative but not significant for both the OLS and GMMd pooled approaches for the FF twelve-
sector portfolio. For OLS, the twelve-sector portfolio appears to be weighted towards firms that
are small cap (SMB, 0.0216), value (HML, 0.1091), robust profitability (RMW, 0.1671), and
conservative investment policy (CMA, 0.0748). For GMMd, the conclusions are the same except
that growth rather than value seems to slightly predominate (HML, - 0.0016). The Hausd test
suggests significant measurement errors in the RMW factor (��𝑅𝑀𝑊 = - 0.2805, t = - 2.36).
Adding the coefficient of RMW and its corresponding ω coefficient yields 0.3976 – 0.2805 =
23
0.1171. This Hausd result is an approximation of the OLS estimation of 0.1671, reminding the
reader that Hausd is asymptotic.
According to LSDV, we have 8 sectors (Durables, 0.0053; Manufacturing, 0.0054;
Business Equipment, 0.0097; Money, 0.0041; Other, 0.0023) that generate significant positive
excess returns while there are 3 sectors (Non-durables, - 0.0025; Telecom, - .0038; Utilities,
- 0.0110) that significantly underperform and 1 sector (Energy, 0.0000) with neutral
performance. The relative (to the market) systematic measure of risk for all 12 sectors is
significantly different from 0 and for all 12 sectors is within 0.25 of the market of 1.
For the fixed effects GMMd, only one sector (Energy, 0.0224) has a Jensen performance
measure that is significantly different from 0! Given that the Jensen performance measure for
Energy is only significant at the 10% level, these sector results are strongly supportive of
efficient markets. Again, the relative systematic measure of risk is significantly different from 0
for all 12 sectors with 10 sectors within 0.25 of 1. Only the betas for Energy at 1.4261 and
Telecom at 0.5862 fall outside the range of 0.25 of 1.
7.2 The Fixed Effects estimation of the FF new augmented six-factor model
Table 3b presents our estimation results for the new augmented (LIQ) FF six-factor model using
the LSDV and GMMd approaches for the fixed effects model.
Insert Table 3b here
For the FF six-factor OLS pooled model for the FF twelve-sector portfolio, all of the
coefficients of the five FF risk factors retain their previous significance level when the LIQ risk
factor is added. The LIQ risk factor, however, is insignificant, which suggests that LIQ on
average does not have a risk premium. This again suggests strong support for the FF five factors.
However, using GMMd pooled, only the coefficient of the market factor is significant at 1% and
illiquidity may matter (20% level of significance). When testing for measurement errors using
the Hausd test, the LIQ risk factor is significant at the 5% level and seems to be measured with
significant errors (��𝐿𝐼𝑄 = - 0.1105, t = - 2.30). Furthermore, the coefficient of the CMA risk
factor becomes significant once again at the 10% level (0.2242, t = 1.86).
From an investment performance perspective, the Jensen performance measure is
negative but not significant for the OLS pooled approach for the FF twelve-sector portfolio.
24
However, for the GMMd pooled approach, the Jensen measure is negative and significant at the
5% level (- 0.1653, t = - 1.99). For OLS, the twelve-sector portfolio appears to be weighted
towards firms that are small cap (SMB, 0.0218), value (HML, 0.1090), robust profitability (RMW,
0.1669), conservative investment policy (CMA, 0.0748), and slightly illiquid (LIQ, 0.0098). For
GMMd, the twelve-sector portfolio appears to be weighted towards firms that are large cap
(SMB, - 0.0501), value (HML, 0.1697), robust profitability (RMW, 0.1345), conservative
investment policy (CMA, 0.2242), and illiquid (LIQ, 0.1152).
According to LSDV, of course, the previous alpha and beta remain exactly the same for
the individual sectors, since the market risk factor is the only risk factor used in LSDV.
7.3 Sector Analysis of the random effects models: the new FF five-factor model
Table 4a presents the OLS and GMMd results for the 12 FF sectors of the new five-factor FF
model, since these results are needed in the estimation of the random effects model.
Insert Table 4a here
Note that the coefficient for the market factor RM – Rf is significant at 1% for all 12 FF
sectors using OLS and for 11 of the 12 FF sectors using GMMd. For the Utilities sector, the
coefficient using GMMd is insignificant at the standard 1%, 5%, and 10% levels but is significant
at the 20% level (t = 1.36 > 1.28).
For SMB, its OLS estimated coefficient was significant at 1% for 9 sectors with 5 of these
coefficients being positive and 4 negative. The OLS estimated SMB coefficient was significant
at the 5% level for 1 sector and insignificant for the other 2. Turning to the GMMd estimated
coefficients for SMB, no coefficients are significant!
For the HML factor, Fama and French (2015) themselves felt that HML could be
redundant34
with the addition of the RMW and CMA factors. In other words, there could be
multicollinearity with its attendant problems. For OLS, the HML coefficient is significant at the
1% level for 8 sectors with 2 having a negative sign. 1 sector is significant at the 10% level and 3
are insignificant. As with SMB, the GMMd estimated coefficients of the HML factor are all
insignificant at the standard levels of significance! However, 4 sectors are significant at the 20%
level with 1 sector having a negative sign.
34 See Fama and French (2015), p. 2.
25
Turning to the first new factor RMW, we note that the OLS estimated coefficients are
significant at the 1% level for 8 sectors with 2 having a negative sign. 2 sectors are significant at
the 5% level, and 2 are insignificant. For the GMMd estimated RMW coefficients, only the Non-
Durables sector has a significant coefficient at a standard level (10%).
For the CMA factor, the OLS estimated coefficients are significant at the 1% level for 4
sectors with one of these sectors having a negative sign. 1 sector is significant at the 5% level
with a negative sign, and 7 sectors are insignificant. The GMMd estimated CMA coefficients are
significant at the 1% level for 3 sectors. 2 of these sectors, Chemicals and Health, have a positive
sign and overlap with the OLS results. The third sector, Telecom has a negative sign and is
insignificant with OLS. 2 sectors are significant at the 5% level with GMMd with 1 being
positive and 1 being negative. Manufacturing is the one with the positive sign, but the sign is
insignificant with OLS. Business Equipment has a negative sign for both GMMd and OLS with
the sign being even more significant at 1% for OLS. 1 sector is significant at the 10% level with
GMMd, and 6 sectors are insignificant. Thus, the GMMd estimation results suggest that only the
new FF factor CMA seems to have some explanatory power.
7.4 An Investment Perspective for the random effects FF five factor model
Now, turning to an investment perspective, we will focus on the 5 sectors (Energy, Business
Equipment, Telecom, Utilities, and Health) that generate positive risk-adjusted abnormal returns
or positive Jensen (1968) alpha based on OLS (see Table 4a). Note, though, that only 2 of these
sectors (Business Equipment and Telecom) have positive alphas (0.4964 and 0.6862,
respectively)35
using GMMd.
For Business Equipment using OLS, it seems that this sector is dominated by small firms
(SMB coefficient 0.0854) with low book to market ratios (HML, - 0.4199), weak profitability
(RMW, - 0.4180), and aggressive investment behavior (CMA, - 0.4395) with the coefficients of
these variables all being significant at the 1% level except for SMB which is significant at the 5%
level. For GMMd, Business Equipment is dominated by large firms (SMB, - 0.0379) with high
book to market ratios (HML, 0.0425) weak profitability (RMW, - 0.7035), and aggressive
investment behavior (CMA, - 0.9481). Only RMW and CMA are significant, albeit at the 20% and
5% levels, respectively.
35 These numbers are in percent return per month. On a nominal annual basis, these abnormal returns are 5.96% and
8.23%, respectively.
26
Using OLS, Telecom is dominated by large firms (SMB coefficient - 0.2658) with a high
book to market ratio (HML, 0.1101), a weak profitability (RMW, - 0.3185) and a conservative
investment behavior (CMA, 0.0503) with the coefficients being significant at the 1%, 10%, and
1% levels for SMB, HML, and RMW, respectively, and insignificant for CMA. We obtain
identical results for GMMd with respect to the signs except that CMA now has a negative sign.
The levels of significance for the factors have changed. The CMA coefficient is now significant
at the 1% level, whereas it was the only insignificant one using OLS. Now SMB, HML, and
RMW are all insignificant.
Of these 5 sectors with positive alpha, only Business Equipment has a beta greater than
one for the market factor RM – Rf. To the investor, this suggests that one can earn abnormal
returns in 4 sectors while taking on less relative risk than the market portfolio. Note, however,
under GMMd only Telecom has positive alpha and beta less than one.
7.5 Sector analysis of the random effects model: the augmented new FF six-factor model
Table 4b presents our estimation results for the new augmented FF six-factor model using the
OLS and GMMd approaches for the random effects model.
Insert Table 4b here
Again, Business Equipment and Telecom have positive alphas using OLS with values of
0.3762 and 0.1978, respectively, and significance levels of 1% and 20%. Using GMMd, both
coefficients are positive at 0.3000 and 0.7119, respectively, with Business Equipment
insignificant and Telecom significant at 5%. Using OLS, the coefficients and t values for the new
FF five factors in the six-factor model are essentially the same. This is not surprising given that
the liquidity coefficient is not significantly different from 0 for both sectors. Turning to GMMd
for Telecom, the coefficients and t values for the new FF five factors did change somewhat with
the SMB, HML, and RMW coefficients remaining insignificant and the t value for CMA
coefficient dropping from 1% to 5%. The LIQ coefficient is insignificant for both OLS and
GMMd.
LIQ is really a measure of illiquidity because we are using the portfolio version of the
LIQ factor (tradable LIQ that is long on an illiquid portfolio and short on a liquid one).
Coefficients should be positive to generate a risk premium. For example, the Durables sector has
a positive sign and is significant at the 5% level for OLS and at the 20% level for GMMd. This is
27
consistent with the idea that durables are difficult to sell during periods of illiquidity. Only 3 of
the 12 FF sectors (Health, Money, and Other) have negative coefficients for both OLS and
GMMd. Although these LIQ coefficients are significant at the 5%, 1%, and 1% level,
respectively, for Health, Money, and Other for OLS, they are all insignificant for GMMd.
7.6 The Random Effects estimation of the FF new five-factor model
The t statistics in Table 4a for the coefficients of random effects model are calculated by 2
methods. First, weighted averages of the t statistics for the 12 sectors for each coefficient are
calculated using (22). Then the t statistics are calculated using the Swamy (1970) variance-
covariance matrix given by (25).
The estimation of the Jensen (1968) alpha (constant term) for the random effects model is
slightly negative but insignificant for both FGLS and GMMd. The insignificance of alpha is an
indicator of market efficiency. The beta coefficient for the market factor RM – Rf is close to 1 for
both FGLS and GMMd. Thus, the 12-sector portfolio has essentially the same relative market
risk as the market itself and has no abnormal or superior return. This suggests that the market
portfolio should be the preferred investment vehicle, as it can be cost effectively obtained from
either index mutual funds or exchange traded funds (ETFs).
For SMB, the t values are insignificant for both FGLS and GMMd and for both methods
of calculating t. For HML, all results are also insignificant except the weighted average t for
FGLS is significant at the 10% level.
Using FGLS, the new FF RMW factor is positive and significant at the 1% level using the
weighted average t and at the 10% level using the Swamy variance-covariance matrix. For
GMMd, RMW is insignificant using the weighted average t and significant at only the 20% level
using the Swamy variance-covariance matrix. These coefficients are 0.1674 for FGLS and
0.2704 for GMMd. These values are much bigger than the insignificant SMB and HML values
that ranged from 0.0217 to 0.1093. Therefore, robust profitability firms (RMW) do seem to have
some explanatory power for the 12-sector portfolio returns.
Meanwhile, conservative firms (CMA) do not seem to explain much of the 12-sector
portfolio returns with an FGLS coefficient of 0.0741 and a t that is almost significant at the 20%
level for the weighted average method. However, the t value is insignificant for the Swamy
method, and GMMd yields insignificant results.
28
7.7 The Random Effects estimation of the FF new six-factor model
For the six-factor model using FGLS, the coefficients of the FF five factors in the twelve-sector
FF equally weighted portfolio are imperceptibly different from their values in the five-factor
model (see Tables 4a and 4b). The t values have the same levels of significance, except for the
HML coefficient which has improved from the 10% level to the 5% level using the weighted
average method for calculating t.
Looking at the investment performance of the FF twelve-sector portfolio, the Jensen
(1968) performance measure is negative but insignificant even at the 20% level36
. Using GMMd,
it appears that the portfolio is weighted towards stocks that are large cap (SMB, - 0.0504), high
book to market (HML, 0.1720), robust profitability (RMW, 0.1338), conservative investment
(CMA, 0.2227), and illiquid (LIQ, 0.1170). These results seem consistent with our previous
Tobin Q and investment perspective. Normally, one expects large cap stocks to be liquid and
hence, the LIQ coefficient should not be significantly different from 0 or possibly significantly
negative. Here, we find that it is barely significant at the 20% using GMMd and the Swamy
variance-covariance matrix. Perhaps this is an effect of the 2007-2009 financial crisis when even
large cap stocks were somewhat illiquid.
7.8 F test for the fixed effects versus the random effects models
When testing the fixed effects model over the pooled one, the F test rejects the pooled regression
approach. The F test is given by (e.g., Greene, 2012)
2 2
2
/ 11,
1 /
LSDV Pooled
LSDV
R R NF N NT N k
R NT N k
(43)
where 2LSDVR is the coefficient of determination for the least squares dummy variables regression,
2PooledR is the coefficient of determination for the pooled regression, N is the number of sectors,
T is the number of months, and k is the number of regressors. Table 5 provides the F values for
the five and six factor models using OLS and GMMd estimation methods.
36 The t value is positive for the weighted average approach using FGLS even though the alpha is negative because
in this particular case the weighted summation of the sectors with positive t values outweighs the magnitude of the
weighted summation of the sectors with negative t values.
29
Insert Table 5 here
Note that all the F tests are significant at the 1% level, which means for the 5 and 6 risk factor
models, the pooled model is rejected in favor of the fixed effects model using either OLS or
GMMd estimation methods37
.
7.9 Hausman H test for the fixed effects versus the random effects models38
While the F tests let us draw conclusions about the fixed effects model, it cannot help
discriminate between the fixed and random effects models. The Hausman (1978) test is
particularly well adapted to discriminate between models that needs to be chosen, in our case, the
fixed effects versus the random effects models. The Hausman test statistic is chi-squared
distributed with k-1 degrees of freedom and is given by (40). Intuitively, the H test is a quadratic
distance weighted by its variance, the distance being between the fixed and random effects
estimations. Turning to our result, Table 5 shows that the Hausman test cannot reject the random
effects model using either OLS or GMMd and for the 5 and 6 risk factor models39
. Thus, the
fixed effects model is rejected.
8. Conclusions
We find that using OLS or LSDV estimation, the new Fama and French (2015) five factors are
highly significant. However, adding to this model the illiquidity factor of Pastor and Stambaugh
(2003) does not provide more explanatory power to the new FF model.
When applying the GMM approach proposed in this paper to either the FF five-factor or
augmented six-factor models, a different picture emerges. In the five-factor model, only the
market risk factor at the 1% level and the profitability RMW factor at the 5% level are significant
for the fixed effects model, with the Hausman auxiliary regression showing significant
measurement errors for RMW. Turning to the random effects model, the market factor is again
significant at the 1% level; whereas, the RMW factor falls to the non-standard 20% level.
Adding the PS illiquidity factor to the FF 5 factors changes the conclusions in the GMMd
universe. Except for the market risk factor, none of the factors is significant at the standard level.
37 Note that the critical value for F test in either model is 1.54. 38 Note that we did not perform the auxiliary regression version of the test. This is because we have repeated
obsevations of the regerssors, therefore rendering the test difficult to apply for this financial application. We
therefore rely on the Hausman test. 39The critical value for the chi-squared distribution of the H test is in our case 11.07 or12.59, respectively, for the
five and six risk factor models.
30
This result is consistent with MacKinlay (1995). The illiquidity factor, however, could be
considered significant if we lower the bar to the 20% level. Note, though, that the illiquidity
factor is measured with significant errors using the Hausman auxiliary regression test. Also note
when using this test, the investment policy CMA factor becomes significant at 10%.
For the fixed effects model, we find that the Jensen performance measure alpha is
negative and significant for the FF twelve-sector pooled augmented six-factor model using our
GMMd approach. Furthermore, although alpha is not significant for both the pooled OLS or
GMMd five-factor model or the pooled OLS augmented six-factor model, the coefficient
nevertheless has a negative sign. While markets may be ex-ante efficient and not ex-post, this
result shows ex-post that the twelve-sector portfolio is somewhat inefficient. Therefore, investors
would be better off holding the market portfolio, rather than this one. Turning to the random
effects model, the alpha is also negative, but insignificant for both the FGLS or GMMd
approaches and for both the five-factor or augmented six-factor models.
Because the FF model is theoretically firmly grounded, we believe that it is still capable
of explaining returns sufficiently well, even in the light of our mitigating results. Further
research, for example on hedge fund returns that are likely to have even higher levels of
skewness and kurtosis than equities, might well show the value of using the FF factors in our
higher moments GMMd approach.
31
References
Abel, A.B., 1983. Optimal investment under uncertainty, American Economic Review, 73, 228-
233.
Adrian, T., Fleming, M., Shachar, O., Vogt, E., 2017. Market liquidity after the financial crisis,
Annual Review of Financial Economics, 9, 43-83.
Aitken, A.C., 1935. On least squares and linear combinations of observations, Proceedings of the
Royal Statistical Society, 55, 42-48.
Anderson, T., Rubin, H., 1949. Estimation of the parameters of a single equation in a complete
system of stochastic equations, Annals of Mathematica Statistics, 20, 46-63.
Anderson, T., Rubin, H., 1950. The asymptotic properties of the parameters of a single equation
in a complete system of stochastic equations, Annals of Mathematica Statistics, 21, 570-582.
Arellano, M. 2003. Panel Data Econometrics, Oxford University Press, New York.
Arellano, M., Bond, S., 1991. Some tests of specification for panel data: Monte Carlo evidence
and an application to employment equations, Review of Economics Studies, 58, 277-297.
Baltagi, B., 2001. Econometric Analysis of Panel Data, 2rd
ed., John Wiley & Sons, New York.
Bellman, R., 1957. Dynamic Programming, Princeton University Press, Princeton, NJ.
Benninga, 2014. Financial Modeling, 4th
ed., MIT Press, Cambridge, MA.
Barillas, F., Shanken, J., 2015. Comparing asset pricing models, NBER Working Paper 21771,
Cambridge, MA.
Black, F., 1972. Capital market equilibrium with restricted borrowing, Journal of Business, 45,
444-455.
Bianchi, M.L., 2015. Are the log-returns of Italian open-end mutual funds normally distributed?
A risk assessment perspective, Journal of Asset Management, 16, 437–449.
Bound, J., Jaeger, D., Baker, R. 1995. Problems with instrumental variables estimation when the
correlation between the instruments and the endogenous explanatory variables is weak, Journal
of the American Statistical Association, 90, 443-450.
Campbell, J., Lo, A., MacKinlay, A., 1997. The Econometrics of Financial Markets, Princeton
University Press, Princeton, NJ.
Carhart, M., 1997. On the persistence in mutual fund performance, Journal of Finance, 52, 57-
82.
Chow, G.C., 1997. Dynamic Economics: Optimization by the Lagrange Method, Oxford
University Press, New York.
Cobb, C.M., Douglas, P.H., 1938. A theory of production, American Economic Review
Supplement, 18, 139-165.
32
Cochrane, D., Orcutt, G., 1949. Application of the least squares regression to relationships
containing autocorrelated error terms, Journal of the American Statistical Association, 44, 32-61.
Cochrane, J.H., 1991. Production-based asset pricing and the link between stock returns and
economic fluctuations, Journal of Finance, 46, 209-237.
Cochrane, J.H., 2005. Asset Pricing, revised ed., Princeton University Press, Princeton, NJ.
Cochrane, J.H., 2008. The dog that did not bark: A defense of return predictability, Review of
Financial Studies, 21, 1533-1575.
Cochrane, J.H., 2011. Presidential address: Discount rates, Journal of Finance, 66, 1047-1108.
Dagenais, M.G., Dagenais, D.L., 1994. GMM estimators for linear regression models with errors
in the variables, Centre de recherche et développement en économique (CRDE), Working Paper
0594, University of Montreal, Montreal, QC.
Dagenais, M.G., Dagenais, D.L., 1997. Higher moments estimators for linear regression models
with errors in the variables, Journal of Econometrics, 76, 193-221.
Driscoll, J.C., Kraay, A.C., 1998. Consistent covariance matrix estimation with spatially
dependent panel data, Review of Economics and Statistics, 80, 546-560.
Dufour, J.M., 2003. Identification, weak instruments and statistical inference in econometrics,
Working Paper No. 2003s-49, CIRANO, University of Montreal, Montreal, QC.
Durbin, J., 1954. Errors in variables, International Statistical Review, 22(1/3), 23-32.
Fama, E.F., 1963. Mandelbrot and the stable Paretian hypothesis, Journal of Business, 36 , 420-
429.
Fama, E.F., 1965. Portfolio analysis in a stable Paretian market, Management Science, 11, 404-
419.
Fama, E.F., French K.R., 1992. The cross-section of expected stock returns, Journal of Finance,
47, 427-465.
Fama, E.F., French K.R., 1993. Common risk factors in the returns of stocks and bonds, Journal
of Financial Economics, 33, 3-56.
Fama, E.F., French, K.R., 2015. A five-factor asset pricing model, Journal of Financial
Economics, 116, 1-22.
Fama, E.F., McBeth, J.D. 1973. Risk, return, and equilibrium: empirical tests, Journal of
Political Economy, 81, 607-636
Greene, W.H., 2012. Econometric Analysis, 7th
ed., Pearson Education, Inc., Boston, MA.
33
Greene, W.H., 2015. Class notes on The Econometric Analysis of Panel Data, Class 9, Stern
School of Business, New York University, New York, Winter 2015. Available at
http://people.stern.nyu.edu/wgreene/Econometrics/PanelDataNotes-9.ppt
Hahn, J., Hausman, J., 2002. A new specification test for the validity of instrumental variables,
Econometrica, 70, 163-189.
Hahn, J., Hausman, J., 2003. Weak instruments: Diagnosis and cures in empirical economics,
American Economic Review, 93, 118-125.
Hansen, L., 1982. Large sample properties of the generalized method of moments estimators,
Econometrica, 50, 1029-1054.
Hansen, L., Singleton, 1982. Generalized instrumental variables estimation of nonlinear rational
expectations models, Econometrica, 50, 1269-1286.
Hansen, L., Singleton, 1984. Generalized instrumental variables estimation of nonlinear rational
expectations models: Errata, Econometrica, 52, 267-268.
Haug, E.S., 2007. Derivatives Models on Models, Wiley, Chichester, England.
Hausman, J.,1978. Specification tests in econometrics, Econometrica, 46(6), 1251-1271.
Hausman, J., Taylor, W., 1981. Panel data and unobservable individual effects, Econometrica,
49, 1377-1398.
Hausman, J., Stock, J., Yogo, M., 2005. Asymptotic properties of the Hahn-Hausman test for
weak instruments, Economics Letters, 89, 333-342.
Heij, C., de Boer, P., Franses, P.H., Kloek, T., van Dijk, H.K., 2004. Econometric Methods with
Applications in Business and Economics, Oxford University Press, Oxford, England.
Hou, K., Xue, C., Zhang, L., 2015. Digesting anomalies: An investment approach, Review of
Financial Studies, 28, 650-705.
Jarque, C.M., Bera, A.K., 1980. Efficient tests for normality, homoscedasticity and serial
independence of regression residuals, Economics Letters, 6, 255-259.
Jensen, M.C., 1968. The performance of mutual funds in the period 1945-64, Journal of Finance,
23, 389-416.
Jurczenko, E., Maillet, B., 2006. The four-moment capital asset pricing model: Between asset
pricing and asset allocation. In Multi-Moment Asset Allocation and Pricing Models (eds.
Jurczenko, E., Maillet, B.), John Wiley & Sons, Chichester, England, Ch. 6.
Lin, X., Zhang, L., 2013. The investment manifesto, Journal of Monetary Economics, 60, 351-
366
Lintner, J., 1965. The valuation of risk assets and the selection of risky investments in stock
portfolios and capital budgets, Review of Economics and Statistics, 46, 13-37.
34
MacKinlay, A.C., 1995. Multifactor models do not explain deviations from the CAPM, Journal
of Financial Economics, 38, 3-28.
Mandelbrodt, B., 1963. The variation of certain speculative prices, Journal of Business, 36, 394-
419.
Mandelbrodt, B., 1972. Correction of an error in “The variation of certain speculative prices”,
Journal of Business, 45, 542-543.
Markowitz, H., 1959. Portfolio Selection: Efficient Diversification of Investments, Cowles
Foundation Monograph 16, John Wiley & Sons, New York.
Markowitz, H., 2012. The “Great Confusion” concerning MPT, Aestimatio, The IEB
International Journal of Finance, 4, 8-27.
Mehra, R., Prescott, E., 1985. The equity risk premium: A puzzle, Journal of Monetary
Economics, 15, 145-161.
Merton, R.C., 1973. An intertemporal capital asset pricing model, Econometrica, 41, 867-887
Miller, M.H., Modigliani, F., 1961. Dividend policy, growth, and the valuation of shares, Journal
of Business, 34, 411-433.
Mitchell, W.C., 1915. The making and using of index numbers, In Introduction to Index
Numbers and Wholesales Prices in the United States and Foreign Countries, Bulletin No. 173,
U.S. Bureau of Labor Statistics.
Mossin, 1966. Equilibrium in a capital asset market, Econometrica, 34, 768-783.
Mundlak, Y., 1978. On the pooling of time series and cross section data, Econometrica, 46, 69-
85.
Nelson, C., Startz, R., 1990a. Some further results on the exact small sample properties of the
instrumental variables estimator, Econometrica, 58, 967-976.
Nelson, C., Startz, R., 1990b. The distribution of the instrumental variables estimator and its t-
ratio with the instrument is a poor one, Journal of Business, 63, S125-S140
Newey, W.K., West, K.D., 1987. A simple, positive semi-definite, heteroskedasticity and
autocorrelation consistent covariance matrix, Econometrica, 55, 703-708.
Olea, J.L.M., Pflueger, C., 2013. A robust test of weak instruments, Journal of Business and
Economic Statistics, 31, 358–369.
Pagan, A.R., 1984. Econometric issues in the analysis of regressions with generated regressors,
International Economic Review, 25, 221-247.
Pagan, A.R., 1986. Two stage and related estimators and their applications, Review of Economic
Studies, 53, 517-538.
Pagan, A.R., Ullah A., 1988. The econometric analysis of models with risk terms, Journal of
Applied Econometrics, 3, 87-105.
35
Pal, M., 1980. Consistent moment estimators of regression coefficients in the presence of errors
in variables, Journal of Econometrics, 14, 349-364
Pástor, L., Stambaugh, R.F., 2003. Liquidity risk and expected stock returns, Journal of Political
Economy, 111, 642-685.
Pinto, J.E., Henry, E., Robinson, T.R., Stowe, J.D., 2015. Equity Valuation, 3nd
ed., John Wiley
& Sons, New York.
Pindyck, R., Rubinfeld, D., 1998. Econometric Models and Economic Forecasts, 4th
ed., Irwin
McGraw-Hill, Boston, MA.
Racicot, F.E., 2015. Engineering robust instruments for panel data regression models with errors
in variables: A note, Applied Economics, 47, 981-989.
Ross, S.A., 1976. The arbitrage theory of capital asset pricing, Journal of Economic Theory, 13,
341-360.
Rubinstein, M., 1973. The fundamental theorem of parameter-preference security valuation.
Journal of Financial and Quantitative Analysis, 8, 61-69.
Schmidheiny, K., 2015. Panel data: Fixed and random effects, short guides to
microeconometrics, Unversität Basel, Basel, Switzerland. Available at: http://www.schmidheiny.name/teaching/panel2up.pdf
Schopflocher, T.P., Sullivan, P.J., 2005. The relashionship between skweness and kurtosis of a
diffusing scalar, Boundary-Layer Meteorology, 115, 341–358.
Shanken, J., 1992. On the estimation of beta-pricing models, Journal of Finance, 47, 1-34.
Sharpe, W.F., 1964. Capital asset prices: A theory of market equilibrium under conditions of
risk, Journal of Finance, 19, 425-442.
Spencer, D., Berk, K., 1981. A limited information specification test, Econometrica, 49, 1079-
1085.
Stock, J., Yogo, M., 2005. Testing for weak instruments in linear IV regression. In Identification
and Inference in Econometrics: Festschrift in Honor of Thomas Rothenberg (eds. Stock, J.,
Andrews, D.), Cambridge University Press, Cambridge, England, 80-108.
Stock, J., Watson, M., 2011. Introductory Econometrics, 3rd
ed., Pearson Education, Inc.,
Boston, MA.
Swamy, P., 1970. Efficient inference in a random coefficient regression model, Econometrica,
38, 311-323.
Theil, H., Goldberger, A.S., 1961. On pure and mixed estimation in economics, International
Economic Review, 2, 65-78.
Tobin, J., 1969. A general equilibrium approach to monetary theory, Journal of Money, Credit
and Banking, 1, 15-29.
36
White, H., 1980. A heteroscedasticity-consistent covariance matrix estimator and a direct test for
heteroscedasticity, Econometrica, 48, 817-838.
Wilkins, J.E., 1944. A note on skewness and kurtosis, The Annals of Mathematical Statistics, 15,
333-335.
Wooldridge, J.M., 2002. Econometric Analysis of Cross Section and Panel Data, MIT Press,
Cambridge, MA.
37
Table 1
Fama and French 12 sectors 1968m01 – 2014m12
Notes: For each sector, there are 47 years x 12 months = 564 monthly observations for a total of 12 sectors x 564 = 6,768 for the panel data set.
Table 2
Fama-French (2015) and Pástor-Stambaugh risk factors 1968m01 - 2014m12
Notes: For each sector, there are 564 observations for a total of 6,768 for the panel data set. Here the panel data contain the 564 monthly
observations, repeated for each of the 12 sectors.
Mean Median Max Min Std. Dev. Skewness Kurtosis Jarque-Bera
1 Nodur 1.10 1.13 18.88 -21.03 4.40 -0.28 4.99 100.20
2 Durbl 0.87 0.84 42.62 -32.63 6.45 0.12 7.81 544.55
3 Manuf 0.97 1.27 21.08 -28.58 5.41 -0.51 5.64 188.14
4 Enrgy 1.06 1.03 24.56 -18.33 5.57 0.02 4.25 36.81
5 Chems 0.96 1.10 20.22 -24.59 4.72 -0.25 5.20 119.11
6 Buseq 0.89 0.87 20.75 -26.07 6.68 -0.20 4.20 37.51
7 Telcm 0.96 1.18 21.34 -16.22 4.75 -0.25 4.23 41.21
8 Utils 0.91 0.96 18.84 -12.65 4.13 -0.12 4.01 25.37
9 Shops 1.05 1.11 25.85 -28.25 5.34 -0.27 5.32 133.58
10 Hlth 1.07 1.11 29.52 -20.46 4.96 0.07 5.51 148.03
11 Money 1.02 1.33 21.10 -22.10 5.60 -0.42 4.64 79.21
12 Other 0.78 1.13 19.35 -29.24 5.54 -0.49 5.14 130.16
Panel data 0.97 1.10 42.62 -32.63 5.34 -0.21 5.68 2075.57
Mean Median Max Min Std. Dev. Skewness Kurtosis Jarque-Bera
R m -R f 0.49 0.86 16.10 -23.24 4.58 -0.52 4.78 100.05
SMB 0.20 0.04 19.05 -15.26 3.09 0.43 6.80 356.73
HML 0.38 0.34 13.89 -12.61 2.95 0.01 5.38 133.54
RMW 0.27 0.19 12.24 -17.60 2.20 -0.43 14.17 2951.29
CMA 0.37 0.25 8.93 -6.76 2.00 0.31 4.26 46.41
LIQ 0.43 0.20 21.46 -10.80 3.52 0.44 5.64 181.62
38
Table 3a
Fixed Effects Model, LSDV vs. GMMd estimation methods for the FF five-factor model by FF 12 sectors
Notes: LSDV is the least squares dummy variable method for estimating the alpha and beta for the FF 12 sectors. GMMd is the generalized
method of moments method using our robust distance instruments given in (34) with the Newey-West (1987) HAC variance-covariance
estimator. *** indicates significance at 1%; **, 5%; and *, 10%. 2R is the adjusted coefficient of determination, and DW is the Durbin-Watson
statistic for autocorrelation of order 1. Hausd is the Hausman (1978) artificial regression test for measurement errors using our robust distance
instruments. ƗThe pooled OLS and GMMd are equivalent to an average of the LSDV/GMMd estimations.
c R m-R f SMB HML RMW CMA DW
Sector Fama-French (2015)
1 NoDur LSDV -0.0025 0.8626
t-stat -1.90* 44.49***
GMMd -0.0045 0.9071
t-stat -0.50 6.15***
2 Durbl LSDV 0.0053 1.1039
t-stat 2.96*** 41.28***
GMMd -0.0122 0.8310
t-stat -0.95 3.77***
3 Manuf LSDV 0.0054 1.0702
t-stat 5.48*** 72.66***
GMMd 0.0060 1.0805
t-stat 1.32 17.61***
4 Enrgy LSDV 0.0000 0.9399
t-stat 0.00 29.11***
GMMd 0.0224 1.4261
t-stat 1.69* 7.42***
5 Chems LSDV 0.0011 0.9613
t-stat 0.86 50.37***
GMMd -0.0105 0.8892
t-stat -0.74 3.94***
6 BusEq LSDV 0.0097 1.0921
t-stat 5.97*** 44.84***
GMMd -0.0024 0.7924
t-stat -0.25 4.93***
7 Telcm LSDV -0.0038 0.8616
t-stat -2.27** 34.02***
GMMd -0.0099 0.5862
t-stat -0.97 3.29***
8 Utils LSDV -0.0110 0.7529
t-stat -6.26*** 28.62***
GMMd -0.0116 0.7717
t-stat -1.14 5.45***
9 Shops LSDV 0.0019 0.9510
t-stat 1.32 44.42***
GMMd 0.0007 0.9365
t-stat 0.09 8.12***
10 Hlth LSDV 0.0011 0.8947
t-stat 0.69 36.51***
GMMd -0.0032 0.9662
t-stat -0.24 4.22***
11 Money LSDV 0.0041 1.0494
t-stat 2.97*** 50.68***
GMMd 0.0069 1.0363
t-stat 0.96 8.75***
12 Other LSDV 0.0023 1.0382
t-stat 2.19** 66.93***
GMMd 0.0046 1.1466
t-stat 0.81 13.73***
Pooled Mode l
OLS -0.0492Ɨ
0.9928Ɨ
0.0216 0.1091 0.1671 0.0748 0.69 1.94
t-stat 1.02 35.82*** 1.67* 6.09*** 8.97*** 2.69***
GMMd -0.0614Ɨ
0.9374Ɨ
0.0906 -0.0016 0.3976 0.0901 0.67 1.93
t-stat -1.03 7.67*** 0.97 -0.01 2.04** 0.47
Hausd -0.0238Ɨ
0.9374Ɨ
0.0906 -0.0016 0.3976 0.0901 0.0542 -0.0670 0.0927 -0.2805 -0.0292 0.69 1.94
-0.99 11.66*** 1.53 -0.02 3.40*** 0.84 1.24 -1.10 1.03 -2.36** -0.26
39
Table 3b
Fixed Effects Model, LSDV vs. GMMd estimation methods for the augmented (LIQ) FF six-factor model by FF 12 sectors
Notes: LSDV is the least squares dummy variable method for estimating the alpha and beta for the FF 12 sectors. GMMd is the generalized
method of moments method using our robust distance instruments given in (34) with the Newey-West (1987) HAC variance-covariance
estimator. *** indicates significance at 1%; **, 5%; and *, 10%. 2R is the adjusted coefficient of determination and DW is the Durbin-Watson
statistic for autocorrelation of order 1. Hausd is the Hausman (1978) artificial regression test for measurement errors using our robust distance
instruments. ƗThe pooled OLS and GMMd are equivalent to an average of the LSDV/GMMd estimations.
c R m-R f SMB HML RMW CMA LIQ DW
Sector Fama-French (2015) and Pastor-Stambaugh (2003)
1 NoDur LSDV -0.0025 0.8626
t-stat -1.90* 44.49***
GMMd -0.0045 0.9071
t-stat -0.50 6.15***
2 Durbl LSDV 0.0053 1.1039
t-stat 2.96*** 41.28***
GMMd -0.0122 0.8310
t-stat -0.95 3.77***
3 Manuf LSDV 0.0054 1.0702
t-stat 5.48*** 72.66***
GMMd 0.0060 1.0805
t-stat 1.32 17.61***
4 Enrgy LSDV 0.0000 0.9399
t-stat 0.00 29.11***
GMMd 0.0224 1.4261
t-stat 1.69 7.42***
5 Chems LSDV 0.0011 0.9613
t-stat 0.86 50.37***
GMMd -0.0105 0.8892
t-stat -0.74 3.94***
6 BusEq LSDV 0.0097 1.0921
t-stat 5.97*** 44.84***
GMMd -0.0024 0.7924
t-stat -0.25 4.93***
7 Telcm LSDV -0.0038 0.8616
t-stat -2.27** 34.02***
GMMd -0.0099 0.5862
t-stat -0.97 3.29***
8 Utils LSDV -0.0110 0.7529
t-stat -6.26*** 28.62***
GMMd -0.0116 0.7717
t-stat -1.14 5.45***
9 Shops LSDV 0.0019 0.9510
t-stat 1.32 44.42***
GMMd 0.0007 0.9365
t-stat 0.09 8.12***
10 Hlth LSDV 0.0011 0.8947
t-stat 0.69 36.51***
GMMd -0.0032 0.9662
t-stat -0.24 4.22***
11 Money LSDV 0.0041 1.0494
t-stat 2.97*** 50.68***
GMMd 0.0069 1.0363
t-stat 0.96 8.75***
12 Other LSDV 0.0023 1.0382
t-stat 2.19** 66.93***
GMMd 0.0046 1.1466
t-stat 0.81 13.73***
Pooled Mode l
OLS -0.0535Ɨ
0.9932Ɨ
0.0218 0.1090 0.1669 0.0748 0.0098 0.69 1.94
t-stat -1.03 35.83*** 1.68* 6.08*** 8.96*** 2.69*** 0.95
GMMd -0.1653Ɨ
1.0148Ɨ
-0.0501 0.1697 0.1345 0.2242 0.1152 0.65 1.89
t-stat -1.99** 7.77*** -0.39 0.79 0.52 1.20 1.34
Hausd -0.0256Ɨ
1.0148Ɨ
-0.0501 0.1697 0.1345 0.2242 0.1152 -0.0232 0.0740 -0.0790 -0.0179 -0.1632 -0.1105 0.69 1.94
-0.99 11.76*** -0.61 1.52 0.85 1.86* 2.46** -1.16 0.89 -0.69 -0.11 -1.32 -2.30**
40
Table 4a Random Effects Model, OLS vs. GMMd estimation methods for the FF five-factor model by FF 12 sectors
Notes: FGLS is calculated using (23) for the random coefficient model. t-stat is calculated first as a Swamy (1970) weighted
average of the OLS sector t-stats using (22) and then using the estimated Swamy variance-covariance matrix given by (25). GMMd
is the generalized method of moments using our robust distance instruments given in (34) with the Newey-West (1987).
HAC variance-covariance estimator for the random coefficient model. *** indicates significance at 1%; **, 5%; and *, 10%. 2R
is the adjusted coefficient of determination and DW is the Durbin-Watson statistic for autocorrelation of order 1.
c R m-R f SMB HML RMW CMA DW
Sector Fama-French (2015)
1 NoDur OLS -0.0990 0.9100 0.0981 -0.0311 0.6417 0.4210 0.78 1.93
t-stat -1.08 41.74*** 3.15*** -0.72 14.36*** 6.32***
GMMd -0.1668 0.8790 0.2595 -0.0114 0.7132 0.4862 0.77 1.92
t-stat -1.33 8.32*** 1.27 -0.03 1.64* 1.68*
2 Durbl OLS -0.4367 1.2217 0.2181 0.5399 0.1692 0.0049 0.71 2.09
t-stat -2.79*** 32.95*** 4.12*** 7.39*** 2.23** 0.04
GMMd -0.4989 1.5280 -0.4635 1.5563 -0.9739 -0.0720 0.37 1.89
t-stat -1.60 6.43*** -0.90 1.45 -0.76 -0.13
3 Manuf OLS -0.1732 1.1424 0.1486 0.1724 0.2519 0.0255 0.89 2.03
t-stat -2.14** 59.48*** 5.42*** 4.56*** 6.40*** 0.43
GMMd -0.4176 1.3581 -0.1281 0.2938 0.0129 0.5937 0.80 1.94
t-stat -3.08*** 12.56*** -0.52 0.69 0.02 2.03**
4 Enrgy OLS 0.0744 0.9163 -0.1986 0.2310 0.0376 0.1657 0.46 1.90
t-stat 0.41 21.17*** -3.21*** 2.71*** 0.42 1.25
GMMd -0.0494 0.8585 0.1868 -1.2231 1.4670 0.8104 -0.12 1.70
t-stat -0.12 2.66*** 0.32 -1.45 1.28 1.13
5 Chems OLS -0.1995 1.0032 -0.0332 0.0266 0.4751 0.3349 0.79 2.04
t-stat -2.07** 43.83*** -1.02 0.59 10.13*** 4.79***
GMMd -0.3843 1.0222 0.0105 -0.1575 0.5345 0.9243 0.76 1.93
t-stat -2.33** 6.51*** 0.05 -0.52 1.29 2.62***
6 BusEq OLS 0.3866 1.0386 0.0854 -0.4199 -0.4180 -0.4395 0.83 2.01
t-stat 3.16*** 35.83*** 2.06** -7.35*** -7.04*** -4.97***
GMMd 0.4964 1.0529 -0.0379 0.0425 -0.7035 -0.9481 0.81 2.04
t-stat 2.65*** 7.21*** -0.14 0.09 -1.32 -2.27**
7 Telcm OLS 0.2114 0.8328 -0.2658 0.1101 -0.3185 0.0503 0.61 1.99
t-stat 1.59 26.40*** -5.90*** 1.77* -4.93*** 0.52
GMMd 0.6862 0.5372 -0.1448 0.5532 -0.3034 -1.3553 0.42 1.92
t-stat 3.34*** 3.84*** -0.49 1.25 -0.55 -2.73***
8 Utils OLS 0.0163 0.6564 -0.1648 0.4404 -0.0257 0.0717 0.45 1.95
t-stat 0.12 20.33*** -3.57*** 6.92*** -0.39 0.73
GMMd -0.0456 0.3437 0.3370 -0.2393 0.8909 0.4089 0.03 1.81
t-stat -0.16 1.36 0.75 -0.37 1.09 0.61
9 Shops OLS -0.1053 1.0279 0.2837 0.0045 0.5588 0.0749 0.80 1.85
t-stat -0.98 40.33*** 7.80*** 0.09 10.70*** 0.96
GMMd -0.1076 1.0142 0.3085 0.3837 0.4526 -0.2201 0.77 1.76
t-stat -0.49 5.49*** 1.23 0.76 0.78 -0.61
10 Hlth OLS 0.2070 0.8685 -0.1874 -0.4582 0.3768 0.3656 0.65 2.12
t-stat 1.58 28.02*** -4.24*** -7.50*** 5.93*** 3.86***
GMMd -0.1778 0.8985 0.0447 -0.8786 0.8722 1.3019 0.56 2.04
t-stat -0.97 5.89*** 0.14 -1.33 1.20 2.94***
11 Money OLS -0.1489 1.1757 -0.0308 0.5757 0.1178 -0.1812 0.84 1.89
t-stat -1.48 49.24*** -0.90 12.24*** 2.41** -2.48**
GMMd -0.0493 1.1164 -0.1781 0.6553 0.0337 -0.3112 0.83 1.78
t-stat -0.33 8.14*** -0.81 1.49 0.07 -1.06
12 Other OLS -0.3233 1.1206 0.3063 0.1181 0.1386 0.0033 0.91 1.97
t-stat -4.30*** 62.90*** 12.04*** 3.36*** 3.80*** 0.06
GMMd -0.4614 1.2232 0.1731 0.1156 0.2502 0.2327 0.89 1.92
t-stat -4.45*** 14.07*** 1.04 0.47 0.81 0.90
Random Effects Model : Swamy's weighted average
FGLS -0.0516 0.9928 0.0217 0.1093 0.1674 0.0741 0.73 1.98
t-stat (weighted avg) 0.62 +38.56*** 1.34 1.89* 3.60*** 1.27
t-stat (Swamy) -0.75 21*** 0.38 1.16 1.79* 1.08
GMMd -0.0956 0.9859 0.0312 0.0909 0.2704 0.1544 0.58 1.89
t-stat (weighted avg) -0.62 6.87*** 0.17 0.21 0.47 0.42
t-stat (Swamy) -0.91 10.51*** 0.46 0.44 1.33 0.69
41
Table 4b
Random Effects Model, OLS vs GMMd estimation methods for the augmented (LIQ) FF six-factor model by FF 12 sectors
Notes: FGLS is calculated using (23) for the random coefficient model. t-stat is calculated first as a Swamy (1970) weighted
average of the OLS sector t-stats using (22) and then using the estimated Swamy variance-covariance matrix given by (25).
GMMd is the generalized method of moments using our robust distance instruments given in (34) with the Newey-West (1987)
HAC variance-covariance estimator for the random coefficient model. *** indicates significance at 1%; **, 5%; and *, 10%. 2R is the adjusted coefficient of determination, and DW is the Durbin-Watson statistic for autocorrelation of order 1.
c R m-R f SMB HML RMW CMA LIQ DW
Sector Fama-French (2015) and Pastor-Stambaugh (2003)
1 NoDur OLS -0.1115 0.9111 0.0985 -0.0315 0.6411 0.4212 0.0281 0.78 1.92
t-stat -1.20 41.76*** 3.16*** -0.73 14.35*** 6.33*** 1.14
GMMd -0.3596 0.9800 0.0522 0.1747 0.3783 0.7256 0.2648 0.67 1.82
t-stat -1.48 5.26*** 0.16 0.42 0.65 1.57 1.55
2 Durbl OLS -0.4743 1.2249 0.2192 0.5387 0.1675 0.0053 0.0845 0.72 2.02
t-stat -3.02*** 33.10*** 4.15*** 7.40*** 2.21** 0.05 2.02**
GMMd -1.0225 1.7739 -1.0884 2.1724 -2.0290 0.4932 0.8474 -0.74 1.78
t-stat -1.32 3.18*** -0.96 1.24 -0.89 0.40 1.38
3 Manuf OLS -0.1877 1.1436 0.1490 0.1719 0.2513 0.0256 0.0328 0.89 2.03
t-stat -2.30** 59.56*** 5.44*** 4.55*** 6.39*** 0.44 1.51
GMMd -0.5326 1.3985 -0.3152 0.5242 -0.3415 0.6645 0.2637 0.64 1.85
t-stat -1.72* 6.27*** -0.72 0.77 -0.39 1.26 1.10
4 Enrgy OLS 0.0467 0.9187 -0.1978 0.2301 0.0363 0.1660 0.0624 0.47 1.87
t-stat 0.25 21.23*** -3.20*** 2.70*** 0.41 1.26 1.27
GMMd 0.1995 0.7029 0.4206 -1.4060 1.8204 0.4497 -0.2548 0.00 1.68
t-stat 0.30 1.44 0.42 -1.10 1.00 0.42 -0.56
5 Chems OLS -0.2052 1.0036 -0.0331 0.0264 0.4748 0.3349 0.0128 0.79 2.02
t-stat -2.11** 43.78*** -1.01 0.59 10.12*** 4.79*** 0.49
GMMd -0.3527 0.9971 0.0431 -0.1892 0.5884 0.8752 -0.0228 0.76 1.96
t-stat -1.57 5.10*** 0.14 -0.50 1.03 2.15** -0.15
6 BusEq OLS 0.3762 1.0395 0.0857 -0.4202 -0.4185 -0.4394 0.0234 0.83 1.99
t-stat 3.05*** 35.82*** 2.07** -7.36*** -7.04*** -4.96*** 0.71
GMMd 0.3000 1.1705 -0.2669 0.2753 -1.0946 -0.7021 0.2530 0.71 1.98
t-stat 0.93 5.70*** -0.59 0.42 -1.27 -1.35 1.09
7 Telcm OLS 0.1978 0.8339 -0.2654 0.1096 -0.3192 0.0504 0.0305 0.61 1.96
t-stat 1.48 26.41*** -5.89*** 1.76* -4.94*** 0.52 0.85
GMMd 0.7119 0.5282 -0.0605 0.4142 -0.1145 -1.3413 -0.0960 0.37 1.87
t-stat 2.43** 2.75*** -0.14 0.70 -0.14 -2.21** -0.48
8 Utils OLS -0.0146 0.6590 -0.1639 0.4394 -0.0272 0.0721 0.0695 0.48 1.94
t-stat -0.11 20.44*** -3.56*** 6.92*** -0.41 0.73 1.90*
GMMd -0.1939 0.4174 0.1457 -0.0325 0.5525 0.5663 0.2407 0.24 1.86
t-stat -0.64 1.59 0.30 -0.05 0.61 0.93 1.04
9 Shops OLS -0.1227 1.0294 0.2842 0.0040 0.5579 0.0751 0.0391 0.80 1.90
t-stat -1.13 40.38*** 7.82*** 0.08 10.69*** 0.97 1.35
GMMd -0.3114 1.1150 0.0924 0.5706 0.1093 0.0275 0.2922 0.66 1.81
t-stat -0.88 4.03*** 0.22 1.03 0.14 0.05 1.44
10 Hlth OLS 0.2469 0.8651 -0.1886 -0.4569 0.3787 0.3651 -0.0899 0.67 2.09
t-stat 1.89* 28.03*** -4.28*** -7.52*** 5.99*** 3.88*** -2.57**
GMMd -0.0743 0.8456 0.1457 -0.9557 1.0233 1.1712 -0.1390 0.54 2.03
t-stat -0.24 4.09*** 0.29 -1.22 1.01 2.34** -0.49
11 Money OLS -0.1076 1.1723 -0.0320 0.5770 0.1198 -0.1817 -0.0930 0.84 1.93
t-stat -1.07 49.53*** -0.95 12.39*** 2.47** -2.52** -3.47***
GMMd 0.0687 1.0551 0.0276 0.3810 0.4393 -0.3959 -0.2368 0.79 1.91
t-stat 0.25 4.54*** 0.07 0.72 0.60 -0.87 -0.92
12 Other OLS -0.2866 1.1175 0.3052 0.1193 0.1404 0.0029 -0.0828 0.91 2.02
t-stat -3.84*** 63.58*** 12.17*** 3.45*** 3.90*** 0.05 -4.16***
GMMd -0.4166 1.1934 0.2026 0.1076 0.2820 0.1567 -0.0303 0.90 1.92
t-stat -3.20*** 11.17*** 0.97 0.40 0.75 0.59 -0.32
Random Effects Model : Swamy's weighted average
FGLS -0.0544 0.9934 0.0217 0.1091 0.1671 0.0750 0.0093 0.73 1.97
t-stat (weighted avg) 1.11 +38.38*** 1.30 2.19** 3.35*** 0.81 0.65
t-stat (Swamy) -0.79 21.08*** 0.38 1.16 1.78* 1.10 0.52
GMMd -0.1595 1.0140 -0.0504 0.1720 0.1338 0.2227 0.1170 0.52 1.87
t-stat (weighted avg) -0.45 4.58*** 0.01 0.25 0.25 0.44 0.40
t-stat (Swamy) -1.24 9.47*** -0.46 0.68 0.47 1.07 1.31
42
Table 5
Testing fixed effects versus random effects models
5 factors 6 factors
Pool/FE GMMd/FE Pool/FE GMMd/FE
F test 29.88 18.56 29.64 16.79
OLS
RE/FE
GMMd
RE/FE
OLS
RE/FE GMMd RE/FE
H test 0.00074 -2.04 0.0031 -0.0053
Notes: F test is a Fisher F test for testing the pooled versus the fixed effects models.
Pool/FE designates the pooled OLS versus LSDV fixed effects models. GMMd/FE
designates the pooled GMMd estimation method versus the fixed effects model
estimated via GMMd. H test is the Hausman test for testing fixed versus random
effects models. OLS RE/FE designates the FGLS for the random effects versus
the LSDV models. GMMd RE/FE designates the GMMd estimation method for the
random effects versus the fixed effects models.