does index arbitrage distort the market reaction to shocks?tensity of arbitrage allows to separately...
TRANSCRIPT
![Page 1: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/1.jpg)
Does Index Arbitrage Distort the Market Reaction to
Shocks?∗
Stanislav Anatolyev
CERGE-EI & NES
Sergei Seleznev
INECO Capital Ltd
Veronika Selezneva†
CERGE-EI
January 27, 2020
Abstract
We show that ETF arbitrage distorts the market reaction to fundamental shocks.
We con�rm this hypothesis by creating a new measure of the intensity of arbitrage
transactions at the individual stock level and using an event study analysis to estimate
the market reaction to economic shocks. Our measure of the intensity of arbitrage
is the probability of simultaneous trading of ETF shares with shares of underlying
stocks estimated using high frequency data. The new measure accounts for statistical
arbitrage, passive investment strategies, and netting of arbitrage positions over the day,
which the existing measures cannot do.
Keywords: high-frequency data, stock market, ETF, arbitrage intensity, oil shock, market
e�ciency
JEL classi�cations: G12, G14, G23, Q43
∗We thank Lubos Pastor, Carol Alexander, and seminar participants at the University of Sussex, and con-ference participants of NASMES-2019, CEMA-2019, and ESEM-2019 for helpful comments and suggestions.Grant GACR 17-27567S from the Czech Science Foundation is gratefully acknowledged.†Corresponding author, [email protected].
1
![Page 2: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/2.jpg)
1 Introduction
The recent growth of exchange traded funds (ETFs) has intensi�ed index trading by opening
access to the market for a broader set of market participants. Although the bene�ts of
increased participation are apparent, the side e�ects of the dramatic change in the market
structure are yet to be fully assessed.
Recent research on equity ETFs examines whether ETF arbitrage can damage the un-
derlying markets. The literature faces two main challenges: how to measure the intensity
of arbitrage transactions and how to assess distortions in the underlying markets. In this
paper, we innovate along both dimensions. We show that ETF arbitrage distorts the market
reaction to fundamental shocks. In order to do that, we create a new measure of arbitrage
intensity at the individual stock level.
A standard way to measure the intensity of ETF arbitrage at the stock level is to use
changes in ETF holdings. Ben-David et al. (2018) construct an ETF ownership measure
by multiplying assets under management by the index weight of the security.1 A similar
approach is taken in Brown et al. (2018), Israeli et al. (2017), Agarwal et al. (2018), Saglam
et al. (2019), and Brogaard et al. (2019). Currently ETF redemptions and creations data
are available even at daily frequency. However, changes in ETF positions may signi�cantly
underestimate arbitrage intensity. If positive and negative price deviations are equally likely
to occur, the net accumulated position of arbitrageurs over the day can be rather small
despite intensive arbitrage.2 Moreover, ETF arbitrage can also be performed by statistical
arbitrageurs who close their positions at market prices and thus do not participate in the
creation/redemption process. To the best of our knowledge, only Da and Shive (2018) take
a di�erent approach and use ETF turnover to measure arbitrage intensity, but only at the
ETF level.
We construct a direct measure of arbitrage intensity. We propose to estimate the prob-
ability of simultaneous trading of ETF shares with shares of underlying stocks using high
1This approach might not work for a sizable proportion of funds that replicate their target index byinvesting in a representative basket of securities rather than the entire index basket. Brogaard et al. (2019)document that at least 22% of ETFs replicate their target index, including six of the ten largest ETFs as of2018.
2Evans et al. (2019) also argue that the time period between the original mispricing/arbitrage event andthe corresponding creation event may be as large as six days, as authorized participants have incentives todelay ETF creation.
2
![Page 3: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/3.jpg)
frequency data. We are motivated by the mechanics of ETF arbitrage. All arbitrageurs, ir-
respective of their type, respond to a price deviation by simultaneously trading ETF shares
and the shares of underlying securities.3 Simultaneity is crucial to avoid taking the unneces-
sary risks of adverse market movements. Our measure can be computed over any interval of
the day and can account for statistical arbitrage. Thus, our approach delivers a more precise
measure of the intensity of arbitrage transactions compared to existing approaches in the
literature.
Another challenge is to document distortions in the underlying markets induced by ETFs.
The most explored channel is the propagation of new liquidity shocks, brought to the market
by ETFs' clientele and transmitted to underlying markets by ETF arbitrage. As a result, the
underlying markets can be characterized by excess volatility as shown by Ben-David et al.
(2018), and excess comovement (see Glosten et al. (2016), Israeli et al. (2017), Staer and
Sottile (2018), Da and Shive (2018)); �nally, Brown et al. (2018) show that ETF creations
and redemptions predict returns for both the underlying securities and the ETFs themselves.
The main di�culty in this line of research is to show that the volatility and/or comovement
of the underlying returns are not driven by the market reaction to market-wide news. A
few papers measure the impact of ETFs on price e�ciency on the underlying markets by
investigating the market response to earnings announcements. Israeli et al. (2017) document
that an increase in ETF ownership is associated with lower return responsiveness to earnings
and higher return synchronicity. In contrast, Glosten et al. (2016) �nd that an increase in
ETF ownership brings quarterly stock returns closer to earnings realized over the same
period.
We aim to document the distortion in the market reaction to fundamental shocks due
to ETF arbitrage. ETFs facilitate price discovery, hence when new information arrives it
can be incorporated into the ETF price �rst. As the ETF price deviates from the price
of the underlying basket, it triggers arbitrage transactions. By purchasing or selling the
underlying securities, arbitrageurs transmit the initial shock to the individual securities.
This mechanical pricing pressure can be substantial if arbitrage is intense while the liquidity
of the underlying market is low. As a result, the stock price response to the shock no longer
3An o�setting operation can be performed later, once the price di�erence has been eliminated. Au-thorized participants can exchange the baskets for units of ETF shares in an in-kind transaction via acreation/redemption mechanism.
3
![Page 4: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/4.jpg)
re�ects the true fundamental exposure.
We use event study methodology to identify exogenous fundamental oil shocks. Oil shocks
have an important advantage over standard macroeconomic news shocks by containing large
idiosyncratic component. An individual �rm's exposure to oil shocks depends mainly on the
nature of its business. Oil betas � betas from regressing a stock's return on the oil return
� of most non-oil related companies are naturally expected to be negative (at least in the
period before the shale boom). Intuitively, an increase in oil prices will raise input costs
for most businesses, as well as force consumers to spend more money on gasoline and less
on everything else. Although a systematic component may be present in the oil shocks,
the idiosyncrasy is likely to dominate. We use oil inventory announcements to identify oil
shocks. Oil inventory reports are released by the Department of Energy on a weekly basis
at a pre-speci�ed time and strongly a�ect the price of oil, thus providing us with a strong
and frequent signal.4
We use high frequency data on stocks and oil futures to estimate stock market sensitivity
to oil shocks. To measure arbitrage intensity, we sample data at a one-second frequency, and
estimate the probability of simultaneous trading of ETF shares with shares of underlying
stocks. Finally, we estimate a linear panel regression of stocks' sensitivity to oil shocks on
the probability of simultaneous trading. In our regression analysis, we include stock-level
�xed e�ects, sector by year �xed e�ects, various measures of the intensity of overall trading,
measures of liquidity and price impact, volatility, and variables that capture time-varying
�rm value and size. Our sample of �rms consists of all non-energy �rms comprising the S&P
500 index and covers the period from 2010 to 2016. We consider the SPDR S&P 500 ETF
(SPY) that tracks the S&P 500 index and represents the largest ETF in the world currently
managing $280 billion in assets under management.
Our results con�rm the main hypothesis that ETF arbitrage distorts the market reaction
to shocks. The �rms that are traded more often with SPY display a stronger reaction to oil
shocks, regardless of their fundamental exposure. The e�ect is economically and statistically
signi�cant; an increase in the probability of simultaneous trading from the �rst to the third
quantile is associated with an increase in the oil beta by 40% of the median oil beta.
One way to further test our mechanism is to show that the distortion is greater in ex-
4See Anatolyev et al. (2018) for institutional details and for references to other papers that use oilinventory announcements to identify oil shocks.
4
![Page 5: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/5.jpg)
actly those �rms where theory would predict that the e�ect of arbitrage transactions on
prices is likely to be stronger. A natural characteristic of a stock to explore is, of course,
liquidity. However, the presence of statistical arbitrage complicates the problem, as the same
characteristic can have opposite e�ects on the intensity of arbitrage transactions and on the
magnitude of distortion that the arbitrage transactions cause. Our direct measure of the in-
tensity of arbitrage allows to separately measure these two e�ects. We show that our results
are consistent with the arbitrage mechanism: the less liquid the stock is, the less likely it
will be included in the arbitrage portfolio, but if included, the distortion will be larger.
Although the percent increase in oil betas due to ETF arbitrage activity is large, one
can argue that the absolute increase in oil betas may not be large enough to be considered
meaningful for the overall market and have serious implications. It should be noted that our
empirical setup delivers the most conservative estimate of the e�ects of ETF arbitrage on the
market reaction to shocks, large part of the e�ect is picked up by our controls, especially by
the sector by year �xed e�ects. Moreover, there is nothing speci�c about oil shocks used in
our analysis, we focus on oil inventory news for identi�cation purposes only. While a median
exposure to oil shocks is relatively low, a median exposure to other fundamental shocks can
be much larger, and so can be the distortion.
There are two main challenges with our approach that open up our results to alternative
interpretations. First, there is a clear omitted variable in our regressions which is the true
oil beta. Second, our ability to clean our measure of arbitrage intensity from the intensity of
overall trading may be considered insu�cient. As a result, the positive correlation between
the estimated oil betas and the probability of simultaneous trading that we document can
potentially be driven by a positive relationship between the true oil betas and the intensity
of overall trading. Although in our regression analysis we do include a large number of
controls to proxy for the true oil beta and its variation over time, and we use a battery
of di�erent measures to control for the overall intensity of trading, one may not be fully
convinced, especially considering that the positive link between the volume of trade and stock
returns around public announcements can easily be generated by an asset pricing model with
heterogeneous prior beliefs. Intuitively, the larger is the announcement return, the larger
must be the revisions of agents beliefs, and thus the larger is the trading volume triggered
by rebalancing activity. We carefully address this important alternative interpretation of
our �ndings and run a number of tests to show that it is unlikely to drive our results.
5
![Page 6: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/6.jpg)
One particular advantage of our approach is our usage of oil shocks and oil betas (along
with a clean identi�cation of oil shocks) rather than macroeconomic shocks and market
betas. Oil exposure of most non-energy companies is likely to be negative (or zero). The
presence of negative oil betas allows us to directly test our mechanism against this alternative
explanation. If there is a link between true oil betas and the intensity of overall trading
it should be a negative relationship for a sizeable subset of the data, while our arbitrage
mechanism predicts a positive relationship uniformly.5 It should also be noted that we
calculate the probability of simultaneous trading over the entire year, and not just over the
short windows of time following oil inventory announcements.
Recent evidence of stale ETF pricing and correction of mispricings through quote adjust-
ment, not through arbitrage trading documented by Box et al. (2019), also raises a serious
concern against plausibility of our mechanism. However, we show that though some smaller
ETFs may indeed have negligible e�ects on the underlying markets, that is clearly not the
case for SPY. We show that SPY and the underlying portfolio react simultaneously to our oil
inventory announcements even at a one-second frequency, while Box et al. (2019) work with
a one-minute frequency. We also �nd that SPY overreacts to news relative to the underlying
portfolio, and thus opens up an arbitrage opportunity in the direction that is consistent
with our mechanism (arbitrageurs would push the prices of the underlying securities in the
direction of the initial shock). The behavior of the directional volume is also consistent with
arbitrage activity.
Overall, our �ndings are robust and can stand against a wide range of reasonable alter-
natives, we list and carefully discuss the most plausible alternative interpretations for our
results. As an additional robustness check, we perform an ETF-level analysis using non-
energy U.S. equity ETFs. Our results indicate that more liquid ETFs, which are more likely
to be involved in intensive arbitrage transactions, display a stronger reaction to oil shocks.
Hence, our ETF-level analysis con�rms our previous results that index trading distorts the
market reaction to fundamental oil shocks.
We also �nd that the distortion is not short-lived. The long-lived distortions may seem to
contradict our mechanism, as one would expect fundamental traders to (eventually) correct
the mispricings. Although it is not unusual in the �nance literature to document prolonged
mispricings as arbitrage capital can be limited and slow moving, recent research highlights
5This test would not be possible, if we were to use market betas instead of oil betas.
6
![Page 7: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/7.jpg)
negative e�ects of ETFs on trading and information acquisition behavior of informed traders.6
Bhattacharya and O'Hara (2018) show that when ETFs are available for trading, informed
agents optimally choose to herd on systematic part of information. As a result, idiosyncratic
component is not re�ected in asset prices. This is consistent with our �ndings that idiosyn-
cratic variation in the oil betas gets partially lost due to the presence of ETFs and ETF
arbitrage.7
Our main contribution is to show that index arbitrage distorts the market reaction to
fundamental shocks. Relatively little empirical evidence exists on whether fundamental
shocks are ampli�ed or distorted by arbitrage activity. One exception is Hong et al. (2012),
who show that arbitrage activity ampli�es the market reaction to earnings surprises, but
they consider a very di�erent economic mechanism: that stock prices overreact to good news
due to short covering. Weller (2018) documents negative e�ects of algorithmic trading on
fundamental information acquisition; see also Lee and Watts (2018). Improved screening of
informed order �ow may transfer prospective rents away from information acquires, however
this mechanism is not directly tested in the paper. In a closely related paper, Shim (2018)
aims to document the distortion in the market reaction to factor information caused by ETF
arbitrage. Our clean identi�cation of fundamental shocks and a direct measure of arbitrage
intensity allows us to solve a number of identi�cation issues present in Shim (2018).8 It
is also worth emphasizing, that the distortion in the market reaction to oil shocks that we
document represents a new source of an increase in the volatility of underlying assets, which
6Related is Weller (2018)who documents negative e�ects of algorithmic trading on information acquisition,see also Lee and Watts (2018).
7Our results show that the non-energy companies in the S&P 500 index overreact to oil shocks. Now, ifthe stock market index correctly responds to oil shocks, then it mechanically implies underreaction of theenergy �rms in the index to oil shocks. As a result, idiosyncratic variation in the betas gets partially lostdue to the presence of ETFs and ETF arbitrage, in line with Bhattacharya and O'Hara (2018).
8Shim (2018) uses ETF betas � betas from regressing a constituent stock's return on the ETF's returnusing daily data (basically using market betas as ETF betas are likely to be highly correlated with marketbetas). Usage of ETF betas makes it challenging to identify fundamental news arrivals and separate themfrom liquidity shocks that could drive ETF returns. Therefore, the relationship between the intensity ofarbitrage and ETF betas cannot be easily interpreted as the distortion in the market reaction to fundamentalinformation. Moreover, the ETF price reaction to news itself can be distorted, and thus, relative responsesbecome even harder to interpret. Finally, Shim (2018) uses a measure of arbitrage sensitivity, which representsa combination of index weight and price impact sensitivity, and thus similar to other existing measures itdoes not account for statistical arbitrage. It is also impossible to separately measure the e�ect of liquidityon the intensity of arbitrage from its e�ect on the magnitude of distortion, while our approach allows us todo that.
7
![Page 8: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/8.jpg)
is unrelated to liquidity shocks introduced by ETFs' clientele, and thus our results can be
viewed as complimentary to Ben-David et al. (2018).
Our second contribution to the literature is to construct a direct measure of the intensity
of arbitrage. Our measure is model free and it can be easily computed for an arbitrary set of
assets linked by arbitrage. In contrast to other popular measures based on a cointegration
assumption, our approach does not require long data series needed to estimate a long-term
relationship between the assets. Our measure does not require access to regulatory transac-
tions data, typically needed to construct current measures of algorithmic trading. To provide
additional empirical evidence, we perform several tests of our measure. Our approach is to
identify times when arbitrage intensity is known to change and to test if our measure captures
these changes. As a �rst preliminary test, we compare our measure with the popular mea-
sures of the intensity of algorithmic trading that pick up variation in the intensity of market
making activity. We show that our measure has a strong and robust negative correlation
with the intensity of market making, which is consistent with other studies that document
that market making activity reduces pro�tability of arbitrage.9 Second, we analyze the S&P
500 index additions and deletions. We show that the probability of simultaneous trading
increases when a �rms is added to the index, and decreases when it is deleted, con�rming our
interpretation that the probability of simultaneous trading re�ects changes in index trading
intensity. Third, we analyze SPY creations and redemptions and show a positive link be-
tween changes in holdings and our measure. Finally, we use mutual fund �re-sales to identify
exogenous variations in prices and document a positive link with our measure.
From a practical perspective, our results have important implications for risk manage-
ment. We document distortions in the market reaction to oil shocks; we can naturally expect
that the stock market reaction to other fundamental shocks is also distorted. Hence, stan-
dard measures of risk exposures have to be adjusted to remove the bias induced by the index
arbitrage. Our �ndings also caution against estimation of discontinuous betas recently put
forward by Todorov and Bollerslev (2010), Bollerslev et al. (2013), Bollerslev et al. (2016).
Although large market movements (either triggered by arrivals of news or related to other
sources of market discontinuity) should provide more information about changes in funda-
mentals and should be less subject to noise in the price formation process, thus, making
estimation of betas more precise, we argue that such betas are likely to be signi�cantly
9See Chaboud et al. (2014) and Brogaard et al. (2014)
8
![Page 9: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/9.jpg)
distorted by index arbitrage. More generally, our results imply that any event study analy-
sis has to remove the arbitrage bias before the results can be interpreted as re�ecting true
fundamental exposures.
This paper is organized as follows. Section 2 illustrates the main mechanism and builds
the foundation for our empirical framework. Section 3 outlines our empirical methodology,
while section 4 describes our data. Our main results are presented in Section 5, where
we also discuss other alternative explanations for our �ndings and discuss the magnitude
of our e�ect. Subsequently, in Section 6 we provide empirical evidence that our measure
captures the intensity of arbitrage transactions. We also carefully compare our measure of
the intensity of arbitrage transactions with existing measures in the literature. We perform
an additional exercise using ETF-level data in Section 7. Finally, in Section 8 we discuss our
results and conclude.
2 Hypothesis Development
The main testable hypothesis of this paper is that ETF arbitrage distorts the market reaction
to fundamental shocks. This section illustrates the mechanism and builds a foundation for
our empirical framework. We start by brie�y outlining the mechanics of ETF arbitrage.
Two types of ETF arbitrage
ETF arbitrage can be conducted in two ways: via redemption/creation of ETF shares by
Authorized Participants or via statistical arbitrage. Both strategies assume the same initial
response, but di�er in the way the o�setting operation is performed.
Authorized Participants can pro�t from riskless arbitrage. Consider a case in which an
ETF price exceeds the weighted price of the basket of underlying securities. An AP can
simultaneously purchase the constituents of the index that the ETF holds in the exact same
weights as the index, and sell the ETF shares. Once the market closes, an in-kind transaction
occurs: The AP delivers the stocks to the sponsor of the fund and in return he receives a
block of ETF shares, which covers his previous short position. This creation or redemption
mechanism allows the AP to conduct riskless arbitrage. Typically, only large banks like
Goldman Sachs and Bank of America can become APs. Moreover, only the elimination
9
![Page 10: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/10.jpg)
of su�ciently large price deviations should be pro�table given the large transaction costs
associated with a purchase of the entire basket of securities.
However, if an arbitrageur is eager to bear a certain amount of risk, arbitrage can take a
di�erent form. Statistical arbitrage also prescribes a simultaneous purchase(sale) of a subset
of underlying securities and a sale(purchase) of ETF shares in response to a price deviation.
In contrast to the previous strategy, however, an o�setting transaction is conducted as a
typical market transaction once the price deviation has been eliminated. Although statistical
arbitrage involves risk, it allows one to economize on transaction costs: An arbitrageur does
not have to acquire the entire basket of underlying securities. A subset of the stocks can do
the job as long as it tracks the index su�ciently well.
ETF arbitrage and the market reaction to shocks
How could ETF arbitrage distort the market reaction to shocks? We provide a simple
illustration of the mechanism. Once introduced, ETFs facilitate price discovery. Imagine that
new information arrives and becomes incorporated into the ETF price �rst.10 When a shock
arrives and the price of the ETF deviates from the price of the underlying basket, arbitrageurs
start trading to eliminate the discrepancy. By purchasing or selling the underlying securities,
arbitrageurs transmit the initial shock to the individual securities. This mechanical pricing
pressure can be substantial if arbitrage is intensive while the liquidity of the underlying
security is low. As a result, the stock price response to the shock no longer re�ects the true
fundamental exposure. The distortion, however, may be transitory if, over time, fundamental
traders eliminate the induced mispricing.
Our goal in this paper is to create a measure of the intensity of arbitrage transactions
and document a link between arbitrage transactions and distortions in the market reaction
to fundamental news.
10We would like to emphasize that we do not assume that all information is �rst incorporated in the ETFprice. However, it is reasonable to assume that the ETF market plays at least a partial role in the pricediscovery process.
10
![Page 11: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/11.jpg)
Stocks characteristics and the magnitude of distortion
Can we predict which stocks will experience the largest distortions associated with ETF
arbitrage? In the presence of statistical arbitrage, we have to consider two channels.
First, a certain stock characteristic can in�uence arbitrage intensity. Not all stocks should
necessarily be included into the arbitrage transaction. The portfolio of chosen �rms must
track the index well enough to avoid the risk of substantial price deviations at the time
of o�setting the transaction. However, some �rms can be omitted to economize on costs.
Firms with larger index weight, volatility and liquidity are more likely to be included in
arbitrage transactions. Indeed, when index weight is large, any variation in the stock price
leads to a substantial variation in the index price. Similarly, the more volatile the stock
price is, the more likely such a large deviation occurs. Finally, liquidity or the price impact
function determines to what extent the price of the security will be a�ected by arbitrageurs'
transactions, and thus a�ects the pro�tability of arbitrage.
At the same time, for a given intensity of arbitrage transactions, the same stock charac-
teristic can determine the magnitude of the market distortion. The less liquid the stock is,
the larger the a�ect of arbitrage transactions on its price will be, and thus we can expect a
larger distortion in the stock's response to shocks.
The total e�ect of some stock characteristics on the size of market distortion can be am-
biguous. Indeed, the less liquid the stock is, the less likely it will be included in the portfolio.
Hence, arbitrage will be less intensive, but the distortion caused by arbitrage transactions
will be larger. Without a direct measure of arbitrage intensity, it is impossible to separate
these two opposing e�ects of liquidity. However, because we directly measure the intensity
of arbitrage transactions, we can separately measure the e�ect of a stock characteristic on
the intensity of arbitrage transactions from its e�ect on the magnitude of distortion that
these arbitrage transactions cause.
3 Methodology
In this section we outline our empirical strategy. We aim to estimate the e�ect of ETF
arbitrage on the market response to fundamental shocks. Our approach involves three steps.
First, we construct a measure of the intensity of ETF arbitrage transactions. Second, we
11
![Page 12: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/12.jpg)
use event study analysis to identify oil shocks and estimate the stock market reaction to
these shocks. Finally, we use regression analysis to document the relationship between the
intensity of arbitrage transactions and the market reaction to shocks. In what follows, we
describe each step in detail.
3.1 Measuring the intensity of arbitrage transactions
Our measure is based on the following observation. Regardless of whether arbitrage is
conducted by Authorized Participants or by statistical arbitrageurs, the initial response to
a price deviation is the same: arbitrage prescribes the simultaneous trading of ETF shares
and its constituents. Simultaneity is crucial to avoid taking the unnecessary risks of adverse
market movements. Hence, we propose to use the estimated probability of simultaneous
trading of ETF shares with shares of underlying stocks as a measure of the intensity of
arbitrage transactions at the individual stock level.
Formally, consider an index and an ETF that tracks this index. For each �rm j in the
index, we use high-frequency data to estimate the probability of simultaneous trading of �rm
j and ETF shares over day t:
πj,t =
T∑τ=1
IVETF,τ>0IVj,τ>0
T,
where Vj,τ is the volume of stock j traded over interval τ = 1, .., T of day t.11 We propose
to sample data on 1 second frequency.
3.2 Measuring the market response to shocks
In the second step, we identify fundamental shocks and measure stock market sensitivity
to these shocks. We follow on our previous research, Anatolyev et al. (2018), and use oil
inventory announcements to identify fundamental oil shocks.
Weekly estimates of crude oil inventories in the U.S. are provided by the U.S. Energy
Information Administration (EIA), a statistical and analytical agency within the U.S. De-
11For robustness we also use an estimate of conditional probability (see Internet Appendix A.4). Byconditioning on �rm j's trading, we can further �lter our intensity of overall trading from our measure of theintensity of arbitrage transactions.
12
![Page 13: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/13.jpg)
partment of Energy. A summary report is released in the form of an EIA publication, the
Weekly Petroleum Status Report.12 The report becomes available to the public at 10:30am
Eastern time. The oil price typically reacts quite strongly to the announcements, thus pro-
viding us with a necessary signal.
We use an event study approach to estimate the stock market sensitivity to oil shocks.
Our main identi�cation assumption is the absence of any market-wide shocks except for the
oil inventory news in the announcement window. To determine the size of the window, we
face the standard tradeo�: we need the window to be narrow enough to justify our identi�-
cation assumption, but we would like to avoid the complications of modeling microstructure
noise. Hence, in our main exercise, we use a 1 minute window. Formally, we use the most
recent transaction price as of 10:31:00 am (or one minute after the release) and the most
recent transaction price 5 seconds before the announcement.13 For robustness, we repeat
our analysis with announcement returns calculated using a 5-minute window to account for
potentially delayed information processing.14
For each year and each �rm j we estimate the following linear regression:
rj,t = αj + βjroil,t + εj,t (1)
where t represents an announcement day, rj,t is the �rm j′s announcement return on an-
nouncement day t, roil,t is the �rst oil futures contract announcement return on announcement
day t, and E(εj,t|rj,t) = 0.We estimate oil betas separately for each �rm over each year using
OLS.
3.3 Regression analysis
To measure the e�ect of ETF arbitrage on the market reaction to shocks, we estimate the
following �xed e�ect panel regression:
12For some weeks which include holidays, releases are delayed by one day.13For each �rm we exclude weeks during which no trading is recorded in the 5-minute interval before the
announcement. Only a negligible number of weeks were excluded.14We do not expect the lack of trading to be a problem in our dataset. Inventory announcements move oil
prices signi�cantly, and trading in the oil futures market intensi�es around EIA announcements. Moreover,the S&P 500 index is comprised of actively traded �rms, and our data typically shows multiple transactionsrecorded in the minute both before and after the announcement.
13
![Page 14: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/14.jpg)
βj,t = cj + αJ,t + γtπj,t + δtXj,t + εj,t (2)
where βj,t is the estimated oil beta of stock j in year t, πj,t is the estimated probability
of simultaneous trading of stock j and SPY over the same period, and Xj,t is a vector of
control variables. Controls include variables describing the size and value of the �rms such
as the logarithms of market capitalization and book-to-market ratio. We also include the
logarithms of the dollar volume and turnover to capture the overall intensity of trading
of j′s stock. It is critical to control for the overall intensity of trading, as our measure
cannot distinguish arbitrage transactions from coincidental trades, which are more likely
for actively traded �rms. Finally, we include the bid-ask spread and the price impact ratio
calculated using high frequency data to capture overall liquidity and price impact of trades.
The coe�cient cj captures the unobserved time-invariant stock �xed e�ect, which can be
interpreted as stock j′s true sensitivity to oil shocks. We also include year by sector e�ects,
αJ,t. Time variability can be captured by allowing the coe�cient of interest, γt, and the
e�ects of controls δt to vary over time.15
4 Data and descriptive statistics
Before we proceed to our main results, we describe our data and characterize the behavior
of the estimated intensity of arbitrage transactions and oil betas.
4.1 Data
In our empirical exercise we focus on SPY, an ETF that tracks the S&P 500 index. SPY is
the largest ETF in the world, currently managing $280 billion in assets under management.
We consider the �rms comprising the S&P 500 index as of November 2018. Although the
composition of the index has changed signi�cantly over the last decade, we �x the set of
�rms to facilitate the estimation.
We use the closest to maturity WTI oil futures traded at NYMEX (CME group) to
measure oil betas. We use a standard rolling procedure, replacing a soon-to-expire futures
15On a technical note, we do not consider the estimated probability as a generated regressor. We treatthe estimated probability itself as a measure of the intensity of arbitrage transactions.
14
![Page 15: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/15.jpg)
contract with the next one on the 5th day of the month. At about that time, the liquidity
on the market moves from the �rst to the second contract markets.
The high-frequency data on stocks are obtained from the Trade and Quote Database
(TAQ) and on oil futures from TickData.16 Our sample covers the period from 2005 to 2016.
To calculate an estimate of the probability of simultaneous trading, we consider the time
period from 9:30 am to 4:00 pm, and sample the data on 1 second frequency.
We use CRSP to calculate bid-ask spreads, and Compustat for market capitalization,
turnover, and book-to-market ratio.
4.2 Descriptive statistics
In this subsection, we describe the time and cross-sectional behavior of our measure of the
intensity of arbitrage transactions and oil betas.
Intensity of arbitrage transactions
Figure 1 displays the average probability of simultaneous trading with the SPY. The black
line corresponds to SPY itself, and thus displays the probability of trading over a 1 second
interval. The blue and red lines correspond to Apple and Microsoft, respectively, two �rms
with the largest weights in the index. Finally, the three lower lines correspond to the �rst,
second and third quantiles of the distribution of probabilities across all �rms in the S&P 500
index.
We see substantial time variation in trading intensity and in the estimated probability of
simultaneous trading. Trading clearly intensi�ed over the �rst 4 years of the sample; by 2008
16Prior to December 9, 2013 TAQ does not report odd lot trades (seeO'Hara et al. (2014) for the generaldiscussion of this issue). The omission of the odd lots in the earlier subsample can lead to an underestimationof the probability of simultaneous trading and thus underestimation of the intensity of arbitrage. However, webelieve that it is unlikely to bias our main results. To bias our results, the underestimation of the probabilityof simultaneous trading must correlate with the underestimation of oil betas. However, our sample of the�rms consists of the most liquid �rms. As we observe trading immediately following the announcements,the bias in the estimates of oil betas should be minimal (if any). Moreover, as we discuss below our resultsbecome even more pronounced as we increase the estimation window from 1 minutes to 5 minutes afterthe oil announcement. To provide further support, we repeat our main estimations including measures ofalgorithmic trading, in particular the odd lot volume ratio. We show that the results remain the same (seeInternet Appendix A.11). It is also worth noticing, that we do not observe any signi�cant changes in theprobability of simultaneous trading around the date of the change in reporting requirements.
15
![Page 16: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/16.jpg)
Figure 1: Probability of simultaneous trading.
2004 2006 2008 2010 2012 2014 20160
0.1
0.2
0.3
0.4
0.5
0.6
0.7
SPYAAPLMSFTq1medianq3
SPY was traded about 60% of the time. The median probability of simultaneous trading
also increased by 2008, but continued �uctuating afterwards.
More importantly, we observe signi�cant cross-sectional variation in the estimated proba-
bility of simultaneous trading with SPY.17 In particular, Apple and Microsoft, two �rms with
the highest index weight, clearly have much larger probability to be traded synchronously
with SPY. This is a more general result, and partially re�ects active trading of these stocks.
We further discuss the determinants of the probability of simultaneous trading in section 6.
Oil betas
Figure 2 displays the evolution of oil betas for the sample of non-energy �rms in the S&P 500
index over our sample period. Before 2009, all oil betas are negative18, in line with historical
evidence of a negative relationship between oil prices and the macroeconomy. However,
the relationship suddenly becomes positive in 2009 and remains positive subsequently. The
timing of the structural shift might not be surprising since in times of �nancial crises, �re sales
and alleviated systematic risk, all assets typically move together. What is more surprising
is that the positive link has not been reversed subsequently.
17The breakdown by industry is provided in Internet Appendix in Table 11. We do not see any particularpatterns across industries, except for somewhat lower numbers in the real estate sector, and higher in theenergy sector.
18See the breakdown of oil betas by industry and years in the Internet Appendix in Table 11.
16
![Page 17: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/17.jpg)
Figure 2: Evolution of oil betas.
2006 2008 2010 2012 2014 2016−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
β oil
Sensitivity to oil shocks, non−energy S&P 500 firms
medianq1 and q3
It is beyond the scope of this paper to investigate the drivers of the shift, i.e. whether
it was the development of unconventional oil, or geopolitical changes, or something else.
However, in our regression analysis we assume time-invariant �rm �xed e�ects. Hence, we
need to restrict our sample to start in 2010 to avoid this dramatic shift in betas. Moreover,
Figure 1 documents a period of rising overall intensity of trading until 2008. As we cannot
fully wash out the intensity of overall trading from our measure of the intensity of arbitrage
transactions, we have an additional reason to restrict the sample to exclude earlier years.
5 Main Results
5.1 Arbitrage intensity and market reaction to oil shocks
In this section, we document the relationship between sensitivities to oil shocks and our
measure of index trading intensity. Table 1 reports our main results. Each column corre-
sponds to a di�erent speci�cation, depending on whether we include controls and allow the
regression coe�cient to vary over time. For robustness, we repeat estimation for 1 minute
and 5 minute returns.
17
![Page 18: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/18.jpg)
Table 1: Impact of probability of simultaneous trading with SPY on oil betas in the panelof S&P 500 �rms.
The table displays the estimates of the following panel �xed e�ect regression: βj,t = cj+αJ,t+γtπj,t+δtXj,t+ εj,t, where βj,t is the estimated oil beta of stock j in year t, πj,t is the average probability ofsimultaneous trading of stock j and SPY over the same period, αJ,t are year by sector dummies, andXj,t is a vector of control variables. Controls include logarithms of market capitalization, book-to-
market ratio, dollar volume, turnover, volatility, and three measures of liquidity and price impact:
bid-ask spread, the price impact coe�cient estimated using high frequency data, and the Amihud
ratio. St.err are clustered at the �rm level, t-statistics in parenthesis. The sample period covers
2010-2016.
1 min returns(1) (2) (3) (4) (5) (6) (7) (8)
ProbTrad 0.49 0.47 0.28 0.31 0.28 0.29 0.20(7.53) (4.81) (2.53) (2.31) (2.37) (1.87) (1.94)
Prob -2010 0.29 (1.23)2011 0.31 (1.86)2012 0.27 (1.72)2013 0.12 (0.61)2014 0.25 (1.94)2015 0.23 (2.06)2016 0.36 (2.84)
Controls(mcap,book-to-market,turnover, dollarvolume)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility Yes
Year sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.037 0.096 0.254 0.266 0.254 0.254 0.269 0.292R-sq between 0.003 0.207 0.022 0.020 0.022 0.018 0.099 0.000
N 3,058 2,877 2,877 2,876 2,877 2,877 2,877 2,877N groups 461 444 444 444 444 444 444 444
18
![Page 19: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/19.jpg)
Table 1(continue)
5 min returns(1) (2) (3) (4) (5) (6) (7) (8)
ProbTrad 1.25 1.42 0.61 0.65 0.61 0.65 0.49(10.15) (8.06) (3.43) (3.08) (3.26) (2.73) (3.00)
Prob -2010 0.60 (1.59)2011 0.56 (2.10)2012 0.36 (1.34)2013 0.55 (1.87)2014 0.51 (2.41)2015 0.51 (2.76)2016 0.56 (3.33)
Controls(mcap,book-to-market,turnover, dollarvolume)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility Yes
Year sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.069 0.268 0.489 0.490 0.489 0.490 0.497 0.514R-sq between 0.015 0.139 0.072 0.068 0.071 0.052 0.140 0.001
N 3,058 2,877 2,877 2,876 2,877 2,877 2,877 2,877N groups 461 444 444 444 444 444 444 444
19
![Page 20: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/20.jpg)
Regarding the results for 1 minute returns, the probability of trading has a large and
positive coe�cient, and is robust to the inclusion of controls and time e�ects. The �rms
that are traded more often with SPY display a stronger reaction to oil shocks. To gauge the
magnitude of the e�ect, consider an increase in the probability of simultaneous trading by
0.07, which is the di�erence between the third and the �rst quantiles. Such an increase is
associated with an increase in oil beta by 0.02. In relative terms, the e�ect is large, as the
median oil beta is 0.05 for most industries (see Figure 2 or Table 11). We further argue that
our approach delivers a conservative estimate of the magnitude of the ETF distortion e�ect
in section 5.5.
Importantly, inclusion of the measures of the overall intensity of trading, such as dollar
volume and turnover, as well as the inclusion of market capitalization, does not a�ect our
results. Hence, our measure of the intensity of trading contains additional information
orthogonal to the intensity of overall trading, and we show that this information has the
power to explain �rms' response to oil shocks. In the Internet Appendix A.4, we repeat the
estimation using the conditional probability of simultaneous trading, which helps to �lter
our overall intensity of trading. We obtain even stronger results. In the Internet Appendix
A.11 we also show that inclusion additional controls for the intensity of algorithmic trading
following Weller (2018) and Lee and Watts (2018) does not change our results.19
The e�ect of probability on oil betas also survives the inclusion of various measure of
liquidity and price impact: the point estimate remains unchanged and signi�cant. The e�ect
of probability weakens when we include volatility, but remains signi�cant at the 10% level.
One reason could be that ETF arbitrage increases stock volatility itself, as shown by Ben-
David et al. (2018), as new noise and liquidity shocks arise in the ETF market and propagate
to the underlying markets via ETF arbitrage. We con�rm this basic �nding in our data using
our measure of arbitrage intensity in the Internet Appendix A.6.
The results are similar when we allow all coe�cients to vary over time, see speci�cation
IV. With the exception of 2013, the estimates are extremely close to the value of 0.31 in the
time-invariant speci�cations. The coe�cients are signi�cant in the 2014-16 period, although
lacking signi�cance in the earlier years. The lack of signi�cance can be due to the lack of a
19We follow the paper and use the SEC's MIDAS data to construct two measures of the intensity ofalgorithmic trading: the cancel-to-trade ratio, or the number of cancellations divided by the number oftrades; and the trade-to-order volume ratio, or the total volume divided by the total volume across all ordersplaced. The data is available since 2012.
20
![Page 21: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/21.jpg)
strong e�ect of oil inventory announcements on oil prices, especially over the 2012-13 period
as we document in Anatolyev et al. (2018). When the oil price responses to news are small,
the signal-to-noise ratio decreases and our estimates of oil betas are likely to become less
precise.
We also see that the results become even stronger when announcement returns are mea-
sured over 5 minutes after the release.
Overall, our results indicate a signi�cant link between probabilities and oil betas con-
�rming that ETF arbitrage distorts the market reaction to oil shocks.
5.2 Di�erential e�ects
One way to provide further support for our mechanism is to show that the distortion is greater
in exactly those �rms where theory would predict that the e�ect of arbitrage transactions
on prices is likely to be stronger. A natural characteristic of a stock to explore is, of course,
liquidity. However, the presence of statistical arbitrage complicates the problem, as the same
characteristic can have opposite e�ects on the intensity of arbitrage transactions and on the
magnitude of distortion that these arbitrage transactions cause. Our direct measure of the
intensity of arbitrage allows to separately measure these two e�ects. In this section we aim
to test if for the same intensity of arbitrage transactions, a stock characterized by a higher
price impact displays a stronger reaction to shocks.
We use three di�erent measures of liquidity and price impact. First, we use the bid-ask
spread calculated as the quoted spread divided by the price from CRSP. Second, we follow
Amihud (2002) and measure price impact by computing the absolute daily return divided
by the total dollar daily volume. Finally, we calculate price impact using high frequency
data.20
For brevity, we only consider one speci�cation that assumes the time invariant e�ects of
controls and probabilities, but contains time by sector �xed e�ects (speci�cations (4-5) in
Table 1). We consider each of our three measures of liquidity and price impact separately,
and we interact each of them with the probability of simultaneous trading. Panel A in Table
2 shows the estimates for the entire sample period. All the interaction terms are positive,
implying that conditional on the level of intensity of arbitrage transactions, �rms that are
20See Internet Appendix A.1 for details.
21
![Page 22: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/22.jpg)
Table 2: Estimates of the di�erential e�ect of probability on oil betas for high and lowliquidity stocks
The table displays the estimates of the following panel �xed e�ect regression: βj,t = cj + αJ,t +(γ0 + γ1Zj,t)πj,t + δ0Zj,t + δ′Xj,t + εj,t, where βj,t is the estimated oil beta of stock j in year t, πj,tis the average probability of simultaneous trading of stock j and SPY over the same period, αJ,tare year by sector dummies, and Xj,t is a vector of control variables. Controls include logarithms of
market capitalization, book-to-market ratio, dollar volume, and turnover. Variables Zj,t representthree measures of liquidity and price impact: bid-ask spread, price impact calculated using intraday
data, and Amihud ratio calculated using monthly returns and volumes. St.err are clustered at the
�rm level, t-statistics in parenthesis.
Panel A: The entire period 2010-2016
1 min returns 5 min returns(1) (2) (3) (1) (2) (3)
ProbTrad 2.12 4.89 7.98 2.77 6.01 12.84(1.24) (1.36) (1.81) (1.04) (1.10) (2.04)
ProbTrad × Bid ask 0.22 0.26(1.13) (0.85)
ProbTrad × Price impact 0.33 0.39(1.31) (1.02)
ProbTrad × Amihud 0.31 0.49(1.79) (1.99)
Panel B: The period from 2014-2016.
1 min returns 5 min returns(1) (2) (3) (1) (2) (3)
ProbTrad 3.82 10.85 8.83 4.31 11.63 9.56(2.45) (4.19) (3.42) (2.70) (4.05) (3.24)
ProbTrad × Bid ask 0.41 0.41(2.37) (2.33)
ProbTrad × Price impact 0.74 0.76(4.11) (3.82)
ProbTrad × Amihud 0.34 0.36(3.39) (3.05)
22
![Page 23: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/23.jpg)
characterized by lower liquidity and a larger price impact, display a larger reaction to oil
shocks, which is consistent with our mechanism. However, we only �nd signi�cance when
we use the Amihud measure. Panel B shows the estimates for the 2014-2016 period when
oil price �uctuations provide a stronger signal. Now we see that all the coe�cients for
the interaction of probability and measures of liquidity and price impact are positive and
signi�cant, no matter which measure we use and how we measure announcement returns.
In section6 we also show that less liquid stocks are associated with less intense arbitrage.
Hence, our results are consistent with the arbitrage mechanism: the less liquid the stock is,
the less likely it will be included in the arbitrage portfolio, but if included, the distortion
will be larger.
5.3 Long-term e�ects
One concern may be that our results document only a temporary distortion of the market
reaction to shocks by ETF arbitrage. Eventually, fundamental traders may correct the
mispricing and thus, the detrimental e�ect of ETFs is short-lived.
Even if that is the case, our results imply that ETFs introduce yet another source of
noise to the market. There is nothing speci�c about oil shocks in our analysis; our choice of
oil shocks was driven by our ability to identify these fundamental shocks using oil inventory
announcements. Naturally, we may expect that the market reaction to all fundamental
shocks becomes distorted by ETF arbitrage. Hence, ETF arbitrage introduces a continuous
cycle of mispricings and corrections and leads to excessive �uctuations of asset prices. We
identify another source of the increased volatility of underlying securities due to the presence
of ETFs. Our paper can be viewed as complimentary to the �ndings of Ben-David et al.
(2018).
However, it is likely that initial distortions never become fully corrected. In the Internet
Appendix A.7, we repeat our analysis using oil betas calculated over the entire day. We
calculate the cumulative return on each stock from the time of announcement until the end
of the day. The oil beta is de�ned as the regression coe�cient of these cumulative returns
on 1 minute oil announcement returns (to keep clean identi�cation of oil shocks). We show
that our e�ect is still present and signi�cant. Even by the end of the day, the mispricing
associated with the intensity of ETF arbitrage is not corrected.
23
![Page 24: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/24.jpg)
We �nd that ETF distortions are not short-lived. Hence, ETFs not only introduce a new
source of noise to the market, but actually a�ect the very basic role of asset prices to re�ect
fundamental information. Hence, ETFs have a detrimental impact on price e�ciency.
5.4 Alternative explanations
Our preferred interpretation of the results in Table 1 is the distortion of the market reaction to
oil shocks induced by ETF arbitrage. In this section, we discuss other potential explanations
for our �ndings. We also discuss our assumptions and our approach in general.
There are two main challenges with our approach that open up our results to alternative
interpretations. First, there is a clear omitted variable in our regressions which is the true
oil beta. Second, our ability to clean our measure of arbitrage intensity from the intensity
of overall trading can be limited. Hence, the positive correlation between the estimated oil
betas and the probability of simultaneous trading that we document can potentially be driven
by a relationship between the true oil betas and the intensity of overall trading. Below we
discuss one potentially important driver of this relationship, however, �rst we would like to
describe our general e�orts to solve this identi�cation issue. First, in our regression analysis
we include a number of controls to proxy for the true oil beta and its variation over time.
In particular, we include stock �xed e�ects and sector by year �xed e�ects, in addition to
other standard controls. It should also be noted that we carefully control for the intensity of
overall trading in our regressions. We use the dollar volume and turnover as our preferred
measures, and in addition, in the Internet Appendix A.4 we repeat estimation using the
conditional probability of simultaneous trading, which further �lters out the intensity of
trading of the individual stock. We obtain even stronger results. To provide even more
empirical evidence that our measure indeed measures the intensity of arbitrage transactions
and not simply re�ect the intensity of overall trading, we devote Section 6 to various empirical
tests of our measure. Finally, we would like to emphasize that we use oil shocks and oil betas,
and not macroeconomic shocks and market betas. Below we argue how this alleviates the
identi�cation problem, because true oil betas are more likely to be orthogonal to both the
intensity of arbitrage transactions and the intensity of overall trading, than market betas.
We start by addressing the most plausible alternative interpretation of our results, which
is based on the relationship between true oil betas and the intensity of overall trading. The
24
![Page 25: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/25.jpg)
reverse causality argument states that the stocks with larger true oil betas are traded more
intensely when oil news arrive. This prediction can be generated by an asset pricing model
with heterogeneous prior beliefs. The larger is the announcement return, the larger must be
the revisions of agents beliefs, and thus the larger is the trading volume that we observe as
the agents rebalance their portfolios. Although plausible, this alternative reverse causality
argument is unlikely to drive our results for three reasons. First, we use oil shocks and oil
betas, and not macroeconomic shocks and market betas. True oil exposure of most non-energy
companies is likely to be negative (or zero). Intuitively, an increase in oil prices will raise
input costs for most business, as well as force consumers to spend more money on gasoline
and less on everything else. The idiosyncrasy in �rms' sensitivities to oil shocks dominates
any systematic component of oil price �uctuations. Thus, even if there is a link between true
oil betas and the intensity of overall trading, it actually should be a negative relationship
for a sizable subset of the data. The presence of negative oil betas allows us to test our
mechanism against the alternative explanation. The test can be performed by repeating our
main regressions using the absolute value of the beta. If the alternative mechanism drives
our results, then taking the absolute value of the beta should make the results even stronger,
as stocks with the greater exposure (i.e. more negative betas, larger in absolute value) are
associated with more intensive trading. In contrast, we argue that more intensive arbitrage
increases oil betas uniformly, in particular, it makes negative oil betas less negative, hence
the results should become weaker. As shown in the Internet Appendix A.10, the results do
become weaker when we use the absolute value of the oil beta thus supporting our story.
Note that, if we were to use market betas instead such a test would be impossible and we
would have a more serious identi�cation issue, as the true market betas are more likely to
be related to the intensity of overall trading.21,22
Second, we calculate the probability of simultaneous trading over the entire trading period
21Although there is little evidence of a strong relationship even between true market betas and the intensityof overall trading. To illustrate it, we use two measures of the intensity of trading, the dollar volume andthe probability of trading (estimated as a fraction of one second intervals with non-zero trading volumes),to calculate the cross-sectional correlations with estimated market betas (by year). In 7 out of 12 years inour sample, the correlations of market betas with either of the two measures of the intensity of trading werelower than 0.15, although positive, and only slightly higher in other years.
22We also calculate the cross-sectional correlation of oil and market betas for each year and �nd thatthis correlation was typically very low (see Internet Appendix A.5 for details and additional discussion,including the discussion of the shale boom and its potential e�ects on the systematic component of oil price�uctuations).
25
![Page 26: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/26.jpg)
using all trading days, not just over a short window immediately following an oil inventory
announcement. Oil news constitute only a negligible fraction of overall news and thus can
hardly drive the average intensity of trading, and thus our results. In sum, it is unlikely that
our results are driven by the reverse causality argument.
It is worth emphasizing, that the reverse causality mechanism inadvertently makes a
very restrictive assumption on the cross-sectional distribution of the true oil betas. The
companies that are traded intensely and have large estimated oil betas, are presumed to
have large true oil betas. In the data, we �nd that �rms that are traded more intensely, are
typically the �rms with the largest weight in the index. However, it is not clear why such
companies, including, say, Amazon and Facebook, should necessarily have larger positive
true exposures to oil shocks relative to the �rms with lower index weight. In contrast, our
mechanism does not impose any restrictions on the distribution of true oil betas. It is the
arbitrage activity that creates a link between the intensity of overall trading and observed
oil betas. Amazon and Facebook are more likely to be involved in statistical arbitrage, and
thus are more exposed to arbitrage distortions and have more distorted oil betas. The true
oil betas can be uniformly zero or even negative. Thus, our mechanism generates a link
between observed betas and trading volumes under much less restrictive assumptions.
Our results can also be partially driven by infrequently traded �rms. Such �rms have low
probability of simultaneous trading with SPY due to low trading activity. Moreover, these
same �rms could have zero or close to zero estimated oil betas, as the lack of trading in a
short window around the announcement prevents new information from being incorporated
into the price. However, we believe that our results are not driven by infrequently traded
�rms. First, we consider the S&P 500 �rms, which are large and actively traded companies.
Second, we observe no trading in the event window only for a negligibly small number of
�rm-event pairs. More importantly, as Table 1 reports, our results become stronger for 5
minute returns, which goes against this alternative explanation. As we increase the event
window from 1 minute to 5 minutes, at least some trading is more likely to occur and thus
oil betas become more precise and no longer zero. Hence, the e�ect should have weakened if
the alternative explanation was true.
The e�ect of arbitrage on stock's oil betas that we interpret as a distortion, may al-
ternatively be interpreted as an enhancement, a positive e�ect of ETFs on price e�ciency.
According to this view, by attracting new clientele and thus improving liquidity of the un-
26
![Page 27: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/27.jpg)
derlying stocks, the ETFs make trading based on fundamental information more attractive
and thus improve stock responsiveness to news. Although this could be a plausible alter-
native for small and illiquid stocks, it is hardly relevant for the most liquid and frequently
traded S&P 500 stocks that we consider. Indeed, this idea is further studied by Glosten et
al. (forthcoming) who investigate the e�ect of ETF ownership on stock responsiveness to
earnings announcements and only �nd a positive and signi�cant e�ect in small stocks and
stocks with low analyst coverage. Moreover, this alternative view also suggests that the
e�ect should be stronger when the absolute values of betas are used, however that is not the
case.
Clearly, we cannot rule out all possible alternative explanations for a positive relationship
between oil betas and the probability of simultaneous trading. However, if one takes into
account stock �xed e�ects and year by industry e�ects, as well as a large set of controls, our
�ndings can stand against a wide range of reasonable alternatives. It should also be noted
that any alternative explanation should also account for the di�erential e�ects of liquidity,
that we document.
One might also question our approach to the estimation of oil betas. We use oil inventory
surprises during each year to estimate stocks' sensitivity to oil shocks. In doing so, we do not
distinguish the sources of shocks that drive inventory changes. However, Kilian and Park
(2009) argue that the stock market reaction to oil shocks di�ers depending on whether the
change in the price of oil is driven by demand or supply shocks in the oil market. This is
not necessarily an issue for us. As long as the distribution of shocks is relatively over time,
we do not have a problem at all. However, even if the composition of shocks changes over
time, under a mild monotonicity assumption our results remain valid. Indeed, in our main
regression speci�cation we include individual stock �xed e�ects and sector by year e�ects.
As long as individual �rms that belong to one sector experience a uniform shift in betas over
the years due to changes in the composition of shocks, the resulting changes in oil betas will
be picked up by sector-year e�ects and thus would not a�ect our main results. The only
situation in which our results would be a�ected is if we were to have individual changes in oil
betas positively correlated with changes in the probability of simultaneous trading with SPY
due to changes in the composition of shocks. We �nd this combination of e�ects extremely
unlikely, and thus we believe our results are robust to changes in the shocks composition.
Another concern has been raised recently by Box et al. (2019) who identify intraday ETF
27
![Page 28: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/28.jpg)
mispricings and investigate how trading in the ETF and the underlying stocks resolves these
mispricings. According to their �ndings, mispricing is typically preceded by a permanent
shock in the underlying portfolio and is subsequently corrected by quote adjustment and not
by arbitrage transactions. Although we also consider an arrival of fundamental information
as a source of mispricing, stale ETF pricing and absence of arbitrage transactions could
raise a serious argument against our mechanism. In the Internet Appendix A.9, we follow
their approach and investigate the behavior of quotes and volumes around our oil inventory
announcements. In contrast to Box et al. (2019), we show that SPY and the underlying
portfolio react simultaneously to our oil inventory announcements even when we use 1 second
frequency data, while Box et al. (2019) work with one minute frequency. Moreover, we show
that SPY reacts stronger to news and thus potentially opens up an arbitrage opportunity
in the direction that is consistent with our mechanism. We also get di�erent results for the
behavior of the directional volume. Thus, even though some smaller ETFs may indeed have
negligible e�ects on the underlying markets, that is clearly not the case for SPY.23
5.5 Magnitude of the e�ect and further discussion
We would like to comment on the magnitude of our main results. Although the percent
increase in oil betas is non-trivial (40% in our benchmark case), one can argue that going
from a very small exposure, 0.05, to a slightly larger exposure, 0.07, does not lend con�dence
that this e�ect is meaningful for the overall market and has serious implications if the change
is driven by distortions. It is worth emphasizing, that while a median exposure to oil shocks
is relatively low, a median exposure to other fundamental shocks can be much larger, and
so can be the distortion. There is nothing speci�c to oil shocks in our analysis, our choice of
oil shocks is driven solely by our desire for a clean identi�cation.
We also argue that our empirical setup delivers the most conservative estimate of the
e�ects of ETF arbitrage on the market reaction to shocks. Our e�ect is partially picked up
by numerous controls that we use in our regression analysis. For example, �uctuations of
the overall intensity of trading are partially driven by changes in the intensity of arbitrage
transactions. When we remove the controls, the e�ect of ETF arbitrage on betas becomes
23In a di�erent setting and for a di�erent time period, Foucault et al. (2016) identify arbitrage oppor-tunities due to asynchronous price adjustments. They �nd out that these opportunities terminate with anarbitrageurs' trade in about two-thirds of the case.
28
![Page 29: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/29.jpg)
two and a half times larger.
Finally, we maintain the assumption that any changes in average industry sensitivities to
oil shocks are not related to ETF arbitrage and potentially re�ect fundamental changes in
the U.S. market (perhaps due to the development of unconventional oil). In our regressions,
these changes are picked up by sector by year indicators. However, instead, these changes,
including the dramatic change in oil betas from uniformly negative to uniformly positive in
2008, may be yet another consequence of the presence of ETFs.
Interestingly, the relative dynamics of the average industry oil betas provides yet another
argument in favor of our mechanism. Table 11 shows that all non-energy sectors experienced
a dramatic increase in oil betas from uniformly negative to uniformly positive. Surprisingly,
the average beta of the energy sector, actually decreased from 0.42-0.45 in the four years
before 2008 to just 0.31 in the period of 2009-2013. Thus, while non-oil sectors became
positively related to oil shocks, the sensitivity of the oil sector itself to oil shocks decreased.
The uni�cation of betas, an increase in the non-energy oil betas and a decrease in the
energy oil betas, although puzzling from a standard macro perspective, is consistent with
our arbitrage mechanism. As long as two stocks have similar weights in the index and have
similar liquidity, we should observe the same price pressure following news arrival due to
arbitrage transactions, irrespective of whether the stock belongs to the energy sector or not.
Moreover, if the non-energy companies overreact to oil shocks, while the stock market index
correctly responds to oil shocks,24 then it mechanically implies underreaction of the energy
stocks in the index to oil shocks. Hence, ETF arbitrage has a tendency to unify individual
market responses to news. Of course, in reality fundamental traders at least partially correct
the mispricings, reducing the price response of non-energy companies and increasing the price
response of energy companies. However, if this fundamental trading is limited or slow, the
uni�cation of betas can be observed over a su�cient period of time to be detected25, in line
with our �ndings.
One can argue that long-lasting permanent e�ects are less consistent with market distor-
24The market response consists of the direct e�ect of the revaluaton of the energy �rms included in theindex, but also of the general equilibrium e�ects of higher or lower oil prices relevant for all �rms.
25We conduct our main analysis using only stocks of the non-energy �rms for two reasons. First, arbitragepressure is more likely to dominate fundamental trading in individual non energy securities as oil news haveonly marginal importance for these �rms, relative to energy �rms. Second, we do not have a su�cientlylarge number of energy stocks in the S&P 500 index to conduct meaningful inference.
29
![Page 30: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/30.jpg)
tions and more consistent with permanent changes in prices. Although it is not unusual in
the �nance literature to document prolonged mispricings as arbitrage capital can be limited
and slow moving, recent research highlights negative e�ects of ETFs on trading and infor-
mation acquisition behavior of informed traders. Some papers suggest existence of equilibria
with permanently incorrect prices. Bhattacharya and O'Hara (2018) show that when ETFs
are available for trading, informed agents optimally choose to herd on systematic part of
information. As a result, idiosyncratic component is not re�ected in asset prices.26 This
e�ect can be ampli�ed if information acquisition is endogenous, as over time, any incentives
to acquire and process idiosyncratic information should cease, as it is not used for trading.
Our results can be seen as an empirical evidence of such herding, because we document the
loss of idiosyncratic variation in oil betas.
Other recent papers study how algorithmic trading can alter incentives for fundamental
traders to acquire information. This literature is particularly relevant for us, as liquid and
cheap ETFs are perfectly suited to attract high frequency traders. Weller (2018) documents
that algorithmic trading deter information acquisition. Intuitively, improved screening of in-
formed order �ow erodes information rents and reduces pro�tability of fundamental trading.
Similarly, Lee and Watts (2018) exploit SEC's randomized tick size experiment to provide
more direct evidence of the causal e�ect of tick size (and thus intensity of algorithmic trad-
ing) on fundamental information acquisition. Among other results, they �nd that with a
larger tick size treatment �rms experience a signi�cant increase in EDGAR web tra�c in
the days leading up to each earnings announcement. Foucault et al. (2016) argue that ar-
bitrageurs are often able to detect arbitrage opportunities due to asynchronous adjustments
in asset prices following information arrival. By arbitraging away such a price discrepancy,
arbitrageur cannibalize on stale quotes and expose dealers and other market participants to
the risk of trading against more informed counter party. Foucault et al. (2016) document
26Bhattacharya and O'Hara (2018) study the informational e�ects of ETFs in a factor model of prices. Anewly introduced ETF attracts new clientele with additional information about the fundamental values of theunderlying securities, both the systemic factor and individual shocks. The market makers on the underlyingmarkets can now extract information from the ETF price. However, if informed traders use short-termtrading strategies, the introduction of an ETF facilitates a herding equilibrium (in the spirit of Froot et al.(1992)). In a herding equilibrium, informed traders rationally choose to trade based on information on thesystematic factor only and completely disregard all idiosyncratic information. Each informed trader on themarket for each individual security, can accurately anticipate the information �ow into their own marketonce market makers learn from observing other markets and the ETF price, and thus can foresee the pricebump and pro�t from it.
30
![Page 31: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/31.jpg)
an increase in bid-ask spreads in the a�ected markets as the dealers seek to insure them-
selves against this risk. However, some market participants may also respond by pulling out
from the trading altogether, or at least at times of known public announcements, further
deteriorating fundamental trading in the individual markets.
Overall, the presence of ETFs can negatively a�ect expected pro�tability of fundamental
trading and thus allow for mispricings to persist in the markets for individual stocks over a
long period of time consistent with our �ndings.
6 Empirical tests of our measure and comparison with
other measures
In this section, we investigate our measure and provide empirical evidence that the estimated
probability of simultaneous trading with SPY measures the intensity of arbitrage transac-
tions. Our approach is to identify times when arbitrage intensity is known to increase, and
analyze whether our measure captures this intensi�cation. We analyze i) the intensity of the
algorithmic market making activity; ii) additions and deletions to the S&P 500 index; iii)
days of intensive creations and redemptions of SPY units; and iv) times of mutual fund �re
sales. Finally, we compare our measure to existing measures in the literature.
Determinants of the probability of simultaneous trading
We start by analyzing the determinants of our measure. We transform our estimate of
probability using the logistic function and estimate the following �xed e�ects regression:
yj,t = cj + αJ,t + δtXi,t + εi,t,
where yj,t = lnπj,t
1−πj,t is the logarithm of the odds, πj,t is the estimated probability of simulta-
neous trading of stock j and SPY in year t, and Xj,t is a vector of our explanatory variables,
including measures of liquidity and price impact and also dollar volume, turnover, market
capitalization and book-to-market ratio. Time-invariant stock e�ects are captured by cj.
We also include year by sector �xed e�ects. Time variability is captured by allowing the
coe�cients δt to vary over time.
31
![Page 32: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/32.jpg)
Panel A in Table 3 shows the results for the time invariant speci�cation. Not surprisingly,
we �nd a strong relationship between the probability of simultaneous trading and dollar
volume (and turnover). As SPY is traded frequently, when an individual stock is traded
more often, a simultaneous transaction becomes more likely to occur. As we cannot fully
�lter out these coincidental trades, it is critical to account for the overall intensity of trading
in our regressions.
Panel A in Table 3 also shows that liquidity measures have an insigni�cant e�ect on the
probability. To further explore this relationship, Panel B allows for the e�ects of liquidity
measures (as well as other controls) to vary over time. We �nd that, in the early years, all
coe�cients are negative and mostly signi�cant. The high frequency measure of price impact
also indicates a negative and signi�cant relationship in 2015. The negative link between
price impact and the intensity of arbitrage transactions partially re�ects the reluctance of
arbitrageurs to include an illiquid stock in the arbitrage portfolio. By conducting statistical
arbitrage, the arbitrageurs can avoid purchasing the entire basket of individual stocks to
economize on transaction and market impact costs. A su�ciently large portfolio would
su�ce, as long as it tracks the index well enough. Thus, our results (weakly) con�rm the
presence of statistical arbitrage.
Test 1: Market making activity and arbitrage
We start by exploiting the tight relationship between di�erent types of algorithmic trading
activities. This should be viewed as suggestive evidence, rather than a precise test.
The statistical arbitrage is just one of many applications of algorithmic trading, along
with optimal order execution and market making. While representing separate activities,
arbitrage and market making are not independent and often compete for the same order
�ow. Indeed, a transitory shock triggered by a large demand order can be traded away be
arbitrageurs or can be absorbed by limit orders supplied by liquidity providers. Thus, active
market making reduces arbitrage pro�ts, eventually ceasing arbitrage trading activity. In
line with this argument, Chaboud et al. (2014) and Brogaard et al. (2014) �nd that HFTs27
tend to consume liquidity when eliminating temporary pricing errors; at the same time the
frequency of arbitrage opportunities decreases as algotrading intensi�es. Thus, we can use
27It is worth noting, that many papers speci�cally focus on the speed of transactions aim to identify highfrequency trading, but do not try to separate market making and arbitrage activities.
32
![Page 33: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/33.jpg)
Table 3: Determinants of the probability of simultaneous trading with SPY.
The table displays the estimates of the following panel regression: yj,t = cj + αJ,t + δXj,t + εj,t,where yj,t = ln
πj,t1−πj,t is the logarithm of the odds, πj,t is the average probability of simultaneous
trading of stock j and SPY in year t, and Xj,t is a vector of explanatory variables: logarithms of
dollar volume, turnover, market capitalization and book-to-market ratio. St.err are clustered at the
�rm level, t-statistics in parenthesis.
Panel A: Time invariant speci�cation
(1) (2) (3)Bid-Ask Spread 0.021
(1.09)Price Impact -0.014
(-0.58)Amihud Ratio 0.012
(0.25)Dollar Volume 0.418 0.405 0.425
(14.41) (13.75) (7.96)Turnover 0.119 0.128 0.121
(3.03) (3.23) (2.60)Mcap -0.061 -0.063 -0.062
(-2.33) (-2.41) (-2.43)Book to market ratio 0.023 0.025 0.024
(1.97) (2.17) (2.08)Year sector e�ects Yes Yes YesTime varying γt, δt No No No
R-sq within 0.81 0.81 0.81R-sq between 0.76 0.76 0.75N obs 2,876 2,877 2,877N groups 444 444 444
Panel B: Time varying e�ects of liquidity on probability.
(1) (2) (3)Bid-Ask Spread - Price Impact- Amihud Ratio-2010 -0.057 (-1.59) -0.087 (-2.14) -0.166 (-2.92)2011 -0.012 (-0.56) -0.073 (-2.60) -0.059 (-1.54)2012 0.032 (1.66) 0.053 (2.17) 0.052 (1.98)2013 0.031 (1.48) 0.039 (1.37) 0.030 (0.9)2014 0.011 (0.57) 0.025 (0.96) 0.061 (1.85)2015 -0.015 (-0.86) -0.063 (-2.03) 0.066 (1.47)2016 0.004 (0.22) -0.053 (-1.65) 0.161 (3.67)Controls Yes Yes YesYear sector e�ects Yes Yes YesTime varying γt, δt Yes Yes Yes
33
![Page 34: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/34.jpg)
Table 4: Correlations of our measure of the intensity of arbitrage with the four measures ofthe intensity of algorithmic trading from Weller (2018).Average (over time) cross-sectional correlation of 4 measures of AT with our measure of theintensity of arbitrage
Trade-to-OrderVolume Ratio
Cancel-to-TradeRatio
Odd LotVolume Ratio
Average TradeSize
Probability ofSimultaneousTrading
0.49 -0.47 -0.54 0.49
Median (across all S&P 500 �rms) time-series correlation of 4 measures of AT with ourmeasure of the intensity of arbitrage for each year separately
Trade-to-OrderVolume Ratio
Cancel-to-TradeRatio
Odd LotVolume Ratio
Average TradeSize
Probability ofSimultaneousTrading-2012
0.44 -0.40 -0.39 0.21
2013 0.45 -0.38 -0.37 0.222014 0.26 -0.18 -0.32 0.132015 0.18 -0.17 -0.22 0.102016 0.22 -0.23 -0.37 0.25
34
![Page 35: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/35.jpg)
currently developed measures of algorithmic trading activity that pick up variation in market
making, to show the ability our measure to pick up variation in the intensity of statistical
arbitrage.
Earlier papers on the e�ects of algorithmic trading were con�ned to use regulatory and
proprietary data. Recently, however, huge regulatory interest led to a creation of the United
States Securities and Exchange Commissions's Market Information Data Analytics (MIDAS)
system which makes some of the stock-level statistics publicly available. We follow the SEC's
suggestions and Weller (2018) and construct two types of daily measures of the intensity of
algotrading. First two measures calculate the frequency of order cancellations. We use the
cancel-to-trade ratio, or the number of cancellations divided by the number of trades and
the trade-to-order volume ratio, or the total volume divided by the total volume across all
orders placed. Next two measures calculate the proportion of small trades. We use the total
volume executed in quantities smaller than 100 shares divided by total volume traded and
the average trade size. Higher number of cancellations and smaller average trade size indicate
more intensive algorithmic trading. It can immediately be seen that �rst two measures tend
to capture market making activity, while the last two measures can also pick up variation in
order execution activities (see also Hendershott et al. (2011)). Weller (2018) �nds that the
measures are highly correlated within each group (-0.83 and -0.94), but less than perfectly
correlated between groups (in absolute value the correlations range from 0.34 to 0.56), which
con�rms di�erent informational content of the measures.
We use all four measures to investigate the relationship between arbitrage intensity and
algorithmic trading. Table 4 displays various time-series and cross-sectional correlations.
The results show strong and stable correlations both in the cross-section and across time
within �rms (we only report the median), moreover, the correlations with our measure are
comparable in magnitude with the correlations reported in Weller (2018). We observe a
negative correlation of our measure with the cancel-to-trade ratio and the odd lot volume
ratio and a positive correlation with the trade-to-order ratio and the average trade size.
Hence, our results consistently indicate that the intensi�cation of activity that is picked up
by our measure is associated with less intensive algorithmic trading. The correlations with
the measures of the intensity of market making activity are especially important, as they
indicate that our measure indeed picks up variation in arbitrage intensity.
35
![Page 36: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/36.jpg)
Test 2: Additions to and deletions from the S&P 500 index
To further investigate the informational content of our measure, we study additions to and
deletions from the S&P 500 index.28 When a stock is added to the index, it becomes exposed
to index trading and thus we can expect an abrupt increase in arbitrage activity. Similarly,
we should see a drop in index trading activity when a stock is deleted from the index. Our
measure should re�ect these changes.
Data and methodology We study S&P 500 index inclusions and deletions between Jan-
uary 1, 2005 and December 31, 2016. In total we have 148 inclusion events and 143 deletion
events. As usual, inclusion events are excluded if the new �rm is a spin-o� of a �rm, and
deletion events are excluded if the old �rm engaged in a merger or acquisition. The remaining
sample contains 126 inclusion events and 72 deletion events.
For each event, we estimate the probability of simultaneous trading with SPY for the 90
working day periods before and after the event.29 To account for possible time variation, we
also perform estimation separately for each calendar year.
The changes in the index composition are preannounced. However, the channel that we
are interested in is arbitrage activity between the ETFs tracking the S&P 500 and the basket.
There is no reason for the new �rm to be used as a part of the arbitraging basket before it is
included in the index. Thus, we only consider the e�ective date of the inclusion or deletion
event.
Results Table 5 displays the change in the probability of simultaneous trading (with SPY)
for the stocks added to and deleted from the S&P 500 index. To gauge the magnitude of
the e�ect, the third column shows the average probability of simultaneous trading before the
inclusion or deletion event.
Panel A in Table 5 con�rms that a stock added to the index becomes more likely to
be traded simultaneously with SPY. The average e�ect across all events is an increase in
probability of 13%. The e�ect is especially pronounced in the earlier subsample. In 2007, the
28We follow large empirical literature that uses this identi�cation scheme, see Wurgler and Zhuravskaya(2002) and Barberis et al. (2005). Similarly, Ben-David et al. (2018) use the annual reconstitution of theRussell indexes.
29For robustness, we repeated the estimation with longer periods. The results are similar and can be sentupon request.
36
![Page 37: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/37.jpg)
Table 5: Changes in the probability of joint trading with SPY of stocks added to and deletedfrom the S&P 500 index.
Panel A: Stocks added to the S&P 500 index.
Mean∆p Median ∆p Mean pbefore Number of Events
2005 0.0092 0.0092 0.0255 12006 02007 0.0475 0.0417 0.0697 62008 02009 -0.0048 -0.0057 0.0931 42010 0.0006 0.0036 0.1076 92011 -0.0012 -0.0001 0.0992 132012 0.0079 0.0091 0.0821 142013 0.0186 0.0174 0.0855 162014 0.0112 0.0096 0.0677 122015 0.0196 0.0129 0.0882 242016 0.0039 0.0012 0.0719 27
all years 0.0110 0.0078 0.0830 126
Panel B: Stocks deleted from the S&P 500 index.
Mean∆p Median∆p Mean pbefore Number of Events2005 02006 02007 02008 0.0617 0.0617 0.1864 12009 -0.0273 -0.0301 0.1243 32010 -0.0111 -0.0077 0.1130 62011 -0.0146 -0.0182 0.1096 92012 0.0020 -0.0030 0.0964 92013 -0.0121 -0.0096 0.1225 122014 -0.0014 0.0018 0.1032 92015 0.0165 0.0178 0.1250 122016 -0.0002 0.0018 0.1211 11
all years -0.0023 -0.0043 0.1156 72
37
![Page 38: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/38.jpg)
probability of trading increases by almost 70%. In 2015, the increase is about 20%, and in
2016 about 5% on average. We focus on relative changes in probabilities rather than absolute
changes, because the �rms that are added to the index are relatively small �rms which receive
relatively small weights in the index. Such �rms are likely to be omitted from some arbitrage
portfolios as traders economize on trading costs. As a result, one cannot expect to detect
large absolute changes in probabilities. Statistically, we �nd that an increase in probability
is signi�cant in 85 of 126 index addition events (under the assumption that each second is
an independent observation). Overall, the additions results provide strong support for our
measure of index trading.
The deletion results are presented in Panel B. We see mostly negative numbers, con�rming
that a �rm deleted from the index experiences a decrease in the probability of joint trading
with SPY; on average the probability decreases by 2% when we consider all deletion events
in our sample. The magnitude of the e�ect in relative terms can be as large as -22% in
2009, -13% in 2011, and -10% in 2010 and 2013. We �nd that a decrease in probability is
signi�cant in 39 of 72 index deletion events (under the same assumption as before). The
number of deletions that can be used for our exercise is much smaller, as frequently a �rm
is deleted due to a merger or an acquisition, which we exclude from our analysis.
Overall, our results con�rm that the probability of simultaneous trading with SPY in-
creases after an addition to the index and decreases after a deletion. We consider this result
as evidence in favor of using the probability of simultaneous trading as a measure of index
trading intensity.
One may be concerned that our results re�ect the overall intensi�cation of trading. How-
ever, in this case, the increase in the trading volume is itself driven by index trading, and
thus represents our e�ect.
Test 3: SPY creations and redemptions
Another way to test our measure is to analyze creations and redemptions of ETF units. Each
Authorized Participant can exchange ETF shares for a basket of underlying securities.30
Imagine that the price of ETF shares happens to be larger than the weighted price of the
underlying securities. An AP purchases the underlying securities in accordance with their
30Although only in large amounts and at a fee.
38
![Page 39: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/39.jpg)
weights in the index and simultaneously sells ETF shares. If the accumulated position is
su�ciently large by the end of the day, the AP brings the underlying securities to the fund
sponsor, receives ETF shares in exchange, and uses them to cover his short position. As a
result, the number of outstanding ETF shares increases and correspondingly increases the
ETF's position in each of its underlying securities. Hence, intensive arbitrage transactions
during the day are re�ected in changes in ETF holdings. Thus, creations and redemptions
can be used to test our measure of the intensity of arbitrage transactions.31
Data and methodology We use the ETF Global database for information on daily hold-
ings of SPY. The sample period starts in 2012. For each individual component, we calculate
daily changes in SPY holdings and take the logarithm of the absolute change, log|∆j,t|.For estimation purposes, we transform the probability of simultaneous trading and cal-
culate the logarithm of the odds as yj,t = lnπj,t
1−πj,t . To test our hypothesis, we estimate a
panel regression of the transformed daily probabilities of simultaneous trading with SPY
on changes in SPY holdings. To distinguish redemptions and creations, we also estimate a
speci�cation that separates positive and negative changes in SPY holdings. We include the
logarithm of the dollar volume and the full set of year-month dummies as controls. Our
database covers the period from 2012 to 2016.
Results Table 6 shows our main results. The coe�cients of log|∆| are all positive and
signi�cant. Our results con�rm that changes in SPY holdings, and thus the intensity of
arbitrage transactions, are associated with a higher probability of simultaneous trading.
In our main exercise, we do not distinguish redemptions and creations. The last column
in Table 6 breaks down changes in SPY position into positive and negative ones. We �nd
no di�erence in the estimated coe�cients, which is also consistent with the interpretation of
measures as re�ecting the intensity of arbitrage transactions.
31Another reason for changes in holdings is rebalancing. However, most ETFs passively track an indexand thus rarely have to rebalance their portfolio. Moreover, rebalancing activity due to changes in the indexcomposition or weightening, typically happens at the end of the day, whereas our measure of probability iscalculated as an average over the entire day.
39
![Page 40: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/40.jpg)
Table 6: Probability of simultaneous trading and SPY redemptions and creations.
Panel A: yj,t = cj + ατ + γlog|∆j,t| + δXj,t + εj,t, where yj,t = lnπj,t
1−πj,t is the logarithm of
the odds, πj,t is the average probability of simultaneous trading of stock j and SPY in yeart, log|∆j.t| is a change in SPY holdings of �rm j, and controls Xj,t contain the logarithmof dollar volume,ατ are year-month dummies. The sample covers the 2012-2016 period.Standard errors are clustered at the �rm level, t-statistics are in parenthesis.
(I) (II) (III)
log|∆| 0.007 0.010(4.53) (6.40)
log|∆|+ 0.010(6.22)
log|∆|− 0.010(6.58)
DollarVolume Yes Yes YesYear-month dummies Yes Yes
R-sq within 0.30 0.47 0.47R-sq between 0.72 0.70 0.70
N 368,528 368,528 368,528N groups 449 449 449
40
![Page 41: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/41.jpg)
Test 4: Mutual fund �re sales
The unique structure of mutual funds gives us another way to test our measure. When a
mutual fund experiences a large out�ow of assets, its asset position has to be reduced. If
the selling pressure is large enough, it can temporarily drive down the prices of individual
securities. If the prices of some of the individual S&P 500 components fall, the overall value
of the index decreases and thus deviates from the price of SPY shares. This opens up an
arbitrage possibility. Hence, we should observe intensi�cation of index arbitrage transactions,
which should be captured by our probability of simultaneous trading with SPY.
Data and methodology We follow the literature32 and use Thomson Returns data on
mutual fund holdings and the CRSP mutual funds database to identify exposed mutual
funds and calculate the overall selling pressure on each stock in each quarter. Ideally, the
�re sales of mutual funds should represent a series of sell orders unrelated to any news
about stock value. Thus, we follow Edmans et al. (2012) and calculate the (logarithm) of
the total mutual fund hypothetical sales of stock j in quarter t, denoted as log|MFHSj,t|,see Internet Appendix A.8 for details. We estimate a panel regression of our transformed
probability of simultaneous trading on mutual fund �re sales. We include the logarithm of
the dollar volume and an indicator on zero sales as controls, and also include year-quarter
dummies.
Results Table 7 displays our results. The coe�cient of log|MFHSj,t| is positive and
signi�cant in all speci�cations. Active mutual fund sales are associated with an increased
probability of simultaneous trading with SPY, which is consistent with our interpretation of
increased arbitrage activity.
Comparison to other measures
In what follows, we brie�y identify and discuss issues of commonly used arbitrage measures.
A standard way to measure the intensity of ETF arbitrage at the stock level is to use
changes in ETF holdings. Ben-David et al. (2018) construct an ETF ownership measure by
32See for example Edmans et al. (2012), Dessaint et al. (2018), Phillips and Zhdanov (2013). See also therecent discussion of this identi�cation scheme in Wardlaw (2018) and Berger (2019).
41
![Page 42: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/42.jpg)
Table 7: Probability of simultaneous trading and mutual fund �re sales.
Panel A: yj,t = cj + αt + γlog|MFHSj,t|+ δXj,t + εj,t, where yj,t = lnπj,t
1−πj,t is the logarithm
of the odds, πj,t is the average probability of simultaneous trading of stock j and SPY inyear t, log|MFHSj,t| denotes mutual fund �re sales of stock j in quarter t, and controls Xj,t
contain the logarithm of dollar volume and the indicator on zero MFHS, αt are year-quarterdummies. We only consider domestic equity funds, and exclude sectoral funds. The �owthreshold is set at 5%. Standard errors are clustered at the �rm level, t-statistics are inparenthesis.
(I) (II)
log|MFHS| 0.074 0.032(7.62) (4.34)
ZeroMHFS Yes YesDollarVolume Yes YesYear-quarter dummies No Yes
R-sq within 0.41 0.83R-sq between 0.87 0.85
N 18,873 18,873N groups 430 430
42
![Page 43: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/43.jpg)
multiplying assets under management by the index weight of the security. A similar approach
is taken in Brown et al. (2018) and Israeli et al. (2017). Currently, ETF redemptions and
creations data are available directly at daily frequency. As far as we are aware, only Da and
Shive (2018) take a di�erent approach and use ETF turnover to measure arbitrage intensity,
but at the ETF level only.
To start, measures based on ETF ownership cannot account for passive investment.
Growth in the assets under management does not necessarily translate into growth in the in-
tensity of arbitrage transactions. Even for SPY, we have seen substantial �uctuations in the
dollar volume and the overall intensity of trading, even though its assets have been steadily
growing. Moreover, as documented using index weights and AUM may be imprecise for a
sizable proportion of funds that replicate their target index by investing in a representative
basket of securities rather than the entire index basket. Brogaard et al. (2019) document
that at least 22% of ETFs replicate their target index, including six of the ten largest ETFs
as of 2018.
Another concern is that measures based on ETF creations and redemptions do not ac-
count for the netting of positions over the day. One can only observe the end of the day
holdings. However, during the day arbitrageurs can trade in both directions, arbitraging
away positive and negative deviations between the ETF price and the price of the underly-
ing basket. If the frequency and magnitude of positive and negative deviations are roughly
comparable, the accumulated end-of-the-day position can be relatively small. Thus, we may
have days with quite intensive arbitrage transactions, but close to zero net changes in the
holdings. Hence, by using changes in ETF holdings, one may signi�cantly underestimate the
intensity of arbitrage transactions. Moreover, Evans et al. (2019) document that authorized
participants have incentives to delay creation of ETF shares. ETF creation can only be done
in bulk (typically 50,000 ETF shares) and at a fee, thus APs may prefer to wait until their
position builds up to a size of the creation unit, which might take several days. In addition, if
the underlying market is not perfectly liquid and if the ETF's order �ow if expected to mean
revert, APs may prefer to take the unhedged position by only selling overvalued ETF shares
and waiting for the mispricing to be reversed (Evans et al. (2019) call such selling of ETF
shares that have not yet been created as 'operational shorting'). Evans et al. (2019) argue
that the time period between the original mispricing/arbitrage event and the corresponding
creation event may be as large as six days. Hence, daily creations and redemptions repre-
43
![Page 44: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/44.jpg)
sent highly imprecise and noisy measure of the intensity of arbitrage, which can explain the
absence of any e�ects of creations and redemptions on the underlying markets documented
by Box et al. (2019).
More importantly, measures that use �xed index weights are speci�cally constructed to
re�ect ETF creations and redemptions, and therefore do not account for statistical arbitrage.
As we have seen, �rms with larger index weight are traded more often simultaneously with
SPY (controlling for the intensity of overall trading), which suggests the presence of statistical
arbitrage. Not accounting for statistical arbitrage distorts the estimates of the intensity of
arbitrage transactions: overestimates for the small �rms, and underestimates for the large
�rms.
In contrast, our approach allows us to measure arbitrage intensity over any interval of
the day and accounts for statistical arbitrage. Thus, it represents a more precise measure of
the intensity of arbitrage transactions than existing approaches in the literature.
Our direct approach has yet another advantage. It allows us to separately measure the
e�ect of a stock characteristic on the intensity of arbitrage transactions from its e�ect on
the magnitude of distortion that these arbitrage transactions cause. For example, without a
direct measure of arbitrage intensity, it is impossible to separately measure these two opposite
e�ects of liquidity (see the discussion in Section 2). Shim (2018) estimates price impact to
account for di�erential stock price sensitivity to mechanical arbitrage, but he abstracts from
statistical arbitrage and thus does not consider the di�erential e�ect of liquidity on the
intensity of arbitrage transactions across �rms. In contrast, because we directly measure the
intensity of arbitrage transactions, we can assess both e�ects independently. In particular,
our results show a strong e�ect of liquidity on the market distortion, but a negative e�ect of
liquidity on arbitrage intensity, thus further con�rming our mechanism.
More generally, our measure can be viewed as complementary to existing measures of
algotrading intensity. It is worth emphasizing, that our measure only requires access to a
widely used commercial dataset such as TAQ, while existing measures of algotrading require
access to regulatory data. Limited access to regulatory data is one of the reasons why
measures of statistical arbitrage are scarce. For example, NASDAQ only provides a limited
sample of randomly selected stocks (Brogaard et al. (2014) only use the data on 120 stocks)
which makes it impossible to study the cross-market arbitrage. We are aware of only one
study, Buyuksahin and Robe (2014) that uses a unique regulatory dataset to study cross
44
![Page 45: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/45.jpg)
arbitrage in the futures markets, by comparing the positions of the same traders at the
equity futures and the commodity futures markets. Brogaard et al. (2019) study cross
market activities of HFTs in 15 Canadian stocks that are not cross-listed in the Unites
States, however, they do not try to separately identify arbitrage activity33. In contrast,
our measure can be easily calculated and applied to an arbitrary combination of assets that
one wishes to study over an arbitrary period of time. It should also be noted that recently
created MIDAS data cannot be directly used to measure statistical arbitrage. For example,
the correlations between the four measure of algotrading calculated for the SPY and for
the individual constituents show very low correlations and thus cannot be used to pick up
variation in arbitrage activity.
7 Additional evidence: ETF-level analysis
So far, we have focused on one particular ETF (SPY) and explored the intensity of arbitrage
transactions at the individual �rm level. To further support our results, we perform an
additional empirical exercise, this time exploring the reaction of ETFs to oil shocks as a
function of the intensity of arbitrage transactions.
Intensity of arbitrage transactions at the ETF level
To predict the intensity of arbitrage transactions at the ETF level, we propose to use a
measure of liquidity at the ETF market. Indeed, arbitrage activity requires transactions
with ETF shares. If the ETF market is not liquid enough, certain price deviations can stay
unarbitraged for some time, as it may not be pro�table to eliminate them by purchasing or
selling ETF shares in a thin market. Therefore, more liquid ETFs can be more involved in
intensive arbitrage.
One of the standard ways to measure liquidity is to use the bid-ask spread. The spread
is basically zero for SPY, but can be extremely large for smaller ETFs trading less liquid
stocks.
33It should also be noted, that classi�cation of traders based on their trading patterns is not straightforwardeven with regulatory data. Large trading �rms such as Goldman tend to utilize multiple IDs. To solve thisissue, Brogaard et al. (2014) only work with 26 small independent proprietary �rms.
45
![Page 46: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/46.jpg)
Methodology
We follow the same approach as described above to estimate oil betas for each ETF. Similarly
to our main exercise, we then estimate the following �xed e�ect panel regression:
βj,t = cj + αt + γtsj,t + δ′tXj,t + εj,t (3)
where βj,t is the estimated oil beta of ETF j in year t, sj,t is (the logarithm of) the average
bid-ask spread on the ETF j's market over the same year, and Xj,t is a vector of control
variables. Controls include the logarithm of assets under management. The coe�cient cj
captures the unobserved time-invariant individual e�ect, which can be interpreted as the
true sensitivity to oil shocks of a particular set of �rms that an ETF tracks.
Data and descriptive statistics
We consider U.S. non-energy equity ETFs. We download a list of equity ETFs from etf.com
and keep those with more than $250 mln in assets under management as of November 2018.
In total that leaves us with 540 funds. As we only keep U.S. equity funds, and exclude
leveraged and inverse ETFs, the remaining sample contains 302 funds. We do not have high
frequency data on 18 ETFs, and thus the �nal sample has 284 funds. We �nd only a very
limited number of energy-related ETFs: 5 funds invest in a broad set of energy related �rms,
one ETF specializes in equipment and services, and two on exploration and production. We
exclude these funds from our analysis. As before, we use the Trade and Quote Database
(TAQ) for the high-frequency data, and our sample covers the period from 2005 to 2016.
The main source of data on ETFs is the ETF Global database. We obtain the data on
NAV and shares outstanding from 2006 to calculate assets under management for the entire
period. However, other variables, including expenses, are available only from 2012. As most
funds do not change expenses over time, we do not incorporate it into our analysis.
We use two main sources to calculate the bid-ask spread: CRSP and WRDS Intraday
Database that reports the best bids and o�ers as of 1 pm. We use these quotes to calculate
the bid-ask spread percentage. Unfortunately, the WRDS intraday indicators are calculated
up to 2014 only, when TAQ switched to milliseconds. ETF Global also provides a measure
of the bid-ask spread, but is available starting from 2012 only.
46
![Page 47: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/47.jpg)
Table 8: Di�erent measures of bid-ask spreads.
ETF Global o�ers average intraday bid and asks, but only after 2012. CRSP reports the best bid
and o�er at market closing. WRDS Intraday Database reports the best bids and o�ers as of 1 pm;
the data are available up to 2014. The table reports the descriptive statistics and cross-sectional
correlations of di�erent measures of spreads (in logs) over the overlapping period 2012-2014.
Panel A: Descriptive statistics
Mean Std.Dev Min MaxETF Global -6.26 1.39 -9.82 -1.32CRSP -7.25 0.92 -9.84 -3.15WRDS, 1pm -7.16 0.88 -9.81 -3.53
Panel B: Cross-sectional correlations
ETF Global CRSP WRDS, 1pmETF Global 1.00CRSP 0.45 1WRDS, 1pm 0.46 0.94 1
Descriptive statistics
We provide detailed information on the composition of our sample in Table 12 in the Internet
Appendix. The vast majority of our ETFs are broad-based funds investing in a broad set of
stocks.
Table 12 also displays estimated oil betas. We can see a familiar shift from negative to
positive betas in 2008-2009. By comparing Table 12 with Table 11, we notice that oil betas
are smaller for ETFs compared to the S&P 500 �rms. Industry ETFs invest in a broader set
of �rms, including smaller �rms outside major indices. Hence, we �nd that larger �rms on
average tend to display a stronger reaction to oil shocks. This result is yet another piece of
evidence in favor of the index trading hypothesis.
Table 8 provides some basic statistics for our measures of the bid-ask spread. CRSP and
47
![Page 48: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/48.jpg)
WRDS show similar �rst and second sample moments, and the correlation is also large, 0.94.
The ETF Global measure has a larger mean and variance, and it is less correlated with the
other two measures.In our main exercise, we will use CRSP and WRDS bid-ask measures.
Results
Table 9 reports our main results. The spread coe�cients are negative and signi�cant for
both measures of ETF liquidity and for most years. The negative sign means that more
liquid ETFs that have lower spreads display a stronger reaction to oil shocks. As a lower
spread is likely to indicate more active arbitrage transactions, this con�rms our previous
�rm-level results that intensive index trading is associated with a stronger response to oil
shocks. To see the magnitude of the e�ect, consider a decrease in the CRSP (log) spread
from the third to the forth quantile, which equals -1.15. Such a decrease in the spread would
imply an increase in oil beta by 0.022 - 0.026 in the earlier period of our sample (we use 1
minute returns). Thus, the e�ect is similar in magnitude to the e�ect of index arbitrage on
oil betas for the stocks.
We �nd a lack of signi�cance in the later years. One potential reason is that ETFs have
grown substantially and have reached su�cient liquidity. Liquidity of ETF markets is no
longer an issue, and the intensity of arbitrage transactions is driven only by the liquidity of
the underlying markets. In this case, the bid-ask spread on the ETF market becomes a poor
predictor of the intensity of arbitrage transactions.
Overall, our ETF-level analysis con�rms our previous results showing that index trading
intensi�es the market reaction to fundamental oil shocks.
8 Conclusion
Exchange traded funds have transformed the investment industry. By providing access to the
markets to a broader set of market participants, ETFs have enhanced trading opportunities
and have improved risk sharing. However, the side e�ects of such a drastic transformation can
be substantial. As arbitrage is an essential feature of ETF trading, we investigate the e�ect
of arbitrage on price discovery. We introduce a new measure of the intensity of arbitrage
transactions and use it to show that ETF arbitrage a�ects price e�ciency by distorting the
48
![Page 49: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/49.jpg)
Table 9: Impact of the probability of joint trading on oil betas in the panel of U.S. non-energyequity ETFs.We estimate the following panel regression of ETF betas on bid-ask spread: βj,t = cj + αt +γtsj,t + δ′tXj,t + εj,t, where βj,t is the estimated oil beta of ETF j in year t, sj,t is (the logarithmof) the average bid-ask spread on the ETF j's market over the same year, and Xj,t is a vector of
control variables, cj corresponds to unobserved ETF-level individual e�ects. Controls include the
logarithm of assets under management. St.err are clustered at the ETFs level, second column in
displays t-statistics.
The sample starts in 2010 and ends in 2016 when we use CRSP bid-ask data, or in 2014 if we use
WRDS 1 pm data.
CRSP WRDS 1pm1 min returns 5 min returns 1 min returns 5 min returns
Spread -2010 -0.023 (-2.68) -0.011 (-1.07) -0.031 (-4.31) -0.038 (-3.33)2011 -0.023 (-3.26) -0.010 (-1.00) -0.021 (-3.38) -0.026 (-2.21)2012 -0.019 (-3.16) -0.024 (-2.30) -0.015 (-1.76) -0.028 (-2.15)2013 -0.013 (-2.48) -0.020 (-2.17) -0.017 (-2.64) -0.040 (-3.31)2014 -0.001 (-0.23) 0.004 (0.52) 0.003 (0.44) -0.008 (-0.68)2015 -0.004 (-0.74) 0.003 (0.27)2016 0.005 (0.45) 0.011 (0.66)
AUM Yes Yes Yes YesTime varying- αt Yes Yes Yes Yes- γt, δt Yes Yes Yes Yes
R-sq within 0.31 0.58 0.24 0.58R-sq between 0.33 0.23 0.39 0.35
N 1,385 1,385 1,010 1,010N groups 247 247 247 247
49
![Page 50: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/50.jpg)
market reaction to fundamental shocks.
However, our results are likely to underestimate the e�ect of the ETF presence on the
market reaction to oil shocks. So far, we have only considered the mechanical arbitrage
channel that describes how the propagation of fundamental shocks can be distorted, whereas,
for example, the informational channel can explain why certain shocks become more likely
to hit the market, as informed traders reoptimize their trading strategies when ETFs are
traded.
In a recent theoretical paper, Bhattacharya and O'Hara (2018) study the informational
e�ects of ETFs in a factor model of prices. A newly introduced ETF attracts new clientele
with additional information about the fundamental values of the underlying securities, both
the systemic factor and individual shocks. The market makers on the underlying markets can
now extract information from the ETF price. However, if informed traders use short-term
trading strategies, the introduction of an ETF facilitates a herding equilibrium (in the spirit
of Froot et al. (1992)). In a herding equilibrium, informed traders rationally choose to trade
based on information on the systematic factor only and completely disregard all idiosyncratic
information. Each informed trader on the market for each individual security, can accurately
anticipate the information �ow into their own market once market makers learn from observ-
ing other markets and the ETF price, and thus can foresee the price bump and pro�t from
it. As a result, in equilibrium asset prices exhibit lower informational e�ciency. In other
words, the market reaction to news is distorted, as all individual information is disregarded
and never becomes re�ected in prices. It should also be noted that endogenous information
acquisition and processing further reinforce the herding. Over time, any incentives to acquire
and process idiosyncratic information cease if informed traders use short-term strategies and
trade based on a systematic piece of information only.
Some of our results are suggestive of herding. We document a dramatic shift in average
oil sensitivities at the end of 2008. In our analysis, we restrict the sample to start in 2010 to
avoid this shift, and we include sector by year indicators to pick up any �uctuations in average
industry sensitivities over time. We maintain the assumption that any changes in average
industry sensitivities to oil shocks are not related to ETF arbitrage and potentially re�ect
fundamental changes in the U.S. market (perhaps due to the development of unconventional
oil). However, instead, these changes may be yet another consequence of the presence of
ETFs, perhaps due to a switch to a herding equilibrium. More research is needed to test the
50
![Page 51: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/51.jpg)
informational channel, we plan to pursue this in future research by extending our ETF-level
analysis.
References
Agarwal, V., P. Hanouna, R. Moussawi, and C. W. Stahel (2018). Do ETFs increase the
commonality in liquidity of underlying stocks? Working Paper.
Anatolyev, S., S. Seleznev, and V. Selezneva (2018). The role of inventories in shaping the
behavior of oil prices. Working paper.
Anatolyev, S., S. Seleznev, and V. Selezneva (2019). Does index arbitrage distort market
reaction to oil shocks? Working paper.
Barberis, N., A. Shleifer, and J. Wurgler (2005). Comovement. Journal of Financial Eco-
nomics 75 (2), 283�317.
Ben-David, I., F. Franzoni, and R. Moussawi (2018). Do ETFs increase volatility? Journal
of Finance 73 (6), 2471�2535.
Berger, E. (2019). Selection bias in mutual fund �ow-induced �re sales: Causes and conse-
quences. Working paper.
Bhattacharya, A. and M. O'Hara (2018). Can ETFs increase market fragility? E�ect of
information linkages in ETF markets. Working paper.
Bollerslev, T., S. Z. Li, and V. Todorov (2016). Roughing up beta: Continuous versus
discontinuous betas and the cross section of expected stock returns. Journal of Financial
Economics .
Bollerslev, T., V. Todorov, and S. Z. Li (2013). Jump tails, extreme dependencies, and the
distribution of stock returns. Journal of Econometrics 172 (2), 307�324.
Box, T., R. Davis, R. Evans, and A. Lynch (2019). Intraday arbitrage between ETFs and
their underlying portfolios. Working Paper.
51
![Page 52: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/52.jpg)
Brogaard, J., D. Heath, and D. Huang (2019). The rise of ETF trading and the bifurcation
of liquidity. Working Paper.
Brogaard, J., T. Hendershott, and R. Riordan (2014). High-frequency trading and price
discovery. The Review of Financial Studies 27 (8), 2267�2306.
Brogaard, J., T. Hendershott, and R. Riordan (2019). Price discovery without trading:
Evidence from limit orders. The Journal of Finance 74 (4), 1621�1658.
Brown, D. C., S. Davies, and M. Ringgenberg (2018). ETF �ows, non-fundamental demand,
and return predictability. Working paper.
Buyuksahin, B. and M. A. Robe (2014). Speculators, commodities and cross-market linkages.
Journal of International Money and Finance 42, 38�70.
Chaboud, A. P., B. Chiquoine, E. Hjalmarsson, and C. Vega (2014). Rise of the machines:
Algorithmic trading in the foreign exchange market. The Journal of Finance 69 (5), 2045�
2084.
Coval, J. and E. Sta�ord (2007). Asset �re sales (and purchases) in equity markets. Journal
of Financial Economics 86 (2), 479�512.
Da, Z. and S. Shive (2018). Exchange traded funds and asset return correlations. European
Financial Management 24 (1), 136�168.
Dessaint, O., T. Foucault, L. Fresard, and A. Matray (2018). Noisy stock prices and corporate
investment. Rotman School of Management Working Paper 2707999.
Edmans, A., I. Goldstein, and W. Jiang (2012). The real e�ects of �nancial markets: The
impact of prices on takeovers. The Journal of Finance 67 (3), 933�971.
Evans, R., R. Moussawi, M. Pagano, and J. Sedunov (2019). ETF short interest and failures-
to-deliver: Naked short-selling or operational shorting? Working Paper.
Foucault, T., R. Kozhan, and W. W. Tham (2016). Toxic arbitrage. The Review of Financial
Studies 30 (4), 1053�94.
52
![Page 53: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/53.jpg)
Froot, K. A., D. S. Scharfstein, and J. C. Stein (1992). Herd on the street: Informational
ine�ciencies in a market with short-term speculation. The Journal of Finance 47 (4),
1461�1484.
Glosten, L. R., S. Nallareddy, and Y. Zou (2016). ETF activity and informational e�ciency
of underlying securities. Working paper.
Hasbrouck, J. (2009). Trading costs and returns for us equities: Estimating e�ective costs
from daily data. The Journal of Finance 64 (3), 1445�1477.
Hendershott, T., C. M. Jones, and A. J. Menkveld (2011). Does algorithmic trading improve
liquidity? The Journal of Finance 66 (1), 1�33.
Hong, H., J. D. Kubik, and T. Fishman (2012). Do arbitrageurs amplify economic shocks?
Journal of Financial Economics 103 (3), 454�470.
Israeli, D., C. M. Lee, and S. A. Sridharan (2017). Is there a dark side to exchange traded
funds? an information perspective. Review of Accounting Studies 22 (3), 1048�1083.
Kilian, L. and C. Park (2009). The impact of oil price shocks on the us stock market.
International Economic Review 50 (4), 1267�1287.
Lee, C. and E. M. Watts (2018). Tick size tolls: Can a trading slowdown improve price
discovery? Working Paper.
O'Hara, M., C. Yao, and M. Ye (2014). What's not there: Odd lots and market data. The
Journal of Finance 69 (5), 2199�2236.
Phillips, G. M. and A. Zhdanov (2013). R&d and the incentives from merger and acquisition
activity. The Review of Financial Studies 26 (1), 34�78.
Saglam, M., T. Tuzun, and R. Wermers (2019). Do ETFs increase liquidity?. Working Paper.
Shim, J. J. (2018). Arbitrage comovement. Working paper.
Staer, A. and P. Sottile (2018). Equivalent volume and comovement. Quarterly Review of
Economics and Finance 68, 143�157.
53
![Page 54: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/54.jpg)
Todorov, V. and T. Bollerslev (2010). Jumps and betas: A new framework for disentangling
and estimating systematic risks. Journal of Econometrics 157 (2), 220�235.
Wardlaw, M. (2018). Measuring mutual fund �ow pressure as shock to stock returns. Working
paper.
Weller, B. M. (2018). Does algorithmic trading reduce information acquisition? The Review
of Financial Studies 31 (6), 2184�2226.
Wurgler, J. and E. Zhuravskaya (2002, October). Does arbitrage �atten demand curves for
stocks? Journal of Business 75 (4), 583�608.
54
![Page 55: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/55.jpg)
A Internet Appendix
A.1 Estimation of the price impact using high frequency data
To estimate price impact, we use high frequency data. For each day d in our sample, we �rst identify
all 1-sec intervals with non-zero trading and we number these intervals with τ . We then estimate
the following regression of absolute returns on the square root of trading volume in dollars:
|rτ | = αd + βd√Vτ + ετ
This gives us an estimate of βd, which we use as a measure of the price impact over day d. Our
approach is similar to Hasbrouck (2009), Shim (2018), and many other papers that document a
square root relationship between returns and volumes. To create annual measures, we take a simple
average of price impacts over the year. 34Our measure of price impact is conceptually similar to
the Amihud ratio, except that we use high frequency data and specify a square root relationship,
whereas the Amihud ratio takes the average of the daily ratios of absolute returns over trading
volume (also in dollars).
In our regressions, we use logarithms of three liquidity measures: bid-ask spread, the Amihud
ratio, and our measure of price impact. Table 10 shows correlations of three di�erent measures for
our sample of non energy S&P 500 �rms and for the period from 2010 to 2016. We can see that our
measure is highly correlated with the Amihud ratio; the correlation is 0.81, and has 0.63 correlation
with the bid-ask spread. However, for robustness we use all three measures in our analysis.
Table 10: Correlation of di�erent measures of liquidity and price impact.
Bid ask spread Price impact Amihud ratioBid ask spread 1Price impact 0.63 1Amihud ratio 0.38 0.81 1
34It should be noted that an estimate of the beta coe�cient in a linear model: |rτ | = αd+βdVτ+ετ ,as well as the price impact calculated as the total sum of absolute values of 1-sec returns over thetotal volume(
∑τ |rτ |/
∑τ Vτ ) produce very similar time series of the price impact which highly
correlated with each other. Our main results also remain unchanged.
55
![Page 56: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/56.jpg)
A.2 Descriptive statistics
Table 11: Estimated oil betas for the �rms in the S&P 500 index.
Panel A: Average oil beta by sector.The table displays average oil betas calculated using 1 minute returns.
2005-2006 2007-2008 2009-2013 2014-2016
Basic Materials -0.05 0.01 0.11 0.11Consumer -0.07 -0.08 0.06 0.05Financial -0.06 -0.12 0.07 0.05Health Care -0.07 -0.04 0.04 0.04Industrials -0.07 -0.05 0.07 0.06Information Technology -0.09 -0.05 0.07 0.06Real Estate -0.04 -0.08 0.07 0.02Telecommunications -0.07 -0.05 0.07 0.05Utilities -0.05 -0.01 0.04 0.04Energy 0.42 0.45 0.31 0.40
Panel B: Average probability of simultaneous trading with SPY by sector
2005-2006 2007-2008 2009-2013 2014-2016
Basic Materials 0.04 0.12 0.10 0.09Consumer 0.05 0.13 0.11 0.11Financial 0.04 0.15 0.13 0.11Health Care 0.05 0.11 0.10 0.11Industrials 0.04 0.12 0.10 0.10Information Technology 0.07 0.14 0.12 0.12Real Estate 0.02 0.09 0.08 0.07Telecommunications 0.05 0.12 0.13 0.15Utilities 0.04 0.10 0.08 0.10Energy 0.08 0.17 0.14 0.16
56
![Page 57: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/57.jpg)
Table 12: Descriptive statistics of U.S. equity ETFs.
Panel A: Sector composition of the U.S. equity ETFs.
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Broad-based 52 63 74 84 86 101 114 121 139 153 167 175
Basic Materials 3 3 3 3 3 3 3 3 3 3 3 3
Consumer 6 6 7 9 9 8 8 8 11 11 11 11
Financial 3 7 7 8 10 10 10 11 13 13 13 13
Health Care 3 5 6 7 7 8 10 8 10 11 11 11
Industrials 3 3 3 3 3 4 4 4 5 5 5 5
Internet & Softw 1 2 3 3 4 4 4 4 3 3 4 4
Real Estate 4 4 5 5 5 5 7 7 7 7 9 10
Telecomm 1 2 2 2 2 2 2 2 2 2 2 2
Utilities 3 3 3 5 5 4 5 5 5 6 6 6
Energy 3 3 4 4 4 4 4 4 5 5 5 5
Oil & Gas Eq&Serv 0 1 1 1 1 1 1 1 1 1 1 1
Oil & Gas E&P 0 2 2 2 2 2 2 2 2 2 2 2
Other Sectors 11 22 22 22 24 27 28 30 33 35 36 36
Total 93 126 142 158 165 183 202 210 239 257 275 284
Panel B: Average oil betasThe table displays average oil betas calculated using 1 minute returns.
2007-2008 2009-2013 2014-2016
Broad-based ETFs -0.006 0.032 0.027
Basic Materials 0.024 0.085 0.066
Consumer -0.023 0.015 0.016
Financial -0.081 0.039 0.024
Health Care -0.012 0.007 0.018
Industrials -0.010 0.028 0.023
Internet and Software 0.003 0.005 0.029
Real Estate -0.062 0.047 0.011
Telecommunications -0.012 0.014 0.024
Utilities 0.018 0.015 0.018
Energy 0.253 0.144 0.271
Oil & Gas Equipment & Services 0.059 0.119 0.189
Oil & Gas Exploration & Production 0.147 0.263 0.381
Other Sectors -0.017 0.034 0.048
57
![Page 58: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/58.jpg)
A.3 Subsamples
Although the development of unconventional oil had already gained momentum by 2010,
U.S. oil production dramatically increased afterwards. It is likely that the U.S. economy has
continued adjusting to new sources of domestic oil, new �nancing of the oil industry, and
thus the sensitivities of �rms to oil has continued to evolve. Hence, our assumption of time
invariant individual e�ects may be too strong. To show that our results are not driven by
changes in the fundamental sensitivites to oil, we break the sample into two subperiods. As
2014 is characterized by a signi�cant collapse of oil prices that triggered a series of defaults
in the oil producing industry, we break the sample at the end of 2013.
Table 13 breaks the sample period into 2010/13 and 2014/16. The point estimates are
large and positive for all years in both subsamples and for both frequencies of returns. As
before, we observe signi�cant results for the second subsample and, in addition, for some
years in the �rst subsample. Again, the results are stronger for 5 minute returns. Overall
the results are similar.
58
![Page 59: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/59.jpg)
Table 13: Impact of the probability of joint trading on oil betas in the cross section of S&P500 �rms by subperiod 2010-2013 and 2014-2016..
1 min returns 5 min returns2010-13 2014-16 2010-13 2014-16
Prob -2010 1.03 1.83
(1.32) (1.60)2011 0.69 1.39
(1.36) (1.88)2012 0.77 1.28
(1.28) (1.47)2013 0.83 1.94
(1.62) (2.56)2014 0.29 0.74
(1.61) (2.68)2015 0.30 0.68
(1.80) (2.64)2016 0.46 0.76
(2.45) (3.09)
Controls(mcap,book-to-market,turnover, dollarvolume, priceimpact)
Yes Yes Yes Yes
Year sectore�ects
Yes Yes Yes Yes
Time varyingγt, δt
Yes Yes Yes Yes
R-sq within 0.48 0.50 0.59R-sq between 0.01 0.01 0.02
N 1,606 1,271 1,606 1,271N groups 418 440 418 440
59
![Page 60: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/60.jpg)
A.4 Conditional probability
For robustness, we also use an estimate of conditional probability calculated as
πcj,t =
T∑τ=1
IVETF,τ>0IVj,τ>0
T∑τ=1
IVj,τ>0
.
By conditioning on �rm j's trading, we can further �lter our intensity of overall trading
from our measure of the intensity of arbitrage transactions. In our sample, πcj,t lies in
the range from 0.70 to 0.91, and the �rst, second, and third quantiles are 0.83, 0.85,0.87,
respectively. Thus, surprisingly, we have very little variation in conditional probability. Table
14 repeats our main estimation and shows the results. The results are now even stronger. We
see signi�cant coe�cients in all speci�cations, for all years and both frequencies of returns.
To compare the magnitude of the e�ect, again consider an increase in conditional probability
associated with a move from the �rst to the third quantile. This increase is associated with
an increase in beta by 0.1 (using an estimate for 1 minute returns). Thus, the e�ect is even
larger when we use conditional probabilities.
60
![Page 61: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/61.jpg)
Table 14: Impact of conditional probability of simultaneous trading with SPY on oil betasin the panel of S&P 500 �rms.
The table displays the estimates of the following panel �xed e�ect regression: βj,t = cj + αJ,t +γtπ
cj,t + δtXj,t + εj,t, where βj,t is the estimated oil beta of stock j in year t, πcj,t is the estimated
probability of simultaneous trading of stock j and SPY conditional on trading of stock j over thesame period, αJ,t are year by sector dummies, and Xj,t is a vector of control variables. Controls
include logarithms of market capitalization, book-to-market ratio, dollar volume, turnover, volatility,
and three measures of liquidity and price impact: bid-ask spread, the price impact coe�cient
estimated using high frequency data, and the Amihud ratio. St.err are clustered at the �rm level,
t-statistics in parenthesis. The sample period covers 2010-2016.
1 min returns 5 min returns
(4) (5) (6) (7) (8) (4) (5) (6) (7) (8)
ProbTrad 2.58 3.14 2.55 2.45 3.60 4.34 3.46 3.39
(2.19) (2.41) (2.26) (2.10) (2.05) (2.22) (2.07) (1.93)
Prob -
2010 3.75 (2.03) 5.39 (1.94)
2011 2.06 (4.01) 3.08 (5.66)
2012 1.51 (3.29) 2.31 (3.63)
2013 1.18 (2.61) 1.51 (2.28)
2014 1.68 (2.97) 1.40 (1.62)
2015 1.46 (2.79) 2.16 (2.82)
2016 1.40 (2.52) 2.54 (3.16)
Controls(mcap,
book-to-
market,
turnover,
dollar volume)
Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread Yes Yes
Price impact Yes Yes
Amihud Yes Yes
Volatility Yes Yes
Year sector
e�ects
Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Time varying
γt, δt
Yes Yes
R-sq within 0.34 0.37 0.34 0.36 0.40 0.54 0.55 0.54 0.55 0.58
R-sq between 0.08 0.12 0.09 0.12 0.00 0.12 0.14 0.10 0.14 0.001
N 2,876 2,884 2,884 2,884 2,884 2,876 2,884 2,884 2,884 2,884
N groups 444 445 445 445 445 444 445 445 445 445
61
![Page 62: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/62.jpg)
A.5 Oil betas vs. market betas
The cash �ow sensitivities to the oil price are largely driven by the nature of business and
re�ect a lot of idiosyncratic variation. As a result, our oil betas are quite di�erent from the
market betas. To show that, we calculate the cross-sectional correlations of the market betas
with our oil betas. To calculate market betas, we use daily returns on the stocks in the S&P
500 index and the SPY returns. As in the main exercise, we use 1 minute and 5 minute
announcement returns to estimate oil betas. The correlations are calculated separately for
each calendar year.
Figure 3 displays the results. Before 2008 the correlations were extremely low suggesting
absolutely no connection between oil and market betas. The oil betas of most non-oil related
companies were negative (see Figure2) in line with conventional wisdom. Intuitively, an
increase in oil prices raises input costs for most business, as well as forces consumers to
spend more money on gasoline and less on everything else. Thus, although oil risk could
represent a market source of risk, the idiosyncratic component outweighed the systematic
one.
By 2012 the oil betas became positive and the correlation increased dramatically. One
potential reason behind these changes is the shale boom. The development of unconventional
oil through various channels, including the high indebtedness of the energy sector, could
have ampli�ed the importance of the systematic component of the oil price �uctuations
(seeAnatolyev et al. (2019)). However, the correlation fell below 0.1 in 2013 right in the
midst of the shale boom. As the unconventional oil production continued to gain momentum
in 2013, it is highly unlikely that the shale boom permanently transformed oil price risk into
a market wide risk.
In sum, oil betas re�ect mostly idiosyncratic variation. Hence, our approach of focusing
on oil betas rather than on market betas allows to mitigate a concern that joint exposure to a
market wide risk triggers simultaneous trading at the stock and ETF markets in response to
oil news and thus drives a spurious relationship between our measure of arbitrage intensity
and betas.
62
![Page 63: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/63.jpg)
Figure 3: Correlation of oil betas and market betas.
2006 2008 2010 2012 2014 2016−0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Correlation of oil betas and market betas
1 min5 min
A.6 ETF arbitrage and stock volatility
Ben-David et al. (2018) �nd the e�ect of ETF ownership on the volatility of underlying
stocks. As an additional test of our measure, we estimate the relationship between the
probability of simultaneous trading with SPY and the volatility of individual stocks estimated
as the variance of daily returns. Table 15 shows our results: The �rms more actively traded
with SPY indeed have higher volatility, consistent with the original hypothesis. The lower
panel in Table 15 shows the estimates of the interaction coe�cients added to speci�cations
(4)-(6) and estimated over the 2014-2016 period. All interaction terms are positive. If the
intraday measure of price impact is used, the interaction coe�cient is also signi�cant. Hence,
similarly to our main results, the distortive e�ects are stronger for stocks with lower liquidity
and stronger price impact, consistent with the ETF arbitrage mechanism. We repeat the
estimation using realized variance, and obtain similar results (available upon request).
63
![Page 64: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/64.jpg)
Table 15: Impact of the probability of simultaneous trading with SPY on volatility in thepanel of S&P 500 �rms.
The table displays the estimates of the following panel �xed e�ect regression: lnσ2j,t = cj + αJ,t +
γtπj,t + δtXj,t + εj,t, , where σ2j,t is the estimated volatility of stock j in year t, πj,t is the average
probability of simultaneous trading of stock j and SPY over the same period, αJ,t are year by
sector dummies, and Xj,t is a vector of control variables. Controls include logarithms of market
capitalization, book-to-market ratio, dollar volume, turnover, and three measures of liquidity and
price impact: bid-ask spread, the price impact coe�cient estimated using high frequency data, and
the Amihud ratio. St.err are clustered at the �rm level, t-statistics in parenthesis. The sample
period covers 2010-2016.
(3) (4) (5) (6) (7) (8) (9)
ProbTrad 3.22 2.93 2.70 1.32
(5.77) (4.93) (3.73) (2.53)
Prob -
2010 3.61 (3.59) 0.80 (0.6) 1.86 (2.16)
2011 2.85 (3.47) 1.15 (1.34) 0.79 (1.4)
2012 2.91 (3.31) 0.40 (0.45) 0.74 (1.2)
2013 2.50 (2.72) 1.58 (1.56) 0.54 (0.78)
2014 2.52 (3.18) 2.88 (3.21) 0.87 (1.57)
2015 3.22 (4.03) 4.40 (4.68) 1.12 (1.64)
2016 4.01 (5.37) 5.03 (5.88) 1.08 (2.01)
Controls(mcap,
book-to-market,
turnover, dollar
volume)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread Yes Yes
Price impact Yes Yes
Amihud Yes Yes
Year sector
e�ects
Yes Yes Yes Yes Yes Yes Yes
Time varying
γt, δt
Yes Yes Yes
R-sq within 0.70 0.71 0.73 0.82 0.73 0.77 0.84
R-sq between 0.54 0.55 0.64 0.84 0.43 0.16 0.85
N 2,877 2,876 2,877 2,877 2,876 2,877 2,877
N groups 444 444 444 444 444 444 444
2014-2016
(1) (2) (3)
ProbTrad 14.11 (2.55) 30.03 (2.89) 33.02 (1.73)
ProbTrad × Bid ask 1.14 (1.83)
ProbTrad × Price impact 1.71 (2.36)
ProbTrad × Amihud 1.19 (1.60)
64
![Page 65: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/65.jpg)
65
![Page 66: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/66.jpg)
A.7 Long-term e�ects
Table 16: Impact of the probability of simultaneous trading with SPY on oil betas in thepanel of S&P 500 �rms.The table displays the estimates of the following panel �xed e�ect regression: βj,t = cj+αJ,t+γtπj,t+δtXj,t+ εj,t, where βj,t is the estimated oil beta of stock j in year t, πj,t is the average probability ofsimultaneous trading of stock j and SPY over the same period, αJ,t are year by sector dummies, andXj,t is a vector of control variables. Controls include logarithms of market capitalization, book-to-
market ratio, dollar volume, turnover, volatility, and three measures of liquidity and price impact:
bid-ask spread, the price impact coe�cient estimated using high frequency data, and the Amihud
ratio. St.err are clustered at the �rm level, t-statistics in parenthesis. The sample period covers
2010-2016.
(3) (4) (5) (6) (7) (8)
ProbTrad 1.34 1.45 1.45 1.49 1.41(2.36) (2.51) (2.49) (2.54) (2.40)
Prob -2010 1.34 (1.78)2011 0.03 (0.03)2012 0.16 (0.15)2013 0.36 (0.27)2014 3.37 (3.04)2015 1.01 (1.5)2016 1.23 (1.88)
Controls(mcap,book-to-market,turnover, dollarvolume)
Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility Yes
Year sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.45 0.45 0.45 0.45 0.45 0.48R-sq between 0.02 0.03 0.03 0.03 0.03 0.007
N 2,852 2,851 2,852 2,852 2,852 2,852N groups 440 440 440 440 440 440
66
![Page 67: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/67.jpg)
A.8 Mutual fund �re sales
Mutual fund �re sales are widely used in the literature to identify exogenous variations in
the prices of securities. Ideally, �re sales of mutual funds should represent a series of sell
orders unrelated to any news about the value of stocks. To ensure that selling pressure does
not re�ect fundamental news, it is common to restrict the choice only to funds that invest
in a broad set of stocks rather than cover a narrow sector. Another issue is that managers
of the funds have a choice over which securities to keep and which to sell. To dampen any
strategic consideration, Edmans et al. (2012) suggest using hypothetical sales and not actual
changes in positions. The two will coincide only if the fund proportionally sells all of its
securities in the portfolio. Coval and Sta�ord (2007) have shown that the identi�ed price
pressure is signi�cant and large, but temporary, which con�rms the liquidity pressure story.
The SEC requires all mutual funds to report their asset holdings at the end of each
quarter. We use Thomson Returns data on mutual fund holdings and the CRSP mutual
funds database to identify exposed mutual funds and calculate the overall selling pressure
on each stock in each quarter. Flows to fund k in quarter t are given by the growth rate of
the total net assets under management after adjusting for the change in the market value of
the mutual fund's assets
Flowk,t =TNAk,t − TNAk,t−1(1 +Rk,t)
TNAk,t−1
We identify exposed funds as funds losing at least 5% of their assets. For each individual
stock j we then calculate total hypothetical sales asMFHSj,t =∑
k IFlowk,t<−5%Flowk,tSj,k,t−1,where Sj,k,t−1 denotes the number of shares of �rm j held by fund k in quarter t. It is standard
in the literature to normalize MFHS by trading volume. In our case, that would complicate
the interpretation because periods of high trading volume can be associated with more active
trading and thus higher probability of simultaneous trading.
A.9 Intraday Evidence of Price Discovery and ETF Arbitrage
Our mechanism relies on the assumption that the ETF market plays at least a partial role
in the price discovery process. When new information arrives it can be incorporated into the
ETF price �rst, and only afterwards be transmitted to the underlying market by arbitrage. In
67
![Page 68: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/68.jpg)
this section, we investigate the behavior of quotes and volumes around our announcements.
We pick sixteen oil inventory announcements that induced the largest movements in the
price of oil, eight negative and eight positive. Our goal is to examine how new information is
incorporated into the prices following each announcement. In the �rst exercise, we investigate
the behavior of quotes. We use the data on best bids and o�ers at each second prepared
by WRDS to calculate the midpoints. When calculating the quotes for the underlying
portfolio, we use the quotes of the underlying constituents and weight them with the index
weights. 35 We plot the cumulative midpoint returns over a two minute period following each
announcement on Figures 4 and A.9. Each picture corresponds to a particular announcement
day; the red line corresponds to the SPY, and the blue line re�ects the underlying portfolio.
A number of interesting observations emerge. First, many days are characterized by
a distinct jump which occurs simultaneously on the ETF and the underlying markets. It
implies that the ETF and the underlying portfolio react simultaneously to news. Hence, our
results contradict the �ndings of Box et al. (2019) who argue that the underlying portfolio
tends to incorporate new information faster. It should be noted that we use higher frequency
data than Box et al. (2019), and still we �nd no di�erence in the timing of responses.
Moreover, we �nd that in most cases SPY overreacts to news relative to the underlying
portfolio. Indeed, the red line lies above the blue line when the announcement return is
positive, and lies below when the return is negative. This e�ect is especially pronounced
when the announcement return is large. Hence, SPY plays at least a partial role in the
price discovery process. Importantly, if a price deviation or a mispricing caused by SPY
overreaction is large enough for an arbitrage opportunity to open up, its direction is such
that arbitrageurs would push the price of the underlying securities in the direction of the
initial shock. For example, when the announcement return is positive and SPY overreacts,
arbitrageurs should purchase the underlying stocks, pushing prices upwards, consistent with
our story.
We also document a widening of the bid-ask spread at the underlying market following
each announcement. Figure 5 depicts the di�erence in cumulative changes in the best bids
and o�ers. To save the space, we only show the results for the four largest announcements
(other days show similar pattern). We see a clear increase in the spread immediately after
an announcement and a slow convergence afterwards, which is consistent with an increased
35We use the weights provided by ETF Global.
68
![Page 69: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/69.jpg)
Figure 4: Cumulative midpoint returns following oil inventory announcements.
Each picture corresponds to one of the 16 announcement days characterized by the largest oil price
announcement returns. We consider a two minute period following each announcement; the x-axis
runs from 5 seconds before the announcement to 2 minutes after. The red line re�ects the cumulative
midpoint SPY returns, the blue line corresponds to the underlying portfolio cumulative midpoint
returns.
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20160210
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20161102
nav spy
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20160113
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20161026
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20160908
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20160511
nav spy
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20151209
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20160218
69
![Page 70: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/70.jpg)
Figure 3 (continue)
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20150415
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20151007
nav spy
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20150422
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20150219
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20150708
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20160727
nav spy
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20150902
0 60 120−3
−2
−1
0
1
2
3x 10
−3 20150211
70
![Page 71: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/71.jpg)
Figure 5: Di�erence in cumulative returns on best bid and best ask for the underlyingportfolio following oil inventory announcements.
0 60 120−1
−0.5
0
0.5
1x 10
−3 20160210
0 60 120−1
−0.5
0
0.5
1x 10
−3 20161102
nav
0 60 120−1
−0.5
0
0.5
1x 10
−3 20160113
0 60 120−1
−0.5
0
0.5
1x 10
−3 20161026
71
![Page 72: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/72.jpg)
Figure 6: Directional volume and returns.Each dot represents one of the 16 announcement days with the largest oil price movements.
−2 −1 0 1 2
x 10−3
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5SPY
return
volu
me
ratio
−2 −1 0 1 2
x 10−3
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5Basket
return
volu
me
ratio
72
![Page 73: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/73.jpg)
buying or selling pressure in response to news arrivals. This result again contradicts the
�ndings of Box et al. (2019) that bid and ask adjust smoothly and symmetrically to the
arrival of new information to the underlying market.
To provide additional evidence, we investigate the behavior of trading volumes. We
consider the directional volume of the ETF and underlying portfolio for the two minute pe-
riod following each announcement. The directional volume is calculated as the normalized
di�erence between buy and sell volume over the two minute period following an announce-
ment: dV olt =BuyV olt − SellV oltBuyV olt + SellV olt
. Buy (sell) volume is the trading volume occurring at
prices above (below) the prevailing midpoint36. For the underlying portfolio, we take the
portfolio-weighted average of the directional volumes of the individual constituents.
Figure 6 shows the results. Both the ETF and the underlying portfolio display a positive
link between returns and order �ow. Marketable buy orders arrive more frequently when the
return is positive. Our results again contradict Box et al. (2019) who show that overvalued
ETFs are typically purchased following the mispricing despite a generally negative trend in
ETF returns that corrects the initial mispricing. More generally, Box et al. (2019) do not
�nd any evidence that directional trading in the ETF market has any relation to midpoint
ETF returns, and thus argue that ETF order �ow appears to be devoid of information. In
contrast, we �nd a strong positive link even when we use one minute intervals outside the
announcement window. The median correlation of one minute SPY returns and directional
volumes is 0.44, where the median is taken over our chosen announcement days37.
We would like to emphasize that directional volume results can neither con�rm, no reject
the presence of arbitrage transactions. We see that marketable orders tend to follow the
returns, and thus can mask the arbitrage transactions. Indeed, when the market receives
positive news, we cannot say whether the buy orders for the underlying stocks re�ect arbi-
trage transactions or informed buy orders following positive news. Similarly, even though
arbitrageurs are expected to sell overvalued SPY shares, these transactions can be dominated
by the bulk of buy orders again following positive news.
It also should be noted, that our mechanism may work indirectly, if the market makers
change their behavior in the presence of ETF arbitrage. If the market makers on the markets
36We use the data �les prepared by WRDS, where each transaction is matched with the correspondingbest bid and best o�ers.
37For each day we consider the period from 9:30 to 11 am.
73
![Page 74: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/74.jpg)
for individual stocks rationally expect arbitrage transactions to occur in response to news,
they may be inclined to adjust the quotes accordingly in anticipation of the trading pressure.
Hence, we can observe a change in quotes without actual trading volumes, however this e�ect
still represents a direct consequence of the presence of ETF arbitrage, and thus should be
considered as a part of our story.
Box et al. (2019) question the results of all empirical papers that try to identify the
e�ects of ETF arbitrage on the underlying market. They �nd that mispricing is typically
originated from a permanent shock in the underlying portfolio, and that it is corrected by
quote adjustment and not by arbitrage transactions. However, our results contradict all
major �ndings of Box et al. (2019). One reason for the discrepancy of results, could be our
exclusive focus on SPY. Perhaps smaller ETFs induce much smaller e�ects on the underlying
markets that are harder to detect. However, SPY is the world's largest ETF, it accounts
for more than 10% of the overall investment in the ETFs, thus, it deserves special attention.
Another potential reason is our improved identi�cation of the shocks that cause mispricings.
We focus on clearly identi�ed fundamental shocks that come at a prespeci�ed time and are
known to signi�cantly move the market. Finally, it might also be that the 1-minute frequency
utilized by Box et al. (2019) is too low to capture arbitrage behavior.
74
![Page 75: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/75.jpg)
A.10 Absolute value of betas
Table 17: Impact of probability of simultaneous trading with SPY on absolute oil betas inthe panel of S&P 500 �rms.
The table repeats the estimation in Table 1, but using the absolute value of the estimated oil beta
of stock j in year t, |βj,t|, as the dependent variable. See Table 1 for details. St.err are clustered at
the �rm level, t-statistics in parenthesis. The sample period covers 2010-2016.
1 min returns(1) (2) (3) (4) (5) (6) (7) (8)
ProbTrad 0.28 0.20 0.13 0.07 0.09 -0.01 0.06(5.03) (2.49) (1.34) (0.55) (0.80) (-0.05) (0.67)
Prob -2010 -0.08 (-0.37)2011 -0.01 (-0.09)2012 0.10 (0.67)2013 -0.02 (-0.15)2014 -0.03 (-0.31)2015 0.10 (1.07)2016 0.21 (2.00)
Controls(mcap,book-to-market,turnover, dollarvolume)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility YesYear sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.016 0.191 0.325 0.338 0.345 0.377 0.338 0.355R-sq between 0.003 0.238 0.147 0.163 0.184 0.224 0.199 0.050N 3,058 2,877 2,877 2,876 2,877 2,877 2,877 2,877N groups 461 444 444 444 444 444 444 444
75
![Page 76: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/76.jpg)
Table 1(continue)
5 min returns(1) (2) (3) (4) (5) (6) (7) (8)
ProbTrad 0.92 0.91 0.41 0.32 0.35 0.20 0.29(8.33) (6.92) (2.72) (1.64) (1.92) (0.97) (2.09)
Prob -2010 0.10 (0.30)2011 0.28 (1.12)2012 0.18 (0.75)2013 0.43 (1.93)2014 0.18 (0.92)2015 0.50 (3.15)2016 0.56 (3.69)
Controls(mcap,book-to-market,turnover, dollarvolume)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility YesYear sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.051 0.317 0.507 0.516 0.522 0.549 0.520 0.528R-sq between 0.016 0.216 0.162 0.175 0.197 0.269 0.247 0.013N 3,058 2,877 2,877 2,876 2,877 2,877 2,877 2,877N groups 461 444 444 444 444 444 444 444
76
![Page 77: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/77.jpg)
A.11 Measures of algorithmic trading
Table 18: Impact of probability of simultaneous trading with SPY on oil betas in the panelof S&P 500 �rms.The table repeats the estimation in Table 1, but the controls also include the two measures of
algorithmic trading, cancels-to-trades ratio and trades-to-orders volume ratio developed inWeller
(2018) and calculated using MIDAS data. The sample period covers 2012-2016, as MIDAS data on
AT starts only in 2012. St.err are clustered at the �rm level, t-statistics in parenthesis.
1 min returns(1) (2) (3) (4) (5) (6) (7) (8)
ProbTrad 0.39 0.62 0.28 0.27 0.30 0.21 0.20(5.08) (5.70) (2.45) (2.34) (2.57) (1.79) (1.75)
Prob -2012 0.38 (2.25)2013 0.01 (0.06)2014 0.30 (1.91)2015 0.21 (1.48)2016 0.41 (2.65)Controls(mcap,book-to-market,turnover, dollarvolume,cancel-to-trade,trade-to-orders)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility YesYear sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.0292 0.1912 0.3846 0.3873 0.3898 0.4024 0.3970 0.4365R-sq between 0.0024 0.2238 0.1260 0.1302 0.1459 0.1616 0.1701 0.0217N 2,223 1,861 1,861 1,861 1,861 1,861 1,861 1,861N groups 461 399 399 399 399 399 399 399
77
![Page 78: Does Index Arbitrage Distort the Market Reaction to Shocks?tensity of arbitrage allows to separately measure these two e ects. We show that our results are consistent with the arbitrage](https://reader031.vdocuments.mx/reader031/viewer/2022040402/5e7db65048288f205e3461fe/html5/thumbnails/78.jpg)
Table 1(continue)
5 min returns(1) (2) (3) (4) (5) (6) (7) (8)
ProbTrad 0.45 0.97 0.58 0.56 0.60 0.53 0.51(4.14) (5.60) (3.55) (3.39) (3.61) (3.39) (3.10)
Prob -2012 0.22 (0.76)2013 0.14 (0.35)2014 0.60 (2.30)2015 0.46 (1.98)2016 0.47 (2.08)
Controls(mcap,book-to-market,turnover, dollarvolume,cancel-to-trade,trade-to-orders)
Yes Yes Yes Yes Yes Yes Yes
Bid-ask spread YesPrice impact YesAmihud YesVolatility Yes
Year sectore�ects
Yes Yes Yes Yes Yes Yes
Time varyingγt, δt
Yes
R-sq within 0.0124 0.2092 0.4263 0.4281 0.4286 0.4299 0.4302 0.4742R-sq between 0.0141 0.1369 0.0954 0.1002 0.1095 0.1088 0.1128 0.0610
N 2,223 1,861 1,861 1,861 1,861 1,861 1,861 1,861N groups 461 399 399 399 399 399 399 399
78