high-dimensional covariance forecasting for short intra-day horizons
TRANSCRIPT
This article was downloaded by: [University of Iowa Libraries]On: 04 October 2014, At: 23:59Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK
Quantitative FinancePublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/rquf20
High-dimensional covariance forecasting for shortintra-day horizonsRoel C. A. Oomen aa Deutsche Bank , London, and Department of Quantitative Economics, University ofAmsterdamPublished online: 06 Apr 2010.
To cite this article: Roel C. A. Oomen (2010) High-dimensional covariance forecasting for short intra-day horizons,Quantitative Finance, 10:10, 1173-1185, DOI: 10.1080/14697680903220349
To link to this article: http://dx.doi.org/10.1080/14697680903220349
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Quantitative Finance, Vol. 10, No. 10, December 2010, 1173–1185
High-dimensional covariance forecasting for
short intra-day horizons
ROEL C. A. OOMEN*
Deutsche Bank, London, and Department of Quantitative Economics, University of Amsterdam
(Received 4 November 2007; in final form 28 July 2009)
Asset return covariances at intra-day horizons are known to tend towards zero due to marketmicrostructure effects. Thus, traders who simply scale their daily covariance forecast to matchtheir trading horizon are likely to over-estimate the actual experienced asset dependence. Inthis paper, some of the key challenges are discussed that are encountered when forecastinghigh-dimensional covariance matrices for short intra-day horizons. Based on a novelevaluation methodology, and extensive empirical analysis, specific recommendations are maderegarding model design and data sampling.
Keywords: Vast covariance matrices; Forecast evaluation; Market microstructure; Factormodels
1. Introduction
This paper discusses some of the key ingredients of arobust and reliable approach to large-dimensional covar-iance matrix forecasting over short intra-day horizons.With the rapid development of algorithmic portfolioexecution engines, and statistical arbitrage strategiesoperating at increasingly high frequencies, good covar-iance forecasts for intra-day use are important. In thissetting, the main challenge is to deal correctly with themarket microstructure contaminations that emerge indata sampled at ultra-high frequency. In this study, wecontrast five competing methods, three based on intra-day data and two based on daily data. We cover modelswith and without microstructure noise corrections andmodels with and without factor structure imposed. Ourperformance evaluation approach centres around portfo-lio optimality and portfolio stability. The large amountof available data allows us to discriminate accuratelyamongst the performance of the alternative models andwe illustrate their relative merits and weaknesses using theFTSE-100 index constituents.
Our main findings can be summarized by the following‘rules-of-thumb’ or recommendations that should beconsidered when developing a covariance forecastingprocedure at intra-day horizons.
don’t use low-frequency risk models for high-frequencycovariance forecasting
Our analysis shows that models based on daily orlower-frequency data perform poorly at intra-day fre-quencies. This is primarily because, at intra-day frequen-cies, the dependence structure of returns changes in subtleways due to various market microstructure effects andthis cannot be captured by models based on daily data.Thus, the common practice of scaling the covariancematrix using the ‘
ffiffiffiffiTp
-rule’ fails when moving to intra-dayhorizons.
impose a factor structure to stabilize portfolio weights
Particularly in large-dimensional systems, the stability ofthe covariance matrix can be at risk when the number oftime series observations is limited. Our analysis showsthat a statistical factor reduction is effective in ‘stabilizing’the covariance matrix and that this leads to significantreduction in portfolio weight variation and resultingtransaction costs.
align sampling frequency with trading or evaluationfrequency
The mis-match observed between the level of covarianceat daily and intra-day frequencies is not a spurious effectintroduced by data sampling. Instead, it reflects thegenuinely different dependence structure that a traderexperiences over short intra-day horizons. To best capturethe covariance structure that one is exposed to it is key toalign the data sampling frequency with the (expected)trading or evaluation frequency. Failing to do so can*Email: [email protected]
Quantitative FinanceISSN 1469–7688 print/ISSN 1469–7696 online � 2010 Taylor & Francis
http://www.informaworld.comDOI: 10.1080/14697680903220349
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
severely under- or over-estimate the experienced depen-dence among the assets.
2. Covariance forecasting and evaluation
Below, we describe the forecasting models, performanceevaluation criteria, and data we use in this paper. As thefocus is on covariance forecasting over short horizons, themodels are deliberately built using intra-day data. Wemake forecasts at the start of each day and then evaluatethese – using various portfolio optimality and stabilitycriteria – over subsets of the asset universe and shortintervals throughout that day. Here the intervals andsubsets are randomly selected, and the procedure repeatedmany times per day. We should emphasise that thecovariance forecasting methods we use in this paper arerather simple and there is substantial scope for refinement.For instance, one could consider the use of irregularlysampled transaction data to estimate covariances (e.g.Hayashi and Yoshida 2005). Also, incorporating longmemory effects (e.g. Chiriac and Voev 2009), accountingfor diurnal patterns in volatility and correlation dynamics,and extending the methodology to allow for real-timeupdates of the forecast, would be interesting. We willfurther discuss and illustrate some of these aspects inthe concluding remarks, but leave a detailed analysis forfuture work as it goes well beyond the scope of the currentpaper and distracts from the main points we want to make.
2.1. Forecasting methods
Define the n� 1 vector of the ith intra-day�t returns for aset of n assets as
rtþi=M ¼ ptþi=M � ptþði�1Þ=M for i ¼ 1, 2, . . . ,M,
where p denotes the log price vector and M denotes thenumber of synchronously sampled intra-day returns. Theex-post realised covariance (RC) for day�t, using returnssampled at frequency M, is defined as
RCM,t �XMi¼1
rtþi=Mr0tþi=M: ð1Þ
Under ideal conditions, in a frictionless market, prices aremartingales and it is well known (see, e.g., Barndorff-Nielsen and Shephard 2004) that RC is an unbiased andconsistent estimator of the true covariance matrix withincreasingly fine sampling, i.e. M!1. In other words,the expectation of RC is invariant to the choice of M butthe accuracy of the estimates improves by increasingM. Inpractice, however, market microstructure effects such asbid–ask bounce, non-synchronous trading, and sluggishadjustment of prices, are a reality and RC loses theseproperties (see for instance Fisher 1966, Scholes andWilliams 1977, Epps 1979, Roll 1984, Griffin and Oomenforthcoming). In particular, Epps (1979) was the first tonote that as the sampling frequency increases, covariance
estimates tend to zero as a consequence ofnon-synchronous trading effects (see also Lo andMacKinlay 1990). This so-called Epps effect is illustratedin figure 1 for the FTSE-100 dataset used in this paper.Consequently, when implementing RC the samplingfrequency plays a key role: it controls the level of noisecontamination present in the sampled data and determinesthe horizon of the covariance estimate. To mitigate theEpps effect, Scholes and Williams (1977)y suggestincorporating a cross-autocovariance correction, i.e.
SWq,M,t
�RCM,tþXqj¼1
XM�ji¼1
rtþðiþjÞ=Mr0tþi=Mþ rtþi=Mr0tþðiþj Þ=M
� �: ð2Þ
Note that with q¼ 1, the diagonal elements of SWcorrespond to the bias-corrected realised variance estima-tor of Zhou (1996). Griffin and Oomen (forthcoming)provide a detailed theoretical and empirical study of theproperties of SW in a bi-variate setting with noise andnon-synchronous trading and compare its performance toRC and the Hayashi and Yoshida (2005) estimator. Fromfigure 1 we observe that the SW modification is quiteeffective in reducing the bias induced by non-synchronoustrading: while the average correlation at a five-minutefrequency for RC is about 20%, SW with q¼ 2 estimatesthe correlation at 26%, which is the level at which itstabilizes as the sampling frequency is lowered and thenoise progressively reduced. An important drawback ofthe estimator in equation (2) is that it is not guaranteedto be positive definite. This can be resolved by applyingsuitable kernel weights to the cross-autocovariance terms.Barndorff-Nielsen et al. (2008) derive various asymptoticresults for such an estimator based on a refresh timesampling scheme.
We now turn to a description of the competingcovariance forecasts we consider in this study. Given atime series of realised covariance matrices, the baselineforecasting method is constructed by simple exponentialfiltering of past RC measurements, i.e.
RCM,tjt�1 ¼1� �
1� ��
X�j¼1
�j�1RCM,t�j: ð3Þ
Thus, the covariance matrix is forecast at a dailyfrequency – by exponentially smoothing a time series ofcovariance matrices computed from intra-day data – andis then assumed to remain unchanged throughout the day.
In large-dimensional systems, particularly when wehave more assets than return observations (i.e. n4M),realised covariance – or a forecast thereof – may becomeunstable and a factor based approach is often desirable.Therefore, the second forecasting method we consider is astandard principal components (PC) reduction applied tothe covariance forecast by equation (3). Here, the numberof statistical factors to include is optimally set followingthe methodology outlined in Johnstone (2001). Seeappendix B for further details.
ySee also Dimson (1979) and Cohen et al. (1983).
1174 R.C.A. Oomen
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
As already mentioned above, microstructure effects canbe a concern when constructing covariance forecasts fromintra-day data. Motivated by this, our third forecastingmethod is based on exponential filtering of a time-seriesof SW measurements, i.e.
SWq,M,tjt�1 ¼1� �
1� ��
X�j¼1
�j�1SWq,M,t�j: ð4Þ
In addition to the above methods, we further considertwo benchmark forecasts commonly used in this litera-ture. The first is a ‘BARRA-type’ fundamental factormodel that decomposes risk into industry exposure andan idiosyncratic company specific component. The modelis estimated from daily data, using market cap weighting,and optimised for a daily forecasting horizon. Hence, werefer to it as the daily factor model or DF. See Briner andConnor (2008) for a detailed description of such models.The second benchmark method we consider is the populardynamic conditional correlation (DCC) model of Engleand Sheppard (2001) and Engle (2002). Here, univariateGARCH models are estimated for each individual assetin the universe with highly parsimonious joint correlationdynamics imposed on the full system. Like the DF model,we estimate the DCC model from daily data. Even thoughDF and DCC are expressly not designed to forecastcovariances at intra-day horizons, we recognize that thesetypes of risk models are used extensively throughout theindustry and it is conceivable that they are being appliedto increasingly short horizons as the speed of tradingcontinues to grow. The results presented here should
therefore be viewed as a measure of the potential gains orlosses associated with the arguably sub-optimal use of alow-frequency model for high-frequency forecasting.
2.2. Forecast evaluation
The methods described above will produce, for each day,five competing forecasts of the covariance matrix, namelyRC, SW, PC, DF and DCC. In evaluating the quality ofthese forecasts we concentrate on two criteria: portfoliooptimality and portfolio stability or sensitivity.
2.2.1. Evaluation criterion I: portfolio optimality. Theprimary use of a covariance forecast is often to determinean ‘optimal’ portfolio allocation strategy where risk isminimized subject to certain user-defined constraints. Insuch a setting, the best covariance forecast is the one thatgenerates portfolio returns with the lowest ex-post realisedvariance. Our first evaluation criterion is based on thisinsight. In particular, for a given covariance forecast �, wederive the associated optimal portfolio weights !�� for anumber of commonly encountered minimum varianceallocation problems. Based on the out-of-sample assetreturns r realised over the forecast horizon we thenestimate Vðr0!��Þ and identify the best forecasting methodas the one which attains the lowest ex-post realisedportfolio return variance. Here the statistical significanceof differences in performance between competing forecastscan be established using a standard bootstrap procedure.
0 5 10 15 20 25 300.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
RCSW 1SW 2
Figure 1. The Epps effect and the Scholes–Williams bias correction. Note that this figure plots the average correlation as a functionof sampling frequency in minutes, between the constituents of the FTSE-100 index over the period 1 April 2006 through 31 March2009. RC is defined in equation (1), and SW1 and SW2 are defined in equation (2) with q¼ 1 and q¼ 2, respectively.
High-dimensional covariance forecasting for short intra-day horizons 1175
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
See Patton and Sheppard (2009) for some further discus-
sion of this approach.Below, we list the minimum variance portfolio alloca-
tion strategies used to evaluate the forecast performance
of the competing methods (here n denotes the dimension
of � and � is an n� 1 vector of ones).
(i) The ‘net long’ strategy, i.e.
min!!0�! s:t: �0! ¼ 1:
‘take an overall long position, spreading weights to
exploit diversification’
An explicit solution to this optimisation problem
is available: !�� ¼ ��1�=ð�0��1�Þ.(ii) The ‘long only’ strategy, i.e.
min!!0�! s:t: �0! ¼ 1 and f!i � 0gni¼1:
‘take a long-only position, spreading weights to
exploit diversification’
Quadratic programming can be used to obtain
optimal portfolio weights.(iii) The ‘ad hoc long–short’ strategy, i.e.
min!!0�! s:t: �0! ¼ 0, �0j!j ¼ 1,
f!i � 0gbn=2ci¼1 , and f!i � 0gni¼bn=2cþ1:
‘take a cash-neutral long–short position, going long
the first and short the second half of the assetuniverse’
Quadratic programming can be used to obtain
optimal portfolio weights.(iv) The ‘target long–short’ strategy, i.e.
min!!0�! s:t: �0! ¼ 0, and �0! ¼ c4 0:
‘take a cash-neutral long–short position, targeting a
positive expected return’
An explicit solution to this optimisation problem
is available:
!��ðcÞ ¼��1ð�, �Þ
ð�, �Þ0��1ð�, �Þðc, 0Þ0: ð5Þ
(v) The ‘min eig’ strategy, i.e.
min!!0�! s:t: !0! ¼ 1:
‘invest in a long–short position in low-volatility
high-correlation assets’
The vector of optimal portfolio weights !�� is
the eigenvector of � associated with the smallest
eigenvalue.
Strategy 1 is the standard minimum variance problem.
Motivated by Jaganathan and Ma (2003) strategy 2 adds
a short-sell constraint. Both strategies are fully invested inthat �0!¼ 1. In contrast, strategies 3 and 4 enter into long–short portfolios with �0!¼ 0, i.e. the proceeds from takingshort positions are fully re-invested to make the overallposition cash-neutral. Strategy 3 assigns the short assetsin an ad hoc fashion while strategy 4 does this based onthe expected return �. Finally, strategy 5 is a minimumvariance strategy where the positions taken can be bothlong and short provided that the sum of squared weightsadd up to unity. Because the portfolio weights are givenby the eigenvector of � associated with the smallesteigenvalue, this strategy provides an interesting setting inwhich to evaluate the performance of PC. In particular,if PC has retained too few principal components, therealised portfolio variance of strategy 4 will be inflated.As such, our evaluation method in effect provides anout-of-sample test for the number of (economically)significant principal components.
The implementation of strategy 4 requires one to fix �and c. From equation (5) and appendix A we see that theoptimal portfolio weights scale linearly in c and inverse-linearly in �: when the expected returns are halved, weneed to double our position (and thus increase risk) toattain the same target mean. In this paper we set � equalto the sign of the daily open-to-close return. Even thoughsuch an approach uses day�t information and is thereforeunfeasible in practice, � is the same for all the competingcovariance forecasting methods and therefore still allowsus to gauge their relative performance. For given �, theparameter c is then identified by imposing an additionalconstraint on the magnitude of the position, i.e. we find c�
such that �0j!j ¼ 1.By following the above logic underlying the portfolio
optimality criterion, one may also construct maximumvariance portfolios. In that case the best forecast shouldgenerate the highest realised portfolio return variance.Although such a scenario has little relevance from aneconomic viewpoint, it does provide yet another dimen-sion along which to judge the statistical quality of theforecast. Amongst the allocation strategies consideredin this paper, we therefore add the maximum varianceanalogue to strategy 5.
(vi) The ‘max eig’ strategy, i.e.
max!!0�! s:t: !0! ¼ 1:
‘invest in a one-sided position in high-volatility assets’
The vector of optimal portfolio weights !�� is theeigenvector of � associated with the largesteigenvalue.
In recent work, DeMiguel et al. (2009) find that naiveequally weighted ‘1/n’ strategies outperform those basedon risk minimization using covariance forecasts.Motivated by this study, we consider ‘naive’ implementa-tions of strategies 1, 2 and 3 that require no covarianceforecast. In particular, for strategies 1 and 2, the portfolioweights are given by !i¼ 1/n for i¼ 1, . . . , n. For thead hoc long–short strategy 3, the portfolio weights are
1176 R.C.A. Oomen
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
given by !i¼ 1/2bn/2c for i¼ 1, . . . , bn/2c and !i¼�1/2(n�bn/2c) for i¼bn/2cþ 1, . . . , n.
2.2.2. Evaluation criterion II: portfolio stability. In addi-tion to portfolio optimality, one may also judge thequality of a covariance forecast by the stability and spreadof the portfolio weights. For instance, extreme positiontaking is often the result of error-maximisation in theoptimisation step.y Also, very small positions are imprac-tical due to contract divisibility and fixed transactioncosts while very large positions may incur excessivemarket impact. Motivated by this, we compute thecross-sectional variation of portfolio weights implied bythe competing forecasting models, i.e.
S1 ¼ !���� ��
2,
where kAkp� (P
ijjAijjp)1/p. Note that criterion S1 is
minimized for the equally weighted ‘1/n’ strategy: it willfavour covariance forecasting methods that result in wellbalanced allocations. In the tables below, we refer to S1 asthe ‘smoothness’ criterion.
In similar spirit, another evaluation criterion can beconstructed specific to the ‘target long–short’ strategy 4.It is motivated by the question: ‘if we revise our view onexpected returns, by how much do the optimal portfolioweights change?’. From a transaction costs viewpoint, thecovariance forecast that yields the more stable weightsis clearly preferred. We compute the following twoquantities:
S2 ¼ q!��=q��� ��
1
S3 ¼ q!��=q��� ��
2,
where q!��=q� can be expressed in closed form, seeequation (A1) in appendix A. Note that S2 (S3) isconsistent with linear (quadratic) transaction costs andin the tables below we refer to it as the ‘linear costs’(‘quadratic costs’) criterion.
2.3. Data and notes on implementation
The dataset we use in this paper consists of last-tickinterpolated 15-second mid-quote data for the constitu-ents of the FTSE-100 index over the period 3 January2006 through 31 March 2009. With official trading hoursfrom 8:00 to 16:30 London time, this results in 2041 priceobservations per asset per day.
To calculate the forecast in equations (3) and (4), weuse a rolling history of �¼ 60 days, a value of � to implyan ad hoc half-life of 1 month, and 15-minute returns orM¼ 34. For PC we include the first eight principalcomponents: these were found to be significant by theJohnstone (2001) test described in appendix B. The DCCmodel is specified in its simplest form (i.e. a GARCH(1,1)
for the asset volatilities and one innovation term and
lagged correlation for the correlation dynamics) withparameters estimated using maximum likelihood.z We
start the forecasting exercise on 1 April 2006 and run it
over the subsequent three years up to 31 March 2009,
totalling 781 trading days.For a given day and covariance forecast, we calculate
the optimal portfolio weights for the strategies described
above for a restricted universe of n¼ 10 randomly selected
names. Then, with given portfolio weights (which are
assumed to be fixed throughout the day), we calculate 100realised portfolio returns from 100 sets of synchronously
sampled asset returns obtained by selecting a random
starting point within the day and a random duration with
a pre-specified mean �d. This procedure is then repeated50 times. Specifically, let p( j) denote the n� 1 price vector
associated with the jth random draw of n names from the
universe. The realised portfolio returns are calculated as
zt,j,i ¼ pð j ÞtþðsiþdiÞ=2040
� pð j Þtþsi=2040
� �0!ð j Þtjt�1,
for i¼ 1, 2, . . . , 100, j¼ 1, 2, . . . , 50, and t¼ 1, 2, . . . , 781.The starting point s � i.i.d. U(0, 2040) and duration
d� 1� i.i.d. Poisson (�d� 1). This sampling procedure is
graphically illustrated in figure 2. When sþ d42040 we
‘wrap’ the data by continuing to sample the remainingreturns from the beginning of the same trading day. Such
an approach is quite common in the bootstrap literature
(e.g. Politis and Romano 1994), and is suitable here as
well. In the analysis below, we vary the trade orevaluation horizon �d between 1 and 45 minutes
depending on the experiment.In summary, for each day from 1 April 2006 onwards,
we compute the five competing covariance forecasts RC,
PC, SW, DF and DCC. Next, for the six allocationstrategies we compute their implied optimal portfolio
weights using 50 randomly drawn subsets of the asset
universe each of size 10, we then randomly draw 100
synchronous returns for the respective stocks, and finallycompute the realised portfolio return, stacked over
universe draws, return draws and days. This leads to
50� 100� 781¼ 3 905 000 returns for each strategy and
covariance forecasting method. Based on these returnseries, we then compute the realised portfolio return
volatility (in basis points) and use a bootstrap re-sampling
method to determine significant differences in perfor-
mance. The main advantages of a forecast evaluationapproach as described here are: (i) it thoroughly
‘exercises’ the covariance matrix along several dimensions
by considering subsets of the full asset universe as well as
various allocation problems and random time horizons;(ii) it mimics a real trading environment with random
timing and duration of trades; (iii) the large number of
generated portfolio returns allows us to measure
yThe solution to the minimum variance allocation problem tends to invest in assets for which the volatility is under-estimated andtake long–short positions in assets for which the correlation is over-estimated. In this case, the ex-ante portfolio risk is evidentlylower than the ex-post portfolio risk.zThe DCC model is estimated using the ‘UCSD GARCH’ MatlabTM toolbox of Kevin Sheppard available fromwww.kevinsheppard.com.
High-dimensional covariance forecasting for short intra-day horizons 1177
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
accurately the statistical significance of differences inperformance among the competing forecasts; and (iv) theprocedure concentrates on the dependence among assetsand is invariant to scaling of the covariance matrix.
3. Empirical results
Below, we discuss the forecast evaluation results. Tofacilitate comparison and interpretation of the results, wereport the performance statistics as fractions relative toRC. In all tables, bootstrapped p-values are reportedin parenthesis below, for the null hypothesis that theperformance statistic associated with a particular methodis lower than that of RC. For all evaluation criteria –except the ‘max eig’ realised portfolio volatility – a ratioof less than one with a p-value sufficiently close to onemeans that the method under consideration is a signifi-cant improvement over RC and vice versa.
don’t use low-frequency risk models for high-frequencycovariance forecasting
Panel A of table 1 reports evaluation criterion I, i.e. therealised portfolio volatility, for the competing forecastingmethods. The message is unambiguous: the forecastingmethods DF and DCC, which are based on daily data,significantly under-perform RC. At the same time, theperformance of PC is statistically indistinguishable from
RC at conventional confidence levels. These results hold
for all strategies considered. For instance, for the ‘target
long–short’ minimum variance strategy 4, the realised
portfolio volatility when using RC is 10.05 basis points
(over a 15-minute horizon). For PC, this figure is 0.2%
higher but statistically insignificant with a p-value of
39%, whereas for DF and DCC the realised portfolio
volatility is significantly increased by 8.3 and 3.5%,
respectively. Similarly, for the ‘max eig’ maximum vari-
ance strategy 6, RC attains a volatility of 88 basis points.
For PC this figure is 0.1% lower but insignificant, while
for DF and DCC the reduction is highly significant.
Figure 3 further illustrates these findings. Panels A and B
plot the bootstrapped distributions of the performance
criterion for the two strategies and we clearly observe a
close correspondence between the RC and PC distribu-
tions, while those for DF and DCC are shifted in
unfavourable directions (to the right for minimum vari-
ance strategy and to the left for the maximum variance
strategy). Also note the performance of PC for ‘min eig’
strategy 5. This strategy invests in the smallest eigenvector
of the covariance matrix. If the number of principal
components used for the construction of PC was too
small, we would expect the resulting portfolio volatility to
be significantly higher than for RC. Yet, we find that it is
comparable to RC, indicating that PC is both statistically
and economically well specified.As already alluded to above, the under-performance of
DF and DCC is in itself not that surprising: the models are
08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00
0.95
1
1.05
1.1
1 23 45 6 7 891011 1213 14 1516 171819 20 2122 2324 25 26 27 282930 3132 33 34 35363738 394041 42 4344 4546 47 4849 50 51 5253 5455 56 5758 5960 61626364 6566 6768697071 72 737475 7677 78 798081 8283 84 85868788 899091 92 93 949596 9798 99 100
Figure 2. Illustration of intra-day random return sampling. Note that this figure illustrates the procedure for randomly samplingintra-day returns. Each numbered horizontal line indicates the sampling horizon of a random return draw. With �d¼ 15 minutes,the majority of durations lie between 10 and 20 minutes.
1178 R.C.A. Oomen
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
13.8 14 14.2 14.4 14.6 14.8 15 15.2 15.4 15.6
RCPCSWDFDCC
0.56 0.58 0.6 0.62 0.64 0.66 0.68 0.7 0.72 0.74
RCPCSWDFDCC
84 85 86 87 88 89 90
RCPCSWDFDCC
3.5 3.6 3.7 3.8 3.9 4
RCPCSWDFDCC
A B
DC
Figure 3. Bootstrap distributions of selected evaluation criteria. Note that this figure reports the bootstrap distributions of therealised portfolio volatility for minvar strategy 4 and maxvar strategy 6 in Panels A and B and the portfolio sensitivity measures 2and 3 in Panels C and D. Panel A: minvar ‘long–short target’ strategy, Panel B: maxvar ‘max eig’ strategy, Panel C: sensitivitymeasure S2 and Panel D: sensitivity measure S3.
Table 1. Forecast evaluation of competing models.
RC PC SW DF DCC NAIVE
Panel A: evaluation criterion I (realised portfolio volatility by strategy)
1. ‘net long’ 21.055 1:002ð0:37Þ
1:028ð0:00Þ
1:135ð0:00Þ
1:133ð0:00Þ
1:201ð0:00Þ
2. ‘long only’ 21.145 1:001ð0:43Þ
1:018ð0:00Þ
1:057ð0:00Þ
1:094ð0:00Þ
1:196ð0:00Þ
3. ‘ad hoc long–short’ 10.048 1:002ð0:36Þ
1:023ð0:00Þ
1:155ð0:00Þ
1:064ð0:00Þ
1:214ð0:00Þ
4. ‘target long–short’ 14.067 1:002ð0:39Þ
1:015ð0:04Þ
1:083ð0:00Þ
1:035ð0:00Þ
–
5. ‘min eig’ 22.799 1:009ð0:07Þ
1:039ð0:00Þ
1:157ð0:00Þ
1:109ð0:00Þ
–
6. ‘max eig’ 88.346 0:999ð0:54Þ
0:998ð0:61Þ
0:963ð1:00Þ
0:981ð0:99Þ
–
Panel B: evaluation criterion II (portfolio sensitivity measures for strategy 4)
1. ‘smoothness’ 0.168 0:986ð1:00Þ
1:057ð0:00Þ
1:108ð0:00Þ
1:106ð0:00Þ
–
2. ‘linear costs’ 3.600 0:974ð1:00Þ
1:054ð0:00Þ
1:031ð0:00Þ
1:102ð0:00Þ
–
3. ‘quadratic costs’ 0.595 0:930ð1:00Þ
1:089ð0:00Þ
1:079ð0:00Þ
1:203ð0:00Þ
–
Average correlation 0.240 0.245 0.270 0.344 0.375
Note that this table reports the realised portfolio volatility for the strategies described in section 2.2.1 and the portfolio sensitivity statistics for
strategy 4 as described in section 2.2.2. RC is taken as the benchmark, and the statistics for all methods are reported as ratios relative to RC.
Bootstrapped p-values are reported in parentheses below, for the null hypothesis that the sample statistic associated with RC is higher than its
competitor. The column ‘NAIVE’ reports results for the naive equally weighted strategies that require no covariance forecast.
High-dimensional covariance forecasting for short intra-day horizons 1179
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
designed for longer-term horizons and estimated fromdaily data. In contrast, both RC and PC are based onintra-day data and therefore better able to capture thedependence structure of returns at short horizons. Notefrom table 1 that the average correlation (across allasset-combinations and over time) for DF and DCC isabout 10% higher than it is for the RC and PC methodsbased on 15-minute data. As already discussed andillustrated in figure 1, this is a direct consequence ofnon-synchronous trading leading to the well known Eppseffect. The important point, however, is that this down-ward ‘bias’ in correlation is not spurious. It reflects theweakened dependence between assets that one is exposedto when the trading or evaluation horizon is very short.Put differently, RC computed from 15-minute data doesprovide an unbiased estimate for the true covariancebetween assets at a 15-minute horizon: it measures thedependence ‘experienced’ by the trader over this shorthorizon. Yet, the RC estimates are biased when evaluatedover a daily horizon. We can further substantiate thispoint by looking at the performance of SW. As we can seefrom figure 1 and table 1, SW forecasts higher correlationamongst the assets, thanks to the correction that capturesthe lead–lag dependence induced by non-synchronoustrading or sluggish adjustment of prices. If the Epps effectwas spurious, then SW should perform better than RCor PC but we find the opposite. All the minimum variancestrategies lead to significantly higher realised portfoliovolatility.
We should make three further observations. Firstly,there is recent work which shows that the use of intra-daydata helps to improve volatility forecasts (see for exampleAndersen et al. 2003). One may argue that our findingthat forecasting methods based on intra-day data outper-form those based on daily data simply reflect this. Yet, thecomparison between RC/PC and SW shows that appro-priately capturing the dependence structure of returnsat intra-day horizons is key. Thus, at least some of theimprovement obtained by moving from daily to intra-daydata, comes from this effect and not only from merelyhaving more data. Secondly, we see from table 1 that thenaive ‘1/n’ strategy of DeMiguel et al. (2009) under-performs all other methods, including those based ondaily data. Thus, covariance models are useful and doplay an important role in asset allocation, at least in thecurrent setting. Thirdly, Jaganathan and Ma (2003) findthat imposing a long-only constraint helps the perfor-mance of minimum variance strategies. Our results,however, show that the realised portfolio volatility of‘long only’ strategy 2 is always higher than that of ‘netlong’ strategy 1. The constraint harms the performancerather than improving it by providing more structure.
impose a factor structure to stabilize portfolio weights
Panel B of table 1 reports evaluation criterion II, i.e. theportfolio sensitivity measures for strategy 4. Up to thispoint, we found that RC and PC are the best performingmethod and statistically indistinguishable based on eval-uation criterion I. Yet, now we see that when RC and PC
are compared based on their implied portfolio sensitivityPC is clearly superior. For instance, under ‘linear costs’the improvement is about 2.5% and highly significant(also note that SW, DF and DCC all deteriorate relativeto RC and are thus inferior on both criteria). For‘quadratic costs’ the patterns is even more pronounced.As before, Panels C and D of figure 3 reiterate thesefindings by plotting the bootstrapped distribution of theperformance measures for all strategies. The distributionof PC clearly lies to the left of all its competitors. From allthis, it is evident that the imposed factor structure is keyin stabilizing the covariance matrix – and consequentlythe portfolio weights – while retaining good performancein terms of realised portfolio volatility.
To gauge the robustness of the above results, considertable 2. From Panels A–C, we see that our conclusionsremain unchanged over different sample periods. Notethat the realised portfolio volatility for all strategies aresignificantly higher in the third subsample from April2008 to March 2009. This is of course a direct reflection ofthe extremely turbulent market environment over thisperiod. Yet, the relative performance of the forecaststrategies remains unchanged with RC and PC superior toSW, DCC and DF according to portfolio optimalitycriterion I and, in addition, PC is superior to RCaccording to portfolio stability criterion II. The resultsare also robust to increasing the portfolio size to 25 assets(Panel D) and alternative sampling frequencies of 5 and45 minutes (Panels E and F).
align sampling frequency with trading or evaluationfrequency
As already discussed above, the use of high-frequency datais important because it allows one to capture the depen-dence structure of returns at intra-day horizons. But howshould we select the sampling frequency in relation to thetrading or evaluation frequency? To shed some light onthis question, consider table 3. Here, to compute the PCforecast, the sampling frequency and evaluation frequencyis varied between 1 and 45 minutes. The statistics andbootstrapped p-values are now computed relative to thebenchmark case where the sampling frequency is equal tothe evaluation frequency. Looking at the results, a clearoverall pattern emerges, namely: it is optimal to align thesampling frequency with the evaluation frequency.Consider for instance the ‘net long’ minimum variancestrategy 1 with an evaluation strategy of 1 minute. Whenthe sampling frequency is also 1 minute, the realisedportfolio volatility is 5.09 basis points, but this deterio-rates by more than 6% when lowering the samplingfrequency to 45 minutes. At the same time, when theevaluation frequency is 15 minutes, but data are sampledat a 1-minute frequency (Panel C) then the realisedportfolio volatility also deteriorates significantly. Thereare a few cases where this pattern does not hold up (e.g.‘target long–short’ strategy) but here the results are notsignificant and therefore do not contradict the statement.When lowering the evaluation frequency further to45 minutes, we see that the harm of sampling at 15
1180 R.C.A. Oomen
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
Table
2.Perform
ance
evaluationrobustnessanalysis.
Panel
A:Apr06–Mar07
Panel
B:Apr07–Mar08
Panel
C:Apr08–Mar09
RC
PC
SW
DF
DCC
NAIV
ERC
PC
SW
DF
DCC
NAIV
ERC
PC
SW
DF
DCC
NAIV
E
1.‘net
long’
10.99
1:00
ð0:41Þ
1:03
ð0:00Þ
1:10
ð0:00Þ
1:21
ð0:00Þ
1:14
ð0:00Þ
17.93
1:00
ð0:41Þ
1:03
ð0:00Þ
1:06
ð0:00Þ
1:11
ð0:00Þ
1:16
ð0:00Þ
29.68
1:00
ð0:43Þ
1:03
ð0:00Þ
1:17
ð0:00Þ
1:13
ð0:00Þ
1:22
ð0:00Þ
2.‘longonly’
11.00
1:00
ð0:42Þ
1:02
ð0:00Þ
1:08
ð0:00Þ
1:15
ð0:00Þ
1:14
ð0:00Þ
17.96
1:00
ð0:45Þ
1:02
ð0:04Þ
1:04
ð0:00Þ
1:08
ð0:00Þ
1:16
ð0:00Þ
29.85
1:00
ð0:46Þ
1:02
ð0:02Þ
1:06
ð0:00Þ
1:09
ð0:00Þ
1:22
ð0:00Þ
3.‘adhoclong–short’
6.70
1:00
ð0:30Þ
1:02
ð0:00Þ
1:08
ð0:00Þ
1:06
ð0:00Þ
1:15
ð0:00Þ
8.51
1:00
ð0:20Þ
1:02
ð0:00Þ
1:09
ð0:00Þ
1:05
ð0:00Þ
1:12
ð0:00Þ
13.57
1:00
ð0:47Þ
1:02
ð0:00Þ
1:20
ð0:00Þ
1:07
ð0:00Þ
1:26
ð0:00Þ
4.‘target
long–short’
9.08
1:00
ð0:44Þ
1:01
ð0:09Þ
1:04
ð0:00Þ
1:03
ð0:00Þ
–11.70
1:00
ð0:38Þ
1:02
ð0:06Þ
1:04
ð0:00Þ
1:03
ð0:00Þ
–19.28
1:00
ð0:47Þ
1:02
ð0:11Þ
1:11
ð0:00Þ
1:04
ð0:00Þ
–
5.‘m
ineig’
15.42
1:01
ð0:05Þ
1:03
ð0:00Þ
1:12
ð0:00Þ
1:07
ð0:00Þ
–19.88
1:01
ð0:10Þ
1:03
ð0:00Þ
1:08
ð0:00Þ
1:08
ð0:00Þ
–30.33
1:01
ð0:21Þ
1:04
ð0:00Þ
1:20
ð0:00Þ
1:13
ð0:00Þ
–
6.‘m
axeig’
45.20
1:00
ð0:51Þ
0:99
ð0:86Þ
0:97
ð1:00Þ
0:97
ð1:00Þ
–68.92
1:00
ð0:53Þ
0:99
ð0:78Þ
0:98
ð0:96Þ
0:98
ð0:98Þ
–128.38
1:00
ð0:51Þ
1:00
ð0:50Þ
0:96
ð1:00Þ
0:98
ð0:96Þ
–
1.‘smoothness’
0.17
0:99
ð1:00Þ
1:05
ð0:00Þ
1:09
ð0:00Þ
1:11
ð0:00Þ
–0.17
0:99
ð0:99Þ
1:06
ð0:00Þ
1:10
ð0:00Þ
1:11
ð0:00Þ
–0.17
0:98
ð1:00Þ
1:06
ð0:00Þ
1:13
ð0:00Þ
1:10
ð0:00Þ
–
2.‘linearcosts’
3.47
0:98
ð0:98Þ
1:05
ð0:00Þ
1:06
ð0:00Þ
1:11
ð0:00Þ
–3.55
0:97
ð1:00Þ
1:07
ð0:00Þ
1:05
ð0:00Þ
1:10
ð0:00Þ
–3.78
0:97
ð0:99Þ
1:05
ð0:00Þ
0:99
ð0:72Þ
1:10
ð0:00Þ
–
3.‘quadraticcosts’
0.57
0:94
ð1:00Þ
1:10
ð0:00Þ
1:10
ð0:00Þ
1:23
ð0:00Þ
–0.58
0:93
ð1:00Þ
1:11
ð0:00Þ
1:10
ð0:00Þ
1:20
ð0:00Þ
–0.64
0:92
ð1:00Þ
1:06
ð0:00Þ
1:04
ð0:01Þ
1:18
ð0:00Þ
–
Averagecorrelation
0.15
0.15
0.22
0.31
0.37
–0.25
0.26
0.28
0.31
0.38
–0.31
0.31
0.31
0.41
0.37
–
Panel
D:size¼25,freq¼15min
Panel
E:size¼10,freq¼5min
Panel
F:size¼10,freq¼45min
RC
PC
SW
DF
DCC
NAIV
ERC
PC
SW
DF
DCC
NAIV
ERC
PC
SW
DF
DCC
NAIV
E
1.‘net
long’
18.09
1:01
ð0:09Þ
1:05
ð0:00Þ
1:21
ð0:00Þ
1:20
ð0:00Þ
1:29
ð0:00Þ
12.10
1:00
ð0:40Þ
1:02
ð0:00Þ
1:16
ð0:00Þ
1:17
ð0:00Þ
1:18
ð0:00Þ
36.24
1:00
ð0:33Þ
1:04
ð0:00Þ
1:12
ð0:00Þ
1:09
ð0:00Þ
1:21
ð0:00Þ
2.‘longonly’
18.53
1:00
ð0:40Þ
1:04
ð0:00Þ
1:10
ð0:00Þ
1:14
ð0:00Þ
1:26
ð0:00Þ
12.14
1:00
ð0:43Þ
1:01
ð0:01Þ
1:08
ð0:00Þ
1:12
ð0:00Þ
1:18
ð0:00Þ
36.30
1:00
ð0:41Þ
1:02
ð0:00Þ
1:04
ð0:00Þ
1:06
ð0:00Þ
1:21
ð0:00Þ
3.‘adhoclong–short’
6.11
1:01
ð0:01Þ
1:06
ð0:00Þ
1:28
ð0:00Þ
1:09
ð0:00Þ
1:26
ð0:00Þ
6.15
1:00
ð0:29Þ
1:01
ð0:01Þ
1:16
ð0:00Þ
1:08
ð0:00Þ
1:20
ð0:00Þ
16.99
1:00
ð0:43Þ
1:04
ð0:00Þ
1:13
ð0:00Þ
1:04
ð0:00Þ
1:21
ð0:00Þ
4.‘target
long–short’
9.79
1:01
ð0:28Þ
1:14
ð0:00Þ
1:13
ð0:00Þ
1:03
ð0:00Þ
–8.22
1:00
ð0:41Þ
1:01
ð0:08Þ
1:09
ð0:00Þ
1:05
ð0:00Þ
–25.87
1:00
ð0:41Þ
1:02
ð0:00Þ
1:06
ð0:00Þ
1:02
ð0:03Þ
–
5.‘m
ineig’
19.62
1:07
ð0:00Þ
1:10
ð0:00Þ
1:24
ð0:00Þ
1:14
ð0:00Þ
–14.08
1:01
ð0:05Þ
1:03
ð0:00Þ
1:16
ð0:00Þ
1:12
ð0:00Þ
–39.04
1:01
ð0:04Þ
1:07
ð0:00Þ
1:12
ð0:00Þ
1:07
ð0:00Þ
–
6.‘m
axeig’
125.51
1:00
ð0:53Þ
1:00
ð0:74Þ
0:98
ð1:00Þ
0:97
ð1:00Þ
–49.70
1:00
ð0:52Þ
1:00
ð0:66Þ
0:96
ð1:00Þ
0:98
ð1:00Þ
–151.74
1:00
ð0:53Þ
1:00
ð0:70Þ
0:97
ð1:00Þ
0:99
ð0:85Þ
–
1.‘smoothness’
0.08
0:94
ð1:00Þ
1:21
ð0:00Þ
1:22
ð0:00Þ
1:18
ð0:00Þ
–0.17
0:99
ð1:00Þ
1:04
ð0:00Þ
1:13
ð0:00Þ
1:12
ð0:00Þ
–0.17
0:98
ð1:00Þ
1:10
ð0:00Þ
1:08
ð0:00Þ
1:07
ð0:00Þ
–
2.‘linearcosts’
4.56
0:92
ð1:00Þ
1:24
ð0:00Þ
1:01
ð0:04Þ
1:15
ð0:00Þ
–3.55
0:98
ð1:00Þ
1:05
ð0:00Þ
1:05
ð0:00Þ
1:13
ð0:00Þ
–3.64
0:96
ð1:00Þ
1:09
ð0:00Þ
1:00
ð0:36Þ
1:07
ð0:00Þ
–
3.‘quadraticcosts’
0.45
0:82
ð1:00Þ
1:22
ð0:00Þ
1:11
ð0:00Þ
1:31
ð0:00Þ
–0.58
0:95
ð1:00Þ
1:10
ð0:00Þ
1:12
ð0:00Þ
1:25
ð0:00Þ
–0.61
0:91
ð1:00Þ
1:09
ð0:00Þ
1:03
ð0:00Þ
1:15
ð0:00Þ
–
Averagecorrelation
0.24
0.25
0.27
0.34
0.37
–0.20
0.20
0.26
0.34
0.37
–0.25
0.26
0.27
0.34
0.37
–
Note
thatthis
table
reportsthesamestatisticsasin
table
1forthreesubsamples(inPanelsA–C),differentportfoliosize
(inPanel
D),anddifferentsamplingfrequencies
(inPanelsE
andF).Theevaluation
frequency
issetequalto
thesamplingfrequency
inallcases.
High-dimensional covariance forecasting for short intra-day horizons 1181
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
minutes is not significant but at five minutes it is for moststrategies.
Based on the above, we come to the following conclu-sion. At very high trading frequencies, where microstruc-ture effects are ubiquitous, the alignment of samplingfrequency with evaluation frequency is key. In thisfrequency range, the return dependence varies in complexways with the sampling frequency, and in particular, usinga lower sampling frequency tends to over-estimate thecovariance among assets. But when the trading frequencyis lowered, and market microstructure effects diminish, theimportance of alignment diminishes as well. As a roughguide for where the important frequency range lies, onemay consider figure 1 and Panel D of figure 4. The Eppscurve stabilizes at around a 25-minute frequency: whentrading at higher frequencies than this alignment isimportant as the dependence structure of returns varieswith the sampling frequency, but when trading at lowerfrequencies alignment is less important and should bebalanced against the loss of information incurred whenlowering the sampling frequency.
4. Concluding remarks
In this paper we highlight a number of importantconsiderations when building covariance models toforecast return dependence over short intra-day horizons.We find that forecasts that are based on intra-day data,impose a factor structure, and use a sampling frequency
that is aligned with the trading or evaluation frequencyperform best in terms of the various portfolio optimalityand stability metrics considered here. To the best of ourknowledge, the focus on short intra-day horizons and theperformance evaluation methodology used in this paperare new in this literature. We also note that our resultscontradict those of Jaganathan and Ma (2003) andDeMiguel et al. (2009): in our setting, the long-onlyconstraint harms performance and the naive ‘1/n’ strategyseverely under-performs model-based minimum varianceportfolios.
As already alluded to above, the forecasting models weuse are rather simple. This is deliberately done as the focusis more on basic modelling principals than on subtlerefinements. Yet, a number of interesting improvementsare worth investigating. Consider figure 4, which reportssome summary statistics for the FTSE-100 dataset studiedin this paper. From Panels A and B we observe pro-nounced diurnal patterns in both volatility and correla-tion. The volatility curve takes on the well knownU-shape, with large spikes around the US open and atscheduled news announcement times. The correlationpattern is less known, but equally prevalent: correlationis low at the beginning of the day, gradually grows, andreaches its maximum once the US market is trading. Boththese salient features provide scope for improving thecovariance model by allowing the forecast to depend onthe time of the day. In Panel C we plot the bid–ask spreadcurve, which shows a steep decline by a factor of three overthe first hour of trading, after which it stabilizes and
Table 3. Sampling frequency versus evaluation frequency.
Sampling frequency Sampling frequency
1-min 5-min 15-min 45-min 1-min 5-min 15-min 45-min
Panel A: 1-minute evaluation frequency Panel B: 5-minute evaluation frequency1. ‘net long’ 5.086 1:019
ð0:00Þ1:044ð0:00Þ
1:065ð0:00Þ
1:007ð0:14Þ
12.100 1:009ð0:08Þ
1:026ð0:00Þ
2. ‘long only’ 5.090 1:015ð0:01Þ
1:032ð0:00Þ
1:044ð0:00Þ
1:007ð0:15Þ
12.138 1:005ð0:21Þ
1:016ð0:01Þ
3. ‘ad hoc long–short’ 3.016 1:006ð0:13Þ
1:015ð0:00Þ
1:027ð0:00Þ
1:004ð0:25Þ
6.153 1:004ð0:23Þ
1:016ð0:00Þ
4. ‘target long–short’ 3.872 1:001ð0:46Þ
1:004ð0:38Þ
0:998ð0:54Þ
1:008ð0:20Þ
8.223 0:996ð0:65Þ
0:988ð0:88Þ
5. ‘min eig’ 6.945 1:008ð0:14Þ
1:026ð0:00Þ
1:047ð0:00Þ
1:010ð0:02Þ
14.082 1:009ð0:06Þ
1:027ð0:00Þ
6. ‘max eig’ 20.961 0:997ð0:65Þ
0:992ð0:88Þ
0:984ð0:99Þ
0:991ð0:89Þ
49.697 1:000ð0:51Þ
0:993ð0:84Þ
Panel C: 15-minute evaluation frequency Panel D: 45-minute evaluation frequency
1. ‘net long’ 1:015ð0:01Þ
0:999ð0:54Þ
21.055 1:011ð0:03Þ
1:021ð0:00Þ
1:004ð0:45Þ
1:000ð0:70Þ
36.235
2. ‘long only’ 1:015ð0:00Þ
1:002ð0:38Þ
21.145 1:007ð0:15Þ
1:019ð0:00Þ
1:005ð0:21Þ
1:000ð0:50Þ
36.300
3. ‘ad hoc long–short’ 1:010ð0:02Þ
1:001ð0:45Þ
10.048 1:009ð0:04Þ
1:016ð0:01Þ
1:004ð0:50Þ
1:000ð0:79Þ
16.993
4. ‘target long–short’ 1:020ð0:01Þ
1:008ð0:17Þ
14.067 0:988ð0:91Þ
1:024ð0:00Þ
1:010ð0:00Þ
1:000ð0:06Þ
25.872
5. ‘min eig’ 1:027ð0:00Þ
1:004ð0:25Þ
22.799 1:011ð0:04Þ
1:039ð0:00Þ
1:012ð0:06Þ
1:000ð0:67Þ
39.038
6. ‘max eig’ 0:982ð0:99Þ
0:996ð0:69Þ
88.346 0:996ð0:66Þ
0:977ð1:00Þ
0:993ð0:82Þ
1:000ð0:49Þ
151.743
Note that this table reports the realised portfolio volatility for the strategies described in section 2.2.1 for forecasting method PC. The sampling and
evaluation frequency is varied between 1 and 45 minutes. The benchmark case is where the sampling frequency is equal to the evaluation frequency.
All other statistics are reported as ratios relative to the benchmark. Bootstrapped p-values are reported in parentheses below, for the null hypothesis
that the sample statistic associated with the benchmark case is higher than its competitor.
1182 R.C.A. Oomen
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
remains roughly constant for the remainder of the day.
Microstructure noise effects induced by the bid–ask
bounce (see for example Roll 1984) are thus expected to
be stronger in the morning. This may at least partially
account for depressed correlation observed early in the
morning. The diurnal pattern in microstructure noise is
again something that may be incorporated in the model-
ling framework.
08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:000
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0 5 10 15 20 25 300.98
1
1.02
1.04
1.06
1.08
1.1
1.12
1.14
08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:000.5
1
1.5
2
2.5
3
3.5
08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:000.5
1
1.5
2
2.5
3
3.5A B
C D
Figure 4. Summary statistics. Note that this figure reports summary statistics, averaged over the FTSE-100 universe. Panel Areports the average volatility (normalized asset-by-asset and day-by-day) as a function of time of the day. Panel B reports theaverage correlation at one-minute frequency as a function of time of the day. Panel C reports the average bid–ask spread(normalized asset-by-asset and day-by-day) as a function of time of the day. Panel D reports the average (normalized) volatility as afunction of sampling frequency in minutes. Panel A: volatility curve, Panel B: correlation curve, Panel C: spread curve and Panel D:volatility signature.
Table 4. Sparse-sampling versus subsampling.
freq¼ 1min freq¼ 5min freq¼ 15min freq¼ 45min
Sparse Subsam Sparse Subsam Sparse Subsam Sparse Subsam
1. ‘net long’ 5.086 1:000ð0:50Þ
12.100 1:000ð0:48Þ
21.055 0:999ð0:57Þ
36.235 0:997ð0:67Þ
2. ‘long only’ 5.090 1:000ð0:50Þ
12.138 1:000ð0:50Þ
21.145 0:999ð0:57Þ
36.300 0:997ð0:67Þ
3. ‘ad hoc long–short’ 3.016 1:000ð0:52Þ
6.153 0:999ð0:61Þ
10.048 0:999ð0:62Þ
16.993 0:995ð0:82Þ
4. ‘target long–short’ 3.872 1:000ð0:52Þ
8.223 0:999ð0:55Þ
14.067 0:998ð0:61Þ
25.872 0:996ð0:64Þ
5. ‘min eig’ 6.945 0:999ð0:54Þ
14.082 0:997ð0:66Þ
22.799 0:996ð0:73Þ
39.038 0:992ð0:92Þ
6. ‘max eig’ 20.961 1:000ð0:48Þ
49.697 1:001ð0:46Þ
88.346 1:000ð0:50Þ
151.743 1:004ð0:29Þ
Note that this table reports the realised portfolio volatility for the strategies described in section 2.2.1 for forecasting method PC based on sparse
sampling (first column) and ‘subsampling and averaging’ (second column). In parenthesis below are the p-values for the null hypothesis that the
forecast constructed using subsampling reduces the realised portfolio volatility.
High-dimensional covariance forecasting for short intra-day horizons 1183
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
In this paper we use synchronized 15-second data, butirregularly spaced tick data may be used as an alternative.The advantage is that all available data can now be usedusing methods as advocated in Hayashi and Yoshida(2005) and others. However, constructing high-dimensional covariance matrices in this fashion is notstraightforward as they are typically not guaranteed to bepositive definite and regularisation would be required.Another alternative to making more efficient use of thedata would be to subsample and average the estimators,see for instance Zhang et al. (2005). Table 4 implementssuch an approach where we obtain the forecast as anaverage over PCs each computed at the same frequencybut with the starting point shifted by 15-second incre-ments. So at a sampling frequency of 1 (45) minute(s), theforecast is constructed as the average over 4 (180)individual forecasts. The results are therefore quiteintuitive, namely that at lower sampling frequencies thescope for improvement by subsampling is greater. Still,the benefit appears marginal at best even at a 45-minutesampling frequency. One explanation for this may be thatthe smoothing done to obtain the forecast diminishes thebenefits. All these issues, and more, warrant furtherinvestigation and we leave this for future research.
Acknowledgements
The author would like to thank two anonymous referees,Karim Bannouh, Andy Ferraris, Alexander Gerko andTom Halahan for helpful comments and suggestions.
References
Andersen, T.G., Bollerslev, T., Diebold, F.X. and Labys, P.,Modeling and forecasting realised volatility. Econometrica,2003, 71, 579–625.
Barndorff-Nielsen, O.E., Hansen, P.R., Lunde, A. andShephard, N., Multivariate realised kernels: consistent posi-tive semi-definite estimators of the covariation of equity priceswith noise and non-synchronous trading. Working paper,Oxford-Man Institute, 2008.
Barndorff-Nielsen, O.E. and Shephard, N., Econometric analy-sis of realised covariation: high frequency based covariance,regression and correlation in financial economics.Econometrica, 2004, 72, 885–925.
Briner, B.G. and Connor, G., How much structure is best? Acomparison of market model, factor model and unstructuredequity covariance matrices. J. Risk, 2008, 10, 3–30.
Chiriac, R. and Voev, V., Modelling and forecasting multi-variate realized volatility. J. Appl. Econom., 2009,forthcoming.
Cohen, K.J., Hawawini, G.A., Maier, S.F., Schwartz, R.A. andWhitcomb, D.K., Friction in the trading process and theestimation of systematic risk. J. Finan. Econ., 1983, 12,263–278.
DeMiguel, V., Garlappi, L. and Uppal, R., Optimal versus naivediversification: how inefficient is the 1/N portfolio strategy?Rev. Finan. Stud., 2009, 22, 1915–1953.
Dimson, E., Risk measurement when shares are subject toinfrequent trading. J. Finan. Econ., 1979, 7, 197–226.
Engle, R., Dynamic conditional correlation – a simple class ofmultivariate GARCH models. J. Bus. & Econom. Statist.,2002, 20, 339–350.
Engle, R.F. and Sheppard, K., Theoretical and empiricalproperties of dynamic conditional correlation multivariateGARCH. Unpublished paper, University of California, 2001.
Epps, T.W., Comovements in stock prices in the very short run.J. Amer. Statist. Assoc., 1979, 74, 291–298.
Fisher, L., Some new stock-market indexes. J. Bus., 1966, 39,191–225.
Griffin, J.E. and Oomen, R.C., Covariance measurement in thepresence of non-synchronous trading and market microstruc-ture noise. J. Econometrics, forthcoming.
Hayashi, T. and Yoshida, N., On covariance estimation of non-synchronously observed diffusion processes. Bernoulli, 2005,11, 359–379.
Jaganathan, R. and Ma, T., Risk reduction in large portfolios:why imposing the wrong constraints helps. J. Finan., 2003, 58,1651–1683.
Johnstone, I., On the distribution of the largest eigenvalue inprincipal components analysis. Ann. Statist., 2001, 29,295–327.
Lo, A.W. and MacKinlay, A.C., An econometric analysis ofnonsynchronous-trading. J. Econom., 1990, 45, 181–212.
Patton, A. and Sheppard, K., Evaluating volatility andcorrelation forecasts. In Handbook of Financial Time Series,edited by T.G. Andersen, R.A. Davis, J.P. Kreiss, andT. Mikosch, pp. 801–838, 2009 (Springer-Verlag: Berlin).
Politis, D.N. and Romano, J.P., The stationary bootstrap.J. Amer. Statist. Assoc., 1994, 89, 1303–1313.
Roll, R., A simple implicit measure of the effective bid–askspread in an efficient market. J. Finan., 1984, 39, 1127–1139.
Scholes, M. and Williams, J., Estimating betas from nonsyn-chronous data. J. Finan. Econ., 1977, 5, 309–327.
Zhang, L., Mykland, P.A. and Aı̈t-Sahalia, Y., A tale of two timescales: determining integrated volatility with noisy highfrequency data. J. Amer. Statist. Assoc., 2005, 100, 1394–1411.
Zhou, B., High frequency data and volatility in foreign-exchange rates. J. Bus. & Econom. Statist., 1996, 14, 45–52.
Appendix A: Stability of portfolio weights
For portfolio strategy 4, we consider the variability ofoptimal weights with respect to changes in expectedreturn. Using the expression for the partitioned inversematrix, note that
!� ¼ ��1� ��1�� � �0��1� �0��1�
�0��1� �0��1�
� �1c
&
�
¼ ��1� ��1�� � A B
B D
� c
&
�
¼ c��1�Aþ &��1�Bþ c��1�Bþ &��1�D,
where
A ¼1
�0��1�þ
1
�0��1�
�0��1��0��1�
�0��1��0��1�� �0��1��0��1�,
B ¼ ��0��1�
�0��1��0��1�� �0��1��0��1�,
D ¼�0��1�
�0��1��0��1�� �0��1��0��1�:
From this it then follows that
d!�
d�¼ ��1ðcAþ &BÞ þ c��1�
dA
d�
þ ð&��1�þ c��1�ÞdB
d�þ &��1�
dD
d�, ðA1Þ
1184 R.C.A. Oomen
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4
where
dA
d�¼ �
�0��1��0��1�
�0��1�
2�0��1��0��1 � 2�0��1��0��1
�0��1��0��1�� �0��1��0��1�ð Þ2,
dB
d�¼�0��1� 2�0��1��0��1 � 2�0��1��0��1
� ��0��1��0��1�� �0��1��0��1�ð Þ
2
��0��1
�0��1��0��1�� �0��1��0��1�,
dD
d�¼
2�0��1
�0��1��0��1�� �0��1��0��1�
��0��1�ð2�0��1��0��1 � 2�0��1��0��1Þ
ð�0��1��0��1�� �0��1��0��1�Þ2:
Appendix B: Testing for the number of significant
principal components
Let X denote an n� k matrix of i.i.d. standard normal
random variables and let �max denote the largest
eigenvalue of X0X. Johnstone (2001) proves that the
distribution of �max, when centred by �� ¼ ðffiffiffiffiffiffiffiffiffiffiffin� 1p
þffiffiffikpÞ2 and scaled by �� ¼ ð
ffiffiffiffiffiffiffiffiffiffiffin� 1p
þffiffiffikpÞð1=
ffiffiffiffiffiffiffiffiffiffiffin� 1p
þ
1=ffiffiffikpÞ1=3, converges to the Tracy–Widom law of order 1
when k, n!1 and n/k! c� 1. Unreported simulations
confirm the finding of Johnstone (2001) that the finite
sample properties of the test for realistic sample sizes is
excellent.
High-dimensional covariance forecasting for short intra-day horizons 1185
Dow
nloa
ded
by [
Uni
vers
ity o
f Io
wa
Lib
rari
es]
at 2
3:59
04
Oct
ober
201
4