random matrix theory and correlation estimation · 2018-03-22 · introduction random matrix theory...

68
Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Random Matrix Theory and Correlation Estimation Jim Gatheral Baruch College Mathematics Society, February 24, 2015

Upload: others

Post on 16-Feb-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Random Matrix Theoryand

Correlation Estimation

Jim Gatheral

Baruch College Mathematics Society,February 24, 2015

Page 2: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Motivation

We would like to understand

what is random matrix theory. (RMT)

how to apply RMT to the estimation of covariance matrices.

whether the resulting covariance matrix performs better than(for example) the Barra covariance matrix.

Page 3: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Motivation

We would like to understand

what is random matrix theory. (RMT)

how to apply RMT to the estimation of covariance matrices.

whether the resulting covariance matrix performs better than(for example) the Barra covariance matrix.

Page 4: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Motivation

We would like to understand

what is random matrix theory. (RMT)

how to apply RMT to the estimation of covariance matrices.

whether the resulting covariance matrix performs better than(for example) the Barra covariance matrix.

Page 5: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Outline

1 Random matrix theoryRandom matrix examplesWigner’s semicircle lawThe Marcenko-Pastur densityThe Tracy-Widom lawImpact of fat tails

2 Estimating correlations

Uncertainty in correlation estimates.Example with SPX stocksA recipe for filtering the sample correlation matrix

3 Comparison with Barra

Comparison of eigenvectorsThe minimum variance portfolio

Comparison of weightsIn-sample and out-of-sample performance

4 Conclusions5 Appendix with a sketch of Wigner’s original proof

Page 6: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Outline

1 Random matrix theoryRandom matrix examplesWigner’s semicircle lawThe Marcenko-Pastur densityThe Tracy-Widom lawImpact of fat tails

2 Estimating correlationsUncertainty in correlation estimates.Example with SPX stocksA recipe for filtering the sample correlation matrix

3 Comparison with Barra

Comparison of eigenvectorsThe minimum variance portfolio

Comparison of weightsIn-sample and out-of-sample performance

4 Conclusions5 Appendix with a sketch of Wigner’s original proof

Page 7: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Outline

1 Random matrix theoryRandom matrix examplesWigner’s semicircle lawThe Marcenko-Pastur densityThe Tracy-Widom lawImpact of fat tails

2 Estimating correlationsUncertainty in correlation estimates.Example with SPX stocksA recipe for filtering the sample correlation matrix

3 Comparison with BarraComparison of eigenvectorsThe minimum variance portfolio

Comparison of weightsIn-sample and out-of-sample performance

4 Conclusions5 Appendix with a sketch of Wigner’s original proof

Page 8: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Outline

1 Random matrix theoryRandom matrix examplesWigner’s semicircle lawThe Marcenko-Pastur densityThe Tracy-Widom lawImpact of fat tails

2 Estimating correlationsUncertainty in correlation estimates.Example with SPX stocksA recipe for filtering the sample correlation matrix

3 Comparison with BarraComparison of eigenvectorsThe minimum variance portfolio

Comparison of weightsIn-sample and out-of-sample performance

4 Conclusions

5 Appendix with a sketch of Wigner’s original proof

Page 9: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Outline

1 Random matrix theoryRandom matrix examplesWigner’s semicircle lawThe Marcenko-Pastur densityThe Tracy-Widom lawImpact of fat tails

2 Estimating correlationsUncertainty in correlation estimates.Example with SPX stocksA recipe for filtering the sample correlation matrix

3 Comparison with BarraComparison of eigenvectorsThe minimum variance portfolio

Comparison of weightsIn-sample and out-of-sample performance

4 Conclusions5 Appendix with a sketch of Wigner’s original proof

Page 10: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example 1: Normal random symmetric matrix

Generate a 5,000 x 5,000 random symmetric matrix withentries aij ∼ N(0, 1).

Compute eigenvalues.

Draw a histogram.

Here’s some R-code to generate a symmetric random matrix whoseoff-diagonal elements have variance 1/N:

n <- 5000;

m <- array(rnorm(n^2),c(n,n));

m2 <- (m+t(m))/sqrt(2*n);# Make m symmetric

lambda <- eigen(m2, symmetric=T, only.values = T);

e <- lambda$values;

hist(e,breaks=seq(-2.01,2.01,.02),

main=NA, xlab="Eigenvalues",freq=F)

Page 11: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example 1: continued

Here’s the result:

Eigenvalues

Density

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

Page 12: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example 2: Uniform random symmetric matrix

Generate a 5,000 x 5,000 random symmetric matrix withentries aij ∼ Uniform(0, 1).

Compute eigenvalues.

Draw a histogram.

Here’s some R-code again:

n <- 5000;

mu <- array(runif(n^2),c(n,n));

mu2 <-sqrt(12)*(mu+t(mu)-1)/sqrt(2*n);

lambdau <- eigen(mu2, symmetric=T, only.values = T);

eu <- lambdau$values;

hist(eu,breaks=seq(-2.05,2.05,.02),main=NA,xlab="Eigenvalues",

eu <- lambdau$values;

histeu<-hist(eu,breaks=seq(-2.01,2.01,0.02),

main=NA, xlab="Eigenvalues",freq=F)

Page 13: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example 2: continued

Here’s the result:

Eigenvalues

Density

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

Page 14: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

What do we see?

We note a striking pattern: the density of eigenvalues is asemicircle!

Page 15: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s semicircle law

Consider an N × N matrix A with entries aij ∼ N(0, σ2). Define

AN =1√2N

{A + A′

}Then AN is symmetric with

Var[aij ] =

{σ2/N if i 6= j2σ2/N if i = j

The density of eigenvalues of AN is given by

ρN(λ) :=1

N

N∑i=1

δ(λ− λi )

−−−−→N→∞

{1

2π σ2

√4σ2 − λ2 if |λ| ≤ 2σ

0 otherwise.=: ρ(λ)

Page 16: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example 1: Normal random matrix with Wigner density

Now superimpose the Wigner semicircle density:

Eigenvalues

Density

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

Page 17: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example 2: Uniform random matrix with Wigner density

Again superimpose the Wigner semicircle density:

Eigenvalues

Density

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

Eigenvalues

Density

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

−2 −1 0 1 2

0.0

0.1

0.2

0.3

0.4

Page 18: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Random correlation matrices

Suppose we have M stock return series with T elements each. Theelements of the M ×M empirical correlation matrix E are given by

Eij =1

T

T∑t

xit xjt

where xit denotes the tth return of stock i , normalized by standarddeviation so that Var[xit ] = 1.

In matrix form, this may be written as

E = H H′

where H is the M × T matrix whose rows are the time series ofreturns, one for each stock.

Page 19: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Eigenvalue spectrum of random correlation matrix

Suppose the entries of H are random with variance σ2. Then, inthe limit T , M →∞ keeping the ratio Q := T/M ≥ 1 constant,the density of eigenvalues of E is given by

ρ(λ) =Q

2π σ2

√(λ+ − λ)(λ− λ−)

λ

where the maximum and minimum eigenvalues are given by

λ± = σ

(1±

√1

Q

)2

.

ρ(λ) is known as the Marcenko-Pastur density.

Page 20: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Example: IID random normal returns

Here’s some R-code again:

t <- 5000;

m <- 1000;

h <- array(rnorm(m*t),c(m,t)); # Time series in rows

e <- h %*% t(h)/t; # Form the correlation matrix

lambdae <- eigen(e, symmetric=T, only.values = T);

ee <- lambdae$values;

hist(ee,breaks=seq(0.01,3.01,.02),

main=NA,xlab="Eigenvalues",freq=F)

Page 21: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Here’s the result with the Marcenko-Pastur density superimposed:

Eigenvalues

Density

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

Page 22: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Here’s the result with M = 100, T = 500 (again with theMarcenko-Pastur density superimposed):

Eigenvalues

Density

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

Page 23: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

...and again with M = 10, T = 50:

Eigenvalues

Density

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

We see that even for rather small matrices, the theoretical limitingdensity approximates the actual density very well.

Page 24: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Some Marcenko-Pastur densities

The Marcenko-Pastur density depends on Q = T/M. Here aregraphs of the density for Q = 1 (blue), 2 (green) and 5 (red).

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Eigenvalues

Density

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

Page 25: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Distribution of the largest eigenvalue

For applications where we would like to know where therandom bulk of eigenvalues ends and the spectrum ofeigenvalues corresponding to true information begins, we needto know the distribution of the largest eigenvalue.

The distribution of the largest eigenvalue of a randomcorrelation matrix is given by the Tracy-Widom law.

Pr(T λmax < µTM + s σTM) = F1(s)

with

µTM =(√

T − 1/2 +√M − 1/2

)2

σTM =(√

T − 1/2 +√M − 1/2

) ( 1√T − 1/2

+1√

M − 1/2

)1/3

Page 26: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Fat-tailed random matrices

So far, we have considered matrices whose entries are eitherGaussian or drawn from distributions with finite moments.

Suppose that entries are drawn from a fat-tailed distributionsuch as Levy-stable.

This is of practical interest because we know that stockreturns follow a cubic law and so are fat-tailed.

Bouchaud et. al. find that fat tails can massively increase themaximum eigenvalue in the theoretical limiting spectrum ofthe random matrix.

Where the distribution of matrix entries is extremely fat-tailed(Cauchy for example) , the semi-circle law no longer holds.

Page 27: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Sampling error

Suppose we compute the sample correlation matrix of Mstocks with T returns in each time series.

Further suppose that the true correlation matrix were theidentity matrix. What would we expect the greatest samplecorrelation to be?

For N(0, 1) distributed returns, the typical maximumcorrelation ρmax should satisfy:

2

M (M − 1)∼ N

(−ρmax

√T)

With M = 500,T = 1000, we obtain ρmax ≈ 0.14.

So, sampling error induces spurious (and potentiallysignificant) correlations between stocks!

Page 28: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

An experiment with real data

We take 431 stocks in the SPX index for which we have2, 155 = 5× 431 consecutive daily returns.

Thus, in this case, M = 431 and T = 2, 155. Q = T/M = 5.There are M (M − 1)/2 = 92, 665 distinct entries in thecorrelation matrix to be estimated from2, 155× 431 = 928, 805 data points.With these parameters, we would expect the maximum error inour correlation estimates to be around 0.09.

First, we compute the eigenvalue spectrum and superimposethe Marcenko Pastur density with Q = 5.

Page 29: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The eigenvalue spectrum of the sample correlation matrix

Here’s the result:

Eigenvalues

Density

0 1 2 3 4 5

0.0

0.5

1.0

1.5

0 1 2 3 4 5

0.0

0.5

1.0

1.5

Note that the top eigenvalue is 105.37 – way off the end of thechart! The next biggest eigenvalue is 18.73.

Page 30: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

With randomized return data

Suppose we now shuffle the returns in each time series. We obtain:

Eigenvalues

Density

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

Page 31: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Repeat 1,000 times and average

Repeating this 1,000 times gives:

Eigenvalues

Density

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

Page 32: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Distribution of largest eigenvalue

We can compare the empirical distribution of the largest eigenvaluewith the Tracy-Widom density (in red):

Largest eigenvalue

Dens

ity

2.02 2.04 2.06 2.08 2.10 2.12 2.14

05

1015

2025

300

510

1520

2530

Page 33: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Interim conclusions

From this simple experiment, we note that:

Even though return series are fat-tailed,

the Marcenko-Pastur density is a very good approximation tothe density of eigenvalues of the correlation matrix of therandomized returns.the Tracy-Widom density is a good approximation to thedensity of the largest eigenvalue of the correlation matrix ofthe randomized returns.

The Marcenko-Pastur density does not remotely fit theeigenvalue spectrum of the sample correlation matrix fromwhich we conclude that there is nonrandom structure in thereturn data.

We may compute the theoretical spectrum arbitrarilyaccurately by performing numerical simulations.

Page 34: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Problem formulation

Which eigenvalues are significant and how do we interpret theircorresponding eigenvectors?

Page 35: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

A hand-waving practical approach

Suppose we find the values of σ and Q that best fit the bulkof the eigenvalue spectrum. We find

σ = 0.73; Q = 2.90

and obtain the following plot:

Eigenvalues

Density

0 1 2 3 4 5

0.0

0.5

1.0

1.5

0 1 2 3 4 5

0.0

0.5

1.0

1.5

Maximum and minimum Marcenko-Pastur eigenvalues are1.34 and 0.09 respectively. Finiteness effects could take themaximum eigenvalue to 1.38 at the most.

Page 36: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Some analysis

If we are to believe this estimate, a fraction σ2 = 0.53 of thevariance is explained by eigenvalues that correspond torandom noise. The remaining fraction 0.47 has information.

From the plot, it looks as if we should cut off eigenvaluesabove 1.5 or so.

Summing the eigenvalues themselves, we find that 0.49 of thevariance is explained by eigenvalues greater than 1.5

Similarly, we find that 0.47 of the variance is explained byeigenvalues greater than 1.78

The two estimates are pretty consistent!

Page 37: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

More carefully: correlation matrix of residual returns

Now, for each stock, subtract factor returns associated withthe top 25 eigenvalues (λ > 1.6).

We find that σ = 1; Q = 4 gives the best fit of theMarcenko-Pastur density and obtain the following plot:

Eigenvalues

Density

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

Maximum and minimum Marcenko-Pastur eigenvalues are2.25 and 0.25 respectively.

Page 38: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Distribution of eigenvector components

If there is no information in an eigenvector, we expect thedistribution of the components to be a maximum entropydistribution.

Specifically, if we normalized the eigenvector u such that itscomponents ui satisfy

M∑i

u2i = M,

the distribution of the ui should have the limiting density

p(u) =

√1

2πexp

{−u2

2

}Let’s now superimpose the empirical distribution ofeigenvector components and the zero-information limitingdensity for various eigenvalues.

Page 39: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Informative eigenvalues

Here are pictures for the six largest eigenvalues:

Eigenvector #1 = 105.37

Density

−4 −2 0 2 4

0.0

0.5

1.0

1.5

2.0

−4 −2 0 2 4

0.0

0.5

1.0

1.5

2.0

Eigenvector #2 = 18.73

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Eigenvector #3 = 14.45

Density

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

Eigenvector #4 = 9.81

Density

−4 −2 0 2 4

0.0

0.2

0.4

0.6

−4 −2 0 2 4

0.0

0.2

0.4

0.6

Eigenvector #5 = 6.99

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

Eigenvector #6 = 6.4

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Page 40: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Non-informative eigenvalues

Here are pictures for six eigenvalues in the bulk of the distribution:

Eigenvector #25 = 1.62

Density

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

Eigenvector #100 = 0.85

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

Eigenvector #175 = 0.6

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

Eigenvector #250 = 0.43

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

Eigenvector #325 = 0.29

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

Eigenvector #400 = 0.17

Density

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

Page 41: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The resulting recipe

1 Fit the Marcenko-Pastur distribution to the empirical densityto determine Q and σ.

2 All eigenvalues above some number λ∗ are consideredinformative; otherwise eigenvalues relate to noise.

3 Replace all noise-related eigenvalues λi below λ∗ with aconstant and renormalize so that

∑Mi=1 λi = M.

Recall that each eigenvalue relates to the variance of aportfolio of stocks. A very small eigenvalue means that thereexists a portfolio of stocks with very small out-of-samplevariance – something we probably don’t believe.

4 Undo the diagonalization of the sample correlation matrix Cto obtain the denoised estimate C′.

Page 42: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The resulting recipe

1 Fit the Marcenko-Pastur distribution to the empirical densityto determine Q and σ.

2 All eigenvalues above some number λ∗ are consideredinformative; otherwise eigenvalues relate to noise.

3 Replace all noise-related eigenvalues λi below λ∗ with aconstant and renormalize so that

∑Mi=1 λi = M.

Recall that each eigenvalue relates to the variance of aportfolio of stocks. A very small eigenvalue means that thereexists a portfolio of stocks with very small out-of-samplevariance – something we probably don’t believe.

4 Undo the diagonalization of the sample correlation matrix Cto obtain the denoised estimate C′.

Page 43: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The resulting recipe

1 Fit the Marcenko-Pastur distribution to the empirical densityto determine Q and σ.

2 All eigenvalues above some number λ∗ are consideredinformative; otherwise eigenvalues relate to noise.

3 Replace all noise-related eigenvalues λi below λ∗ with aconstant and renormalize so that

∑Mi=1 λi = M.

Recall that each eigenvalue relates to the variance of aportfolio of stocks. A very small eigenvalue means that thereexists a portfolio of stocks with very small out-of-samplevariance – something we probably don’t believe.

4 Undo the diagonalization of the sample correlation matrix Cto obtain the denoised estimate C′.

Page 44: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The resulting recipe

1 Fit the Marcenko-Pastur distribution to the empirical densityto determine Q and σ.

2 All eigenvalues above some number λ∗ are consideredinformative; otherwise eigenvalues relate to noise.

3 Replace all noise-related eigenvalues λi below λ∗ with aconstant and renormalize so that

∑Mi=1 λi = M.

Recall that each eigenvalue relates to the variance of aportfolio of stocks. A very small eigenvalue means that thereexists a portfolio of stocks with very small out-of-samplevariance – something we probably don’t believe.

4 Undo the diagonalization of the sample correlation matrix Cto obtain the denoised estimate C′.

Page 45: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

An extra detail

In general, we will have C ′ii 6= 1.

We can set diagonal elements to one by reweightingeigenvector components.

Let D be the diagonal matrix with elements Di = 1/√C ′ii .

ThenC ′′ = D C ′ D

has C ′′ii = 1.

Eigenvector components will be reweighted by 1/√Cii .

Page 46: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Comparison with Barra

We might wonder how this random matrix recipe compares toBarra.

For example:

How similar are the top eigenvectors of the sample and Barramatrices?How similar are the eigenvalue densities of the filtered andBarra matrices?How do the minimum variance portfolios compare in-sampleand out-of-sample?

Page 47: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Comparing the top eigenvector

We compare the eigenvectors corresponding to the topeigenvalue (the market components) of the sample and Barracorrelation matrices:

0.00 0.02 0.04 0.06 0.08

0.00

0.02

0.04

0.06

0.08

Top Barra eigenvector components

Top

sam

ple

eige

nvec

tor c

ompo

nent

s

0.00 0.02 0.04 0.06 0.08

0.00

0.02

0.04

0.06

0.08

NEM

The eigenvectors are rather similar except for Newmont(NEM) which has no weight in the sample market component.

Page 48: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The next four eigenvectors

The next four are:

−0.15 −0.05 0.00 0.05 0.10 0.15

−0.1

5−0

.05

0.05

0.10

0.15

#2 Barra components

#2 s

ampl

e co

mpo

nent

s

−0.15 −0.05 0.00 0.05 0.10 0.15

−0.1

5−0

.05

0.05

0.10

0.15

−0.20 −0.10 0.00 0.05 0.10

−0.2

0−0

.10

0.00

0.05

0.10

#3 Barra components

#3 s

ampl

e co

mpo

nent

s

−0.20 −0.10 0.00 0.05 0.10

−0.2

0−0

.10

0.00

0.05

0.10

−0.20 −0.10 0.00 0.05 0.10 0.15

−0.2

0−0

.10

0.00

0.10

#4 Barra components

#4 s

ampl

e co

mpo

nent

s

−0.20 −0.10 0.00 0.05 0.10 0.15

−0.2

0−0

.10

0.00

0.10

−0.15 −0.05 0.00 0.05 0.10 0.15

−0.1

5−0

.05

0.05

0.10

0.15

#5 Barra components

#5 s

ampl

e co

mpo

nent

s

−0.15 −0.05 0.00 0.05 0.10 0.15

−0.1

5−0

.05

0.05

0.10

0.15

The first three of these are very similar but #5 diverges.

Page 49: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

The minimum variance portfolio

We may construct a minimum variance portfolio byminimizing the variance w′.Σ.w subject to

∑i wi = 1.

The weights in the minimum variance portfolio are given by

wi =

∑j σ−1ij∑

i ,j σ−1ij

where σ−1ij are the elements of Σ−1.

We compute characteristics of the minimum varianceportfolios corresponding to

the sample covariance matrixthe filtered covariance matrix (keeping only the top 25 factors)the Barra covariance matrix

Page 50: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Comparison of portfolios

We compute the minimum variance portfolios given thesample, filtered and Barra correlation matrices respectively.

From the picture below, we see that the filtered portfolio iscloser to the Barra portfolio than the sample portfolio.

−0.10 −0.05 0.00 0.05 0.10

−0.1

0−0

.05

0.00

0.05

0.10

Barra portfolio weights

Filte

red

portf

olio

wei

ghts

−0.10 −0.05 0.00 0.05 0.10

−0.1

0−0

.05

0.00

0.05

0.10

−0.10 −0.05 0.00 0.05 0.10

−0.1

0−0

.05

0.00

0.05

0.10

Barra portfolio weights

Sam

ple

portf

olio

wei

ghts

−0.10 −0.05 0.00 0.05 0.10

−0.1

0−0

.05

0.00

0.05

0.10

Consistent with the pictures, we find that the absoluteposition sizes (adding long and short sizes) are:Sample: 4.50; Filtered: 3.82; Barra: 3.40

Page 51: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

In-sample performance

In sample, these portfolios performed as follows:

0 500 1000 1500 2000

−0.5

0.0

0.5

1.0

Life of portfolio (days)

Ret

urn

0 500 1000 1500 2000

−0.5

0.0

0.5

1.0

Life of portfolio (days)

Ret

urn

0 500 1000 1500 2000

−0.5

0.0

0.5

1.0

Life of portfolio (days)

Ret

urn

Figure: Sample in red, filtered in blue and Barra in green.

Page 52: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

In-sample characteristics

In-sample statistics are:

Volatility Max DrawdownSample 0.523% 18.8%Filtered 0.542% 17.7%Barra 0.725% 55.5%

Naturally, the sample portfolio has the lowest in-samplevolatility.

Page 53: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Out of sample comparison

We plot minimum variance portfolio returns from 04/26/2007to 09/28/2007.

The sample, filtered and Barra portfolio performances are inred, blue and green respectively.

0 20 40 60 80 100

−0.0

8−0

.06

−0.0

4−0

.02

0.00

0.02

Life of portfolio (days)

Ret

urn

0 20 40 60 80 100

−0.0

8−0

.06

−0.0

4−0

.02

0.00

0.02

0 20 40 60 80 100

−0.0

8−0

.06

−0.0

4−0

.02

0.00

0.02

Sample and filtered portfolio performances are pretty similarand both much better than Barra!

Page 54: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Out of sample summary statistics

Portfolio volatilities and maximum drawdowns are as follows:

Volatility Max DrawdownSample 0.811% 8.65%Filtered 0.808% 7.96%Barra 0.924% 10.63%

The minimum variance portfolio computed from the filteredcovariance matrix wins according to both measures!

However, the sample covariance matrix doesn’t do too badly ...

Page 55: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Main result

It seems that the RMT filtered sample correlation matrixperforms better than Barra.

Although our results here indicate little improvement over thesample covariance matrix from filtering, that is probablybecause we had Q = 5.In practice, we are likely to be dealing with more stocks (Mgreater) and fewer observations (T smaller).

Moreover, the filtering technique is easy to implement.

Page 56: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

When and when not to use a factor model

Quoting from Fan, Fan and Lv:

The advantage of the factor model lies in the estimation ofthe inverse of the covariance matrix, not the estimation of thecovariance matrix itself. When the parameters involve theinverse of the covariance matrix, the factor model showssubstantial gains, whereas when the parameters involved thecovariance matrix directly, the factor model does not havemuch advantage.

Page 57: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Moral of the story

Fan, Fan and Lv’s conclusion can be extended to all techniques for”improving” the covariance matrix:

In applications such as portfolio optimization where theinverse of the covariance matrix is required, it is important touse a better estimate of the covariance matrix than thesample covariance matrix.

Noise in the sample covariance estimate leads to spurioussub-portfolios with very low or zero predicted variance.

In applications such as risk management where only a goodestimate of risk is required, the sample covariance matrix(which is unbiased) should be used.

Page 58: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Miscellaneous thoughts/ observations

There are reasons to think that the RMT recipe might berobust to changes in details:

It doesn’t really seem to matter much exactly how manyfactors you keep.In particular, Tracy-Widom seems to be irrelevant in practice.

The better performance of the RMT correlation matrixrelative to Barra probably relates to the RMT filtered matrixuncovering real correlation structure in the time series datawhich Barra does not capture.With Q = 5, the sample covariance does very well, even whenit is inverted. That suggests that the key is to reducesampling error in correlation estimates.

We could perhaps increase sample size by subsampling (hourlyfor example).

It seems that correlations are really different at differenttimescales so we can’t use tick data to get a better estimateof the one-day correlation matrix.

Page 59: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Miscellaneous thoughts/ observations

There are reasons to think that the RMT recipe might berobust to changes in details:

It doesn’t really seem to matter much exactly how manyfactors you keep.In particular, Tracy-Widom seems to be irrelevant in practice.

The better performance of the RMT correlation matrixrelative to Barra probably relates to the RMT filtered matrixuncovering real correlation structure in the time series datawhich Barra does not capture.

With Q = 5, the sample covariance does very well, even whenit is inverted. That suggests that the key is to reducesampling error in correlation estimates.

We could perhaps increase sample size by subsampling (hourlyfor example).

It seems that correlations are really different at differenttimescales so we can’t use tick data to get a better estimateof the one-day correlation matrix.

Page 60: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Miscellaneous thoughts/ observations

There are reasons to think that the RMT recipe might berobust to changes in details:

It doesn’t really seem to matter much exactly how manyfactors you keep.In particular, Tracy-Widom seems to be irrelevant in practice.

The better performance of the RMT correlation matrixrelative to Barra probably relates to the RMT filtered matrixuncovering real correlation structure in the time series datawhich Barra does not capture.With Q = 5, the sample covariance does very well, even whenit is inverted. That suggests that the key is to reducesampling error in correlation estimates.

We could perhaps increase sample size by subsampling (hourlyfor example).

It seems that correlations are really different at differenttimescales so we can’t use tick data to get a better estimateof the one-day correlation matrix.

Page 61: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Miscellaneous thoughts/ observations

There are reasons to think that the RMT recipe might berobust to changes in details:

It doesn’t really seem to matter much exactly how manyfactors you keep.In particular, Tracy-Widom seems to be irrelevant in practice.

The better performance of the RMT correlation matrixrelative to Barra probably relates to the RMT filtered matrixuncovering real correlation structure in the time series datawhich Barra does not capture.With Q = 5, the sample covariance does very well, even whenit is inverted. That suggests that the key is to reducesampling error in correlation estimates.

We could perhaps increase sample size by subsampling (hourlyfor example).

It seems that correlations are really different at differenttimescales so we can’t use tick data to get a better estimateof the one-day correlation matrix.

Page 62: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

References

Jean-Philippe Bouchaud, Marc Mezard, and Marc Potters.

Theory of Financial Risk and Derivative Pricing: From Statististical Physics to Risk Management.Cambridge University Press, Cambridge, 2 edition, 2003.

Jianqing Fan, Yingying Fan, and Jinchi Lv III.

Large dimensional covariance matrix estimation via a factor model.Journal of Econometrics 147(1) 186–197 (2008).

J.-P. Bouchaud G. Biroli and M. Potters.

On the top eigenvalue of heavy-tailed random matrices.Europhysics Letters, 78(1):10001 (5pp), April 2007.

Iain M. Johnstone.

High dimensional statistical inference and random matrices.In International Congress of Mathematicians I 307–333, Eur. Math. Soc., Zurich (2007).

Vasiliki Plerou, Parameswaran Gopikrishnan, Bernd Rosenow, Luıs A. Nunes Amaral, Thomas Guhr, and

H. Eugene Stanley.Random matrix approach to cross correlations in financial data.Physical Review E, 65(6):066126, June 2002.

Jean-Philippe Bouchaud and Marc Potters.

Financial Applications of Random Matrix Theory: a short review.In The Oxford Handbook of Random Matrix Theory.Oxford University Press, 2011.

Eugene P. Wigner.

Characteristic vectors of bordered matrices with infinite dimensions.The Annals of Mathematics, 2nd Ser., 62(3):548–564, November 1955.

Page 63: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s proof

The Eigenvalue Trace Formula

For any real symmetric matrix A, ∃ a unitary matrix U consistingof the (normalized) eigenvectors of A such that

L = U′A U

is diagonal. The entries λi of L are the eigenvalues of A.Noting that Lk = U′Ak U it follows that the eigenvalues of Ak areλki . In particular,

Tr[Ak]

= Tr[Lk]

=N∑i

λki → N E[λk ] as N →∞

That is, the kth moment of the distribution ρ(λ) of eigenvalues isgiven by

E[λk ] = limN→∞

1

NTr[Ak]

Page 64: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s proof

Matching moments

Then, to prove Wigner’s semi-circle law, we need to show that themoments of the semicircle distribution are equal to the the traceson the right hand side in the limit N →∞.For example, if A is a Wigner matrix,

1

NTr [A] =

1

N

N∑i

aii → 0 as N →∞

and 0 is the first moment of the semi-circle density.Now for the second moment:

1

NTr[A2]

=1

N

N∑i ,j

aij aji =1

N

N∑i ,j

a2ij → σ2 as N →∞

It is easy to check that σ2 is the second moment of the semi-circledensity.

Page 65: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s proof

The third moment

1

NTr[A3]

=1

N

N∑i ,j ,k

aij ajk aki

Because the aij are assumed iid, this sum tends to zero. This istrue for all odd powers of A and because the semi-circle law issymmetric, all odd moments are zero.

Page 66: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s proof

The fourth moment

1

NTr[A4]

=1

N

N∑i ,j ,k,l

aij ajk akl ali

To get a nonzero contribution to this sum in the limit N →∞, wemust have at least two pairs of indices equal. We also get anonzero contribution from the N cases where all four indices areequal but that contribution goes away in the limit N →∞. Termsinvolving diagonal entries aii also vanish in the limit. In the casek = 4, we are left with two distinct terms to give

1

NTr[A4]

=1

N

N∑i ,j ,k,l

{aij aji ail ali + aij ajk akj aji} → 2σ4 as N →∞

Naturally, 2σ4 is the fourth moment Eρ[λ4] of the semi-circledensity.

Page 67: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s proof

Higher moments

Only products of pairs of matrix elements contribute.Just as in the case of the fourth moment, there are at least afactor N fewer terms involving higher order combinations.

So we need to count how many different combinations of twoelements contribute in the N →∞ limit.

This turns out to be the number of planar (non-crossing) waysof connecting pairs of elements.

For k = 2 j , the number of such combinations is given by theCatalan numbers

Cj =(2 j)!

(j + 1)! j!.

Thus, for k = 2 j ,

limN→∞

1

NTr[Ak]

= Cj σ2 j ,

identical to the moments of the Wigner semicircle density.

Page 68: Random Matrix Theory and Correlation Estimation · 2018-03-22 · Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix Motivation We

Introduction Random matrix theory Estimating correlations Comparison with Barra Conclusion Appendix

Wigner’s proof

Remarks on Wigner’s result

The elements aij of A don’t need to be normally distributed.In Wigner’s original proof, aij = ±ν for some fixed ν.

However, we do need the higher moments of the distribution ofthe aij to vanish sufficiently rapidly.In practice, this means that if returns are fat-tailed, we need tobe careful.

The Wigner semi-circle law is like a Central Limit theorem forrandom matrices.