random matrix theory and numerical linear algebra: a story of communication alan edelman mathematics...

Download Random Matrix Theory and Numerical Linear Algebra: A story of communication Alan Edelman Mathematics Computer Science & AI Labs ILAS Meeting June 3, 2013

If you can't read please download the document

Upload: alexander-payne

Post on 26-Dec-2015

221 views

Category:

Documents


4 download

TRANSCRIPT

  • Slide 1
  • Random Matrix Theory and Numerical Linear Algebra: A story of communication Alan Edelman Mathematics Computer Science & AI Labs ILAS Meeting June 3, 2013
  • Slide 2
  • Page 2
  • Slide 3
  • Page 3
  • Slide 4
  • Page 4
  • Slide 5
  • An Intriguing Thesis Page 5 The results/methodologies from NUMERICAL LINEAR ALGEBRA would be valuable even if computers had not been built. but Im glad we have computers & Im especially grateful for the Julia computing system
  • Slide 6
  • Eigenvalues of GOE (=1 means reals) Nave Way in four languages: A=randn(n); S=(A+A)/sqrt(2*n);eig(S) A=randn(n,n);S=(A+A)/sqrt(2*n);eigvals(S) A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n); eigen(S,symmetric=T,only.values=T)$values; A=RandomArray[NormalDistribution[],{n,n}]; S=(A+Transpose[A])/Sqrt[2*n];Eigenvalues[s] Page 6
  • Slide 7
  • Eigenvalues of GOE (=1 means reals) Nave Way in four languages: A=randn(n); S=(A+A)/sqrt(2*n);eig(S) A=randn(n,n);S=(A+A)/sqrt(2*n);eigvals(S) A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n); eigen(S,symmetric=T,only.values=T)$values; A=RandomArray[NormalDistribution[],{n,n}]; S=(A+Transpose[A])/Sqrt[n];Eigenvalues[s] Page 7 If for you: simulation is just intuition, a check, a figure, or a student project, then software like this is written quickly and victory is declared.
  • Slide 8
  • Eigenvalues of GOE (=1 means reals) Nave Way in four languages: A=randn(n); S=(A+A)/sqrt(2*n);eig(S) A=randn(n,n);S=(A+A)/sqrt(2*n);eigvals(S) A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n); eigen(S,symmetric=T,only.values=T)$values; A=RandomArray[NormalDistribution[],{n,n}]; S=(A+Transpose[A])/Sqrt[n];Eigenvalues[s] Page 8 For me: Simulation is a chance to research efficiency, Numerical Linear Algebra Style, and this research cycles back to the mathematics! ==
  • Slide 9
  • Tridiagonal Model More Efficient (=1: Same eigenvalue distribution!) (Silverstein, Trotter, general Dumitriu&E. etc) g i ~N(0,2) Julia: eig(SymTridiagonal(d,e)) Storage: O(n) (vs O(n 2 )) Time: O(n 2 ) (vs O(n 3 )) Page 9 n
  • Slide 10
  • Histogram without Histogramming: Sturm Sequences Count #eigs < s: Count sign changes in Det( (A-s*I)[1:k,1:k] ) Count #eigs in [x,x+h]: Take difference in number of sign changes at x+h and x Speed comparison vs. nave way Mentioned in Dumitriu and E 2006, Used theoretically in Albrecht, Chan, and E 2008 Page 10
  • Slide 11
  • Page 11 A good computational trick is a good theoretical trick! Finite Semi-Circle Laws for Any Beta! Finite Tracy-Widom Laws for Any Beta!
  • Slide 12
  • Efficient Tracy Widom Simulation Nave Way: A=randn(n); S=(A+A)/sqrt(2*n);max(eig(S)) Better Way: Only create the 10n 1/3 initial segment of the diagonal and off-diagonal as the Airy function tells us that the max eig hardly depends on the rest Lead to the notion of the stochastic operator limit Page 12
  • Slide 13
  • Google Translate Page 13
  • Slide 14
  • Google Translate (in my dreams) Lost In Translation: eigs = svds squared Page 14
  • Slide 15
  • The Jacobi Random Matrix Ensemble (Constantine 1963) Suppose A and B are randn(m1,n) and randn(m2,n) (iid standard normals) The eigenvalues of or in symmetric form has joint density Page 15
  • Slide 16
  • The Jacobi Ensemble Geometrically Take random n-hyperplane in R m (m>n) (uniformly) Take reference hyperplane (any dimension) Orthogonal projection of the unit ball in the random hyperplane onto the reference hyperplane is an ellipsoid Semi-axes lengths are Jacobi cosines: Page 16
  • Slide 17
  • The GSVD Usual Presentation Underlying Geometry Take hyperplane spanned by [ ] X=span( 1 st m 1 columns of I m ) Y=span(last m 2 columns of I m ) ABAB A: m 1 x n B: m 2 x n [ ]: m x n ABAB Page 17
  • Slide 18
  • GSVD Mathematics (Cosine,Sine) Thus any hyperplane is the span of with UU=VV=I and C 2 +S 2 =I (diagonal) One can directly verify by taking Jacobians projecting out the WdW directions that we do not care about and further understand C J = 18/77 Page 18
  • Slide 19
  • Circular Ensembles A random complex unitary matrix has eigenvalues distributed on the unit circle with density It is desriable to construct random matrices that we can compute with for general beta: (Killip and Nenciu 2004 product of tridiagonal solution) Page 19
  • Slide 20
  • Numerical linear algebra Ammar, Gragg, Reichel (1991) Hessenberg Unitary Matrices G j = Recalled in Forrester and Rains (2005) Converts Killip and Nenciu to AmmarGraggReichel format Classical orthogonal polynomials on unit circle! Story relates to CMV matrices Verblunsky Coefficients = Schur Parameters Page 20
  • Slide 21
  • Implementation Needed facts 1)AmmarGraggReichel format 2)Generating the variables requires some thinking | j | ~ sqrt(1-rand()^ ((2/)/j)) j = | j |*exp(2*pi*i*rand()) Remark: Authors would rightly feel all the mathematical information is in their papers. Still this presenter had some considerable trouble gathering enough of the pieces to produce the ultimate arbiter of understanding the result: a working code! Plea: Random Matrix Theory is so valuable to so many, that a pseudocode or code available is a great technology transfer mechanism. Page 21
  • Slide 22
  • Julia Code: (similar to MATLAB etc.) really useful! Page 22
  • Slide 23
  • The method of Ghosts and Shadows for Beta Ensembles Page 23
  • Slide 24
  • So far: I tried to hint about Introduction to Ghosts G 1 is a standard normal N(0,1) G 2 is a complex normal (G 1 +iG 1 ) G 4 is a quaternion normal (G 1 +iG 1 +jG 1 +kG 1 ) G (>0) seems to often work just fine Ghost Gaussian Page 24
  • Slide 25
  • Chi-squared Defn: is the sum of iid squares of standard normals if =1,2, Generalizes for non-integer as the gamma function interpolates factorial is the sqrt of the sum of squares (which generalizes) (wikipedia chi-distriubtion) |G 1 | is 1, |G 2 | is 2, |G 4 | is 4 So why not |G | is ? I call the shadow of G 2 Page 25
  • Slide 26
  • Page 26
  • Slide 27
  • Working with Ghosts Real quantity Page 27
  • Slide 28
  • Singular Values of a Ghost Gaussian by a real diagonal Page 28 (see related work by Forrester 2011) (Dubbs, E, Koev, Venkataramana 2013)
  • Slide 29
  • The Algorithm Page 29
  • Slide 30
  • Removing U and V Page 30
  • Slide 31
  • Algorithm cont. Page 31
  • Slide 32
  • Completion of Recursion Page 32
  • Slide 33
  • Monte Carlo Trials Parallel Histograms No tears! Page 33
  • Slide 34
  • Julia: Parallel Histogram 3 rd eigenvalue, pylab plot, 8 seconds! 75 processors Page 34
  • Slide 35
  • Linear Algebra too limited in Lets me put together what I need: e.g.:Tridiagonal Eigensolver Fast rank one update Arrow matrix eigensolver can surgically use LAPACK without tears Page 35
  • Slide 36
  • Weekend Julia Run Histogram of smallest singular value of randn(200) spiked by doubling the first column Julia: A=randn(200,200); A(:,1)*=2; Reduced mathematically to bidiagonal form Used julias bidiagonal svd Ran 2.25 billion trials in 20 hours on 75 processors Nave serial algorithm would take 16 years Page 36
  • Slide 37
  • Weekend Julia Run Histogram of smallest singular value of randn(200) spiked by doubling the first column Julia: A=randn(200,200); A(:,1)*=2; Reduced mathematically to bidiagonal form Used julias bidiagonal svd Ran 2.25 billion trials in 20 hours on 75 processors Nave serial algorithm would take 16 years I knew the limiting histogram from my thesis, universality, etc. With this kind of power, I could obtain the first order correction for finite n! Semilogy plot of abs correction and a prediction: This is like having an electron microscope to see small details that would be invisible with conventional tools! Page 37
  • Slide 38
  • Conclusion If you already held Numerical Linear Algebra with high esteem, thanks to random matrix theory, now there are even more reasons! Try julia. (Google: julia) Page 38
  • Slide 39
  • Page 39
  • Slide 40
  • Wishart Matrices (arbitrary covariance) G=mxn matrix of Gaussians =mxn semidefinite matrix GG is similar to A= GG - For =1,2,4, the joint eigenvalue density of A has a formula: Known for =2 in some circles as Harish-Chandra-Itzykson-Zuber Page 40
  • Slide 41
  • Eigenvalue density of GG ( similar to A= GG - ) Present an algorithm for sampling from this density Show how the method of Ghosts and Shadows can be used to derive this algorithm Further evidence that =1,2,4 need not be special Page 41
  • Slide 42
  • Scary Ideas in Mathematics Zero Negative Radical Irrational Imaginary Ghosts: Something like a sometimes commutative algebra of random variables that generalizes random Reals, Complexes, and Quaternions and inspires theoretical results and numerical computation Page 42
  • Slide 43
  • Did you say commutative?? Quaternions dont commute. Yes but random quaternions do! If x and y are G 4 then x*y and y*x are identically distributed! Page 43
  • Slide 44
  • RMT Densities Hermite: c | i - j | e - i 2 /2 (Gaussian Ensemble) Laguerre: c | i - j | i m e - i (Wishart Matrices) Jacobi: c | i - j | i m 1 (1- i ) m 2 (Manova Matrices) Fourier: c | i - j | (on the complex unit circle) (Circular Ensembles) (orthogonalized by Jack Polynomials) Page 44
  • Slide 45
  • Wishart Matrices (arbitrary covariance) G=mxn matrix of Gaussians =mxn semidefinite matrix GG is similar to A= GG - For =1,2,4, the joint eigenvalue density of A has a formula: Page 45
  • Slide 46
  • Joint Eigenvalue density of GG The 0 F 0 function is a hypergeometric function of two matrix arguments that depends only on the eigenvalues of the matrices. Formulas and software exist. Page 46
  • Slide 47
  • Generalization of Laguerre Laguerre: Versus Wishart: Page 47
  • Slide 48
  • General ? The joint density : is a probability density for all >0. Goals: Algorithm for sampling from this density Get a feel for the densitys ghost meaning Page 48
  • Slide 49
  • Main Result An algorithm derived from ghosts that samples eigenvalues A MATLAB implementation that is consistent with other beta- ized formulas Largest Eigenvalue Smallest Eigenvalue Page 49
  • Slide 50
  • More practice with Ghosts Page 50
  • Slide 51
  • Bidiagonalizing =I ZZ has the =I density giving a special case of Page 51
  • Slide 52
  • Numerical Experiments Largest Eigenvalue Analytic Formula for largest eigenvalue dist E and Koev: software to compute Page 52
  • Slide 53
  • 53
  • Slide 54
  • Page 54
  • Slide 55
  • 55
  • Slide 56
  • Smallest Eigenvalue as Well The cdf of the smallest eigenvalue, Page 56
  • Slide 57
  • Cdfs of smallest eigenvalue 57
  • Slide 58
  • Goals Continuum of Haar Measures generalizing orthogonal, unitary, symplectic Place finite random matrix theory into same framework as infinite random matrix theory: specifically as a knob to turn down the randomness, e.g. Airy Kernel d 2 /dx 2 +x+(2/ )dW White Noise Page 58
  • Slide 59
  • Formally Let S n =2/(n/2)=surface area of sphere Defined at any n= >0. A -ghost x is formally defined by a function f x (r) such that f x (r) r -1 S -1 dr=1. Note: For integer, the x can be realized as a random spherically symmetric variable in dimensions Example: A -normal ghost is defined by f(r)=(2) -/2 e -r 2 /2 Example: Zero is defined with constant*(r). Can we do algebra? Can we do linear algebra? Can we add? Can we multiply? r=0 Page 59
  • Slide 60
  • Understanding | i - j | Define volume element (dx)^ by (r dx)^=r (dx)^ (-dim volume, like fractals, but dont really see any fractal theory here) Jacobians: A=QQ (Sym Eigendecomposition) QdAQ=d+(QdQ)- (QdQ) (dA)^=(QdAQ)^= diagonal ^ strictly-upper diagonal = d i =(d)^ off-diag = ( (QdQ) ij ( i - j ) ) ^=(QdQ)^ | i - j | Page 60