efficient gaussian process regression for large data sets
DESCRIPTION
Efficient Gaussian Process Regression for large Data Sets. ANJISHNU BANERJEE, DAVID DUNSON, SURYA TOKDAR Biometrika , 2008. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/1.jpg)
Efficient Gaussian Process Regression for large Data Sets
ANJISHNU BANERJEE, DAVID DUNSON, SURYA TOKDARBiometrika, 2008
![Page 2: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/2.jpg)
IntroductionWe have noisy observations from the unknown function observed at locations respectively
The prediction for new input x,
fffT
fx Kkxg y1,, )()()(
Where K_{f,f} is n x n covariance matrix.
Problem: O(n^3) in performing necessary matrix inversions with n denoting the number of data points
nf yy ,...,1ynf xx ,...,1x
The unknown function f is assumed to be a realization of a GP.
![Page 3: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/3.jpg)
Key idea
• Existing solutions : “Knots” or “landmarks” based solutions– Determining the location and spacing of knots, with the
choice having substantial impact• Methods have been proposed for allowing uncertain numbers and locations of knots in
the predictive process using reversible jump– Unfortunately, such free knot methods increase the computational burden substantially, partially eliminating
the computational savings due to a low rank method
• Motivated by the literature on compressive sensing, the authors propose an alternative random projection of all the data points onto a lower-dimensional subspace
![Page 4: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/4.jpg)
Nystrom Approximation (Williams and Seeger, 2000)
![Page 5: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/5.jpg)
Nystrom Approximation (Williams and Seeger, 2000)
![Page 6: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/6.jpg)
Landmark based Method
• Let X*= Let, Defining
An approximation to
![Page 7: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/7.jpg)
Random projection method
• The key idea for random projection is to use instead of where
Tffff
TffffTTT
aug KKKK
xfxfK ,)(cov
Let be the random projection approximation to ,
ffTff
Tff
RPff KKKQ
1
![Page 8: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/8.jpg)
Some properties• When m = n, So, and we get back the original process with a full rank
random projection
• Relation to Nystrom approximation– Approximation in the machine learning literature were viewed as reduced
rank approximations to the covariance matrices– It is easy to see that corresponds to a Nystrom approximation to
![Page 9: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/9.jpg)
Choice of Ф
![Page 10: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/10.jpg)
Low Distortion embeddings
• Embed matrix K from a using random projection matrix
• Embeddings with low distortion properties have been well-studied and J-L transforms are among the most popular–
![Page 11: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/11.jpg)
Given a fixed rank m what is the near optimal projection for that rank m?
![Page 12: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/12.jpg)
Finding range for a given target error condition
![Page 13: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/13.jpg)
Results
• Conditioning numbers– Full covariance matrix for a smooth GP tracked at a dense set of
locations will be ill-conditioned and nearly rank-deficient in practice– Inverses may be highly unstable and severely degrade inference
![Page 14: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/14.jpg)
Results
• Parameter estimation (Toy data)
![Page 15: Efficient Gaussian Process Regression for large Data Sets](https://reader036.vdocuments.mx/reader036/viewer/2022081512/568162b5550346895dd33e42/html5/thumbnails/15.jpg)
Results• Parameter estimation (real data)