gaussian processes - harvard universityhea- · 2018. 1. 10. · bombcat.gifﬁt. what happened? 5.0...

Gaussian Processesa practical introduction to

for astronomy

Dan Foreman-Mackey Flatiron Institute // dfm.io // github.com/dfm // @exoplaneteer

Resources

a gaussianprocess.org/gpml

b george.readthedocs.io

c dfm.io/gp.js

d github.com/dfm/gp

e foreman.mackey@gmail.com

Gaussian Processes

Screenshot from ui.adsabs.harvard.edu

the number of published astronomy papers containing the words "gaussian process"

Photo by Adi Goldstein on Unsplash

Don't you think you should be using Gaussian Processes?

Photo by Adi Goldstein on Unsplash

Don't you think you should be using Gaussian Processes?

After today, you will be.

1 Why?

2 What?

3 How?

The importance of correlated noise a motivating example

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

assuming independent noise

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

assuming independent noise

fitBOMBCAT.GIF

What happened?

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

datapoint m

The true covariance matrix

log p({yn} | ✓) = �1

[yn �mn]

+ log(2⇡ �n2)

log p({yn} | ✓) = �1

rT C�1 r � 1

log detC � N

log(2⇡)

B@y1 �m1

...yN �mN

CA C =

B@�1

2 0. . .

0 �N2

log p({yn} | ✓) = �1

[yn �mn]

+ log(2⇡ �n2)

log p({yn} | ✓) = �1

rT C�1 r � 1

log detC � N

log(2⇡)

B@y1 �m1

...yN �mN

CA C =

B@�1

2 0. . .

0 �N2

log p({yn} | ✓) = �1

[yn �mn]

+ log(2⇡ �n2)

log p({yn} | ✓) = �1

rT C�1 r � 1

log detC � N

log(2⇡)

B@y1 �m1

...yN �mN

CA C =

B@�1

2 0. . .

0 �N2

* Note: this part has everything to do with Gaussian processes

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

�5.0 �2.5 0.0 2.5 5.0x

�1.0

�0.5

Gaussian process

👍Excellent.

But: we don't know .

* we need to fit for it.

The math of Gaussian processes

Rasmussen & Williams gaussianprocess.org/gpml

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

parameters of mean model

parameters of covariance model

residual vector

covariance matrix

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

parameters of mean model

parameters of covariance model

residual vector

covariance matrix

This is the equation for an N-dimensional Gaussian** hint: this is where the name comes from…

covariance matrix

mean model

generative model

This is the equation for an N-dimensional Gaussian** hint: this is where the name comes from…

y ⇠ N (m✓, K↵)

[K↵]nm = �n2 �nm + k↵(xn, xm)

pixel n

pixel m

[K↵]nm = �n2 �nm + k↵(xn, xm)

the "kernel" function

k↵(xn, xm) = a

✓� (xn � xm)

◆for example*:

* Note: this part has nothing to do with the name Gaussian process

samples

0 2 4 6 8 10x , �x

kernel

samples

0 2 4 6 8 10x , �x

kernel

samples

0 2 4 6 8 10x , �x

kernel

amplitude

samples

0 2 4 6 8 10x , �x

kernel

samples

0 2 4 6 8 10x , �x

kernel

samples

0 2 4 6 8 10x , �x

kernel

samples

0 2 4 6 8 10x , �x

kernel

samples

0 2 4 6 8 10x , �x

kernel

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

a drop-in replacement for χ2

import numpy as np

def gp_log_like(params, x, y, yerr): K = params[0]**2 * np.exp(-0.5*(x[:, None]-x[None, :])**2/params[1]**2) K[np.diag_indices_from(K)] += yerr**2 ll = np.dot(y, np.linalg.solve(K, y)) ll += np.linalg.slogdet(K)[1] return -0.5*ll

A fully functional GP implementation in Python

+ scipy.optimize or emcee

Using GPs in Python

1 george

2 scikit-learn

4 PyMC3

5 etc.

Using GPs in Python

1 george

2 scikit-learn

4 PyMC3

5 etc.

Why? The problems with Gaussian processes

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

a Choosing the kernel

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

a Choosing the kernel

b Scaling to large datasets

102 103 104

number of datapoints

10�4

10�3

10�2

10�1

tation

experiment

a Approximation

b Structure

Approximation methods

i Subsample

ii Sparsity

iii Low-rank approximations (see HODLR solver in george)

iv etc.

Structured models

i Kronecker products

ii Evenly sampled data

iii Semi-separable kernels (see celerite)

iv etc.

celerite.readthedocs.io

102 103 104 105

number of data points [N]

10�5

10�4

10�3

10�2

10�1

tationa

st[sec

direct

100 101 102

number of terms [J]

262144

log p({yn} | ✓, ↵) = �1

r✓T K↵

�1 r✓ �1

log detK↵ � N

log(2⇡)

if there is correlated noise (instrumental or astrophysical) in your data*, try a Gaussian process:

* Hint: there is!

to do this in Python, try:

import georgegeorge.readthedocs.io

Resources

a gaussianprocess.org/gpml

b george.readthedocs.io

c dfm.io/gp.js

d github.com/dfm/gp

e foreman.mackey@gmail.com

Gaussian Processesa practical introduction to

for astronomy

Dan Foreman-Mackey Flatiron Institute // dfm.io // github.com/dfm // @exoplaneteer

gaussian processes - harvard universityhea- · 2018. 1. 10. · bombcat.gifﬁt. what happened? 5.0...

Documents

introduction2...

kmbt...

quality assessment of fractalized npr textures: a...

north carolina community college system · 2014. 12. 5. ·...

ankara Üniversitesitıp fakültesi...plastik cerrahi 0.3 /...

introduction to machine learning - project jupyter ·...

the future in your hands - colly flowtech€¦ · 0 5 10 15...

multiple correspondence analysis (mca) with factominer (tea...

ntc thermistors for temperature...

the sharpe ratio ratio - risky finance...the sharpe ratio...

final-projectcourses.csail.mit.edu/.../final-project.pdf ·...

situación consumo 1s14 web - bbva research · algunas...

2080 -0.5°c 0.0°c 2.0°c 1.5°c 1.0°c 0.5°c 3.0°c...

community services 6 0.5 alberston 1 5.0 … · alberston 1...

marketbeat - cwproprius.com country...2016 (parliamentary)...

travel demand infrastructure needs infrastructure...

spanish - miami dda€¦ · new york 4.0 - 8.9 2.9 - 3.6...

star pubs and bars limited three year full volume history...

japaneseclimate.indd 4 2015/03/19 13:37 - env · 5.5 5.0...

lisakewley% australian%naonal%university% · [oii]%-2.0...