a video from google deepmind - github pageswangleiphy.github.io/talks/rbm_seattle.pdfrecommender...

Post on 05-Jun-2020

16 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Video from Google DeepMind

http://www.nature.com/nature/journal/v518/n7540/fig_tab/nature14236_SV2.html

Can machine learning teach us cluster updates ?

Lei Wang Institute of Physics, CAS https://wangleiphy.github.io

Li Huang and LW, 1610.02746 LW, 1702.08586

Local vs Cluster algorithms

Local vs Cluster algorithms

is slower than

Local vs Cluster algorithms

Algorithmic innovation outperforms Moore’s law!

Recommender Systems

Learn preferences

Recommendations

Recommender Systems

Learn preferences

Recommendations

Restricted Boltzmann Machine

E (x,h) = �NX

i=1

aixi �MX

j=1

bjhj �NX

i=1

MX

j=1

xiWijhj

x 2 {0, 1}N

h 2 {0, 1}M

Energy based model

x1 x2 . . .

xN

h1 h2 h3 . . .

hM

Smolensky 1986 Hinton and Sejnowski 1986

Restricted Boltzmann Machine

E (x,h) = �NX

i=1

aixi �MX

j=1

bjhj �NX

i=1

MX

j=1

xiWijhj

x 2 {0, 1}N

h 2 {0, 1}M

Energy based model

x1 x2 . . .

xN

h1 h2 h3 . . .

hM

Smolensky 1986 Hinton and Sejnowski 1986

Restricted Boltzmann Machine

E (x,h) = �NX

i=1

aixi �MX

j=1

bjhj �NX

i=1

MX

j=1

xiWijhj

x 2 {0, 1}N

h 2 {0, 1}M

Energy based model

p(x) =X

h

p(x,h)

p(x,h) ⇠ e�E(x,h)

p(h|x) = p(x,h)/p(x)

Probabilities

x1 x2 . . .

xN

h1 h2 h3 . . .

hM

Smolensky 1986 Hinton and Sejnowski 1986

Restricted Boltzmann Machine

E (x,h) = �NX

i=1

aixi �MX

j=1

bjhj �NX

i=1

MX

j=1

xiWijhj

x 2 {0, 1}N

h 2 {0, 1}M

Energy based model

p(x) =X

h

p(x,h)

p(x,h) ⇠ e�E(x,h)

p(h|x) = p(x,h)/p(x)

Probabilities

x1 x2 . . .

xN

h1 h2 h3 . . .

hM

Universal approximator of probability distributions . Freund and Haussler, 1989 . Le Roux and Bengio, 2008

Smolensky 1986 Hinton and Sejnowski 1986

Generative LearningLearning: Fit to the target distribution

Generating: Blocked Gibbs sampling

Hinton 2002Learn the model from data

Generate more data from the learned model

x

h

x

0

p(h|x) p(x0|h)

x

h

h

0

x

0

p(h|x) min

⇥1,

p(h0)p(h)

⇤p(x0|h0)

target probability p(x) ⇠ ⇡(x)

MNIST database of handwritten digits

learned weight Wij

http://www.deeplearning.net/tutorial/rbm.html#rbmTheano deep learning tutorial

MNIST database of handwritten digits

learned weight Wij

http://www.deeplearning.net/tutorial/rbm.html#rbmTheano deep learning tutorial

784 100x1 x2 . . .

xN

h1 h2 h3 . . .

hM

MNIST database of handwritten digits

learned weight Wij

http://www.deeplearning.net/tutorial/rbm.html#rbmTheano deep learning tutorial

784 100x1 x2 . . .

xN

h1 h2 h3 . . .

hM

Which one is written by human ?

Good$Genera=ve$Model?$MNIST$HandwriEen$Digit$Dataset$

Challenge

Salakhutdinov, 2015 Deep learning summer school

Idea

Model the QMC data with an RBM, then sample !om the RBM

Li Huang and LW, 1610.02746

cf. Liu, Qi, Meng, Fu,1610.03137Liu, Shen, Qi, Meng, Fu,1611.09364

Xu, Qi, Liu, Fu, Meng,1612.03804

Why is it useful ?• When the fitting is perfect, we can completely

bypass the “quantum” part of the QMC

• Even with an imperfect fitting, the RBM can still guide the QMC sampling

RBM is a recommender system for the QMC simulations

cf. Torlai, Melko,1606.02718

Why is it useful ?• When the fitting is perfect, we can completely

bypass the “quantum” part of the QMC

• Even with an imperfect fitting, the RBM can still guide the QMC sampling

Bonus: it is also fun to see what it discovers for the physical model

RBM is a recommender system for the QMC simulations

cf. Torlai, Melko,1606.02718

Learned Weight

of a 8*8 Falicov-Kimball model

Accept or not?

The art of Monte Carlo methods

A(x ! x

0) = min

1,

T (x0 ! x)

T (x ! x

0)

⇡(x0)

⇡(x)

Sample the RBM, and propose the move to QMCx

h

x

0

p(h|x) p(x0|h)

x

h

h

0

x

0

p(h|x) min

⇥1,

p(h0)p(h)

⇤p(x0|h0)

Accept or not?

A(x ! x

0) = min

1,

T (x0 ! x)

T (x ! x

0)

⇡(x0)

⇡(x)

Sample the RBM, and propose the move to QMC

T (x0 ! x)

T (x ! x

0)=

p(x)

p(x0)Detailed balance condition for the RBM

x

h

x

0

p(h|x) p(x0|h)

x

h

h

0

x

0

p(h|x) min

⇥1,

p(h0)p(h)

⇤p(x0|h0)

Accept or not?

A(x ! x

0) = min

1,

p(x)

p(x0)· ⇡(x

0)

⇡(x)

�Li Huang and LW, 1610.02746

Acceptance rate of recommended update

R. M. Neal, Bayesian learning for neural networks, 1996 .  J. S. Liu, Monte Carlo strategies in scientific computing, 2008

S. Zhang, Auxiliary-field QMC for correlated electron systems, 2013

“surrogate function”

“force bias”

RBM Physical model

Can Boltzmann Machines Discover Cluster Updates ?

LW, 1702.08586

Question

Cluster Update in a Nutshell

Swendsen and Wang,1987

Cluster Update in a NutshellSample auxiliary bond variables

Swendsen and Wang,1987

Cluster Update in a NutshellBuild clusters

Swendsen and Wang,1987

Cluster Update in a NutshellFlip clusters randomly

Swendsen and Wang,1987

Cluster Update in a Nutshell

Rejection free cluster update!

Swendsen and Wang,1987

Ising spins Visible units

Auxiliary bond variables Hidden units

Build clusters Sample h given s

Flip clusters Sample s given h

Swendsen-Wang Boltzmann Machine

flip clusters

p(h|s)h` = 0, 1si = ±1

visible hidden

Boltzmann Machine for cluster update

E(s,h) = �X

`

(Wsi`sj` + b)h`

si` sj`

h` = 1h 2 {0, 1}|Edge|

s 2 {�1,+1}|Vertex|

Boltzmann Machine for cluster update

E(s,h) = �X

`

(Wsi`sj` + b)h`

si` sj`

h` = 0h 2 {0, 1}|Edge|

s 2 {�1,+1}|Vertex|

Boltzmann Machine for cluster update

E(s,h) = �X

`

(Wsi`sj` + b)h`

si` sj`

h` = 0h 2 {0, 1}|Edge|

s 2 {�1,+1}|Vertex|

Cluster Update in the BM language

Sample hidden variables given visible units

Cluster Update in the BM language

Inactive hidden units break the links

Cluster Update in the BM language

Randomly flip clusters of visible units

Cluster Update in the BM language

Voila!

Cluster Update in the BM language

E(s,h) = �X

`

(Wsi`sj` + b)h`

si` sj`

h` = 0h 2 {0, 1}|Edge|

s 2 {�1,+1}|Vertex|

• Rejection free Monte Carlo updates when 1 + eb+W

1 + eb�W= e2�J

Boltzmann Machine for cluster update

E(s,h) = �X

`

(Wsi`sj` + b)h`

si` sj`

h` = 0h 2 {0, 1}|Edge|

s 2 {�1,+1}|Vertex|

• Encompass general cluster algorithm frameworksNiedermeyer,1988 Kandel and Domany,1991 Kawashima and Gubernatis,1995

• Rejection free Monte Carlo updates when 1 + eb+W

1 + eb�W= e2�J

Boltzmann Machine for cluster update

E(s,h) = �X

`

(Wsi`sj` + b)h`

si` sj`

h` = 0h 2 {0, 1}|Edge|

s 2 {�1,+1}|Vertex|

• Encompass general cluster algorithm frameworks

• In generalNiedermeyer,1988 Kandel and Domany,1991 Kawashima and Gubernatis,1995

“feature” E(s,h) = E(s)�

X

[W↵F↵(s) + b↵]h↵

• Rejection free Monte Carlo updates when 1 + eb+W

1 + eb�W= e2�J

Boltzmann Machine for cluster update

• Boltzmann Machines are learnable so they can adapt to various physical problems

• Boltzmann Machines parametrize Monte Carlo policies which can be optimized for efficiency

• The hidden units learn to play smart roles: Fortuin-Kasteleyn and Hubbard-Stratonovich transformations

Generate

Learn

Plaquette Ising model

K —1610.03137“However, for K≠0, NO simple and efficient global update method is known.”

Plaquette Ising model

K e�Ks1s2s3s4 !X

h

eW (s1s2+s3s4)h

Plaquette Ising model

K

• Given s, sample h on each plaquette independently

• Given h, the Boltzmann Machine is an ordinary Ising model with modulated interactions, which can be sampled efficiently

• Rejection free cluster update!

e�Ks1s2s3s4 !X

h

eW (s1s2+s3s4)h

Plaquette Ising model

K

LW, 1702.08586

e�Ks1s2s3s4 !X

h

eW (s1s2+s3s4)h

• New tools for (quantum) many-body problems

• Moreover, it offers a new way of thinking

• Can we make new scientific discovery with it ?

• Can one design better algorithms with it ?

Machine learning for many-body physics

LW, 1606.00318 Li Huang and LW, 1610.02746

Li Huang, Yi-feng Yang and LW, 1612.01871LW, 1702.08586

• New tools for (quantum) many-body problems

• Moreover, it offers a new way of thinking

• Can we make new scientific discovery with it ?

• Can one design better algorithms with it ?

Machine learning for many-body physics

LW, 1606.00318 Li Huang and LW, 1610.02746

Li Huang, Yi-feng Yang and LW, 1612.01871LW, 1702.08586 The fun just starts!

Quantum many-body physics for ML

������� � ������ ������� ������

������ ��" 7����� � ������ � ������� �� �� ��� &�� �� ����� ���0� � ���� ��%��������� � �� � � ����� ����������� ' ����� ���� �� ���� ������� �������� ����� �� � � .!��� ����������� �� �� ����� � ����� �� � ���� �� � � ����� ��@��� ��� �� ��� � ��� ������ �* ���������� �� �� ���� �������� ������� ���� ����� � � � ��������� ���� ������������ ��� ������ � ��� ����� � �* ���������� � ������ � ��������� ��������� �� ��������� �� ����� ������

:� ������ ������������ ����������� ����������� ��� ��� �;����� �� ������� ��� ���� ��� �� � ��������� ���� ������ �� ��������� ,� ��� ���������� ���� ��� ������� �� ��������� ��� ��������� �� ���� ����� �� �����

���

������� � ���� �����

������ ��? �� ��� ������ ��� ��� AF�:) �������� )�� ,F�:)- ������ ��� F���������������� �� :�������� ��� )���������� ��� ������ ���� ���������� ��������� ���� �����)�� ,A- ������ ��� , ���1���- ����� ��� ���� ��� ���� ������������ ��� ������ ��� ���� ������ �������� �������� �� )�� AF�:) ������� �������� �� ����� �� ����������� ��������� ���������� ������ ���������� ����� ����� 4%? �� ��������� �� ���� � ���� )��� �� ���������1������ ������ �� ��� �� ��� �� ����� ��� ��� ������ ���� ����� �� ���� ����������������� �� �� ���� ������� ������� ����� <���� ���� ��� ����� ������<��� �� ���!��0������ C����� ��� ��������� �� �� ,��� �������� �� ������ ���������- ������ ������ ������ ������ �������� ����������� �� ����� ����� �������� � �� ���������� ��������������������� ��� �� ���������� ����� ����� ����� H����

��

MNIST database random imagesfrom the “Deep Learning” book by Goodfellow, Bengio, Courville https://www.deeplearningbook.org/

Quantum entanglement perspective on deep learningJing Chen, Song Cheng, Haidong Xie, LW, and Tao Xiang, 1701.04831

Dong-Ling Deng, Xiaopeng Li and S. Das Sarma, 1701.04844

Xun Gao, L.-M. Duan, 1701.05039 Y. Huang and J. E. Moore, 1701.06246

A(x ! x

0) = min

1,

p(x)

p(x0)· ⇡(x

0)

⇡(x)

RBM Physical model

e.g. Falikov-Kimball (classical field + fermions) model

while for the RBM

ln[p(x)] =NX

i=1

a

i

x

i

+MX

j=1

ln⇣1 + e

bj+PN

i=1 xiWij

Hij = Kij + �ijU (xi � 1/2)

x1 x2 . . .

xN

h1 h2 h3 . . .

hM

ln[⇡(x)] =�U

2

NX

i=1

xi + ln deth1 + e

��H(x)i

x 2 {0, 1}N

h 2 {0, 1}M

Train the RBM

NX

i=1

a

i

x

i

+MX

j=1

ln⇣1 + e

bj+PN

i=1 xiWij

=�U

2

NX

i=1

xi + ln det�1 + e

��H�

Supervised learning of the RBM

... ...

F (x)

x1

x2

xN

p(x) ⇠ ⇡(x)

top related