advances in neural turing machines - github pages+.pdf · relational dynamic memory network (dmnn)...

53
10/09/2018 1 Source: rdn consulting Aug 2018 Truyen Tran Deakin University @truyenoz truyentran.github.io [email protected] letdataspeak.blogspot.com goo.gl/3jJ1O0 Advances in Neural Turing Machines

Upload: others

Post on 28-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 1

Source: rdn consultingAug 2018

Truyen TranDeakin University @truyenoz

truyentran.github.io

[email protected]

letdataspeak.blogspot.com

goo.gl/3jJ1O0

Advances inNeural Turing Machines

Page 2: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 2

https://twitter.com/nvidia/status/1010545517405835264

Page 3: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

(Real) Turing machine

10/09/2018 3

It is possible to invent a single machine which can be used to compute any computable sequence. If this machine U is supplied with the tape on the beginning of which is written the string of quintuples separated by semicolons of some computing machine M, then U will compute the same sequence as M.

Wikipedia

Page 4: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 4

Can we learn from data a model that is as powerful as a Turing machine?

Page 5: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Agenda

Neural Turing machine (NTM)Dual-view in sequences (KDD’18)Bringing variability in output sequences (NIPS’18)Bringing relational structures into memory (ICPR’18+)Looking ahead (ACL’19, KDD’19, CVPR’19, ICML’19, NIPS’19 ?)

10/09/2018 5

Page 6: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Let’s review current offerings

Feedforward nets (FFN)

Recurrent nets (RNN)

Convolutional nets (CNN)

Message-passing graph nets (MPGNN)

Universal transformer

…..

Work surprisingly well on LOTS of important problems

Enter the age of differentiable programming

10/09/2018 6

BUTS …

No storage of intermediate results.

Little choices over what to compute and what to use

Little support for complex chained reasoning

Little support for rapid switching of tasks

Page 7: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Searching for better priors

Translation invariance in CNN

Recurrence in RNN

Permutation invariance in attentions and graph neural networks

Memory for complex computation

Memory-augmented neural networks (MANN)

(LeCun, 2015)

Page 8: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

What is missing? A memoryUse multiple pieces of information

Store intermediate results (RAM like)

Episodic recall of previous tasks (Tape like)

Encode/compress & generate/decompress long sequences

Learn/store programs (e.g., fast weights)

Store and query external knowledge

Spatial memory for navigation

10/09/2018 8

Rare but important events (e.g., snake bite)

Needed for complex control

Short-cuts for ease of gradient propagation = constant path length

Division of labour: program, execution and storage

Working-memory is an indicator of IQ in human

Page 9: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Example: Code language model

9

Still needs a better memory for:

RepetitivenessE.g. for (int i = 0; i < n; i++)

LocalnessE.g. for (int size may appear more often that for (int i in some source files.

Very long sequence (big file, or char level)

Page 10: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Example: Electronic medical records

Three interwoven processes:Disease progressionInterventions & care processesRecording rules

10/09/2018 10

Source: medicalbillingcodings.org

visits/admissions

time gap ?

prediction point

Abstraction

Modelling

Need memory to handle thousands of events

Page 11: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Conjecture: Healthcare is Turing computational

Healthcare processes as executable computer program obeying hidden “grammars”

The “grammars” are learnable through observational data

With “generative grammars”, entire health trajectory can be simulated.

10/09/2018 11

Get sick

See doctor

Enjoy life

Page 12: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Neural Turing machine (NTM)

10/09/2018 12

Page 13: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

RNN: theoretically powerful, practically limited

ClassificationImage captioning

Sentence classification

Neural machine translation

Sequence labelling

Source: http://karpathy.github.io/assets/rnn/diags.jpeg10/09/2018 13

Page 14: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Neural Turing machine (NTM)

A controller that takes input/output and talks to an external memory module.

Memory has read/write operations.

The main issue is where to write, and how to update the memory state.All operations are differentiable.

https://rylanschaeffer.github.io/content/research/neural_turing_machine/main.html

Page 15: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

NTM operations

10/09/2018 15https://medium.com/@aidangomez/the-neural-turing-machine-79f6e806c0a1

https://rylanschaeffer.github.io

Page 16: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 16

NTM unrolled in time with LSTM as controller

#Ref: https://medium.com/snips-ai/ntm-lasagne-a-library-for-neural-turing-machines-in-lasagne-2cdce6837315

Page 17: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Differentiable neural computer (DNC)

10/09/2018 17

Source: deepmind.com

#REF: Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory.” Nature 538.7626 (2016): 471-476.

https://rylanschaeffer.github.io

20162014

Page 18: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Dual-view sequential problemsHung Le, Truyen Tran & Svetha Venkatesh

KDD’18

10/09/2018 18

Page 19: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Synchronous two-view sequential learning

Visual

Speech

1 2 3 4

Page 20: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Asynchronous two-view sequential learning Healthcare: medicine prescription

E11 I10 N18

1916 1910

Z86 E11

1952 1893

DOCU100L ACET325

Diagnoses

Procedures

Medicines

Page 21: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Asynchronous two-view sequential learning Healthcare: disease progression

E11 I10 N18

1916 1910

Z86 E11

DOCU100LACET325

Previous diagnoses

Previous interventions

Future diagnoses ???

Page 22: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Intra-view & inter-view interactions

output

Page 23: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Dual architecture

Dual Memory Neural Computer (DMNC). There are two encoders and one decoder implemented as LSTMs. The dash arrows represent cross-memory accessing in early-fusion mode

Intra-interaction

Inter-interaction

Long-term dependencies

#Ref: Le, Hung, Truyen Tran, and Svetha Venkatesh. "Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning." KDD18.

Page 24: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Accuracy

Learning curve

Simple sum, but distant, asynchronous

Page 25: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

70 80 90

AUC

F1

P@1

P@2

P@3

Medicine prescription performance(data: MIMIC-III)

LSTM DNC WLAS DMNC

Page 26: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

67.6

61.3

57

53.6

50

47.1

65.9

60.8

56.5

51.8

48.9

45.7

66.2

59.6

53.752.7

49.4

46.2

44

49

54

59

64

69

P@1 Dieabies P@2 Dieabies P@3 Dieabies P@1 Mental P@2 Mental P@3 Mental

Disease progression performance(data: MIMIC-III)

DMNC WLAS DeepCare

Page 27: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Bringing variability in output sequencesHung Le, Truyen Tran & Svetha Venkatesh

NIPS’18

10/09/2018 27

Page 28: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Motivation: Dialog system

10/09/2018 28

A dialog system needs to maintain the history of chat (e.g., could be hours)Memory is needed

The generation of response needs to be flexible, adapting to variation of moods, styles Current techniques are mostly based on LSTM, leading to “stiff” default responses

(e.g., “I see”).

There are many ways to express the same thought Variational generative methods are needed.

Page 29: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Variational Auto-Encoder (VAE)(Kingma & Welling, 2014)

Two separate processes: generative (hidden visible) versus recognition (visible hidden)

http://kvfrans.com/variational-autoencoders-explained/

Gaussian hidden variables

Data

Generative net

Recognisingnet

Page 30: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Variational memory encoder-decoder (VMED)

10/09/2018 30

Conditional Variational Auto-Encoder

contextgenerated

latent variables

VMED

contextgenerated

latent variables memory

reads

Page 31: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 31

Page 32: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Sample response

10/09/2018 32

Page 33: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Sample response (2)

10/09/2018 33

Page 34: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Bringing relational structures into memoryTrang Pham, Truyen Tran & Svetha Venkatesh

ICPR’18+

10/09/2018 34

Page 35: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

NTM as matrix machine

Controller and memory operations can be conceptualized as matrix operations Controller is a vector

changing over time

Memory is a matrix changing over time

10/09/2018 35

#REF: Kien Do, Truyen Tran, Svetha Venkatesh, “Learning Deep Matrix Representations”, arXiv preprint arXiv:1703.01454

Recurrent dynamics

Page 36: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Idea: Relational memoryIndependent memory slots not suitable for relational reasoning

Human working memory sub-processes seem inter-dependent

10/09/2018 36

Relational structure

New memory proposalNew information

Transformation

Old memoryTime-aware bias

Page 37: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Relational Dynamic Memory Network (DMNN)

10/09/2018 37

Controller

Memory

Graph

Query Output

Read WriteOutputController

Memory

Query

Read Write

Relational Dynamic Memory NetworkNTM

Page 38: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

RDMN unrolled

10/09/2018 38

Input process

Memory process

Output process

Controller process

Message passing

Page 39: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Drug-disease response

10/09/2018 39

Molecule Bioactivity

Page 40: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Chemical reaction

10/09/2018 40

Molecules Reaction

Page 41: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Looking ahead

10/09/2018 41

Page 42: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

The popular

On the rise

The black sheep

The good old • Representation learning (RBM, DBN, DBM, DDAE)• Ensemble• Back-propagation• Adaptive stochastic gradient

• Attention• Batch-norm• ReLU & skip-connections• Highway nets, LSTM/GRU & CNN

• Reinforcement learning, imagination & planning• Deep generative models + Bayesian methods• Memory & reasoning• Lifelong/meta/continual/few-shot/zero-shot learning• Universal transformer

• Cognitive architecture | Unified Theory of Cognition• Quantum ML/AI• Theory of consciousness (e.g., Penrose’s microtubes)• Value-aligned ML

Page 43: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Better memory theory

Sparse writing

Explaining memory operations

Dynamic memory structure (other than a fixed-size matrix) E.g., Differentiable pooling (NIPS’18)

Loading long-term/episodic mem into working mem walk(man, dog; day1); walk(woman, dog; day2) couple(man, woman)

A grand unified theory of memory? May be Free-Energy Principle by Karl Friston

Intelligence as emergence? We are just little bit better than apes in each intelligence dimension, but far more intelligent overall.

10/09/2018 43

Page 44: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

http://www.rainbowrehab.com/executive-functioning/

Memory types

Short-term/working (temporary storage)Episodic (events happened at specific time)Long-term/semantic (facts, objects, relations)Procedural (sequence of actions)

Page 45: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Applications of memory

Rare events

Video captioning

QA, VQA

Machine translation

Machine reading (stories, books, DNA)

Business process continuation

Software execution

Code generation

10/09/2018 45

Graph as sequence of edges

Event sequences

Graph traversal

Algorithm learning (e.g., sort)

Dialog systems (e.g., chat bots)

Reinforcement learning agents

Multi-agents with shared memory

Learning to optimize

Page 46: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Memory-supported intelligence

Reasoning with working memory (NTM style)

Meta-learning with episodic memory

Meta-remembering of memory operations

Learning to plan with procedural memory

Learning world knowledge with semantic memory

Learning to navigate with spatial memory

Learning to socialize with collective memory and memory of others (???)

Toward a full cognitive architecture

10/09/2018 46

Page 47: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 47

#REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan

Page 48: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 48

#REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan

Page 49: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 49

#REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan

Knowledge graphsPlanning, RL

NTM Attention, RL

RLUnsupervised

CNN

Episodic memory

Where is the processor?

Page 50: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 50

#REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan

Page 51: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

10/09/2018 51

#REF: A Cognitive Architecture Approach to Interactive Task Learning, John E. Laird, University of Michigan

Page 52: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

Team @ Deakin (A2I2)

10/09/2018 52

Thanks to many people who have created beautiful graphics & open-source programming frameworks.

Page 53: Advances in Neural Turing Machines - GitHub Pages+.pdf · Relational Dynamic Memory Network (DMNN) 10/09/2018 37 Controller. Memory Graph Query. Output. Read. Write. Output. Controller

References

Memory–Augmented Neural Networks for Predictive Process Analytics, A Khan, H Le, K Do, T Tran, A Ghose, H Dam, R Sindhgatta, arXiv preprint arXiv:1802.00938

Learning deep matrix representations, K Do, T Tran, S Venkatesh, arXiv preprint arXiv:1703.01454

Variational memory encoder-decoder, H Le, T Tran, T Nguyen, S Venkatesh, arXiv preprintarXiv:1807.09950

Relational dynamic memory networks, Trang Pham, Truyen Tran, Svetha Venkatesh, arXivpreprint arXiv:1808.04247

Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning, H Le, T Tran, S Venkatesh, KDD'18

Dual control memory augmented neural networks for treatment recommendations, H Le, T Tran, S Venkatesh, PAKDD'18.

10/09/2018 53