comp 5013 deep learning architectures daniel l. silver march, 2014 1

Comp 5013 Deep Learning Architectures Daniel L. Silver March, 2014 1

Y. Bengio - McGill 2009 Deep Learning Tutorial 2013 Deep Learning towards AI Deep Learning of Representations (Y. Bengio) http://www.youtube.com/watch?v=4xsVFLnHC_0 http://www.youtube.com/watch?v=4xsVFLnHC_0

Deep Belief RBM Networks with Geoff Hinton Learning layers of features by stacking RBMs http://www.youtube.com/watch?v=VRuQf3Djmf M http://www.youtube.com/watch?v=VRuQf3Djmf M Discriminative fine-tuning in DBN http://www.youtube.com/watch?v=-I2pgcH02QM http://www.youtube.com/watch?v=-I2pgcH02QM What happens during fine-tuning? http://www.youtube.com/watch?v=yxMeeySrfDs http://www.youtube.com/watch?v=yxMeeySrfDs 3

Deep Belief RBM Networks with Geoff Hinton Learning handwritten digits http://www.cs.toronto.edu/~hinton/digits.html http://www.cs.toronto.edu/~hinton/digits.html Modeling real-value data (G.Hinton) http://www.youtube.com/watch?v=jzMahqXfM7I http://www.youtube.com/watch?v=jzMahqXfM7I 4

Deep Learning Architectures Consider the problem of trying to classify these hand-written digits.

Deep Learning Architectures 2000 top-level artificial neurons 0 0 500 neurons (higher level features) 500 neurons (higher level features) 500 neurons (low level features) 500 neurons (low level features) Images of digits 0-9 (28 x 28 pixels) Images of digits 0-9 (28 x 28 pixels) 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 Neural Network: - Trained on 40,000 examples - Learns: * labels / recognize images * generate images from labels - Probabilistic in nature - DemoDemo 2 3 1

Deep Convolution Networks Intro - http://www.deeplearning.net/tutorial/lenet.h tml#lenet http://www.deeplearning.net/tutorial/lenet.h tml#lenet

ML and Computing Power Andrew Ngs work on Deep Learning Networks (ICML-2012) Problem: Learn to recognize human faces, cats, etc from unlabeled data Dataset of 10 million images; each image has 200x200 pixels 9-layered locally connected neural network (1B connections) Parallel algorithm; 1,000 machines (16,000 cores) for three days 8 Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le, MarcAurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng ICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.

ML and Computing Power Results: A face detector that is 81.7% accurate Robust to translation, scaling, and rotation Further results: 15.8% accuracy in recognizing 20,000 object categories from ImageNet 70% relative improvement over the previous state-of-the-art. 9

Deep Belief Convolution Networks Deep Belief Convolution Network (Javascript) Runs well under Google Chrome https://www.jetpac.com/deepbelief https://www.jetpac.com/deepbelief 10

Google and DLA http://www.youtube.com/watch?v=JBtfRiGEA FI http://www.youtube.com/watch?v=JBtfRiGEA FI http://www.technologyreview.com/news/524 026/is-google-cornering-the-market-on-deeplearning/ http://www.technologyreview.com/news/524 026/is-google-cornering-the-market-on-deeplearning/

Cloud-Based ML - Google 12 https://developers.google.com/prediction/

Additional References http://deeplearning.net http://en.wikipedia.org/wiki/Deep_learning Coursera course Neural Networks fro Machine Learning: https://class.coursera.org/neuralnets-2012- 001/lecture https://class.coursera.org/neuralnets-2012- 001/lecture ML: Hottest Tech Trend in next 3-5 Years http://www.youtube.com/watch?v=b4zr9Zx5WiE http://www.youtube.com/watch?v=b4zr9Zx5WiE Geoff Hintons homepage https://www.cs.toronto.edu/~hinton/ https://www.cs.toronto.edu/~hinton/ 13

Open Questions in ML

Challenges & Open Questions Stability-Plasticity problem - How do we integrate new knowledge in with old? No loss of new knowledge No loss or prior knowledge Efficient methods of storage and recall ML methods that can retain learned knowledge will be approaches to common knowledge representation a Big AI problem 15

Challenges & Open Questions Practice makes perfect ! An LML system must be capable of learning from examples of tasks over a lifetime Practice should increase model accuracy and overall domain knowledge How can this be done? Research important to AI, Psych, and Education 16

Challenges & Open Questions Scalability Often a difficult but important challenge Must scale with increasing: Number of inputs and outputs Number of training examples Number of tasks Complexity of tasks, size of hypothesis representation Preferably, linear growth 17

Never-Ending Language Learner Carlson et al (2010) Each day: Extracts information from the web to populate a growing knowledge base of language semantics Learns to perform this task better than on previous day Uses a MTL approach in which a large number of different semantic functions are trained together 18

comp 5013 deep learning architectures daniel l. silver march, 2014 1

Documents