deep learning - ugr · deep learning human information processing mechanisms (e.g., vision and...

113
Deep Learning R h G S ft C ti d Francisco Herrera Research Group on Soft Computing and Information Intelligent Systems (SCI 2 S) http://sci2s.ugr.es Dept. of Computer Science and A.I. University of Granada, Spain University of Granada, Spain Email: [email protected]

Upload: others

Post on 15-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

R h G S ft C ti d

Francisco Herrera

Research Group on Soft Computing andInformation Intelligent Systems

(SCI2S)http://sci2s.ugr.es

Dept. of Computer Science and A.I. University of Granada, SpainUniversity of Granada, Spain

Email: [email protected]

Page 2: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

D L i El di j f d Deep Learning : El aprendizaje profundo es un conjunto de algoritmos que intenta modelar abstracciones de alto nivel en los datos mediante el abstracciones de alto nivel en los datos mediante el uso de arquitecturas compuestas de transformación no lineales múltiples. pBibliografía: L. Deng and D. Yu. Deep Learning methods and applications.Deep Learning methods and applications.Foundations and Trends in Signal ProcessingVol. 7, Issues 3-4, 2014.

Nota: Deep Learning introduce el uso Nota: Deep Learning introduce el uso de estructuras de aprendizaje que requieren de arquitecturas de procesamiento eficiente y distribuido ( k ) l d(GPU, Spark, …) y muestra resultados importantes en el procesamiento de imágenes, habla, lenguaje natural, ...

Page 3: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Deep Learning (deep structure learning): machine learning algorithms based on learningmultiple levels of representation/abstraction multiple levels of representation/abstraction.

Amazing improvements in error rate in objecta g p o e e ts e o ate objectrecognition, object detection, speech recognition, and more recently, in natural languageprocessing/understading processing/understading.

Yoshua Bengiohttp://www iro umontreal ca/ bengioy/talks/DL Tutorial NIPS2015 pdfhttp://www.iro.umontreal.ca/~bengioy/talks/DL-Tutorial-NIPS2015.pdf

Page 4: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Al d fi i i (D d Y 2014)Algunas definiciones (Deng and Yu, 2014)

Page 5: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Al d fi i i (D d Y 2014)Algunas definiciones (Deng and Yu, 2014)

Page 6: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Al d fi i i (D d Y 2014)Algunas definiciones (Deng and Yu, 2014)

Page 7: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Al d fi i i (D d Y 2014)Algunas definiciones (Deng and Yu, 2014)

Page 8: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Al d fi i i (D d Y 2014)Algunas definiciones (Deng and Yu, 2014)

Page 9: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning(also called Hierarchical Learning)(also called Hierarchical Learning)

Hierarchical Learning

• Natural progression from low level to high level structure as seen in natural complexitycomplexity

• Easier to monitor what is being learnt and to guide the machine to better the machine to better subspaces

• Usually best when input space is locally structured –space is locally structured spatial or temporal: images, language, etc. vs arbitrary input features

Page 10: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for

i l d b ildi i lextracting complex structure and building internal representation from rich sensory inputs.

Historically, the concept of deep learning originated from artificial neural network research. (Hence, one may occasionally hear the discussion of “newmay occasionally hear the discussion of “new-generation neural networks.”) Feed-forward neural networks or MLPs with many hidden layers, which are y y ,often referred to as deep neural networks (DNNs), are good examples of the models with a deep architecture.

Page 11: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Machine learning: Shallow-structured arquitectures

G i i t d l (GMM ) Gaussian mixture models (GMMs), Linear or nonlinear dynamical systems, Conditional, random fields (CRFs) Conditional, random fields (CRFs) Maximum entropy (MaxEnt) models, Support vector machines (SVMs) Logistic regression/kernel regression Multilayer perceptrons (MLPs) with a single hidden layer

including extreme learning machines (ELMs)including extreme learning machines (ELMs).

These architectures typically contain at most one or two layers of nonlinear feature one or two layers of nonlinear feature transformations.

Page 12: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Traditional recognition approaches

Features are not learning

Page 13: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

“Shallow” vs. “deep” architectures

Page 14: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Backpropagation

Credits: The Evolution of Neural Learning Systems: A Novel Architecture Combining the Strengths of NTs, CNNs, and ELMs. N Martinel, C Micheloni… - IEEE SMC Magazine, 2015 - ieeexplore.ieee.org

Page 15: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Backpropagation• Minimize error of

l l dk

calculated output• Adjust weights

• Gradient Descent

wjk

Gradient Descent

• Procedure• Forward Phase

j • Backpropagation of errors

• For each sample,

vij

p ,multiple epochs

i

Page 16: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

P bl ith B k tiProblems with Backpropagation

Multiple hidden Layers Multiple hidden Layers

Get stuck in local optima Get stuck in local optima start weights from random positions

Slow convergence to optimum large training set needed

Only use labeled datamost data is unlabeled most data is unlabeled

Page 17: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Deep Architecture (Train networks with many layers)

Multiple hidden layers

Deep Architecture (Train networks with many layers)

Multiple hidden layers Motivation (why go deep?)

Approximate complex decision boundarypp o ate co p e dec s o bou da y• Fewer computational units for

same functional mapping Hierarchical Learning Hierarchical Learning

• Increasingly complex features Work well in different domains

• Vision, Audio, …

Page 18: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Yoshua BengioCredits: http://www.iro.umontreal.ca/~bengioy/talks/DL-Tutorial-NIPS2015.pdf

Page 19: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning(Hierarchical Learning)(Hierarchical Learning)

Hierarchical Learning/deep structure learning: Automating Feature Discovery

From simplest features to complex onepFron unsupervised learning to supervised learninga g

Page 20: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

S M d l

Deep Architecture (Train networks with many layers)

Some Models: Deep networks for unsupervised or generative learning:

deep belief network (DBN) stack of restricted Boltzmanndeep belief network (DBN), stack of restricted Boltzmannmachines (RBMs), autoencoder …

Deep networks for supervised learning: Deep Neural Networks (DNN), Convolutional neural network (CNN). …

Hybrid deep networks: DBN-DNN (when DBN is used to initialize the training of a DNN, the resulting network is

ti ll d th DBN DNN)sometimes called the DBN–DNN)

Page 21: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

AutoencoderAutoencoder

An autoencoder neural network is an unsupervised network is an unsupervised learning algorithm that applies backpropagation, setting the target values to be equal to the inputs.

The aim of an autoencoderis to learn a representation ( di ) f f (encoding) for a set of data, typically for the purpose of dimensionality purpose of dimensionality reduction.

Page 22: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Autoencoder (autoencoder de h2o una sola capa interna Autoencoder (autoencoder de h2o, una sola capa interna de 3 neuronas y 1000 "epochs". En todos los autoencodersuso la tangente hiperbólica como función de activación. WDBC (569 i t i 30 t ib t d t d )WDBC (569 instancias con 30 atributos de entrada)

Credito: D. Charte

Page 23: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Ejemplo: Diseño de un Clasificador para IrisEjemplo: Diseño de un Clasificador para Iris Problema simple muy conocido: clasificación de lirios. Tres clases de lirios: setosa, versicolor y virginica. Cuatro atributos: longitud y anchura de pétalo y sépalo,

respectivamente. 150 j l 50 d d l 150 ejemplos, 50 de cada clase. Disponible en

http://www.ics.uci.edu/~mlearn/MLRepository.html

Page 24: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

setosa, versicolor (C) y virginica

Page 25: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

setosa, versicolor y virginica

IRIS: Conjunto entrenamiento original

setosa versicolor virginica

0,70,80,9

1

alo

0,20,30,40,50,6

Anc

hura

Pét

a

00,1,

0 0,2 0,4 0,6 0,8 1

Longitud Pétalo

Page 26: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Autoencoder (autoencoder de h2o, salida de la capa intermedia( , pcapas internas de [8, 5, 3, 5, 8] neuronas y 100 "epochs" (eltridimensional), y [8, 5, 2, 5, 8] neuronas con 1000 "epochs" (el bidimensional).

setosa, versicolor y virginica

Credito: D. Charte

Page 27: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Hybrid deep networks: DBN-DNNHybrid deep networks: DBN DNN

Credits: L. Deng and D. Yu. Deep Learning methods and applications.Foundations and Trends in Signal Processing. Vol. 7, Issues 3-4, 2014, pag. 246

Page 28: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Convolutional Neural Networks (Supervised)

Each module consists of a convolutional layer and a pooling layer.

Typically tries to compress large data (images) into a smaller set of robust features, based on local variations.

Basic convolution can still create many features.

CNNs have been found highly effective and been commonly used in computer vision and image recognition.

Page 29: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Convolutional Neural Networks

C layers are convolutions convolutions, S layers pool/sample

Page 30: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Convolutional Neural Networks

Page 31: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Credits: http://www.iro.umontreal.ca/~bengioy/talks/DL-Tutorial-NIPS2015.pdf

Page 32: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Credits: http://www.iro.umontreal.ca/~bengioy/talks/DL-Tutorial-NIPS2015.pdf

Page 33: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning De la academia a la industria: DNNresearch Inc y

Google Brain is a deep learning

Google Deepmind

Google Brain is a deep learning research project at Google

En 2013, Google adquirió la compañía DNNresearch Inccreada por uno de los pioneros de Deep Learning (Geoffrey Hinton).

En enero de 2014 se hizo con el control de la ‘startup’ pDeepmind Technologies una pequeña empresa londinense en la trabajaban que algunos de los mayores expertos en ‘deeplearning’. learning .

Deep Mind: Start up-2011 D i H bi Sh L M t fDemis Hassabis Shane Legg y MustafaSuleyman

Page 34: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Convolutional Neural NetworksNIPS2012 un caso de éxito de CNN para el challenge

ImageNet Classification with Deep Convolutional Neural Networks

NIPS2012, un caso de éxito de CNN para el challenge ILSVRC 2010

g pPart of: Advances in Neural Information Processing Systems 25 (NIPS 2012)

ImageNet is a dataset of over 15 gmillion labeled high-resolution images belonging to roughly 22,000 categories. The images were collected from the web and labeled by human labelers using Amazon’s Mechanical Turk crowd-sourcing tool. Starting in 2010, as part of the Pascal Visual Object Challenge, an annual competition called the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) has been held. ILSVRC uses a subset of ImageNet with roughly 1000 images in each of 1000 categories. In all, there are roughly 1.2 million training images, 50,000 validation images, and 150,000 testingimages.

Page 35: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://arxiv.org/abs/1312.5602

Page 36: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

Juegos Arcade (Breakout)

http://elpais.com/elpais/2015/02/25/ciencia/1424860455_667336.html

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Page 37: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Page 38: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Schematic illustration of the convolutional neural network.

V Mnih et al Nature 518 529-533 (2015) V Mnih et al. Nature 518, 529-533 (2015) doi:10.1038/nature14236

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Page 39: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://arxiv.org/abs/1509.01549

Page 40: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://arxiv.org/abs/1509.01549 https://chessprogramming.wikispaces.com/Giraffe

Page 41: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

https://www.technologyreview.com/s/541276/deep-learning-machine-https://www.technologyreview.com/s/541276/deep learning machineteaches-itself-chess-in-72-hours-plays-at-international-master/

Ref: arxiv.org/abs/1509.01549 : Giraffe: Using Deep Reinforcement Learning to Play Chess

Algunos datos: von Neumann introduced the minimax algorithm in 1928363 features363 featuresThe evaluator network converges in about 72 hours on a machine with 2x10-core Intel Xeon E5-2660v2 CPU. Giraffe is able to play at the level of an FIDE International Master

Page 42: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html

Page 43: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://elpais.com/elpais/2016/01/26/ciencia/1453766578 683799.htmlhttp://elpais.com/elpais/2016/01/26/ciencia/1453766578_683799.html

Page 44: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

Each simulation traverses the tree by selecting the y gedge with maximum action value Q, plus a bonus u(P) that depends on a stored prior probability P for that edge. b, The leaf node may be expanded; the new node is processed once by t

http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html

is processed once by t…

Page 45: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

Neural network training pipeline and architecture

D Silver et al. Nature 529, 484–489 (2016) doi:10.1038/nature16961

http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html

Page 46: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

How AlphaGo (black, to play) selected its move in an informal How AlphaGo (black, to play) selected its move in an informal game against Fan Hui

D Silver et al. Nature 529, 484–489 (2016) doi:10.1038/nature16961

http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html

doi:10.1038/nature16961

Page 47: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

http://www.nature.com/news/google-ai-algorithm-masters-ancient-game-of-go-1.19234

Page 48: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

http://www.nature.com/news/the-go-files-champion-preps-for-1-g p p pmillion-machine-match-1.19541

Page 49: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

https://actualidad.rt.com/ciencias/201602-inteligencia-artificial-alphago-google-gana-leyenda-go

9/03/2016

Page 50: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

https://gogameguru.com/tag/deepmind-alphago-lee-sedol/

Page 51: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en los Juegos “inteligentes”

https://gogameguru.com/alphago-defeats-lee-sedol-4-1/

Page 52: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

IMAGENET (ILSRVC): Microsoft Wins ImageNetUsing Extremely Deep Neural NetworksUsing Extremely Deep Neural Networks

Mi ft' t k ll d Microsoft's network was really deep at 150 layers (extremely deep neural network). To do this the team had to overcome a team had to overcome a fundamental problem inherent in training deep neural networks. As the network gets deeper training the network gets deeper training becomes more difficult so you encounter a seemingly paradoxical situation that adding layers makes situation that adding layers makes the performance worse.The solution proposed is called deep residual learning.p g

http://www.image-net org/challenges/LSVRC/

http://www.i-programmer.info/news/105-artificial-intelligence/9266-microsoft-wins-imagenet-using-extremely-deep-neural-networks.html

net.org/challenges/LSVRC/

Page 53: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

IMAGENET (ILSRVC 2015): Microsoft Wins ImageNet Using Extremely Deep Neural g g y pNetworks

http://arxiv.org/abs/1512.03385

Page 54: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en la “pintura”

http://arxiv.org/abs/1508.06576

Page 55: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en la “pintura”

Credits: http://arxiv.org/abs/1508.06576

Page 56: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en la “pintura”

http://www.deepart.io/

Page 57: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en la “pintura”Ejemplos del resultado de DeepART

van Gothvan Goth

http://www.deepart.io/

Page 58: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Retos en la “pintura”

Modelo de CNN utilizado y descripción de la metodologíaModelo de CNN utilizado y descripción de la metodología

http://arxiv.org/abs/1409.1556http://arxiv.org/abs/1508.06576http://arxiv.org/abs/1508.06576

Page 59: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer Kaggle

Caso estudio: Digit Recognizer Kaggle (A. Herrera-Poyatos)

Andrés Herrera PoyatosRepositorio en GitHub con el código:Repositorio en GitHub con el código:https://github.com/andreshp/Kaggle

Page 60: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

Desarrollar un reconocedor de dígitos es uno de los problemas clásicos de la ciencia de datos.

Sirve de benchmark para probar los nuevos algoritmos. ¡Ningún un humano acierta el 100%!

li ió á i d ó d í l Aplicación práctica: detección de matrículas, conversión de escritura a mano en texto …

9 6 6 6 4 0 7 4 0 13 3 2 23 1 3 4 7 2 7 1 2 11 7 4 2 3 5 1 2 4 4

Créditos: A. Herrera-Poyatos

Page 61: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

Kaggle mantiene una competición pública:

http://www.kaggle.com/c/digit-recognizer

Datos a analizar: MNIST DATA (60.000 instances)http://yann.lecun.com/exdb/mnist/

Rodrigo Benenson has compiled an informative summary page

p //y / / /

y p g

Créditos: A. Herrera-Poyatos

http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

Page 62: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

Data Set: Training Set: 42.000 Imágenes

á Test Set: 28.000 Imágenes Imagen:

10 clases: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 28x28 píxeles

Ej l Ejemplo:

Puntuación para la clasificación general en Kaggle: índice de acierto sobre un 25% Kaggle: índice de acierto sobre un 25% del Test Set.

Créditos: A. Herrera-Poyatos

Page 63: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

1 Primer paso

1. Utilizar los algoritmos más conocidos para usarlos

1. Primer paso

g pcomo benchmark• KNN con k = 10 0.96557 en Kaggle

d á b l l• Random Forest con 1000 árboles 0.96829 en Kaggle

2 Optimizar los parámetros de un algoritmo sencillo2. Optimizar los parámetros de un algoritmo sencillo• Cross Validation sobre KNN para encontrar el mejor valor de k.

Solución: K=1 0.97114 en Kaggle

Créditos: A. Herrera-Poyatos

Page 64: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

2 Visualización

Media de todas las imágenes del training set por clases:

2. Visualización

clases:

Observación: Incluso las medias no están centradas (ver 6 y 7). Esto provoca problemas

l f lpara clasificarlas correctamente.

Solución: PreprocesamientoSolución: Preprocesamiento

Créditos: A. Herrera-Poyatos

Page 65: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

3 Preprocesamiento

Idea: Eliminar las filas y columnas de píxeles en bl

3. Preprocesamiento

blanco.

Problema: Las nuevas imágenes tienen dif t di idiferentes dimensiones.

Créditos: A. Herrera-Poyatos

Page 66: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

3 Preprocesamiento Solución: Redimensionar las imágenes a 20x20

píxeles (tras el proceso anterior la imagen más

3. Preprocesamiento

p e es (t as e p oceso a te o a age ásgrande tiene esa dimensión)

á Media de las imágenes del training set preprocesadas:

¡Todas están centradas!KNN k 1 b l d t d KNN con k=1 sobre los datos preprocesados0.97557 en Kaggle

Créditos: A. Herrera-Poyatos

Page 67: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

Lib í ti D L i H2O

http://0xdata.com/

Librería que contiene Deep Learning: H2O

é

http://0xdata.com/blog/2015/02/deep-learning-f /

Récord del mundo en el problema MNIST sin preprocesamiento

performance/

Soporte para R, Python, Hadoop y Sparká Se puede instalar en cualquier máquina,

incluyendo un portatil, cluster de ordenadores, … Funcionamiento: Crea una máquina virtual con Java en la que optimiza el virtual con Java en la que optimiza el paralelismo de los algoritmos.

Créditos: A. Herrera-Poyatos

Page 68: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

http://cran.r-project.org/web/packages/h2o/index.htmlhttp://cran.r project.org/web/packages/h2o/index.html

Créditos: A. Herrera-Poyatos

Page 69: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

http://www.h2o.ai/resources/

Deep Neural Network (DNN), includes:

-The default initialization scheme is the uniform The default initialization scheme is the uniform adaptive option, which is an optimized initialization based on the size of the network. -H2O’s Deep Learning framework supports

l i i h i fi iregularization techniques to prevent overfitting(among them, dropout (Hinton et al., 2012)). - It uses the implemented adaptive learning rate algorithm ADADELTA (Zeiler, 2012) algorithm ADADELTA (Zeiler, 2012) automatically combines the benefits of learning rate annealing and momentum training to avoid slow convergence.

It tili HOGWILD! th tl d l d

Créditos: A. Herrera-Poyatos

- It utilizes HOGWILD!, the recently developed lock-free parallelization scheme (Niu et al, 2011).

Page 70: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

http://www.h2o.ai/resources/

Machine Learning with Sparkling Water: H2O + Spark

S kli W t ll t Sparkling Water allows users to combine the fast, scalable machine learning algorithms of H2O with the capabilities of Spark With capabilities of Spark. With Sparkling Water, users can drive computation from Scala/R/Python and utilize the H2O Flow UI and utilize the H2O Flow UI, providing an ideal machine learning platform for application developers.

Créditos: A. Herrera-Poyatos

Page 71: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

4 Deep Learning sobre MNIST DATA preprocesados4. Deep Learning sobre MNIST DATA preprocesados

Andrés Herrera PoyatosRepositorio en GitHub con el código:https://github.com/andreshp/Kaggle Créditos: A. Herrera-Poyatos

Page 72: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Caso estudio: Digit Recognizer Kaggle

4 Deep Learning sobre MNIST DATA preprocesados4. Deep Learning sobre MNIST DATA preprocesados

hidden=C(1024,1024,2048)

Andrés Herrera PoyatosRepositorio en GitHub con el código:https://github.com/andreshp/Kaggle Créditos: A. Herrera-Poyatos

Page 73: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

4 Deep Learning sobre MNIST DATA preprocesados

Caso estudio: Digit Recognizer Kaggle

4. Deep Learning sobre MNIST DATA preprocesados

Tiempo de Ejecución: 2.5 horas de cómputo con un Procesador Intel i5 a 2 5 GHzProcesador Intel i5 a 2.5 GHz.

Resultados conseguidos: Deep Learning 0.98229 en Kagglep g gg Preprocesamiento + Deep Learning 0.98729 en

Kaggle¡El i lt d 0 96557! ¡El primer resultado era 0.96557!

Créditos: A. Herrera-Poyatos

Page 74: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.htmlA complete description

By Michael Nielsen /Jan 2016

Page 75: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

C l ti l l t k th b i id l l Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. Let's look at each of these ideas in turn.

Local receptive fields: To be more precise, each neuron in the first hidden layer will be connected to a small region of the input neurons, say, for example, a 5×5 region, corresponding to 25 input pixels. So, for a particular hidden neuron we might have connections that look like this:particular hidden neuron, we might have connections that look like this:

Page 76: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

L l ti fi ld Local receptive fields: 24×24 neurons

28×28 input image

Page 77: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

Sh d i ht d bi Shared weights and biases: the same weights and bias for each of the 24×24 hidden neurons (sigmoide function)

The map from the input layer to the hidden layer a feature map.

In the example shown, there are 3 feature maps. If we have 20 feature maps that's a total of 20×26=520 parameters If we have 20 feature maps that s a total of 20×26=520 parameters defining the convolutional layer. By comparison, suppose we had a fully connected first layer, with 784=28×28 input neurons, 30 hidden neurons, 23,550 parameters.

Page 78: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

The 20 images correspond to 20 different feature maps

Page 79: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

P li lPooling layers: the pooling layers do is simplify the information in the output from the convolutional layer, one common procedure for pooling is known as max-pooling, in the 2x2 region input.

Page 80: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

R l i t f DIGIT Real experiment for DIGIT:

Different results, and preprocessing are analyzed in the chapter Expanding the training data to displace each chapter. Expanding the training data, to displace each training image by a single pixel, either up one pixel, down one pixel, left one pixel, or right one pixel.

Page 81: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

http://neuralnetworksanddeeplearning.com/chap6.html

Final experiment for DIGIT ( bl ith diff tFinal experiment for DIGIT (ensemble with differentconfigurations): 99.67 percent accuracy, 33 of the 10,000 test images. The label in the top right is the correct classification, according to the MNIST data while in the bottom right is the label according to the MNIST data, while in the bottom right is the label output by our ensemble of nets:

Page 82: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

On the left, the raw input digits. On the right, graphical representations of the l d f t I th t k l t “ ” li d llearned features. In essence, the network learns to “see” lines and loops.

Credits: https://www.datarobot.com/blog/a-primer-on-deep-learning/

Page 83: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

Credits: http://www.iro.umontreal.ca/~bengioy/talks/DL-Tutorial-NIPS2015.pdf

Page 84: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer and Convolutional NN

Credits: L. Deng and D. Yu. Deep Learning methods and applications.Foundations and Trends in Signal Processing. Vol. 7, Issues 3-4, 2014, pag. 325

Page 85: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer

http://yann lecun com/exdb/mnist/http://yann.lecun.com/exdb/mnist/

Page 86: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer

http://yann lecun com/exdb/mnist/http://yann.lecun.com/exdb/mnist/

Page 87: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer

http://rodrigob.github.io/are we there yet/build/classificationhttp://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

Page 88: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer

Page 89: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Digit Recognizer

http://rodrigob.github.io/are we there yet/build/classificationhttp://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

Page 90: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

htt // t l /b/d l i lib i l 569/http://www.teglor.com/b/deep-learning-libraries-language-cm569/

PythonMatlabC++RJa a Java, …

Page 91: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

http://www.teglor.com/b/deep-learning-libraries-language-cm569/http://www.teglor.com/b/deep learning libraries language cm569/

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Google's DeepDream is based on Caffe Framework. This framework is a BSD-licensed C++ library with Python Interfaceframework is a BSD licensed C++ library with Python Interface.Lasagne is a lightweight library to build and train neural networks in Theano. It is governed by simplicity, transparency, modularity, pragmatism , focus and restraint principles.restraint principles.nolearn contains a number of wrappers and abstractions around existing neural network libraries, most notably Lasagne, along with a few machine learning utility modules.Deeplearning4j is the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala. It is designed to be used in business environments, rather than as a research tool.

Page 92: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

http://caffe.berkeleyvision.org/

Caffe is a deep learning framework made with expression, speed, and modularity in mind.

It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.

Google's DeepDream is based on Caffe Framework. This framework is a BSD-licensed C++ library with P th I t fPython Interface.

SparkNet

https://github.com/amplab/SparkNet

SparkNet

Page 93: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

http://www.kdnuggets.com/2015/12/spark-deep-learning-training-with-http://www.kdnuggets.com/2015/12/spark deep learning training withsparknet.html By Matthew Mayo, KDnuggets.

https://github.com/amplab/SparkNet

Page 94: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

https://pypi.python.org/pypi/Theanohttp://deeplearning.net/software/theano/

nolearn Web: https://pythonhosted org/nolearn/

http://lasagne.readthedocs.org/en/latest/index.htmlNN Lasagne

nolearn - Web: https://pythonhosted.org/nolearn/Including: DBN y CNN.

Page 95: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

Tensor FlowTensor Flowhttps://www.tensorflow.org/

T Fl ™ i ft lib TensorFlow™ is an open source software library for numerical computation using data flow graphs.

The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop server or mobile device with a single API desktop, server, or mobile device with a single API.

TensorFlow was originally developed by researchers and engineers working on the Google Brain Team and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research but the learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

Page 96: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

Tensor FlowTensor Flowhttps://www.tensorflow.org/

Scikit Flow: Easy Deep Learning y p gwith TensorFlow and Scikit-learn

https://github.com/tensorflow/skflow Deep Neural NetworkConvolutional NN

http://www.kdnuggets.com/2016/02/scikit-flow-easy-deep-learning-tensorflow-scikit-learn.html

Page 97: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

http://deeplearning4j.org/

Page 98: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

Page 99: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

http://spark-packages.org/user/deeplearning4j

Page 100: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

Page 101: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

deepnet implements some deep learning architectures and neural network algorithms, including BP,RBM,DBN,Deep autoencoder and so on.

Page 102: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

Page 103: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning

Librerías de Deep Learning

In summaryIn summary

darch: Package for Deep Architectures and Restricted Boltzmann Machines

deepnet: deep learning toolkit in R deepnet: deep learning toolkit in R

autoencoder: Sparse Autoencoder for Automatic Learning of Representative Features from Unlabeled Data

Page 104: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Relevant researchers

Credits: https://www.datarobot.com/blog/a-primer-on-deep-learning/The Fathers of Deep Learning

Page 105: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Relevant researchers

Credits: http://www.slideshare.net/david.kh/promises-of-deep-learning

Page 106: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Relevant researchers

Páginas de los 3 investigadores de referencia:

https://www.cs.toronto.edu/~hinton/

(Geoffrey E. Hinton)University of TorontoGoogle Lab - Toronto

http://www.iro.umontreal.ca/~bengioy/yoshua_en/index.html(Yosua Bengio)

é d é lUniversité de Montréal

http://yann.lecun.com/(Yann LeCun)Director of AI Research, Facebook

Page 107: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Final Comments: Overview sobre las estructurasde representación para aprendizaje y deep learning

En este artículo de review, Bengio y coautores hacen una revisión muyEn este artículo de review, Bengio y coautores hacen una revisión muyinteresante sobre la representación para el aprendizaje de características, fundamental para entender deep learning, analizando el estado del arte y las perspectivas futuras.

Page 108: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Readings: Recent Overview

http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html

Page 109: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Readings: Recent Overview

Deep learning, Yann LeCun, Yoshua Bengio & Geoffrey Hinton

Deep learning allows computational models that are composed of

Abstractp g p p

multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagationalgorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep

l ti l t h b ht b t b kth h i convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html

Page 110: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Readings: Recent Overview

Convolutional neural networks

Deep learning, Yann LeCun, Yoshua Bengio & Geoffrey Hinton

Convolutional neural networks

http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html

Page 111: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Final Comments: The future of of deep learning (by LeCun, Bengio and Hinton)

Unsupervised learning91, 92, 93, 94, 95, 96, 97, 98 had a catalytic effect in reviving interest Unsupervised learning had a catalytic effect in reviving interest in deep learning, but has since been overshadowed by the successes of purely supervised learning. Although we have not focused on it in this Review, we expect unsupervised learning to become far more important in the longer term. Human and animal learning is largely unsupervised: we discover the structure of the world by animal learning is largely unsupervised: we discover the structure of the world by observing it, not by being told the name of every object.Human vision is an active process that sequentially samples the optic array in an intelligent, task-specific way using a small, high-resolution fovea with a large, low-resolution surround. We expect much of the future progress in vision to come from resolution surround. We expect much of the future progress in vision to come from systems that are trained end-to-end and combine ConvNets with RNNs that use reinforcement learning to decide where to look. Systems combining deep learning and reinforcement learning are in their infancy, but they already outperform passive vision systems99 at classification tasks and produce impressive results in learning to y p p gplay many different video games100.Natural language understanding is another area in which deep learning is poised to make a large impact over the next few years. We expect systems that use RNNs to understand sentences or whole documents will become much better when they ylearn strategies for selectively attending to one part at a time76, 86.Ultimately, major progress in artificial intelligence will come about through systems that combine representation learning with complex reasoning. Although deep learning and simple reasoning have been used for speech and handwriting

http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html

g p g p grecognition for a long time, new paradigms are needed to replace rule-based manipulation of symbolic expressions by operations on large vectors101.

Page 112: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Final Comments

En el enlace a Deep Learning de la Wikipedia se hace un rápido recorridoEn el enlace a Deep Learning de la Wikipedia se hace un rápido recorridosobre Deep learning, los diferentes modelos de redes neuronalesasociados, así como algunos de los campos actuales de aplicación.

https://en wikipedia org/wiki/Deep learning https://en.wikipedia.org/wiki/Deep_learning

Existe una gran variedad de arquitecturas Deep Neural Network

Page 113: Deep Learning - UGR · Deep Learning Human information processing mechanisms (e.g., vision and audition) suggest the need of deep architectures for extracti l d b ildi i ling complex

Deep Learning Final Comments

En el enlace a Deep Learning de la Wikipedia se hace un rápido recorridoEn el enlace a Deep Learning de la Wikipedia se hace un rápido recorridosobre Deep learning, los diferentes modelos de redes neuronalesasociados, así como algunos de los campos actuales de aplicación.

https://en wikipedia org/wiki/Deep learning https://en.wikipedia.org/wiki/Deep_learning