unsupervised feature learning for audio classification using convolutional deep belief networks

Post on 22-Jan-2016

120 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Unsupervised feature learning for audio classification using convolutional deep belief networks. Honglak Lee, Yan Largman, Peter Pham and Andrew Y. Ng. Presented by Bo Chen, 5.7,2010. Outline. 1. What’s Deep Learning? 2. Why use Deep Learning? 3. Foundations of Deep Learning - PowerPoint PPT Presentation

TRANSCRIPT

Unsupervised feature learning for audio classification using convolutional deep belief net

works

Honglak Lee, Yan Largman, Peter Pham and Andrew Y. Ng

Presented by Bo Chen, 5.7,2010

Outline

• 1. What’s Deep Learning?

• 2. Why use Deep Learning?

• 3. Foundations of Deep Learning

• 4. Convolutional Deep Belief Networks

• 5. Results

Deep Architecture

• Deep architectures: compositions of many layers of adaptive non-linear components.

Difficulty: parameter searching (local minima)

• Deep belief nets: probabilistic generative models that are composed of multiple layers of stochastic, latent variables. (Hinton et al., 2006)

Deep Learning Wiki

Why Use Deep Learning

• Insufficient depth can hurt Usually our experiences tell us that one-layer machine only gives us

a set of general dictionary elements, unless a huge number of dictionary elements.

• The brain has a deep architecture• Cognitive processes seem deep• Learn a feature hierarchies or the complicated fu

nctions that can represent high-level abstractions

For example, PixelsEdgletsMotifsPartsObjectsScenes

Some from Yoshua Bengio’s course notes and Yann Lecun, et.al.,2010

One-layer dictionary

30 16x16 dictionary elementsand reconstructed images

250 16x16 dictionary elementsand reconstructed images

Restricted Boltzmann Machine

Figure from R Salakhutdinov et. al. 

Energy functionBinary-valued

Real-valued

Contrastive divergence is used to solve the problem. (Hinton et al., 2006)

Deep Architectures

RBM in the different layers can be independently trained.

Convolutional Network Architecture

Figure from Yann LeCun et. al, 1998

Intuitively, in each layer the weight matrix will catch the most consistent ‘structures’ through all of the images.

3-dimensional Dictionary elements in the second layer

The dictionary element in the second layeris a 3-dimensional matrix.

D: the first-layer dictionary element E: the second-layer dictionary elementS: the convolution of the image and the first-layer elements.

Convolutional RBM with Probabilistic Max-Pooling Layer

Max-pooling Layer

Convolutional Deep Belief Networks

: the weight matrixConnecting poolingunit Pk to detection unit H’l.

Results on Natural Images

Results Caltech101 Images

top related