building high-level features using large-scale unsupervised learning anh nguyen, bay-yuan hsu cs290d...

Building high-level features using large-scale unsupervised learningAnh Nguyen, Bay-yuan HsuCS290D – Data Mining (Spring 2014)University of California, Santa BarbaraSlide adapted from Andrew Ng (Stanford), Nando de Freitas (UBC) 1

Agenda1. Motivation2. Approach

1. Sparse Deep Auto-encoder2. Local Receptive Field3. L2 Pooling4. Local contrast normalization5. Overall Model

3. Parallelism4. Evaluation5. Discussion 2

1. Motivation

Motivation

• Feature learning• Supervised learning

• Need large number of labeled data• Unsupervised learning

• Example: Build face detector without having labeled face images

• Building high-level features using unlabeled data.

Motivation

• Previous works• Auto encoder• Sparse coding

• Result: Only learns low level features• Reason: Computational constraints• Approach

• Dataset• Model• Computational resources 5

2. Approach

Sparse Deep Auto-encoder

• Auto-encoder• Neural network• Unsupervised learning• Back-propagation

Sparse Deep Auto-encoder (cnt’d)• Sparse Coding• Input: Images x(1), x(2) ... x(m) • Learn: Bases (features) f1, f2, ..., fk, so that each

input x can be approximately decomposed as: x=∑ajfj s.t. aj’s are mostly zero (“sparse”)

Sparse Deep Auto-encoder (cnt’d)

Sparse Deep Auto-encoder (cnt’d)• Sparse Coding• Regularizer

Sparse Deep Auto-encoder (cnt’d)• Sparse Deep Auto-encoder

• Multiple hidden layers to achieve particular characteristic in learning features

Local Receptive Field

• Definition: Each feature in the autoencoder can connect only to a small region of the lower layer

• Goal: • Learn feature efficiently• Parallelism

• Training on small image patches

L2 Pooling

• Goal: Robust to local distortion• Approach: Group similar features together to

achieve invariance

L2 Pooling

• Goal: Robust to local distortion • Approach: Group similar features together to

achieve invariance

L2 Pooling

achieve invariance

L2 Pooling

achieve invariance

Local Contrast Normalization

• Goal: Robust to variation in light intensity• Approach: Normalize contrast

Local Contrast Normalization

• Goal: Robust to variation in light intensity• Approach: Normalize contrast

Overall Model

• 3 layers• Simple: 18x18 px

• 8 neurons/patch• Complex: 5x5 px• LCN: 5x5 px

Overall Model

• Train:• Reconstruct input of

each layer• Optimization function

Overall Model

• Complex model?

3. Parallelism

Asynchronous SGD

Two recent lines of research in speeding up large learning problems:• Parallel/distributed computing• Online (and mini-batch) learning algorithms: stochastic gradient descent, perceptron, MIRA, stepwise EMHow can we bring together the benefits of parallel computing and online learning? 24

Asynchronous SGD

SGD: Stochastic Gradient Descent:• Choose an initial vector of parameters W and

learning rate α• Repeat until an approximate minimum is

obtained:• Randomly shuffle examples in the training set

Model Parallelism

• Weights divided according to locality of image and store on different machine

5. evaluation

Evaluation

• 10M Youtube unlabeled frames of size 200x200

• 1B parameters• 1000 machines• 16,000 cores

Experiment on Faces

• Test set• 37,000 images• 13,026 face images

• Best neuron

Experiment on Faces (cnt’d)

• Visualization• Top stimulus (images) for face neuron• Optimal stimulus for face neuron

• Invariances Properties

Experiment on Cat/Human body• Test set

• Cat: 10,000 positive, 18,409 negative• Human body: 13,026 positive, 23,974 negative

• Accuracy

ImageNet classification

• Recognizing images• Dataset

• 20,000 categories• 14M images

• Accuracy• 15.8%• State of art: 9.3%

5. DISCUSSION

Discussion

• Deep learning• Unsupervised feature learning• Learning multiple layers of representation

• Increase accuracy: Invariance, contrast normalization

• Scalability

6. REFERENCES

References1. Quoc Le et al., “Building High-level Features using Large Scale Unsupervised

Learning”2. Nando de Freitas, “Deep Learning”, URL: https://www.youtube.com/watch?

v=g4ZmJJWR34Q3. Andrew Ng, “Sparse autoencoder”, URL:

http://www.stanford.edu/class/archive/cs/cs294a/cs294a.1104/sparseAutoencoder.pdf

4. Andrew Ng, “Machine Learning and AI via Brain Simulations”, URL: https://forum.stanford.edu/events/2011slides/plenary/2011plenaryNg.pdf

5. Andrew Ng, “Deep Learning”, URL: http://www.ipam.ucla.edu/publications/gss2012/gss2012_10595.pdf

building high-level features using large-scale unsupervised learning anh nguyen, bay-yuan hsu cs290d...

Documents

lecture 11 - csd.uwo.ca · lecture 11 unsupervised learning...

imperial college london · web viewtitle: risk-averse...

pruac2018.esoe.ntu.edu.twpruac2018.esoe.ntu.edu.tw/uploads/1/1/4/8/114897683/pruac2018_book... ·...

ê da-yuan huang · best paper honorable mention award, acm...

unsupervised learning clustering algorithms - it -...

welcome message from convention chair´»動專區/feiap...

troy hsu(hsu chi chung) work during 2012-2015

when on mars do as martians do -...

music composition with deep learning -...

wei-ning hsu, yu zhang, james...

hsu research

test requirements and it execution for the velocity...

china’s personal care market › wp-content › uploads...

unsupervised learning learning unsupervised...unsupervised...

unsupervised learning and data mining unsupervised learning

unsupervised curricula for visual meta-reinforcement...

applying unsupervised learning - mathworks · applying...

class-specific hough forests for object detection zhen yuan...

faizan ahmed; andrew hsu; benjamin hsu...

hsu, hsul & jcu - generation...hsu, hsul, jcu 2 hsu, hsul &...