![Page 1: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/1.jpg)
Building high-level features using large-scale unsupervised learningAnh Nguyen, Bay-yuan HsuCS290D – Data Mining (Spring 2014)University of California, Santa BarbaraSlide adapted from Andrew Ng (Stanford), Nando de Freitas (UBC) 1
![Page 2: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/2.jpg)
Agenda1. Motivation2. Approach
1. Sparse Deep Auto-encoder2. Local Receptive Field3. L2 Pooling4. Local contrast normalization5. Overall Model
3. Parallelism4. Evaluation5. Discussion 2
![Page 3: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/3.jpg)
1. Motivation
3
![Page 4: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/4.jpg)
Motivation
• Feature learning• Supervised learning
• Need large number of labeled data• Unsupervised learning
• Example: Build face detector without having labeled face images
• Building high-level features using unlabeled data.
4
![Page 5: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/5.jpg)
Motivation
• Previous works• Auto encoder• Sparse coding
• Result: Only learns low level features• Reason: Computational constraints• Approach
• Dataset• Model• Computational resources 5
![Page 6: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/6.jpg)
2. Approach
6
![Page 7: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/7.jpg)
Sparse Deep Auto-encoder
• Auto-encoder• Neural network• Unsupervised learning• Back-propagation
7
![Page 8: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/8.jpg)
Sparse Deep Auto-encoder (cnt’d)• Sparse Coding• Input: Images x(1), x(2) ... x(m) • Learn: Bases (features) f1, f2, ..., fk, so that each
input x can be approximately decomposed as: x=∑ajfj s.t. aj’s are mostly zero (“sparse”)
8
![Page 9: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/9.jpg)
Sparse Deep Auto-encoder (cnt’d)
9
![Page 10: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/10.jpg)
Sparse Deep Auto-encoder (cnt’d)• Sparse Coding• Regularizer
10
![Page 11: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/11.jpg)
Sparse Deep Auto-encoder (cnt’d)• Sparse Deep Auto-encoder
• Multiple hidden layers to achieve particular characteristic in learning features
11
![Page 12: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/12.jpg)
Local Receptive Field
• Definition: Each feature in the autoencoder can connect only to a small region of the lower layer
• Goal: • Learn feature efficiently• Parallelism
• Training on small image patches
12
![Page 13: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/13.jpg)
L2 Pooling
• Goal: Robust to local distortion• Approach: Group similar features together to
achieve invariance
13
![Page 14: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/14.jpg)
L2 Pooling
• Goal: Robust to local distortion • Approach: Group similar features together to
achieve invariance
14
![Page 15: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/15.jpg)
L2 Pooling
• Goal: Robust to local distortion • Approach: Group similar features together to
achieve invariance
15
![Page 16: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/16.jpg)
L2 Pooling
• Goal: Robust to local distortion • Approach: Group similar features together to
achieve invariance
16
![Page 17: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/17.jpg)
Local Contrast Normalization
• Goal: Robust to variation in light intensity• Approach: Normalize contrast
17
![Page 18: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/18.jpg)
Local Contrast Normalization
• Goal: Robust to variation in light intensity• Approach: Normalize contrast
18
![Page 19: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/19.jpg)
Overall Model
• 3 layers• Simple: 18x18 px
• 8 neurons/patch• Complex: 5x5 px• LCN: 5x5 px
19
![Page 20: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/20.jpg)
Overall Model
20
![Page 21: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/21.jpg)
Overall Model
• Train:• Reconstruct input of
each layer• Optimization function
21
![Page 22: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/22.jpg)
Overall Model
• Complex model?
22
![Page 23: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/23.jpg)
3. Parallelism
23
![Page 24: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/24.jpg)
Asynchronous SGD
Two recent lines of research in speeding up large learning problems:• Parallel/distributed computing• Online (and mini-batch) learning algorithms: stochastic gradient descent, perceptron, MIRA, stepwise EMHow can we bring together the benefits of parallel computing and online learning? 24
![Page 25: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/25.jpg)
Asynchronous SGD
SGD: Stochastic Gradient Descent:• Choose an initial vector of parameters W and
learning rate α• Repeat until an approximate minimum is
obtained:• Randomly shuffle examples in the training set
25
![Page 26: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/26.jpg)
26
![Page 27: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/27.jpg)
27
![Page 28: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/28.jpg)
28
![Page 29: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/29.jpg)
Model Parallelism
• Weights divided according to locality of image and store on different machine
29
![Page 30: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/30.jpg)
5. evaluation
30
![Page 31: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/31.jpg)
Evaluation
• 10M Youtube unlabeled frames of size 200x200
• 1B parameters• 1000 machines• 16,000 cores
31
![Page 32: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/32.jpg)
Experiment on Faces
• Test set• 37,000 images• 13,026 face images
• Best neuron
32
![Page 33: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/33.jpg)
Experiment on Faces (cnt’d)
• Visualization• Top stimulus (images) for face neuron• Optimal stimulus for face neuron
33
![Page 34: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/34.jpg)
Experiment on Faces (cnt’d)
• Invariances Properties
34
![Page 35: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/35.jpg)
Experiment on Faces (cnt’d)
• Invariances Properties
35
![Page 36: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/36.jpg)
Experiment on Cat/Human body• Test set
• Cat: 10,000 positive, 18,409 negative• Human body: 13,026 positive, 23,974 negative
• Accuracy
36
![Page 37: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/37.jpg)
ImageNet classification
• Recognizing images• Dataset
• 20,000 categories• 14M images
• Accuracy• 15.8%• State of art: 9.3%
37
![Page 38: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/38.jpg)
5. DISCUSSION
38
![Page 39: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/39.jpg)
Discussion
• Deep learning• Unsupervised feature learning• Learning multiple layers of representation
• Increase accuracy: Invariance, contrast normalization
• Scalability
39
![Page 40: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/40.jpg)
6. REFERENCES
40
![Page 41: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56649ee85503460f94bf98b9/html5/thumbnails/41.jpg)
References1. Quoc Le et al., “Building High-level Features using Large Scale Unsupervised
Learning”2. Nando de Freitas, “Deep Learning”, URL: https://www.youtube.com/watch?
v=g4ZmJJWR34Q3. Andrew Ng, “Sparse autoencoder”, URL:
http://www.stanford.edu/class/archive/cs/cs294a/cs294a.1104/sparseAutoencoder.pdf
4. Andrew Ng, “Machine Learning and AI via Brain Simulations”, URL: https://forum.stanford.edu/events/2011slides/plenary/2011plenaryNg.pdf
5. Andrew Ng, “Deep Learning”, URL: http://www.ipam.ucla.edu/publications/gss2012/gss2012_10595.pdf
41