building high-level features using large-scale unsupervised learning anh nguyen, bay-yuan hsu cs290d...

41
Building high-level features using large- scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California, Santa Barbara Slide adapted from Andrew Ng (Stanford), Nando de Freitas (UBC) 1

Upload: kristina-hudson

Post on 13-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Building high-level features using large-scale unsupervised learningAnh Nguyen, Bay-yuan HsuCS290D – Data Mining (Spring 2014)University of California, Santa BarbaraSlide adapted from Andrew Ng (Stanford), Nando de Freitas (UBC) 1

Page 2: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Agenda1. Motivation2. Approach

1. Sparse Deep Auto-encoder2. Local Receptive Field3. L2 Pooling4. Local contrast normalization5. Overall Model

3. Parallelism4. Evaluation5. Discussion 2

Page 3: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

1. Motivation

3

Page 4: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Motivation

• Feature learning• Supervised learning

• Need large number of labeled data• Unsupervised learning

• Example: Build face detector without having labeled face images

• Building high-level features using unlabeled data.

4

Page 5: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Motivation

• Previous works• Auto encoder• Sparse coding

• Result: Only learns low level features• Reason: Computational constraints• Approach

• Dataset• Model• Computational resources 5

Page 6: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

2. Approach

6

Page 7: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Sparse Deep Auto-encoder

• Auto-encoder• Neural network• Unsupervised learning• Back-propagation

7

Page 8: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Sparse Deep Auto-encoder (cnt’d)• Sparse Coding• Input: Images x(1), x(2) ... x(m) • Learn: Bases (features) f1, f2, ..., fk, so that each

input x can be approximately decomposed as: x=∑ajfj s.t. aj’s are mostly zero (“sparse”)

8

Page 9: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Sparse Deep Auto-encoder (cnt’d)

9

Page 10: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Sparse Deep Auto-encoder (cnt’d)• Sparse Coding• Regularizer

10

Page 11: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Sparse Deep Auto-encoder (cnt’d)• Sparse Deep Auto-encoder

• Multiple hidden layers to achieve particular characteristic in learning features

11

Page 12: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Local Receptive Field

• Definition: Each feature in the autoencoder can connect only to a small region of the lower layer

• Goal: • Learn feature efficiently• Parallelism

• Training on small image patches

12

Page 13: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

L2 Pooling

• Goal: Robust to local distortion• Approach: Group similar features together to

achieve invariance

13

Page 14: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

L2 Pooling

• Goal: Robust to local distortion • Approach: Group similar features together to

achieve invariance

14

Page 15: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

L2 Pooling

• Goal: Robust to local distortion • Approach: Group similar features together to

achieve invariance

15

Page 16: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

L2 Pooling

• Goal: Robust to local distortion • Approach: Group similar features together to

achieve invariance

16

Page 17: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Local Contrast Normalization

• Goal: Robust to variation in light intensity• Approach: Normalize contrast

17

Page 18: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Local Contrast Normalization

• Goal: Robust to variation in light intensity• Approach: Normalize contrast

18

Page 19: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Overall Model

• 3 layers• Simple: 18x18 px

• 8 neurons/patch• Complex: 5x5 px• LCN: 5x5 px

19

Page 20: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Overall Model

20

Page 21: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Overall Model

• Train:• Reconstruct input of

each layer• Optimization function

21

Page 22: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Overall Model

• Complex model?

22

Page 23: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

3. Parallelism

23

Page 24: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Asynchronous SGD

Two recent lines of research in speeding up large learning problems:• Parallel/distributed computing• Online (and mini-batch) learning algorithms: stochastic gradient descent, perceptron, MIRA, stepwise EMHow can we bring together the benefits of parallel computing and online learning? 24

Page 25: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Asynchronous SGD

SGD: Stochastic Gradient Descent:• Choose an initial vector of parameters W and

learning rate α• Repeat until an approximate minimum is

obtained:• Randomly shuffle examples in the training set

25

Page 26: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

26

Page 27: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

27

Page 28: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

28

Page 29: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Model Parallelism

• Weights divided according to locality of image and store on different machine

29

Page 30: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

5. evaluation

30

Page 31: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Evaluation

• 10M Youtube unlabeled frames of size 200x200

• 1B parameters• 1000 machines• 16,000 cores

31

Page 32: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Experiment on Faces

• Test set• 37,000 images• 13,026 face images

• Best neuron

32

Page 33: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Experiment on Faces (cnt’d)

• Visualization• Top stimulus (images) for face neuron• Optimal stimulus for face neuron

33

Page 34: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Experiment on Faces (cnt’d)

• Invariances Properties

34

Page 35: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Experiment on Faces (cnt’d)

• Invariances Properties

35

Page 36: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Experiment on Cat/Human body• Test set

• Cat: 10,000 positive, 18,409 negative• Human body: 13,026 positive, 23,974 negative

• Accuracy

36

Page 37: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

ImageNet classification

• Recognizing images• Dataset

• 20,000 categories• 14M images

• Accuracy• 15.8%• State of art: 9.3%

37

Page 38: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

5. DISCUSSION

38

Page 39: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Discussion

• Deep learning• Unsupervised feature learning• Learning multiple layers of representation

• Increase accuracy: Invariance, contrast normalization

• Scalability

39

Page 40: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

6. REFERENCES

40

Page 41: Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

References1. Quoc Le et al., “Building High-level Features using Large Scale Unsupervised

Learning”2. Nando de Freitas, “Deep Learning”, URL: https://www.youtube.com/watch?

v=g4ZmJJWR34Q3. Andrew Ng, “Sparse autoencoder”, URL:

http://www.stanford.edu/class/archive/cs/cs294a/cs294a.1104/sparseAutoencoder.pdf

4. Andrew Ng, “Machine Learning and AI via Brain Simulations”, URL: https://forum.stanford.edu/events/2011slides/plenary/2011plenaryNg.pdf

5. Andrew Ng, “Deep Learning”, URL: http://www.ipam.ucla.edu/publications/gss2012/gss2012_10595.pdf

41