get compact representation using deep networks
DESCRIPTION
This presentation focus on motivation , methods, and application of getting compact representation using deep networks.TRANSCRIPT
Get Compact Representation using Deep NetworksMethod and Application
Zhengbo Li
Shanghai Jiao Tong University
November 19, 2015
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 1 / 13
Overview
Motivation
Method
Performance
Application
Future work
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 2 / 13
Motivation: why do we need compact representation?
Useful — compact representation of original data needs lesscomputational and spacial resources.
Interesting — we want to know what are the compact representations(essentially the same as what do gates learn).
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 3 / 13
Dataset
A low resolution version of MNIST.
Convert 28 by 28 pictures to 14 by 14 pictures, each input is a 196dimension vector.
Due to limited computational resource and time.
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 4 / 13
Method: Autoencoder
Dilemma:
Shallow autoencoders (single or a few hidden layers):Advantage: easy to find a good local minimumDisadvantage: not complex enough to get good representations
Deep autoencoders (more hidden layers):Advantage: complex enough, good representation is possibleDisadvantage: very likely to get stuck into poor local minimums
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 5 / 13
Method: Combine the Advantages
Example: get a 4-dimensional representation of the 196 dimensionalhand written digits, aka, use 4 real numbers to represent a picture.Step 1: Use the 196 dimensional original input to train a 100dimensional representation.Step 2: Use the 100 dimensional representation to train a 50dimensional representation.
Figure 1 : Step 1(left), Step 2(right)
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 6 / 13
Method: Combine the Advantages, cont
Step 3: Combine these two networks. Use the red and blue weightswe got as initial weights and continue training, thus we get a 50dimensional representation of the original 196 dimensional input.
Figure 2 : Step 3
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 7 / 13
Method: Combine the Advantages, cont.
Step 4: Use the 50 dimensional representation to train a 20dimensional representation.
Step 5: Combine the networks. Use the red’, blue’ and green weightswe got as initial weights and continue training, thus we get a 20dimensional representation of the original 196 dimensional input.
Figure 3 : Step 4(left), Step 5(right)
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 8 / 13
Method: Combine the Advantages, cont.
Keep inserting hidden layers in the middle to get more compactrepresentations.
Final network structure:[196, 100, 50, 20, 10, 4, 10, 20, 50, 100, 196]
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 9 / 13
Performance
To evaluate performance, cost = sum square of the differencesbetween input and output, averaged for all inputs
Method Cost
Top 4 principle components (SVD) 12.1284Single hidden layer autoencoder with 4 hidden gates 6.1094Autoencoder with same architecture, but train all layers together 10.0036Our method 2.2951
Table 1 : Cost comparison for different methods
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 10 / 13
Application: Generating samples
Dimension reduction has many applications, omitted here.
Pick up a random 4-dimensional vector. With high probability itcorresponds to a hand written digit.
Figure 4 : Generated Numbers
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 11 / 13
Future work
Try other datasets.
See what do these 4 hidden gates learn (why the 4 dimensionalrepresentation achieves low cost).
Why deep networks are easy to get stuck into poor local minimums?
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 12 / 13
Thank you for listening.
Zhengbo Li (SJTU) Get Compact Representation November 19, 2015 13 / 13