deep learning for data mining - aris anagnostopoulos -...

80
Deep Learning for Data Mining University of Rome "La Sapienza" Dep. of Computer, Control and Management Engineering A. Ruberti Valsamis Ntouskos, ALCOR Lab

Upload: others

Post on 25-Jul-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Deep Learning for Data Mining

University of Rome "La Sapienza"Dep. of Computer, Control and Management Engineering A. Ruberti

Valsamis Ntouskos, ALCOR Lab

Page 2: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Outline

Deep Learning for Data Mining

• Introduction - Motivation

• Theoretical aspects

• Convolutional Neural Networks

• Recurrent Neural Networks

• Generative models

• Further directions

13/12/2018

Page 3: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Deep Learning with CNNs

Deep Learning for Data Mining

Compositional Models

Learned End-to-End

Hierarchy of Representations- vision: pixel, motif, part, object

- text: character, word, clause, sentence

- speech: audio, band, phone, word

concrete abstractlearning

Slides from Caffe framework tutorial @ CVPR2015

13/12/2018

Page 4: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Deep Learning with CNNs

Deep Learning for Data Mining

Compositional Models

Learned End-to-End

Back-propagation jointly learns

all of the model parameters to

optimize the output for the task.

Slides from Caffe framework tutorial @ CVPR2015

13/12/2018

Page 5: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Motivation of CNNs

Deep Learning for Data Mining

Inputs usually treated as general feature vectors

In some cases inputs have special structure:• Audio• Images• Videos

Signals: Numerical representations of physical quantities

Deep learning can be directly applied on signals by using suitable operators

13/12/2018

Page 6: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Motivation

Deep Learning for Data Mining

. . . 0.0468 0.0468 0.0468 0.0390 0.0390 0.0390 0.0546 0.0625 0.0625 0.0390 0.0312 0.0468 0.0625 . . .

1D data - (variable length) vectors

Audio

13/12/2018

Page 7: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Motivation

Deep Learning for Data Mining

Images

A sequence of images sampled through time - 3D data

2D data - matrices

Video

13/12/2018

Page 8: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

What is a CNN?

Deep Learning for Data Mining 13/12/2018

Page 9: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Some theory

13/12/2018Deep Learning for Data Mining

Convolution

• Image filtering is

based on convolution

with special kernels

Page 10: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Some theory

Deep Learning for Data Mining

Pooling

• Introduces subsampling

13/12/2018

Page 11: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Some theory

Deep Learning for Data Mining

Activation

Standard way to model a neuron

f(x) = tanh(x) or f(x) = (1 + e-x)-1

Very slow to train (saturation)

Non-saturating nonlinearity (RELU)f(x) = max(0, x)Quick to train

13/12/2018

Page 12: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Some theory

Deep Learning for Data Mining

Regularization

Dropout

• Applied on the fully-connected layers

• During training prune nodes with probability α

• During testing nodes are weighed by α

Image from Srivastava et al.. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting"

13/12/2018

Page 13: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Some theory

Deep Learning for Data Mining

Every convolutional layer of a CNN transforms the 3D input

volume to a 3D output volume of neuron activations.

A regular 3-layer Neural Network

Material from Fei-Fei’s group

13/12/2018

Page 14: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Some theory

Deep Learning for Data Mining

Each neuron is connected to a

local region in the input volume

spatially, but to all channels

The neurons still compute a dot

product of their weights with the

input followed by a non-linearity

Material from Fei-Fei’s group

13/12/2018

Page 15: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Algorithms

Deep Learning for Data Mining

• Each* neuron/layer is differentiable!

• Just apply backpropagation (chain-rule)

• Use standard gradient-based optimization algorithms

(SGD, AdaGrad, …)

• The devil lies in the details though …

▪Choosing hyperparameters / loss-function

▪Exploding/Vanishing gradients – batch normalization

▪Overfitting – Regularization

▪Cost of performing experiments

▪Convergence

▪…

*what about max-pooling?

13/12/2018

Page 16: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Kernels and

Feature maps

Deep Learning for Data Mining

Material from Fei-Fei’s group

13/12/2018

Page 17: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Case study: image classification

Deep Learning for Data Mining

Slides from Caffe framework tutorial @ CVPR2015

13/12/2018

Page 18: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Deep Learning for Data Mining

Case study: image classification

Slides from Caffe framework tutorial @ CVPR2015

13/12/2018

Page 19: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Brief history of CNNs

Foundational work done in the middle of the 1900s

• 1940s-1960s: Cybernetics [McCulloch and Pitts 1943,

Hebb 1949, Rosenblatt 1958]

• 1980s-mid 1990s: Connectionism [Rumelhart 1986,

Hinton 1989]

• 1990s: modern convolutional networks [LeCun et al.

1998], LSTM [Hochreiter & Schmidhuber 1997,

MNIST and other large datasets]

Deep Learning for Data Mining 13/12/2018

Page 20: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Brief history of CNNs

Deep Learning for Data Mining

Hubel & Wiesel [60s] Simple & Complex cells architecture:

Hubel & Wiesel [60s] Simple & Complex cells architecture Fukushima’s Neocognitron [70s]

Yann LeCun’s Early CNNs [80s]:

13/12/2018

Page 21: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Brief history of CNNs

Deep Learning for Data Mining

Convolutional Networks: 1989

LeNet: a layered model composed of convolution and subsampling operations followed

by a holistic representation and ultimately a classifier for handwritten digits. [ LeNet ]

13/12/2018

Page 22: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Recent success

• Parallel Computation (GPU)

• Larger training sets

• International Competitions

• Theoretical advancements

– Dropout

– ReLUs

– Batch Normalization

Deep Learning for Data Mining 13/12/2018

Page 24: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Recent success

ImageNet

• Over 15M labeled high resolution images

• Roughly 22K categories

• Collected from web and labeled by Amazon Mechanical Turk

Deep Learning for Data Mining

Larger training sets

13/12/2018

Page 25: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Recent success

ILSVRC

• Annual competition of image classification at large scale

• 1.2M images in 1K categories

• Classification: make 5 guesses about the image label

Deep Learning for Data Mining

Competitions

EntleBucher Appenzeller

13/12/2018

Page 26: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

CNNs in Computer Vision

Deep Learning for Data Mining

• Image classification

13/12/2018

Page 27: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Evolution of CNNs for image classification

Deep Learning for Data Mining

AlexNet: a layered model composed of convolution, subsampling, and further operations

followed by a holistic representation and all-in-all a landmark classifier on

ILSVRC12. [ AlexNet ]

Convolutional Nets: 2012

AlexNet

13/12/2018

Page 28: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Evolution of CNNs for image classification

Deep Learning for Data Mining

Convolutional Nets: 2014

ILSVRC14 Winners: ~6.6% Top-5 error

- GoogLeNet: composition of multi-scale

dimension-reduced modules

+ depth

+ data

+ dimensionality reduction

13/12/2018

Page 29: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Evolution of CNNs for image classification

Deep Learning for Data Mining

Convolutional Nets: 2014

ILSVRC14 Winners: ~6.6% Top-5 error

- VGG: 16 layers of 3x3 convolution

interleaved with max pooling +

3 fully-connected layers

+ depth

+ data

+ dimensionality reduction

13/12/2018

Page 30: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Evolution of CNNs for image classification

Deep Learning for Data Mining

Convolutional Nets: 2015

ResNet

ILSVRC15 Winner: ~3.6% Top-5 error

Intuition: Easier to learn zero than identity function

13/12/2018

Page 31: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Reasonable questions

• Is this just for a particular dataset? – No!

Deep Learning for Data Mining

Slides from ICCV 2015 Math of Deep Learning tutorial

13/12/2018

Page 32: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Reasonable questions

Deep Learning for Data Mining

Object Localization[R-CNN, HyperColumns, Overfeat, etc.]

Pose estimation [Thomson et al, CVPR’15]

• Is this just for a particular task? – No!

Slides from ICCV 2015 Math of Deep Learning tutorial

13/12/2018

Page 33: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Reasonable questions

Deep Learning for Data Mining

Semantic Segmentation[Pinhero, Collobert, Dollar, ICCV’15]

• Is this just for a particular task? – No!

Slides from ICCV 2015 Math of Deep Learning tutorial

13/12/2018

Page 34: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Fine Tuning

Deep Learning for Data Mining

Dogs vs. Cats

top 10 in 10 minutes

Take a pre-trained model and fine-tune to new tasks [DeCAF] [Zeiler-Fergus] [OverFeat]

© kaggle.com

Your Task

Style

RecognitionLots of Data

ImageNet

13/12/2018

Page 35: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

RNNs

Dealing with sequences

• data points are related

• order is important!

(C)NNs disregard this information

Deep Learning for Data Mining 13/12/2018

Page 36: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

RNNs

Recurrent Neural Networks (RNNs) address this problem by introducing cells with loops

13/12/2018Deep Learning for Data Mining

• allow information to persist

• the value of output 𝐡𝑡 depends both on 𝐱𝑡 and the previous output 𝐡𝑡−1

Material from Colah’s blog

Page 37: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

RNNs

Another way to see RNNs is to unfold the loop

• each cell dedicated to a different data sample

• output 𝐡𝑡 computed based also on 𝐡𝑡−1

13/12/2018Deep Learning for Data Mining

Page 38: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

RNNs

RNN structure

13/12/2018Deep Learning for Data Mining

Material from Colah’s blog

Page 39: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

RNNs

RNNs (and variants) are very effective in various domains:

• Speech recognition

• Translation

• Image/Video captioning

• Video summarization

• Activity recognition

• ...

13/12/2018Deep Learning for Data Mining

Page 40: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

RNNs

a) "The clouds are in the _."

13/12/2018Deep Learning for Data Mining

ProblemRNNs cannot capture long-term dependencies• so-called vanishing/exploding gradients

problem

b) "I grew up in France... I speak fluently _."

Page 41: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs

Long Short Term Memories (LSTMs)

• RNNs with special structure

• designed to capture long-term dependencies

13/12/2018Deep Learning for Data Mining

Hochreiter & Schmidhuber (1997) Long short-term memory

Page 42: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs

Main idea: Separate cell output 𝐡𝑡 and cell state 𝐶𝑡• state is modified through a series of

structures called gates

13/12/2018Deep Learning for Data Mining

Page 43: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs – step by step

1st gate: forget mechanism

• drop elements of 𝐶𝑡−1 based on values of 𝐡𝑡−1and input 𝐱𝑡

13/12/2018Deep Learning for Data Mining

Page 44: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs – step by step

2nd and 3rd gate: update mechanism

i. decide which elements of 𝐶𝑡−1 to update

ii. compute the update vector ሚ𝐶𝑡

13/12/2018Deep Learning for Data Mining

Page 45: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs – step by step

2nd and 3rd gate: update mechanism

iii. update elements selected via 𝑖𝑡 with values computed in ሚ𝐶𝑡

13/12/2018Deep Learning for Data Mining

Page 46: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs – step by step

Last step: compute output

• Based on 𝐶𝑡 , 𝐡𝑡−1, 𝐱𝑡

13/12/2018Deep Learning for Data Mining

Page 47: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTM variants 1/3

Add peephole connections

All layers “see” the state 𝐶𝑡−1

13/12/2018Deep Learning for Data Mining

Gers & Schmidhuber (2000) Recurrent nets that time and count

Page 48: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTM variants 2/3

Coupled input and forget gates:

• only forget something if you are going to input something else

13/12/2018Deep Learning for Data Mining

Page 49: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTM variants 3/3

Gated Recurrent Units (GRUs)

• no cell state! - directly operate on output 𝐡𝑡• just two gates instead of three

13/12/2018Deep Learning for Data Mining

Yao et al. (2015) Depth-gated recurrent neural networks

Page 50: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs

Further improvements:

• Further modifications of gates/structure

• Bi-directional LSTMs

• Attention mechanisms

• ...

13/12/2018Deep Learning for Data Mining

Page 51: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs – typical use cases

One-to-one – classical NNs (no RNN)

One-to-many – sequence output (e.g. image captioning)

Many-to-one – sequence input (e.g. sentiment analysis)

Many-to-many – sequence to sequence (e.g. machine translation)

Many-to-many synced – e.g. frame per frame video analysis

13/12/2018Deep Learning for Data Mining

Page 52: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

LSTMs – use case examples

Deep Learning for Data Mining

Visual Sequence Tasks

Jeff Donahue et al. CVPR’15

68

Based on Long short-term memory (LSTM) layers

13/12/2018

Page 53: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Using LSTMs

Some videos:

- Facebook AI

- Music classification

- Action recognition SecondHands project

13/12/2018Deep Learning for Data Mining

Page 54: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Beyond Regression/Classification

Generative models

• Variational Auto-Encoders (VAEs)

– focus on learning latent space structure

• Generative Adversarial Networks (GANs)

– focus on learning a distribution

– no latent space (in general)

13/12/2018Deep Learning for Data Mining

Page 55: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Goal

Sample from the input data distribution 𝒳

Idea

Invert a (Convolutional) Neural Network

13/12/2018Deep Learning for Data Mining

Page 56: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Encoder (conventional CNN)

• processes an image and produces a vector/code

13/12/2018Deep Learning for Data Mining

Page 57: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Decoder

• receives a code and produces an image

• uses “deconvolutional” layers

13/12/2018Deep Learning for Data Mining

Material from Frans’ blog

Page 58: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Problem

How to train the decoder to produce meaningful data?

Idea

Adversarial Training

13/12/2018Deep Learning for Data Mining

Page 59: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

GAN is a combination of two networks

1. a generator network (decoder)

2. a discriminator network (critic)

13/12/2018Deep Learning for Data Mining

Page 60: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Roles

• Generator: produces realistic samples of 𝒳

• Discriminator: identifies if a sample actually comes from the (unknown) 𝒳 or not

Training

make the networks compete with each other

• generator tries to fool the discriminator in believing that the sample is ‘real’

• discriminator tries to discriminate as good as possible ‘real’ from ‘fake’ samples

13/12/2018Deep Learning for Data Mining

Page 61: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Example on CIFAR dataset (32x32 images)

Are these real images or generated?

13/12/2018Deep Learning for Data Mining

Page 62: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Example on CIFAR dataset (32x32 images)

Are these real images or generated?

13/12/2018Deep Learning for Data Mining

Page 63: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Example on CIFAR dataset (32x32 images)

Results at 300, 900 and 5700 iterations

13/12/2018Deep Learning for Data Mining

Page 64: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

GANs

Example: Celebrity faces (1024x1024 images)

13/12/2018Deep Learning for Data Mining

Page 65: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

Goal

• modify data in specific directions

• identify meaningful directions in latent space

Examples

- ‘distort’ faces (change expression, add glasses)

- produce digits from different hand-writing styles

- distort 3D meshes

- ... 13/12/2018Deep Learning for Data Mining

Page 66: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

What is an auto-encoder?

- a combination of an encoder and a decoder

- trained based on reconstruction loss

- provides low-dimensional representation

13/12/2018Deep Learning for Data Mining

Material from Kurita’s blog

Page 67: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

Goal

Feed vectors and get realistic samples of 𝒳

AE problem: we don’t know if the vectors are valid or not

13/12/2018Deep Learning for Data Mining

Page 68: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

Main idea:

Encoder produces a distribution instead of a vector

Decoder operates on samples from this distribution

13/12/2018Deep Learning for Data Mining

Page 69: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

Problems

1. How to produce a distribution?

2. How to prevent degeneration?

Solutions

1. parametric distributions – typically Gaussian

– produce mean 𝜇 and variance Σ

2. add loss term based on Kullback-Leiblerdivergence

13/12/2018Deep Learning for Data Mining

Page 70: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

One more problem:

- Sampling operation is not differentiable

Solution:

- re-parametrization

13/12/2018Deep Learning for Data Mining

Images from Keita Kurita’s blog

Page 71: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

Example: faces (Hou et al., 2016)

13/12/2018Deep Learning for Data Mining

Page 72: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

VAEs

Example: 3D Mesh deformation (Tan et al., 2018)

13/12/2018Deep Learning for Data Mining

Page 73: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Further directions

Few-shot and zero-shot learning

• meta-learner architectures

• Prototypical networks

13/12/2018Deep Learning for Data Mining

Page 74: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Further directions

Few-shot and zero-shot learning

• meta-learner architectures

• Prototypical networks

13/12/2018Deep Learning for Data Mining

One –show learning example: Mullet Zero–show learning example: Zebra (from horse)

Page 75: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Further directions

Few-shot and zero-shot learning

• meta-learner architectures

• Prototypical networks

Learning on manifolds

• learning functions defined on surfaces

13/12/2018Deep Learning for Data Mining

Page 76: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Further directions

Few-shot and zero-shot learning

• meta-learner architectures

• Prototypical networks

Learning on manifolds

• learning functions defined on surfaces

• learning functions defined on graphs

13/12/2018Deep Learning for Data Mining

Page 77: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Further directions

Few-shot and zero-shot learning

• meta-learner architectures

• Prototypical networks

Learning on manifolds

• learning functions defined on surfaces

• learning functions defined on graphs

Learning with constraints

• Frank-Wolf networks

13/12/2018Deep Learning for Data Mining

Page 78: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Resources

Frameworks:

• TensorFlow (Google) | C/C++, Python, Java, Go

• Torch/PyTorch (Facebook) | Lua/Python

• Caffe/Caffe 2 (UC Berkeley) | C/C++, Python, Matlab

• Theano (U Montreal) | Python

• CNTK (Microsoft) | Python, C++ , C#/.Net, Java

• MxNet (DMLC) | Python, C++, R, Perl, …

• Darknet (Redmon J.) | C

• …

Deep Learning for Data Mining 13/12/2018

Page 79: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Resources

High-level libraries:

• Keras | Backends: TensorFlow (TF), Theano

Models:

• Depends on the framework, e.g.

– https://github.com/BVLC/caffe/wiki/Model-Zoo (Caffe)

– https://github.com/tensorflow/models/tree/master/research (TF)

Interactive Interfaces:

• DIGITS (NVIDIA) | Caffe, TF, Torch

• TensorBoard (TF)

Tools:

• http://ethereon.github.io/netscope (for networks defined in protobuf )

Deep Learning for Data Mining 13/12/2018

Page 80: Deep Learning for Data Mining - Aris Anagnostopoulos - Homearis.me/contents/teaching/data-mining-2018/slides/DeepLearningDat… · Algorithms Deep Learning for Data Mining ... ILSVRC14

Thank you!

13/12/2018Deep Learning for Data Mining