deep learning - uppsala university...deep learning: ai go player an ai defeated a human professional...

52
Deep Learning Niklas Wahlstr¨ om Department of Information Technology, Uppsala University, Sweden September 21, 2017 [email protected] Guest lecture: Empirical modeling (HT 2017)

Upload: others

Post on 25-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Deep Learning

Niklas Wahlstrom

Department of Information Technology, Uppsala University, Sweden

September 21, 2017

[email protected] Guest lecture: Empirical modeling (HT 2017)

Page 2: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Deep Learning: Caption generation

Generate caption automatically from images

Xu, K., Lei Ba, J., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R. Richard S. Zemel, R. S., and Bengio, Y.Show, attend and tell: neural image caption generation with visual attention. In Proceedings of the 32ndInternational Conference on Machine Learning (ICML), Lille, France, July, 2015.

1 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 3: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Deep learning: AI Go player

An AI defeated a human professional forthe first time in the game of Go

Silver, D. et al. Mastering the game of Go with deep neural networks and tree search, Nature, Vol 529, 484–489(2016)

2 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 4: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Outline

1. What is a neural network (NN)?

2. Why do deep neural networks work so well?

a) Why neural networks?b) Why deep?

3. Some comment, pointers and summary

3 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 5: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Outline

1. What is a neural network (NN)?

2. Why do deep neural networks work so well?

a) Why neural networks?b) Why deep?

3. Some comment, pointers and summary

4 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 6: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – background

One recent result on the use of deep learning in medicine -Detecting skin cancer (February 2017)Andre Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M. and Thrun, S. Dermatologist-levelclassification of skin cancer with deep neural networks. Nature, 542, 115–118, February, 2017.

Some background figures (from the US) on skin cancer:

I Melanomas represents less than 5% of all skin cancers, butaccounts for 75% of all skin-cancer-related deaths.

I Early detection absolutely critical. Estimated 5-year survivalrate for melanoma: Over 99% if detected in its earlier stagesand 14% is detected in its later stages.

5 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 7: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – background

One recent result on the use of deep learning in medicine -Detecting skin cancer (February 2017)Andre Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M. and Thrun, S. Dermatologist-levelclassification of skin cancer with deep neural networks. Nature, 542, 115–118, February, 2017.

Some background figures (from the US) on skin cancer:

I Melanomas represents less than 5% of all skin cancers, butaccounts for 75% of all skin-cancer-related deaths.

I Early detection absolutely critical. Estimated 5-year survivalrate for melanoma: Over 99% if detected in its earlier stagesand 14% is detected in its later stages.

5 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 8: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – taxonomy used

Image copyright Nature doi:10.1038/nature21056)

6 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 9: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – task

Image copyright Nature (doi:10.1038/nature21056)

7 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 10: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – solution (ultrabrief)

Start from a neural network trained on 1.28 million images(transfer learning).

Make minor modifications to this model, specializing to presentsituation.

Learn new model parametersusing 129 450 clinical images(∼ 100 times more images thanany previous study).

?

Unseen data

Modelprediction

8 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 11: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – solution (ultrabrief)

Start from a neural network trained on 1.28 million images(transfer learning).

Make minor modifications to this model, specializing to presentsituation.

Learn new model parametersusing 129 450 clinical images(∼ 100 times more images thanany previous study).

?

Unseen data

Modelprediction

8 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 12: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – solution (ultrabrief)

Start from a neural network trained on 1.28 million images(transfer learning).

Make minor modifications to this model, specializing to presentsituation.

Learn new model parametersusing 129 450 clinical images(∼ 100 times more images thanany previous study).

?

Unseen data

Modelprediction

8 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 13: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – indication of the results

sensitivity =true positive

positivespecificity =

true negative

negative

Image copyright Nature (doi:10.1038/nature21056)

9 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 14: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Skin cancer – indication of the results

sensitivity =true positive

positivespecificity =

true negative

negative

Image copyright Nature (doi:10.1038/nature21056)

9 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 15: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Constructing an NN for regression

A neural network (NN) is a nonlinear function y = gθ(ϕ)from an input variable ϕ to an output variable y

parameterized by θ.

Linear regression models the relationship between a continuoustarget variable y and an input variable ϕ,

y =n∑i=1

ϕiθi + θ0 = ϕTθ,

where θ is the parameters composed by the “weights” θi and theoffset (“bias”) term θ0,

θ =(θ0 θ1 θ2 · · · θn

)T,

ϕ =(1 ϕ1 ϕ2 · · · ϕn

)T.

10 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 16: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Generalized linear regression

We can generalize this by introducing nonlinear transformations ofthe predictor ϕTθ,

y = σ(ϕTθ)....

1ϕ1

ϕn

σ y

θ0

θn

We call σ(x) the activation function. Two common choices are:

−5 5

1

x

σ(x)

Sigmoid: σ(x) = 11+e−x

−1 1

1

x

σ(x)

ReLU: σ(x) = max(0, x)

Let us consider an example of a feed-forward NN, indicating thatthe information flows from the input to the output layer.

11 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 17: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Generalized linear regression

We can generalize this by introducing nonlinear transformations ofthe predictor ϕTθ,

y = σ(ϕTθ)....

1ϕ1

ϕn

σ y

θ0

θnWe call σ(x) the activation function. Two common choices are:

−5 5

1

x

σ(x)

Sigmoid: σ(x) = 11+e−x

−1 1

1

x

σ(x)

ReLU: σ(x) = max(0, x)

Let us consider an example of a feed-forward NN, indicating thatthe information flows from the input to the output layer.

11 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 18: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Generalized linear regression

We can generalize this by introducing nonlinear transformations ofthe predictor ϕTθ,

y = σ(ϕTθ)....

1ϕ1

ϕn

σ y

θ0

θnWe call σ(x) the activation function. Two common choices are:

−5 5

1

x

σ(x)

Sigmoid: σ(x) = 11+e−x

−1 1

1

x

σ(x)

ReLU: σ(x) = max(0, x)

Let us consider an example of a feed-forward NN, indicating thatthe information flows from the input to the output layer.

11 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 19: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Generalized linear regression

We can generalize this by introducing nonlinear transformations ofthe predictor ϕTθ,

y = σ(ϕTθ)....

1ϕ1

ϕn

σ y

θ0

θnWe call σ(x) the activation function. Two common choices are:

−5 5

1

x

σ(x)

Sigmoid: σ(x) = 11+e−x

−1 1

1

x

σ(x)

ReLU: σ(x) = max(0, x)

Let us consider an example of a feed-forward NN, indicating thatthe information flows from the input to the output layer.

11 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 20: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Generalized linear regression

We can generalize this by introducing nonlinear transformations ofthe predictor ϕTθ,

y = σ(ϕTθ)....

1ϕ1

ϕn

σ y

θ0

θnWe call σ(x) the activation function. Two common choices are:

−5 5

1

x

σ(x)

Sigmoid: σ(x) = 11+e−x

−1 1

1

x

σ(x)

ReLU: σ(x) = max(0, x)

Let us consider an example of a feed-forward NN, indicating thatthe information flows from the input to the output layer.

11 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 21: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linearregression models.

...

1

ϕ1

ϕn

σz1

yσ...

σzM

11

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z1 = σ(+∑n

j=1ϕj

)z2 = σ

(+∑n

j=1ϕj

)...

zM = σ(+∑n

j=1ϕj

)

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 22: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linear regressionmodels.

...

1

ϕ1

ϕn

σz1

y

σ...

σzM

11

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z1 = σ(θ(1)01 +

∑n

j=1θ(1)j1 ϕj

)

z2 = σ(+∑n

j=1ϕj

)...

zM = σ(+∑n

j=1ϕj

)

y = θ(2)1 z1

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 23: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linear regressionmodels.

...

1

ϕ1

ϕn

σz1

...

σzM

11

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z1 = σ(θ(1)01 +

∑n

j=1θ(1)j1 ϕj

)z2 = σ

(θ(1)02 +

∑n

j=1θ(1)j2 ϕj

)

...

zM = σ(+∑n

j=1ϕj

)

y =

2∑m=1

θ(2)m zm

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 24: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linear regressionmodels.

...

1

ϕ1

ϕn

σz1

yσ...

σzM

11

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z1 = σ(θ(1)01 +

∑n

j=1θ(1)j1 ϕj

)z2 = σ

(θ(1)02 +

∑n

j=1θ(1)j2 ϕj

)...

zM = σ(θ(1)0M +

∑n

j=1θ(1)jMϕj

)y =

M∑m=1

θ(2)m zm

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 25: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linear regressionmodels.

...

1

ϕ1

ϕn

σz1

yσ...

σzM

1

1

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z1 = σ(θ(1)01 +

∑n

j=1θ(1)j1 ϕj

)z2 = σ

(θ(1)02 +

∑n

j=1θ(1)j2 ϕj

)...

zM = σ(θ(1)0M +

∑n

j=1θ(1)jMϕj

)y = θ

(2)0 +

M∑m=1

θ(2)m zm

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 26: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linear regressionmodels.

...

1

ϕ1

ϕn

σ

yσ...

σ

1

1

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z = σ(WT1 ϕ+ bT

1 )

b1 = [ θ(1)01 ... θ(1)0M

]

W1 =

θ(1)01 ... θ

(1)0M

... ......

θ(1)n1 ... θ

(1)nM

y =WT

2 z+ bT2

b2 = [ θ(1)0 ]

W2 =

θ(2)0

...θ(2)M

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 27: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several several linearregression models.

...

1

ϕ1

ϕn

σ

yσ...

σ

1

1

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Outputs

z = σ(WT1 ϕ+ bT

1 )

y =WT2 Z+ bT

2

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 28: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Neural network - construction

A NN is a sequential construction of several linearregression models.

...

1

ϕ1

ϕn

σ

σ...

σ

z(1)1

z(1)2

z(1)M1

11

...

σ

σ

σ

z(2)1

z(2)2

z(2)M2

y

Inputs Hidden units Hidden units Outputs

z(1) = σ(WT1 ϕ+ bT

1 )

z(2) = σ(WT2 z

(1) + bT2 )

y =WT3 z

(2) + bT3

The model learns better using adeep network (several layers)instead of a wide and shallownetwork.

12 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 29: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Outline

1. What is a neural network (NN)?

2. Why do deep neural networks work so well?a) Why neural networks?b) Why deep?

3. Some comment, pointers and summary

13 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 30: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Why neural networks?

Continuous multiplication gateA neural network with only four hidden units can modelmultiplication of two numbers arbitrarily well.

ϕ1

ϕ2

f

f

f

f

y ≈ ϕ1 ∗ ϕ2

λ

λ−λ−λλ

−λ−λλ

µ

µ

−µ−µ

If we choose µ = 14λ2f ′′(0) then y → ϕ1 ∗ ϕ2 when λ→ 0.

Henry W. Lin and Max Tegmark. (2016) Why does deep and cheap learning work so well?, arXiv

14 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 31: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

A regression example

Input: u ∈ R1000

Output: y ∈ RTask: Model a quadratic relationship between y and u

Linear regression

y = u1u1θ1,1 + u1u2θ1,2 + · · ·+ u1000u1000θ1000,1000 = ϕTθ

where

ϕ =[u1u1 u1u2 . . . u1000u1000

]Tθ =

[θ1,1 θ1,2 . . . θ1000,1000

]TRequires ≈ 1’000∗1’000

2 = 500’000 parameters!

Neural networkTo model all products with a neural network we would need4 ∗ 500’000 = 2 ∗ 106 hidden units and hence 2 billion parameters...

... ...

u1

u2

u3

u1000

z1

z2’000’000

y

1000 ∗ (2 ∗ 106) + 2 ∗ 106 ≈ 2 ∗ 109 param.

15 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 32: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

A regression example

Input: u ∈ R1000

Output: y ∈ RTask: Model a quadratic relationship between y and u

Linear regression

y = u1u1θ1,1 + u1u2θ1,2 + · · ·+ u1000u1000θ1000,1000 = ϕTθ

where

ϕ =[u1u1 u1u2 . . . u1000u1000

]Tθ =

[θ1,1 θ1,2 . . . θ1000,1000

]TRequires ≈ 1’000∗1’000

2 = 500’000 parameters!

Neural networkTo model all products with a neural network we would need4 ∗ 500’000 = 2 ∗ 106 hidden units and hence 2 billion parameters...

... ...

u1

u2

u3

u1000

z1

z2’000’000

y

1000 ∗ (2 ∗ 106) + 2 ∗ 106 ≈ 2 ∗ 109 param.

15 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 33: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

A regression example

Input: u ∈ R1000

Output: y ∈ RTask: Model a quadratic relationship between y and u

Linear regression

y = u1u1θ1,1 + u1u2θ1,2 + · · ·+ u1000u1000θ1000,1000 = ϕTθ

where

ϕ =[u1u1 u1u2 . . . u1000u1000

]Tθ =

[θ1,1 θ1,2 . . . θ1000,1000

]TRequires ≈ 1’000∗1’000

2 = 500’000 parameters!

Neural networkTo model all products with a neural network we would need4 ∗ 500’000 = 2 ∗ 106 hidden units and hence 2 billion parameters...

... ...

u1

u2

u3

u1000

z1

z2’000’000

y

1000 ∗ (2 ∗ 106) + 2 ∗ 106 ≈ 2 ∗ 109 param.

15 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 34: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

A regression example (cont.)

Input: u ∈ R1000

Output: y ∈ RTask: Model a quadratic relationship between y and u

Assume that only 10 of the regressors uiuj are of importance

Linear regression

y = u1u1θ1,1 + u1u2θ1,2 + · · ·+ u1000u1000θ1000,1000 = ϕTθ

where

ϕ =[u1u1 u1u2 . . . u1000u1000

]Tθ =

[θ1,1 θ1,2 . . . θ1000,1000

]TYou probably want to regularize, but 500’000 parameters are stillrequired!

Neural networkTo model 10 products with a neural network we would need 4*10hidden units, i.e. leading to only ≈ 40’000 parameters!

...

...

u1

u2

u3

u1000

z1

z40y

1000*40 + 40 = 40’040

16 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 35: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

A regression example (cont.)

Input: u ∈ R1000

Output: y ∈ RTask: Model a quadratic relationship between y and u

Assume that only 10 of the regressors uiuj are of importance

Linear regression

y = u1u1θ1,1 + u1u2θ1,2 + · · ·+ u1000u1000θ1000,1000 = ϕTθ

where

ϕ =[u1u1 u1u2 . . . u1000u1000

]Tθ =

[θ1,1 θ1,2 . . . θ1000,1000

]TYou probably want to regularize, but 500’000 parameters are stillrequired!

Neural networkTo model 10 products with a neural network we would need 4*10hidden units, i.e. leading to only ≈ 40’000 parameters!

...

...

u1

u2

u3

u1000

z1

z40y

1000*40 + 40 = 40’04016 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 36: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Outline

1. What is a neural network (NN)?

2. Why do deep neural networks work so well?a) Why neural networks?b) Why deep?

3. Some comment, pointers and summary

17 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 37: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Why deep? - A regression example

I Consider the same example. Now we want a model withcomplexity corresponding to polynomials of degree 1’000.

I Keep 250 products in each layer⇒ 250∗4=1’000 hidden units.

......

......

......

......

......

...

u1

u2

u3

u1000

z(1)1

z(1)1000

z(10)1

z(10)1000

y

106+106+106+106+106+106+106+106+106+106+103︸ ︷︷ ︸≈107 parameters

Linear regression would require ≈ 10001000

1000! parameters to modelsuch a relationship...

18 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 38: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Why deep? - Image classification

Example: Image classification

Input: pixels of an imageOutput: object identityEach hidden layer extractsincreasingly abstractfeatures.

Zeiler, M. D. and Fergus, R. Visualizing and understanding convolutional networks

Computer Vision - ECCV (2014).

19 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 39: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Deep neural networks

Deep learning methods allow a machine to make use of raw datato automatically discover the representations (abstractions) thatare necessary to solve a particular task.

It is accomplished by using multiple levels of representation.Each level transforms the representation at the previous level into anew and more abstract representation,

z(l+1) = f(W (l+1)z(l) + b(l+1)

),

starting from the input (raw data) z(0) = u.

Key aspect: The layers are not designed by human engineers,they are generated from (typically lots of) data using a learningprocedure and lots of computations.

20 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 40: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Deep neural networks

Deep learning methods allow a machine to make use of raw datato automatically discover the representations (abstractions) thatare necessary to solve a particular task.

It is accomplished by using multiple levels of representation.Each level transforms the representation at the previous level into anew and more abstract representation,

z(l+1) = f(W (l+1)z(l) + b(l+1)

),

starting from the input (raw data) z(0) = u.

Key aspect: The layers are not designed by human engineers,they are generated from (typically lots of) data using a learningprocedure and lots of computations.

20 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 41: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Deep neural networks

Deep learning methods allow a machine to make use of raw datato automatically discover the representations (abstractions) thatare necessary to solve a particular task.

It is accomplished by using multiple levels of representation.Each level transforms the representation at the previous level into anew and more abstract representation,

z(l+1) = f(W (l+1)z(l) + b(l+1)

),

starting from the input (raw data) z(0) = u.

Key aspect: The layers are not designed by human engineers,they are generated from (typically lots of) data using a learningprocedure and lots of computations.

20 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 42: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Outline

1. What is a neural network (NN)?

2. Why do deep neural networks work so well?

a) Why neural networks?b) Why deep?

3. Some comment, pointers and summary

21 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 43: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Some comments - Why now?

Neural networks have been around for more than fifty years. Whyhave they become so popular now (again)?

To solve really interesting problems you need:

1. Efficient learning algorithms

2. Efficient computational hardware

3. A lot of labeled data!

These three factors have not been fulfilled to a satisfactory leveluntil the last 5-10 years.

22 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 44: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Some pointers

A book has recently been writtenI. Goodfellow, Y. Bengio and A. Courville Deep learning MIT Press, 2016

http://www.deeplearningbook.org/

A well written introduction:LeCun, Y., Bengio, Y., and Hinton, G. (2015) Deep learning, Nature, 521(7553), 436–444.

Details about the multiplication gate

Henry W. Lin and Max Tegmark. (2016) Why does deep and cheap learning work so well?, arXiv

You will also find more material than you can possibly want here

http://deeplearning.net/

23 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 45: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Some pointers

A book has recently been writtenI. Goodfellow, Y. Bengio and A. Courville Deep learning MIT Press, 2016

http://www.deeplearningbook.org/

A well written introduction:LeCun, Y., Bengio, Y., and Hinton, G. (2015) Deep learning, Nature, 521(7553), 436–444.

Details about the multiplication gate

Henry W. Lin and Max Tegmark. (2016) Why does deep and cheap learning work so well?, arXiv

You will also find more material than you can possibly want here

http://deeplearning.net/

23 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 46: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Some pointers

A book has recently been writtenI. Goodfellow, Y. Bengio and A. Courville Deep learning MIT Press, 2016

http://www.deeplearningbook.org/

A well written introduction:LeCun, Y., Bengio, Y., and Hinton, G. (2015) Deep learning, Nature, 521(7553), 436–444.

Details about the multiplication gate

Henry W. Lin and Max Tegmark. (2016) Why does deep and cheap learning work so well?, arXiv

You will also find more material than you can possibly want here

http://deeplearning.net/

23 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 47: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Some pointers

A book has recently been writtenI. Goodfellow, Y. Bengio and A. Courville Deep learning MIT Press, 2016

http://www.deeplearningbook.org/

A well written introduction:LeCun, Y., Bengio, Y., and Hinton, G. (2015) Deep learning, Nature, 521(7553), 436–444.

Details about the multiplication gate

Henry W. Lin and Max Tegmark. (2016) Why does deep and cheap learning work so well?, arXiv

You will also find more material than you can possibly want here

http://deeplearning.net/

23 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 48: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Summary

A neural network (NN) is a nonlinear function y = gθ(u)from an input variable u to an output variable y

parameterized by θ.

We can think of an NN as a sequential/recursive construction ofseveral generalized linear regressions.

Deep learning refers to learning NNs with several hidden layers.Allows for data-driven models that automatically learns rep. ofdata (features) with multiple layers of abstraction.

A deep NN is very parameter efficient when modellinghigh-dimensional, complex data.

24 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 49: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Summary

A neural network (NN) is a nonlinear function y = gθ(u)from an input variable u to an output variable y

parameterized by θ.

We can think of an NN as a sequential/recursive construction ofseveral generalized linear regressions.

Deep learning refers to learning NNs with several hidden layers.Allows for data-driven models that automatically learns rep. ofdata (features) with multiple layers of abstraction.

A deep NN is very parameter efficient when modellinghigh-dimensional, complex data.

24 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 50: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Summary

A neural network (NN) is a nonlinear function y = gθ(u)from an input variable u to an output variable y

parameterized by θ.

We can think of an NN as a sequential/recursive construction ofseveral generalized linear regressions.

Deep learning refers to learning NNs with several hidden layers.Allows for data-driven models that automatically learns rep. ofdata (features) with multiple layers of abstraction.

A deep NN is very parameter efficient when modellinghigh-dimensional, complex data.

24 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 51: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Summary

A neural network (NN) is a nonlinear function y = gθ(u)from an input variable u to an output variable y

parameterized by θ.

We can think of an NN as a sequential/recursive construction ofseveral generalized linear regressions.

Deep learning refers to learning NNs with several hidden layers.Allows for data-driven models that automatically learns rep. ofdata (features) with multiple layers of abstraction.

A deep NN is very parameter efficient when modellinghigh-dimensional, complex data.

24 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)

Page 52: Deep Learning - Uppsala University...Deep learning: AI Go player An AI defeated a human professional for the rst time in the game of Go Silver, D. et al. Mastering the game of Go with

Thank you!

25 / 25 [email protected] Guest lecture: Empirical modeling (HT 2017)