midterm review - cs230 deep learningcs230.stanford.edu/fall2018/midterm_review.pdf · midterm...

78
Midterm Review CS230 Fall 2018

Upload: others

Post on 28-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Midterm ReviewCS230 Fall 2018

Page 2: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Broadcasting

Page 3: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Calculating Means

How would you calculate the means across the rows of the following matrix? How about the columns?

Page 4: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Calculating Means

How would you calculate the means across the rows of the following matrix? How about the columns?

Rows: row_mu = np.sum(M, axis=1) / M.shape[1]

Page 5: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Calculating Means

How would you calculate the means across the rows of the following matrix? How about the columns?

Rows: row_mu = np.sum(M, axis=1) / M.shape[1]Cols: col_mu = np.sum(M, axis=0) / M.shape[0]

Page 6: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Softmax

How would you compute the softmax across the columns of the following matrix?

Page 7: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Softmax

How would you compute the softmax across the columns of the following matrix?

exp = np.exp(M)

Page 8: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Softmax

How would you compute the softmax across the columns of the following matrix?

exp = np.exp(M)

smx = exp / np.sum(exp, axis=0)

Page 9: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Distances

How would you compute the closest column in the matrix X to the vector V (in terms of Euclidean distance)?

Page 10: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means
Page 11: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Distances

How would you compute the closest column in the matrix X to the vector V (in terms of Euclidean distance)?

Page 12: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Distances

How would you compute the closest column in the matrix X to the vector V (in terms of Euclidean distance)?

sq_diff = np.square(X-V)

Page 13: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Distances

How would you compute the closest column in the matrix X to the vector V (in terms of Euclidean distance)?

sq_diff = np.square(X-V)dists = np.sqrt(np.sum(sq_diff, axis=0))

Page 14: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Computing Distances

How would you compute the closest column in the matrix X to the vector V (in terms of Euclidean distance)?

sq_diff = np.square(X-V)dists = np.sqrt(np.sum(sq_diff, axis=0))nearest = np.argmin(dists)

Page 15: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

L1/L2 Regularization

Page 16: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Logistic Regression and Separable Data

What’s the issue with training a logistic regression model on the following data?

Page 17: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means
Page 18: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Logistic Regression and Separable Data

What’s the issue with training a logistic regression model on the following data?

Page 19: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Logistic Regression and Separable Data

What’s the issue with training a logistic regression model on the following data?

The parameters will tend to plus/minus infinity! So, it will never converge.

Page 20: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Solving the Exploding Weights Issue

What modification of the loss function can you implement so solve this issue? Write out the new loss function.

Page 21: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Solving the Exploding Weights Issue

What modification of the loss function can you implement so solve this issue? Write out the new loss function.

Add L2 Regularization

Lkj

This new loss function will keep the magnitude of the weights from exploding!

Page 22: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Gradient of the New Loss

Compute the gradient of the weight vector with respect to this new loss function.

Page 23: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Gradient of the New Loss

Compute the gradient of the new loss function with respect to the weight vector.

Page 24: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Another Solution...

What is another, similar modification to the loss function that could help with this issue? Compute its gradient?

Page 25: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Another Solution...

What is another, similar modification to the loss function that could help with this issue? Compute its gradient?

Add L1 Regularization:

Page 26: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Backprop

Page 27: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means
Page 28: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means
Page 29: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

CNN Input/Output Sizes

Page 30: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic (no padding, stride 1)

Input Filter

Page 31: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 32: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 33: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 34: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 35: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 36: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 37: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 38: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Page 39: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Input Filter Conv Output

Page 40: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Input Filter Conv Output

Shape = n - f + 1

Page 41: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Padding0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0

Input Filter

Page 42: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Padding0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0

Input Filter Conv Output

Page 43: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Padding0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0

Input Filter Conv Output

Shape = n + 2p- f + 1

Page 44: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Valid and Same Convolutions

● Valid ○ No padding○ Output shape -> n - f + 1

● Same ○ Pad so that input is same as output size○ Output shape -> n + 2p - f + 1

Page 45: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Stride

Input Filter

Page 46: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Stride

Page 47: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Stride

Page 48: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Stride

Page 49: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Stride

Page 50: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Basic

Input Filter Conv Output

Shape = (n - f)/s + 1

Page 51: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

With Stride

n x n image

f x f filter

p padding

s stride

Output size -> (n + 2p - f)/s + 1

Page 52: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Maxpool

Page 53: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input

Page 54: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input Output

Size of output (n-f)/s + 1

Page 55: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input

10

Output

Page 56: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input

10 5

Output

Page 57: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input

10 5

6

Output

Page 58: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input

10 5

6 9

Output

Page 59: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Forward prop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

2 x 2 Pooling layer with stride 2

Input

10 5

6 9

Output

Page 60: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Backprop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

Input to maxpool layer

10 5

6 9

Output of maxpool layer

-4 7

5 -6

Gradient w.r.t output

Page 61: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Backprop

? ? ? ?

? ? ? ?

? ? ? ?

? ? ? ?

Gradient w.r.t input

-4 7

5 -6

Gradient w.r.t output

Page 62: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

ReLU

Page 63: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Maxpool

if mij ≠ max(m) Maxpool(mij) Maxpool(mij)

Maxpool(mij) if mij ≠ max(m)

if mij = max(m)

if mij = max(m)

Page 64: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Backprop

1 3 2 1

4 10 5 1

1 6 6 5

2 4 2 9

Keep track of where the maximum value is

Input

10 5

6 9

Output

0 0 0 0

0 1 1 0

0 1 0 0

0 0 0 1

Mask

Page 65: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Backprop

0 0 0 0

0 1 1 0

0 1 0 0

0 0 0 1

Mask

-4 7

5 -6

Gradient w.r.t output

0 0 0 0

0 -4 5 0

0 5 0 0

0 0 0 -6

Gradient w.r.t input

Page 66: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Error Analysis

Page 67: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Dog Classifier

Trying to predict dog vs not dog.

Page 68: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Improving performance

Two kinds of errors - misclassification on muffins and fried chicken

https://medium.freecodecamp.org/chihuahua-or-muffin-my-search-for-the-best-computer-vision-api-cbda4d6b425d

https://barkpost.com/doodle-or-fried-chicken-twitter/

Page 69: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Error analysis

- Get 100 examples on dev set

Image number Classified as muffin

Classified as chicken

... Comments

1 Y -

2 - Y

... - -

100 - Y

5% 50%

Page 70: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Error Analysis

Page 71: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Strategic Data Acquisition

Page 72: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Trigger Word Detection

Page 73: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Classification

Page 74: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Dropout

Page 75: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means
Page 76: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means
Page 77: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means

Batchnorm

Page 78: Midterm Review - CS230 Deep Learningcs230.stanford.edu/fall2018/midterm_review.pdf · Midterm Review CS230 Fall 2018. Broadcasting. Calculating Means How would you calculate the means