basics of neural network programming - deep learningcs230.stanford.edu/files/c1m2_old.pdf · andrew...

50
Basics of Neural Network Programming Binary Classification deeplearning.ai

Upload: others

Post on 05-Apr-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

BasicsofNeuralNetworkProgramming

BinaryClassificationdeeplearning.ai

Page 2: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

1 (cat) vs 0 (non cat)

255 134 93 22123 94 83 234 44 187 3034 76 232 12467 83 194 142

255 134 202 22123 94 83 434 44 187 19234 76 232 3467 83 194 94

255 231 42 22123 94 83 234 44 187 9234 76 232 12467 83 194 202

RedGreen

Blue

Binary Classification

Page 3: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Notation

Page 4: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

LogisticRegressiondeeplearning.ai

Page 5: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic Regression

Page 6: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

LogisticRegression

cost functiondeeplearning.ai

Page 7: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic Regression cost function•

Loss (error) function:

Page 8: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

LogisticRegression

cost functiondeeplearning.ai

Page 9: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic Regression cost function•

Loss (error) function:

Page 10: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

BasicsofNeuralNetworkProgramming

GradientDescentdeeplearning.ai

Page 11: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Gradient DescentRecap: !" = % &'( + * , % + = ,

,-./0

1 &, * =1456

78,ℒ(!" 7 , !(7)) =− 1

456

78,!(7) log !" 7 + (1 − !(7)) log(1 − !" 7 )

Want to find &, * that minimize 1 &, *

*

1 &, *

&

Page 12: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Gradient Descent

&

Page 13: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

BasicsofNeuralNetworkProgramming

LogisticRegressionasaNeuralNetworkdeeplearning.ai

Page 14: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic Regression as a Neural Network!" = % &'( + * ,where % + = ,

,-./0

Page 15: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

Derivativesdeeplearning.ai

Page 16: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Intuition about derivatives

Page 17: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

More derivativesexamplesdeeplearning.ai

Page 18: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Intuition about derivatives

Page 19: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

More derivative examples

Page 20: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

Computation Graphdeeplearning.ai

Page 21: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Computation Graph

Page 22: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

Derivatives with aComputation Graphdeeplearning.ai

Page 23: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Computing derivatives

611 33

Page 24: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Computing derivatives

611 33

Page 25: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

LogisticRegression

Gradient descentdeeplearning.ai

Page 26: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic regression recap

Page 27: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic regression derivatives

b

Page 28: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

Gradient descenton m examplesdeeplearning.ai

Page 29: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic regression on m examples

Page 30: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic regression on m examples

Page 31: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

Vectorizationdeeplearning.ai

Page 32: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

What is vectorization?

Page 33: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

More vectorizationexamplesdeeplearning.ai

Page 34: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Neural network programming guidelineWhenever possible, avoid explicit for-loops.

Page 35: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Neural network programming guidelineWhenever possible, avoid explicit for-loops.

Page 36: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Vectors and matrix valued functionsSay you need to apply the exponential operation on every element of amatrix/vector.

u[i]=math.exp(v[i])

u = np.zeros((n,1)) for i in range(n):

Page 37: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic regression derivativesJ = 0, dw1 = 0, dw2 = 0, db = 0

for i = 1 ton:

Page 38: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

VectorizingLogistic

Regressiondeeplearning.ai

Page 39: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Vectorizing Logistic Regression

Page 40: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Basics of NeuralNetwork Programming

Vectorizing LogisticRegression’s

GradientComputation

deeplearning.ai

Page 41: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Vectorizing Logistic Regression

Page 42: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Implementing Logistic Regression

for i = 1 tom:

Page 43: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

BasicsofNeuralNetworkProgramming

BroadcastinginPythondeeplearning.ai

Page 44: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

Broadcasting example

cal = A.sum(axis = 0)percentage = 100*A/(cal.reshape(1,4))

Apples Beef Eggs PotatoesCarb

Fat

56.0 0.0 4.4 68.01.2 104.0 52.0 8.01.8 135.0 99.0 0.9

Protein

CaloriesfromCarbs,Proteins,Fatsin100gofdifferentfoods:

Page 45: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

101 102 103204 205 206

1 2 34 5 6

100200+ =

101 202 303104 205 306

100 200 3001 2 34 5 6 + =

1234

100101102103104

+ =

Broadcasting example

Page 46: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

General Principle

Page 47: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

BasicsofNeuralNetworkProgramming

Explanationoflogisticregressioncostfunction

(Optional)deeplearning.ai

Page 48: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Logistic regression cost function

Page 49: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

If$ = 1: ( $ ) = $*If$ = 0: ( $ ) = 1 − $*

Logistic regression cost function

Page 50: Basics of Neural Network Programming - Deep Learningcs230.stanford.edu/files/C1M2_old.pdf · Andrew Ng 1 (cat) vs 0 (non cat) 255134 93 22 123 94 83 2 34 44 187 30 34 76 232124 67

AndrewNg

Cost on m examples