cs6501: deep learning for visual recognition introduction...
TRANSCRIPT
![Page 1: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/1.jpg)
CS6501: Deep Learning for Visual RecognitionIntroduction to Machine Learning
![Page 2: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/2.jpg)
Today’s Class
Intro to Machine LearningWhat is Machine Learning?Supervised vs Unsupervised LearningSupervised Learning: Classification with K-nearest neighborsUnsupervised Learning: Clustering with K-means clusteringSoftmax Classifier
![Page 3: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/3.jpg)
Tianlu Wang([email protected])
Hours: 10am to noon on Weds at Rice 436
3
Teaching Assistants
Ziyan Yang([email protected])
Hours: 11am to 12:30pm
Tues and Thurs at Rice 436
![Page 4: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/4.jpg)
Also…
• Assignment 2 released.
• Subscribe and check Piazza regularly, important information about assignments will go there. Please don’t email the instructor with questions that could be answered through Piazza.
![Page 5: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/5.jpg)
Image Features: HoG
Scikit-image implementation
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.
Images by Satya Mallick https://www.learnopencv.com/histogram-of-oriented-gradients/
![Page 6: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/6.jpg)
Image Features: HoG
Scikit-image implementation
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.
Images by Satya Mallick https://www.learnopencv.com/histogram-of-oriented-gradients/
*
!" !# !"$ + !#$Compute gradients
![Page 7: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/7.jpg)
Image Features: HoG
Scikit-image implementation
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.
Images by Satya Mallick https://www.learnopencv.com/histogram-of-oriented-gradients/
We will aggregategradient magnitudeand directionsin 8x8 pixel regions
![Page 8: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/8.jpg)
Image Features: HoG
Scikit-image implementation
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.
Images by Satya Mallick https://www.learnopencv.com/histogram-of-oriented-gradients/
Compute a histogramwith 9 bins for angles from 0 to 180
![Page 9: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/9.jpg)
Image Features: HoG
Scikit-image implementation
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.
Images by Satya Mallick https://www.learnopencv.com/histogram-of-oriented-gradients/
Normalize histograms with respect to histograms of adjacent neighbors.
![Page 10: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/10.jpg)
Image Features: HoG
Scikit-image implementation
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.
Images by Satya Mallick https://www.learnopencv.com/histogram-of-oriented-gradients/
Image (or image region) represented by a vector containing all the histograms.
In this case how long is that vector?
![Page 11: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/11.jpg)
Image Features: HoG
Paper by Navneet Dalal & Bill Triggs presented at CVPR 2005 for detecting people.Figure from Zhuolin Jiang, Zhe Lin, Larry S. Davis, ICCV 2009 for human action recognition.
+ Block Normalization
![Page 12: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/12.jpg)
slide by Fei-fei Li
Extract SIFT Feature Descriptors
Compute Histograms of Features
![Page 13: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/13.jpg)
Summary: Image Features
• Many other features proposed• LBP: Local Binary Patterns: Useful for recognizing faces.• Dense SIFT: SIFT features computed on a grid similar to the HOG features.• etc.
• Largely replaced by Neural networks• Still useful to study for inspiration in designing neural networks that
compute features.
![Page 14: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/14.jpg)
Machine Learning
• Machine learning is the subfield of computer science that gives "computers the ability to learn without being explicitly programmed.”
- term coined by Arthur Samuel 1959 while at IBM
• The study of algorithms that can learn from data.
• In contrast to previous Artificial Intelligence systems based on Logic, e.g. ”Expert Systems”
![Page 15: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/15.jpg)
Supervised Learning vs Unsupervised Learning
cat
cat
cat
dog
dog
dog
bear
bear
bear
! → $ !
![Page 16: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/16.jpg)
Supervised Learning vs Unsupervised Learning
cat
cat
cat
dog
dog
dog
bear
bear
bear
! → $ !
![Page 17: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/17.jpg)
Supervised Learning vs Unsupervised Learning
cat
cat
cat
dog
dog
dog
bear
bear
bear
! → $ !
Classification Clustering
![Page 18: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/18.jpg)
Supervised Learning Examplescat
Language Parsing
Face Detection
Classification
Structured Prediction
![Page 19: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/19.jpg)
Supervised Learning Examplescat = !()
= !()
= !()
![Page 20: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/20.jpg)
20
Supervised Learning – k-Nearest Neighborscat
cat
cat
dog
dog
dog
bear
bear
bear
cat, cat, dogk=3
![Page 21: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/21.jpg)
21
Supervised Learning – k-Nearest Neighborscat
cat
cat
dog
dog
dog
bear
bear
bear
bear, dog, dog
k=3
![Page 22: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/22.jpg)
22
Supervised Learning – k-Nearest Neighbors
•How do we choose the right K?•How do we choose the right features?•How do we choose the right distance metric?
![Page 23: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/23.jpg)
23
Supervised Learning – k-Nearest Neighbors
•How do we choose the right K?•How do we choose the right features?•How do we choose the right distance metric?
Answer: Just choose the one combination that works best!BUT not on the test data.
Instead split the training data into a ”Training set” and a ”Validation set” (also called ”Development set”)
![Page 24: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/24.jpg)
Training, Validation (Dev), Test Sets
Training Set Validation Set
Testing Set
![Page 25: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/25.jpg)
Training, Validation (Dev), Test Sets
Used during development
Training Set Validation Set
Testing Set
![Page 26: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/26.jpg)
Training, Validation (Dev), Test Sets
Only to be used for evaluating the model at the very end of development and any changes to the model after running it on the test set, could be influenced by what you saw happened on the test set, which would invalidate any future evaluation.
Training Set Validation Set
Testing Set
![Page 27: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/27.jpg)
27
Unsupervised Learning – k-means clustering
k = 31. Initially assign all images to a random cluster
![Page 28: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/28.jpg)
28
Unsupervised Learning – k-means clustering
k = 32. Compute the mean image (in feature space) for each cluster
![Page 29: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/29.jpg)
29
Unsupervised Learning – k-means clustering
k = 33. Reassign images to clustersbased on similarity to cluster means
![Page 30: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/30.jpg)
30
Unsupervised Learning – k-means clustering
k = 34. Keep repeatingthis processuntil convergence
![Page 31: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/31.jpg)
31
Unsupervised Learning – k-means clustering
k = 34. Keep repeatingthis processuntil convergence
![Page 32: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/32.jpg)
32
Unsupervised Learning – k-means clustering
k = 34. Keep repeatingthis processuntil convergence
![Page 33: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/33.jpg)
33
Unsupervised Learning – k-means clustering•How do we choose the right K?•How do we choose the right features?•How do we choose the right distance metric?•How sensitive is this method with respect to the random assignment of clusters?
Answer: Just choose the one combination that works best!BUT not on the test data.
Instead split the training data into a ”Training set” and a ”Validation set” (also called ”Development set”)
![Page 34: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/34.jpg)
34
Supervised Learning - Classification
cat
cat cat
dog
dog
dog
bear
bear
bear
Training Data Test Data
![Page 35: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/35.jpg)
35
Supervised Learning - Classification
cat
cat
dog
bear
Training Data Test Data
.
.
.
.
.
.
![Page 36: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/36.jpg)
36
Supervised Learning - Classification
cat
cat
dog
bear
Training Data
!" = []
!' = []
!( = []
!) = []*) = []
*( = []
*' = []
*" = []
.
.
.
![Page 37: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/37.jpg)
37
Supervised Learning - ClassificationTraining Data
1
1
2
3!" =
!$ =
!% =
!& ='& = ['&&'&%'&$'&*]
'% = ['%&'%%'%$'%*]
'$ = ['$&'$%'$$'$*]
'" = ['"&'"%'"$'"*]
.
.
.
!,- = .(',; 1)
We need to find a function thatmaps x and y for any of them.
How do we ”learn” the parametersof this function?
We choose ones that makes the following quantity small:
34567(!,- , !,)"
,9&
inputstargets /labels /ground truth
1
2
2
1!:" =
!:$ =
!:% =
!:& =predictions
![Page 38: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/38.jpg)
38
Supervised Learning – Linear SoftmaxTraining Data
1
1
2
3!" =
!$ =
!% =
!& ='& = ['&&'&%'&$'&*]
'% = ['%&'%%'%$'%*]
'$ = ['$&'$%'$$'$*]
'" = ['"&'"%'"$'"*]
.
.
.
inputstargets /labels /ground truth
![Page 39: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/39.jpg)
39
Supervised Learning – Linear SoftmaxTraining Data
[1 0 0]
[1 0 0]
[0 1 0]
[0 0 1]!" =
!$ =
!% =
!& ='& = ['&&'&%'&$'&*]
'% = ['%&'%%'%$'%*]
'$ = ['$&'$%'$$'$*]
'" = ['"&'"%'"$'"*]
.
.
.
inputstargets /labels /ground truth
[0.85 0.10 0.05]
[0.40 0.45 0.15]
[0.20 0.70 0.10]
[0.40 0.25 0.35]!," =
!,$ =
!,% =
!,& =predictions
![Page 40: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/40.jpg)
40
Supervised Learning – Linear Softmax
[1 0 0]!" =$" = [$"&$"($")$"*] !," = [-.-/-0]
1. = 2.&$"& + 2.($"( + 2.)$") + 2.*$"* + 4.1/ = 2/&$"& + 2/($"( + 2/)$") + 2/*$"* + 4/10 = 20&$"& + 20($"( + 20)$") + 20*$"* + 40
-. = 567/(567+56: + 56;)-/ = 56:/(567+56: + 56;)-0 = 56;/(567+56: + 56;)
![Page 41: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/41.jpg)
41
How do we find a good w and b?
[1 0 0]!" =$" = [$"&$"($")$"*] !," = [-.(0, 2)-4(0, 2)-5(0, 2)]
We need to find w, and b that minimize the following:
6 0, 2 =77−!",9log(!,",9))
9=&
>
"=&
Why?
=7−log(!,",?@5A?)>
"=&=7−log-",?@5A?(0, 2)
>
"=&
![Page 42: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/42.jpg)
42
Gradient Descent (GD)
!(#, %) =(−log./,01230(#, %)4
/56
7 = 0.01
for e = 0, num_epochs do
end
Initialize w and b randomly
;!(#, %)/;# ;!(#, %)/;%Compute: and
Update w:
Update b:
# = # − 7;!(#, %)/;#% = % − 7;!(#, %)/;%
Print: !(#, %) // Useful to see if this is becoming smaller or not.
expensive
![Page 43: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/43.jpg)
43
Gradient Descent (GD) (idea)
! "
"
1. Start with a random valueof w (e.g. w = 12)
w=12
2. Compute the gradient(derivative) of L(w) at pointw = 12. (e.g. dL/dw = 6)
3. Recompute w as:
w = w – lambda * (dL / dw)
![Page 44: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/44.jpg)
44
Gradient Descent (GD) (idea)
! "
"w=10
2. Compute the gradient(derivative) of L(w) at pointw = 12. (e.g. dL/dw = 6)
3. Recompute w as:
w = w – lambda * (dL / dw)
![Page 45: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/45.jpg)
45
(mini-batch) Stochastic Gradient Descent (SGD)
!(#, %) =(−log./,01230(#, %)/∈5
6 = 0.01
for e = 0, num_epochs do
end
Initialize w and b randomly
:!(#, %)/:# :!(#, %)/:%Compute: and
Update w:
Update b:
# = # − 6:!(#, %)/:#% = % − 6:!(#, %)/:%
Print: !(#, %) // Useful to see if this is becoming smaller or not.
end
for b = 0, num_batches do
![Page 46: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/46.jpg)
Source: Andrew Ng
![Page 47: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/47.jpg)
47
Three more things
•How to compute the gradient•Regularization•Momentum updates
![Page 48: CS6501: Deep Learning for Visual Recognition Introduction ...vicente/deeplearning/2019/slides/...Tianlu Wang (tw8cb@virginia.edu) Hours: 10am to noon on Weds at Rice 436 3 Teaching](https://reader033.vdocuments.mx/reader033/viewer/2022052519/5f8b3f313c92d717a06174e8/html5/thumbnails/48.jpg)
Questions?
48