cs376 computer vision lecture 20: deep learning basicshuangqx/cs376_lecture_20.pdf · cs376...

CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10 th 2019 Slide material from: http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture04.pdf http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf

Upload: others

Post on 16-Oct-2020

3 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

CS376 Computer Vision Lecture 20: Deep Learning Basics

Qixing Huang

April 10th 2019

Slide material from: http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture04.pdf

http://cs231n.stanford.edu/slides/2018/cs231n_2018_lecture05.pdf

Page 2: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Today’s Topic

• Neural Networks

– Activation functions

– Fully connected

– Convolutional neural networks

• Optimization

• Back-propagation

Page 3: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Neural Networks

Page 4: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

From linear to multi-layer

Page 5: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

From linear to multi-layer

Page 6: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Activation functions

Page 7: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Neural Networks: Architectures

Page 8: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Summary

• We arrange neurons into fully-connected layers

• The abstraction of a layer has the nice property that it allows us to use efficient vectorized code (e.g. matrix multiplies)

• Neural networks are not really neural

Page 9: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolutional Neural Network

Page 10: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolutional Neural Networks

• A bit of history

Page 11: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolutional Neural Networks

• A bit of history

Page 12: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Recall fully connected layers

Page 13: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Recall fully connected layers

Page 14: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolution layer

Page 15: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolution layer

Page 16: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolution layer

Page 17: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolution layer

Page 18: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Convolution layer

Page 19: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

ConvNet

• ConvNet is a sequence of Convolution Layers, interspersed with activation functions

Page 20: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

A typical ConvNet

Page 21: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

A closer look at spatial dimensions

Page 22: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Activations

Page 23: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Output size

Padding is necessary

Page 24: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Padding

Page 25: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Note that

Page 26: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

An example

Page 27: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

General form of a Conv layer

Page 28: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

An extreme case

Page 29: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

The brain/neuron view of CONV layer

Page 30: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

The brain/neuron view of CONV layer

Page 31: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Reminder: Fully Connected Layer

Page 32: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Pooling layer

• makes the representations smaller and more manageable

• operates over each activation map independently:

Page 33: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Max pooling

Page 34: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

General form

Page 35: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Fully Connected Layer (FC Layer)

• Contains neurons that connect to the entire input volume, as in ordinary Neural Networks

Page 36: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Summary

• ConvNets stack CONV,POOL,FC layers

• Trend towards smaller filters and deeper architectures

• Trend towards getting rid of POOL/FC layers (just CONV)

• Typical architectures look like

• [(CONV-RELU)*N-POOL?]*M-(FC-RELU)*K,SOFTMAX where N is usually up to ~5, M is large, 0 <= K <= 2.

– but recent advances such as ResNet/GoogLeNet challenge this paradigm

Page 37: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Optimization

Page 38: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Let us go back to the linear case

Page 39: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Optimization

Page 40: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Gradient descent

Page 41: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Compuational graphs

Page 42: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Back-propagation

Page 43: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

A toy example

Page 44: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Back-propagation

Page 45: CS376 Computer Vision Lecture 20: Deep Learning Basicshuangqx/CS376_Lecture_20.pdf · CS376 Computer Vision Lecture 20: Deep Learning Basics Qixing Huang April 10th 2019 Slide material

Back-propagation

• recursive application of the chain rule along a computational graph to compute the gradients of all inputs /parameters / intermediates

• implementations maintain a graph structure, where the nodes implement the forward() / backward() API

• forward: compute result of an operation and save any intermediates needed for gradient computation in memory

• backward: apply the chain rule to compute the gradient of the loss function with respect to the inputs

CS354 Computer Graphics Introduction to OpenGLhuangqx/CS354_Lecture_9.pdf · 2018-02-14 · Introduction to OpenGL Qixing Huang February 14th 2018. Synthetic Camera Model. Pinhole

Stanford hci group / cs376 u Jeffrey Heer · 2 April 2009 Seminal Ideas in Human-Computer Interaction

CS354 Computer Graphics Introductionhuangqx/CS354_Lecture_1.pdfCS 354–Computer Graphics •Instructor: Qixing Huang –[email protected] –Office: GDC 5.422 –Office hours:

10/5/2004 [CSCW] Computer Supported Cooperative Work CS376 Reading Summary Bjoern Hartmann (bjoern@cs)

CS376 Computer Vision Lecture 3: Linear Filtershuangqx/CS376_Lecture_3.pdf · 2019. 1. 30. · CS376 Computer Vision Lecture 3: Linear Filters Qixing Huang January 30th 2019. Last

Stanford hci group / cs376 u Scott Klemmer · 16 November 2006 Speech & Multimod al

The Interpretation of Cultures: Selected Essayshci.stanford.edu/cs376/private/readings/geertz_thick_description.pdf · The Interpretation of Cultures: Selected Essays Clifford Geertz

CS376 Computer Vision Lecture 9: Active Contours

Crowdsourcing - Stanford Universityhcicourses.stanford.edu/.../slides/CS376-2015-08-crowdsourcing.pdf · Designing crowdsourcing algorithms is often like designing a ... Bayesian

CS376 Computer Vision Lecture 18: Introduction to Visual ...huangqx/CS376_Lecture_18.pdf · CS376 Computer Vision Lecture 18: Introduction to Visual Recognition Qixing Huang thApril

Tablet+Stylus - Stanford University › courses › cs376 › 2016 › slides › cs376-10-in… · Recall: Skinput. Recall: Omnitouch. Recall: SenseCam. Input and interaction research

CS354 Computer Graphics Introductionhuangqx/CS354_Lecture_1.pdfCS354 Computer Graphics Introduction Qixing Huang Januray 178h 2017. CS 354–Computer Graphics •Instructor: Qixing

CS376 Computer Vision Lecture 16: Two-View Stereo

Ubiquitous Computing - Stanford HCI grouphci.stanford.edu/courses/cs376/2010/lectures/2009-04-05-ubicomp/CS...Ubiquitous Computing . ... of envisioning future environments ... communication

Stanford hci group / cs376 u Scott Klemmer · 03 October 2006 CSCW Computer-Supported Cooperative Work

Near-Regular Structure Discovery Using Linear Programming · 2013-10-19 · Near-Regular Structure Discovery Using Linear Programming Qixing Huang Stanford University and Leonidas

Overview of Geometric Data Analysishuangqx/CS395_Lecture_III.pdf · 2017-01-23 · Overview of Geometric Data Analysis Qixing Huang Janurary 23th 2017. Last Lecture --- Scanning

Qixing Tanglangquan-Chachui.Geng jun

CS354 Computer Graphics Introduction to OpenGLhuangqx/CS354_Lecture_9.pdfIntroduction to OpenGL Qixing Huang February 14th 2018. Synthetic Camera Model. Pinhole Camera To find perspective

Stanford hci group / cs376 u Scott Klemmer · 30 November 2006 UI Software Tools

Input and Interaction - Stanford University€¦ · Input and Interaction michael bernstein spring 2013 cs376.stanford.edu. 2 Recall: Skinput. 3 Recall: Omnitouch. 4 Recall: SenseCam

Catalogo general qixing

Information Retrieval Visualization CPSC 533c Class Presentation Qixing Zheng March 22, 2004

Qixing Tanglangquan. Lizhanyuan, Liuchongxi

Information Visualization Mike Brzozowski cs376 October 28, 2004

Qixing Huang Karthik Ramani Purdue Generating 3D shape surfaces using deep residual networks Ayan Sinha MIT [email protected] Asim Unmesh IIT Kanpur [email protected] Qixing Huang

CS376 Computer Vision Lecture 5: Texturehuangqx/CS376_Lecture_5.pdf · Slide Credit: Kristen Grauman. Texture representation: example original image derivative filter responses, squared

Physical representations: object knowledge cs376 fall 2005 - sdoorley

Stanford hci group / cs376 Adaptive Interfaces Nathan Sakunkoo

Qixing Tanglangquan -Baiyuanxiaomu.Geng Jun

Stanford hci group / cs376 u Jeffrey Heer · 5 May 2009 Design Methods: Prototyping

Research & HCI Mike Krieger CS376 Tuesday, April 15th

Stanford hci group / cs376 u Scott Klemmer · 3 April 2008 Seminal Ideas in Human-Computer Interaction

Stanford hci group / cs376 u Jeffrey Heer · 12 May 2009 Information Visualization

Stanford hci group / cs376 u Scott Klemmer · 12 October 2006 Evaluati on Methods