computer vision: models, learning and inference...
TRANSCRIPT
Computer Vision: Models, Learning and Inference–
Introduction
Oren Freifeld and Ron Shapira-Weber
Computer Science, Ben-Gurion University
Feb 25, 2019
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 1 / 32
1 General Info
2 CV192 Goals
3 Requirements and Grading
4 Computer Vision in a Nutshell
5 Methods and Applications
6 A Bit More on Computer VisionRelated FieldsIndustry and AcademiaLow-, Mid-, and High-Level VisionThe General SettingRegularization and PriorsTypical Difficulties and Computational Challenges
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 2 / 32
General Info
Welcome to CV192
Lecturer: Oren FreifeldEmail: [email protected] Hour: Tue 10:00-12:00, Bld 37, Room 204
Teaching Assistant: Ron Shapira WeberEmail: [email protected] Hour: Mon, 12:00-13:00 , Bld 37, Room 316
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 3 / 32
General Info
Few Links
Website: www.cs.bgu.ac.il/~cv192
FAQ: www.cs.bgu.ac.il/~cv192/FAQ
Updated syllabus: www.cs.bgu.ac.il/~cv192/Syllabus
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 4 / 32
General Info
Times
Formally:
Lectures (4 hours): Mon, 9-12; Wed, 12-13Practical Session (1 hour): Wed, 13-14
However, we will often switch between the times of the lecture and PS.Moreover, the overall 4:1 ratio will be (approximately) preserved but notnecessarily on a weekly basis.E.g., in one week it may be 5:0, in another 3:2, etc.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 5 / 32
General Info
Emails and Office Hours
You are encouraged to use office hours and/or email me if you havequestions (not already included in the FAQ), if you are stuck for toolong with your HW, etc.
When you email me, please include CV192 in the subject line.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 6 / 32
General Info
Slides, Notes, Reading, etc.
We will post slides (usually after the lecture or practical session), notes,as well as pointers for relevant mandatory and optional reading, atwww.cs.bgu.ac.il/~cv192/Lectures_And_Practical_Sessions
No official textbooks in our class; here are two unofficial:
Simon J.D. Prince: ”Computer Vision: Models, Learning, and Inference”Rick Szeliski’s Computer-Vision textbook
The two books are freely available online at the authors’ websites.For details, see https://www.cs.bgu.ac.il/~cv192/Reading
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 7 / 32
CV192 Goals
CV192: Goals
Study certain fundamental models/methods in CV, with emphasis onprobabilistic/statistical/machine-learning ones, and how they may beapplied to solving several key problems in the field.
Understand (a subset of the) main challenges, ideas, and principles ofthe field of CV.
Provide a good basis for reading/understanding CV literature and talks.
Provide students with a good background for becoming researchersand/or algorithm developers in CV.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 8 / 32
Requirements and Grading
Requirements and Grading
To pass the class, you must pass the exam.
Assuming a passing grade in the exam:
final grade = 0.4× exam grade + 0.6× homework grade
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 9 / 32
Requirements and Grading
Homework
Critical for understanding the material in this class.
1-3 students can submit an assignment together
There will be both coding and math
Regarding the math: not so much theorem proving (though we mayhave some); rather, emphasis will be more on understanding andcomputational aspects.
The coding is in Python
Read the Late Policy: www.cs.bgu.ac.il/~cv192/Late_Policy
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 10 / 32
Requirements and Grading
Yet Another Requirement: Maturity (slide 1 out of 2)
Grad students:
Please don’t scare your fellow students by derailing the discussion intoother, or more advanced, topics and/or by throwing everything youknow into the the air.
All:
New notation? Get over it.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 11 / 32
Requirements and Grading
Yet Another Requirement: Maturity (slide 2 out of 2)
All: Please show some patience and trust. E.g.
Parts of the lectures might be overwhelming. Fast pace and the materialis not always easy.
So it’s ok if not everything is understood during the lectures.
Questions during class are encouraged, but without rereading the slidesafter class and trying to understand what just hit you, and withoutreading other material you are pointed to, this won’t work. If you trythis and things remain unclear: office hours or email.
When you hear “this will become clearer later”.
When you are taught something (especially math) that you don’timmediately see what you will need it for.
When something is skipped in class but you’re told you will get to learnmore about it during the HW.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 12 / 32
Computer Vision in a Nutshell
Computer Vision in a Nutshell
Some informal definitions:
Inference: extract information or draw conclusions from data.
Visual data:
Narrow definition: (camera) images or image ensembles; videos.Broader definition also includes: 3D meshes; range images; 3D point clouds;medical images; etc.
Visual inference: inference from visual data.
Computer vision: computational/automated visual inference.
Visual data may come with additional non-visual data: audio, text, GPSinfo, metadata (e.g., time stamp, camera info). Adopting a broadperspective, all of this is of interest to CV today.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 13 / 32
Methods and Applications
Some Computer-vision Applications We Will Touch Upon
The following is a tentative list:
statistical image models
denoising
inpainting
segmentation
tracking
object detection
object recognition
image/object classification
motion analysis
deblurring
intermediate image/video/scene representations
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 14 / 32
Methods and Applications
Methods We Will Touch Upon
The following is a tentative and partial list (some items overlap):
linear and nonlinear filtering
parameter and density estimation
least squares
likelihood and Bayesian methods
robust statistics
linear models and dimensionality reduction
clustering and mixture models
probabilistic graphical models (Markov Chains, Markov Random Fields,and, time permitting, Bayesian Networks)
sampling (in the statistical sense, not in the sense of signal processing)
Markov Chain Monte Carlo (MCMC)
deep learning in computer vision
Time permitting: Bayesian nonparametric models
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 15 / 32
A Bit More on Computer Vision Related Fields
CV Utilizes Tools from Various Disciplines
Math:
Linear algebra, multivariate calculus, optimization, probability, statistics,stochastic processes, differential equations, calculus of variations, differentialgeometry, projective geometry, numerical analysis, group theory, harmonicanalysis,. . .
CS:
Algorithms, machine learning, computational geometry, computer graphics,graph theory, distributed computing, software, neural networks. . .
EE:
Image processing, signal processing, estimation theory, linear filtering,. . .
Physics:
Optics, Newtonian mechanics, material properties, statistical physics
Cognitive science
Biological vision
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 16 / 32
A Bit More on Computer Vision Related Fields
Additional Closely-Related Fields
Artificial Intelligence
Human-machine interface
Robotics
Geometry processing
Augmented reality
Medical imaging
Computational photography/imaging
Photogrammetry
. . .
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 17 / 32
A Bit More on Computer Vision Industry and Academia
From Pascal Fua’s Linkedin Post (22/2/2019)
Source: https://www.linkedin.com/pulse/
computer-vision-student-numbers-pascal-fua/
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 18 / 32
A Bit More on Computer Vision Industry and Academia
From Pascal Fua’s Linkedin Post (22/2/2019)
Source: https://www.linkedin.com/pulse/
computer-vision-student-numbers-pascal-fua/
“This morning I gave my first computer vision class of thesemester in front of a full auditorium, which is the first time thishappens in 20 years. When I started teaching Computer Vision atEPFL in 1996, I never imagined I would one day have 190 studentssitting in front of me and I therefore plotted my student numbersover the years. I would like to believe that the upward trend reflectsmy teaching abilities but it probably has much more to do withGAFA. The students seem convinced that computer vision will helpthem get them a job there and they may be right. And I am notcomplaining: The current level of enthusiasm makes it much easierto find outstanding PhD students.”
GAFA = Google, Amazon, Facebook, and Applewww.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 19 / 32
A Bit More on Computer Vision Industry and Academia
Computer Vision in the Industry and Academia
Industry:
Very “hot”Numerous applicationsGoogle, Apple, Facebook, Amazon, Microsoft, eBay, Mobileye, Intel, IBM,Samsung, Applied Materials, Orbotech, Siemens, Philips, 3M, Adobe,startups, military industries, car industry, Hollywood-related industry, sportsanalytics, entertainment, . . .
Industry-academia collaborations
Many interesting research questions in both academia and industry
Israel Computer-Vision Day (usually Dec)
Israel Machine Vision Conference: March 18, 2019
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 20 / 32
A Bit More on Computer Vision Industry and Academia
Do I need an MSc or a PhD to get a CV Position in theIndustry?
In short: no, though it usually helps quite a bit.
Certain positions, however, primarily in developing algorithms, usuallyrequire an MSc or a PhD.And yes, exceptions exist.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 21 / 32
A Bit More on Computer Vision Low-, Mid-, and High-Level Vision
Low-, Mid-, and High-Level (Computer) Vision
Inter-level boundaries often blurred
In this class: low- and mid-level vision
A very high-level vision example: [Fritz Heider & Marianne Simmel,1944]”An experimental study of apparent behavior.”
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 22 / 32
A Bit More on Computer Vision The General Setting
The General Setting
1 D: (visual) data
2 x: An unknown quantity of interest
3 D and x are related somehow4 Given D, want to find:
xg(x) for some function of interest gProbabilistic quantities related to x (given D).
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 23 / 32
A Bit More on Computer Vision The General Setting
The General Setting
Questions:
1 How do we represent x and D mathematically?(and on a computer?)
2 How do we decide what a good value of x is?
3 How do we find such a value?(and how much we can trust our solution?)
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 24 / 32
A Bit More on Computer Vision The General Setting
The General SettingThree conceptual stages
1 Representation2 Modeling
Models can often be learned
3 Inference
Impact each other and their interplay is key, but it is also important todistinguish between the three.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 25 / 32
A Bit More on Computer Vision The General Setting
Modeling and Inference: A Deterministic View
In the basic version, the goodness of x is defined via a (D-dependent)cost function:
x̂ = argminx
f(x;D)
Need to pick f
Need to solve the (mathematical) optimization problem
Typical tradeoff between the f we want and how easy it is to work with
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 26 / 32
A Bit More on Computer Vision The General Setting
Modeling and Inference: A Probabilistic View
Still in the basic version, we can also have a probabilistic take on this;i.e., the goodness of x is defined via one of the following:
a cost function:argmin
xf(x;D)
a likelihood model:argmax
xL(x;D)
Example:x,D ∈ Rf(x;D) = (x−D)2
L(x;D) = 1√2π
exp(−12(x−D)2)
Similar questions: how to pick L(D;x); how to maximize it; tradeoff
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 27 / 32
A Bit More on Computer Vision Regularization and Priors
Regularization and Priors
Often, we have an a-priori notion which values of x are preferable.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 28 / 32
A Bit More on Computer Vision Regularization and Priors
Regularization
Want some f2(x) to be small ⇒ a cost function with regularization:
argminx
f1(x;D) + λf2(x) λ > 0
Again: how to pick f2, how to optimize, tradeoff
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 29 / 32
A Bit More on Computer Vision Regularization and Priors
Priors
Have a prior distribution, p(x). Write L(x;D) as p(D|x).⇒ a posterior distribution, p(x|D):
Maximize the posterior:
argmaxx
p(x|D) = argmaxx
p(D|x)p(x)p(D)
= argmaxx
p(D|x)p(x)
Posterior mean:E(x|D)
Sample from the posterior:x ∼ p(x|D)
Similar questions (choosing p(x), how to maximize p(x|D), tradeoff)but also some new ones: how can we compute the expectation? Howcan we sample?
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 30 / 32
A Bit More on Computer Vision Typical Difficulties and Computational Challenges
Computational/Mathematical/Algorithmic Challenges
Inverse problems are usually harder to solve than the forward problems(e.g., hard to reason in 3D when observations are 2D)
Both dim(x) and dim(D) can be large
Number of data points in D can be large
“Wrong” assumptions
Outliers
Missing data
Hard-to-optimize functions
High dimensionality (of x and/or D)
Complicated dependency structures
Distributions: construction, maximization, expectation, sampling
Structures to be respected and exploited (sometimes the structures arehidden)
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 31 / 32
A Bit More on Computer Vision Typical Difficulties and Computational Challenges
Version Log
25/2/2019, ver 1.00.
www.cs.bgu.ac.il/~cv192/ Introduction (ver. 1.00) Feb 25, 2019 32 / 32