computer vision crash course
Post on 07-Jan-2017
10.292 views
TRANSCRIPT
![Page 1: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/1.jpg)
Computer Vision Crash Course
Jia-Bin Huang
University of Illinois, Urbana-Champaign
www.jiabinhuang.com Jan 12, 2016
![Page 2: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/2.jpg)
Overview
• A little about me
• Introduction to Computer Vision
=== Intermission ===
• Fundamentals and Applications
• Resources
![Page 3: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/3.jpg)
About me
• Born in Kaohsiung, raised in Taipei
![Page 4: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/4.jpg)
About me
National Chiao-Tung UniversityB.S. in EE
IIS, Academia SinicaResearch Assistant
UIUCPh.D. in ECE(Expected summer 2016)
UC, MercedVisiting Student
Microsoft ResearchResearch Intern 2012, 2013
Disney ResearchResearch Intern 2014
![Page 5: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/5.jpg)
My research: Computational Photography
Image Completion Image super-resolution
![Page 6: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/6.jpg)
My Research: Video-based Bird Migration Monitoring
![Page 7: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/7.jpg)
My Research: Visual Tracking
Object Tracking Multi-face Tracking
![Page 8: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/8.jpg)
What is Computer Vision?
• Make computers understand images and videos.
• What kind of scene?
• Where are the cars?
• How far is the building?
![Page 9: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/9.jpg)
What is Computer Vision?
• Make computers understand images and videos.
• What are they doing?
• Why is this happening?
• What is important?
• What will I see?
![Page 10: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/10.jpg)
Computer Vision and Nearby Fields
Images (2D)
Geometry (3D)Shape
PhotometryAppearance
Digital Image ProcessingComputational Photography
Computer Graphics
Computer Vision
Machine learning:Vision = Machine learning applied to visual data
![Page 11: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/11.jpg)
Visual data on the Internet
• Flickr • 10+ billion photographs
• 60 million images uploaded a month
• Facebook • 250 billion+
• 300 million a day
• Instagram• 55 million a day
• YouTube• 100 hours uploaded every minute
90% of net traffic will be visual!
Mostly about cats
![Page 12: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/12.jpg)
Too big for humans
• Need automatic tools to access and analyze visual data!
http://www.petittube.com/
![Page 13: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/13.jpg)
Slide credit: Alexei A. Efros
![Page 14: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/14.jpg)
Vision is Really Hard
• Vision is an amazing feature of natural intelligence• Visual cortex occupies about 50% of Macaque brain
• More human brain devoted to vision than anything else
Is that a queen or a
bishop?
![Page 15: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/15.jpg)
Why is Computer Vision Hard?
![Page 16: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/16.jpg)
Why is Computer Vision Hard?
![Page 17: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/17.jpg)
What did you see?
• Where this picture was taken?
• How many people are there?
• What are they doing?
• What object the person on the left standing on?
• Why this is a funny picture?
![Page 18: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/18.jpg)
Why is Computer Vision Hard?
![Page 19: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/19.jpg)
Why is Computer Vision Hard?
![Page 20: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/20.jpg)
Why is Computer Vision Hard?
![Page 21: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/21.jpg)
Why is Computer Vision Hard?
![Page 22: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/22.jpg)
Why is Computer Vision Hard?
![Page 23: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/23.jpg)
Why is Computer Vision Hard?
![Page 24: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/24.jpg)
Computer: okay, it’s a funny picture
![Page 25: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/25.jpg)
Computer Vision Matters
Safety Health Security
Comfort AccessFun
![Page 26: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/26.jpg)
History of Computer Vision
Marvin Minsky, MIT Turing award, 1969
“In 1966, Minsky hired a first-year
undergraduate student and
assigned him a problem to solve
over the summer:
connect a camera to a computer
and get the machine to describe
what it sees.”Crevier 1993, pg. 88
![Page 27: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/27.jpg)
Half a century later, we're still working on it.
![Page 28: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/28.jpg)
History of Computer Vision
Marvin Minsky, MIT Turing award, 1969
Gerald Sussman, MIT
“You’ll notice that Sussman never worked
in vision again!” – Berthold Horn
![Page 29: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/29.jpg)
1960’s: interpretation of synthetic worlds
Larry Roberts“Father of Computer Vision”
Larry Roberts PhD Thesis, MIT, 1963, Machine Perception of Three-Dimensional Solids
Input image 2x2 gradient operator computed 3D model
rendered from new viewpoint
Slide credit: Steve Seitz
![Page 30: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/30.jpg)
1970’s: some progress on interpreting selected images
The representation and matching of pictorial structuresFischler and Elschlager, 1973
![Page 31: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/31.jpg)
1970’s: some progress on interpreting selected images
The representation and matching of pictorial structuresFischler and Elschlager, 1973
![Page 32: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/32.jpg)
1980’s: ANNs come and go; shift toward geometry and increased mathematical rigor
Image credit: Rick Szeliski
![Page 33: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/33.jpg)
Goodbye science
![Page 34: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/34.jpg)
1990’s: face recognition; statistical analysis in vogue
Image credit: Rick Szeliski
![Page 35: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/35.jpg)
2000’s: broader recognition; large annotated datasets available; video processing starts
Image credit: Rick Szeliski
![Page 36: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/36.jpg)
2010’s: resurgence of deep learning
[AlexNet NIPS 2012] [DeepFace CVPR 2014]
[DeepPose CVPR 2014] [Show, Attend and Tell ICML 2015]
![Page 37: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/37.jpg)
2020’s: autonomous vehicles
![Page 38: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/38.jpg)
2030’s: robot uprising?
![Page 39: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/39.jpg)
5-mins break
![Page 40: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/40.jpg)
Examples of Computer Vision Applications
• How is computer vision used today?
![Page 41: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/41.jpg)
Face detection
• Most digital cameras and smart phones detect faces (and more)• Canon, Sony, Fuji, …
• For smart focus, exposure compensation, and cropping
Slide credit: Steve Seitz
![Page 42: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/42.jpg)
Face recognition
Facebook face auto-tagging
![Page 43: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/43.jpg)
Face Landmark Alignment
TAAZ.com: Virtual makeover Face morphingJia-Bin HuangMiss Daegu 2013 Contestants Face Morphing
![Page 44: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/44.jpg)
Face Landmark Alignment- Pose Estimation
Pitch: ~ 15 degree
Yaw: ~ +20, -20 degree
Jia-Bin Huang What's the Best Pose for a Selfie?
![Page 45: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/45.jpg)
Face Landmark Alignment – 3D Persona
What Makes Tom Hanks Look Like Tom Hanks ICCV 2015
![Page 46: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/46.jpg)
Video-based face replacement
![Page 47: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/47.jpg)
Smile Detection
Sony Cyber-shot® T70 Digital Still Camera Slide credit: Steve Seitz
![Page 48: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/48.jpg)
Eye contact detection
Gaze locking, UIST 2013
![Page 49: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/49.jpg)
Vision-based Biometrics
“How the Afghan Girl was Identified by Her Iris Patterns” Read the story wikipedia
Slide credit: Steve Seitz
![Page 50: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/50.jpg)
Vision-based Biometrics
![Page 51: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/51.jpg)
Optical Character Recognition (OCR)
Digit recognition, AT&T labs
http://www.research.att.com/~yann/
License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition
• Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software
Slide credit: Steve Seitz
![Page 52: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/52.jpg)
Computer vision in sports
Hawk-Eye: helping/improving referee decisions
![Page 54: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/54.jpg)
Computer vision in sports
Replay Technologies: improving viewer experiences
![Page 55: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/55.jpg)
Computer vision in sports
Play tracking
![Page 57: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/57.jpg)
Visual recognition for photo organization
Google photo
![Page 58: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/58.jpg)
Matching street clothing photos in online Shops
Street2shop, ICCV 2015
![Page 59: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/59.jpg)
3D scanner from a mobile camera
MobileFusion
![Page 60: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/60.jpg)
Earth viewers (3D modeling)
Image from Microsoft’s Virtual Earth
(see also: Google Earth)Slide credit: Steve Seitz
![Page 61: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/61.jpg)
3D from thousands of images
Building Rome in a Day: Agarwal et al. 2009
![Page 62: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/62.jpg)
3D from thousands of images
[Furukawa et al. CVPR 2010]
![Page 63: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/63.jpg)
Microsoft PhotoSynth: Photo Tourism
![Page 64: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/64.jpg)
MS PhotoSynth in CSI
![Page 65: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/65.jpg)
Indoor Scene Reconstruction
Structured Indoor Modeling, ICCV2015
![Page 66: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/66.jpg)
Tour into picture
Adobe Affer Effect
![Page 67: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/67.jpg)
First-person Hyperlapse Videos
[Kopf et al. SIGGRAPH 2014]
![Page 68: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/68.jpg)
3D Time-lapse from Internet Photos
3D Time-lapse from Internet Photos, ICCV 2015
![Page 69: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/69.jpg)
Special effects: matting and composition
Kylie Minogue - Come Into My World
![Page 70: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/70.jpg)
Style transfer
A Neural Algorithm of Artistic Style [Gatys et al. 2015]
Target image (Content)Source image (Style) Output (deepart)
![Page 71: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/71.jpg)
The Matrix movies, ESC Entertainment, XYZRGB, NRC
Special effects: shape capture
Slide credit: Steve Seitz
![Page 72: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/72.jpg)
Pirates of the Carribean, Industrial Light and Magic
Special effects: motion capture
Slide credit: Steve Seitz
![Page 73: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/73.jpg)
Google cars
Google in talks with Ford, Toyota and Volkswagen to realise driverless cars
http://www.theatlantic.com/technology/archive/2014/05/all-the-world-a-track-the-trick-that-makes-googles-self-driving-cars-work/370871/
![Page 74: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/74.jpg)
Interactive Games: Kinect
• Object Recognition: http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o
• Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg
• 3D: http://www.youtube.com/watch?v=7QrnwoO1-8A
• Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY
![Page 75: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/75.jpg)
Vision in space
Vision systems (JPL) used for several tasks• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.
![Page 76: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/76.jpg)
Industrial robots
Vision-guided robots position nut runners on wheels
http://www.automationworld.com/computer-vision-opportunity-or-threat
![Page 77: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/77.jpg)
Mobile robots
http://www.robocup.org/NASA’s Mars Spirit Rover
Saxena et al. 2008
STAIR at Stanfordhttp://www.youtube.com/w
atch?v=DF39Ygp53mQ
![Page 78: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/78.jpg)
Medical imaging
Image guided surgery
Grimson et al., MIT3D imaging
MRI, CT
![Page 79: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/79.jpg)
Computer vision for healthcare
BiliCam, Ubicomp 2014
![Page 82: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/82.jpg)
Computer vision for the mass
Counting cells
Analyzing social effects of green space
![Page 83: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/83.jpg)
20-mins coffee break
![Page 84: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/84.jpg)
Fundamentals of Computer Vision
• Light– What an image records
• Matching– How to measure the similarity of two regions
• Alignment– How to align points/patches– How to recover transformation parameters based on matched points
• Geometry– How to relate world coordinates and image coordinates
• Grouping– What points/regions/lines belong together?
• Categorization– What similarities are important?
Slide credit: Derek Hoiem
![Page 85: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/85.jpg)
Image Formation
![Page 86: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/86.jpg)
Digital camera
A digital camera replaces film with a sensor array• Each cell in the array is light-sensitive diode that converts photons to electrons
• Two common types: • Charge Coupled Device (CCD): larger yet slower, better quality
• Complementary Metal Oxide Semiconductor (CMOS): high bandwidth, lower quality
• http://electronics.howstuffworks.com/digital-camera.htmSlide by Steve Seitz
![Page 87: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/87.jpg)
Sensor Array
CMOS sensor
CCD sensor
![Page 88: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/88.jpg)
![Page 89: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/89.jpg)
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.99
0.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.91
0.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.92
0.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.95
0.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.85
0.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.33
0.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.74
0.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.93
0.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.99
0.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.97
0.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
![Page 90: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/90.jpg)
Perception of Intensity
from Ted Adelson
• A is brighter?• B is brighter?• They are equally bright
![Page 91: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/91.jpg)
Perception of Intensity
from Ted Adelson
![Page 92: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/92.jpg)
Digital Color Images
![Page 93: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/93.jpg)
Color Image R
G
B
![Page 94: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/94.jpg)
Image filtering
• Linear filtering: function is a weighted sum/difference of pixel values
• Really important!• Enhance images
• Denoise, smooth, increase contrast, etc.
• Extract information from images• Texture, edges, distinctive points, etc.
• Detect patterns• Template matching
![Page 95: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/95.jpg)
111
111
111
Slide credit: David Lowe (UBC)
],[g
Example: box filter
![Page 96: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/96.jpg)
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0
Credit: S. Seitz
],[],[],[,
lnkmflkgnmhlk
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
![Page 97: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/97.jpg)
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 10
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
Credit: S. Seitz
],[],[],[,
lnkmflkgnmhlk
![Page 98: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/98.jpg)
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 10 20
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
Credit: S. Seitz
],[],[],[,
lnkmflkgnmhlk
![Page 99: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/99.jpg)
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 10 20 30
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
Credit: S. Seitz
],[],[],[,
lnkmflkgnmhlk
![Page 100: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/100.jpg)
0 10 20 30 30
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
Credit: S. Seitz
],[],[],[,
lnkmflkgnmhlk
![Page 101: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/101.jpg)
0 10 20 30 30
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
Credit: S. Seitz
?
],[],[],[,
lnkmflkgnmhlk
![Page 102: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/102.jpg)
0 10 20 30 30
50
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
[.,.]h[.,.]f
Image filtering
111
111
111
],[g
Credit: S. Seitz
?
],[],[],[,
lnkmflkgnmhlk
![Page 103: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/103.jpg)
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 90 0 90 90 90 0 0
0 0 0 90 90 90 90 90 0 0
0 0 0 0 0 0 0 0 0 0
0 0 90 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 10 20 30 30 30 20 10
0 20 40 60 60 60 40 20
0 30 60 90 90 90 60 30
0 30 50 80 80 90 60 30
0 30 50 80 80 90 60 30
0 20 30 50 50 60 40 20
10 20 30 30 30 30 20 10
10 10 10 0 0 0 0 0
[.,.]h[.,.]f
Image filtering111
111
111],[g
Credit: S. Seitz
],[],[],[,
lnkmflkgnmhlk
![Page 104: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/104.jpg)
What does it do?
• Replaces each pixel with an average of its neighborhood
• Achieve smoothing effect (remove sharp features)
111
111
111
Slide credit: David Lowe (UBC)
],[g
Box Filter
![Page 105: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/105.jpg)
Smoothing with box filter
![Page 106: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/106.jpg)
Practice with linear filters
000
010
000
Original
?
Source: D. Lowe
![Page 107: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/107.jpg)
Practice with linear filters
000
010
000
Original Filtered
(no change)
Source: D. Lowe
![Page 108: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/108.jpg)
Practice with linear filters
000
100
000
Original
?
Source: D. Lowe
![Page 109: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/109.jpg)
Practice with linear filters
000
100
000
Original Shifted left
By 1 pixel
Source: D. Lowe
![Page 110: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/110.jpg)
Practice with linear filters
Original
111
111
111
000
020
000
- ?
(Note that filter sums to 1)
Source: D. Lowe
![Page 111: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/111.jpg)
Practice with linear filters
Original
111
111
111
000
020
000
-
Sharpening filter- Accentuates differences with local average
Source: D. Lowe
![Page 112: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/112.jpg)
Sharpening
Source: D. Lowe
![Page 113: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/113.jpg)
Other filters
-101
-202
-101
Vertical Edge(absolute value)
Sobel
![Page 114: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/114.jpg)
Other filters
-1-2-1
000
121
Horizontal Edge(absolute value)
Sobel
![Page 115: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/115.jpg)
Basic gradient filters
000
10-1
000
010
000
0-10
10-1
or
Horizontal Gradient Vertical Gradient
1
0
-1
or
![Page 116: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/116.jpg)
Demo - image filtering
![Page 117: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/117.jpg)
Correspondence and alignment
• Correspondence: matching points, patches, edges, or regions across images
≈
![Page 118: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/118.jpg)
Correspondence and alignment
• Alignment: solving the transformation that makes two things match better
![Page 119: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/119.jpg)
A
Example: fitting an 2D shape template
Slide from Silvio Savarese
![Page 120: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/120.jpg)
Example: fitting a 3D object model
Slide from Silvio Savarese
![Page 121: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/121.jpg)
Example: estimating “fundamental matrix” that corresponds two views
Slide from Silvio Savarese
![Page 122: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/122.jpg)
Example: tracking points
frame 0 frame 22 frame 49
x xx
![Page 123: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/123.jpg)
Overview of Keypoint Matching
K. Grauman, B. Leibe
AfBf
A1
A2 A3
Tffd BA ),(
1. Find a set of distinctive key-points
3. Extract and normalize the region content
2. Define a region around each keypoint
4. Compute a local descriptor from the normalized region
5. Match local descriptors
![Page 124: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/124.jpg)
Goals for Keypoints
Detect points that are repeatable and distinctive
![Page 125: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/125.jpg)
Key trade-offs
More Repeatable More Points
A1
A2 A3
Detection
More Distinctive More Flexible
Description
Robust to occlusionWorks with less texture
Minimize wrong matches Robust to expected variationsMaximize correct matches
Robust detectionPrecise localization
![Page 126: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/126.jpg)
Choosing interest points
Where would you tell your friend to meet you?
![Page 127: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/127.jpg)
Choosing interest points
Where would you tell your friend to meet you?
![Page 128: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/128.jpg)
Local Descriptors: SIFT Descriptor
[Lowe, ICCV 1999]
Histogram of oriented
gradients
• Captures important texture
information
• Robust to small translations /
affine deformationsK. Grauman, B. Leibe
![Page 129: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/129.jpg)
Examples of recognized objects
![Page 130: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/130.jpg)
Demo – interest point detection
![Page 131: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/131.jpg)
Single-view Geometry
How tall is this woman?
Which ball is closer?
How high is the camera?
What is the camera rotation?
What is the focal length of the camera?
![Page 132: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/132.jpg)
Single-view Geometry
Mapping between image and world coordinates
• Pinhole camera model
• Projective geometry• Vanishing points and lines
• Projection matrix
![Page 133: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/133.jpg)
Image formation
Let’s design a camera– Idea 1: put a piece of film in front of an object
– Do we get a reasonable image?
Slide credit: Steve Seitz
![Page 134: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/134.jpg)
Pinhole camera
Idea 2: add a barrier to block off most of the rays• This reduces blurring
• The opening known as the aperture
Slide credit: Steve Seitz
![Page 135: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/135.jpg)
Pinhole camera
Figure from David Forsyth
f
f = focal lengthc = center of the camera
c
![Page 136: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/136.jpg)
Figures © Stephen E. Palmer, 2002
Dimensionality Reduction Machine (3D to 2D)
Point of observation
3D world 2D image
![Page 137: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/137.jpg)
Projection can be tricky…
Slide source: Seitz
![Page 138: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/138.jpg)
Projection can be tricky…
Slide source: Seitz
Making of 3D sidewalk art: http://www.youtube.com/watch?v=3SNYtd0Ayt0
![Page 139: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/139.jpg)
Vanishing points and lines
Parallel lines in the world intersect in the image at a “vanishing point”
![Page 140: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/140.jpg)
Vanishing points and lines
Vanishingpoint
Vanishingline
Vanishingpoint
Vertical vanishingpoint
(at infinity)
Slide from Efros, Photo from Criminisi
![Page 141: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/141.jpg)
Comparing heightsVanishing
Point
Slide by Steve Seitz
142
![Page 142: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/142.jpg)
Measuring height
1
2
3
4
55.3
2.8
3.3
Camera height
Slide by Steve Seitz
143
![Page 143: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/143.jpg)
![Page 144: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/144.jpg)
Image Stitching
• Combine two or more overlapping images to make one larger image
Add example
Slide credit: Vaibhav Vaish
![Page 145: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/145.jpg)
Illustration
Camera Center
![Page 146: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/146.jpg)
RANSAC for Homography
Initial Matched Points
![Page 147: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/147.jpg)
Final Matched Points
RANSAC for Homography
![Page 148: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/148.jpg)
RANSAC for Homography
![Page 149: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/149.jpg)
Bundle Adjustment
• New images initialised with rotation, focal length of best matching image
![Page 150: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/150.jpg)
Bundle Adjustment
• New images initialised with rotation, focal length of best matching image
![Page 151: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/151.jpg)
Details to make it look good
• Choosing seams
• Blending
![Page 152: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/152.jpg)
Choosing seams
• Better method: dynamic program to find seam along well-matched regions
Illustration: http://en.wikipedia.org/wiki/File:Rochester_NY.jpg
![Page 153: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/153.jpg)
Gain compensation• Simple gain adjustment
• Compute average RGB intensity of each image in overlapping region
• Normalize intensities by ratio of averages
![Page 154: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/154.jpg)
Multi-band Blending
• Burt & Adelson 1983• Blend frequency bands over range l
![Page 155: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/155.jpg)
Blending comparison (IJCV 2007)
![Page 156: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/156.jpg)
![Page 157: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/157.jpg)
Demo – feature matching and image stitching
![Page 158: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/158.jpg)
Depth from Stereo
• Goal: recover depth by finding image coordinate x’ that corresponds to x
f
x x’
BaselineB
z
C C’
X
f
X
x
x'
![Page 159: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/159.jpg)
Correspondence Problem
• We have two images taken from cameras with different intrinsic and extrinsic parameters
• How do we match a point in the first image to a point in the second? How can we constrain our search?
x ?
![Page 160: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/160.jpg)
Potential matches for x have to lie on the corresponding line l’.
Potential matches for x’ have to lie on the corresponding line l.
Key idea: Epipolar constraint
x x’
X
x’
X
x’
X
![Page 161: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/161.jpg)
![Page 162: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/162.jpg)
![Page 163: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/163.jpg)
Basic stereo matching algorithm
• If necessary, rectify the two stereo images to transform epipolar lines into scanlines
• For each pixel x in the first image• Find corresponding epipolar scanline in the right image• Search the scanline and pick the best match x’• Compute disparity x-x’ and set depth(x) = fB/(x-x’)
![Page 164: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/164.jpg)
800 most confident matches among 10,000+ local features.
![Page 165: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/165.jpg)
Epipolar lines
![Page 166: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/166.jpg)
Keep only the matches at are “inliers” with respect to the “best” fundamental matrix
![Page 167: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/167.jpg)
The Fundamental Matrix Song
![Page 168: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/168.jpg)
5-mins break
![Page 169: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/169.jpg)
What do you see in this image?
Can I put stuff in it?
Forest
Trees
Bear
Man
Rabbit Grass
Camera
![Page 170: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/170.jpg)
Describe, predict, or interact with the object based on visual cues
Is it alive?Is it dangerous?
How fast does it run? Is it soft?
Does it have a tail? Can I poke with it?
![Page 171: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/171.jpg)
Why do we care about categories?• From an object’s category, we can make predictions about its behavior in the future,
beyond of what is immediately perceived.
• Pointers to knowledge• Help to understand individual cases not previously encountered
• Communication
![Page 172: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/172.jpg)
Image categorization: Fine-grained recognition
Visipedia Project
![Page 173: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/173.jpg)
Place recognition
Places Database [Zhou et al. NIPS 2014]
![Page 175: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/175.jpg)
Training phaseTraining Labels
Training
Images
Classifier Training
Training
Image Features
Trained Classifier
![Page 176: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/176.jpg)
Testing phaseTraining Labels
Training
Images
Classifier Training
Training
Image Features
Image Features
Testing
Test Image
Trained Classifier
Outdoor
PredictionTrained Classifier
![Page 177: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/177.jpg)
Q: What are good features for…
• recognizing a beach?
![Page 178: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/178.jpg)
Q: What are good features for…
• recognizing cloth fabric?
![Page 179: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/179.jpg)
Q: What are good features for…
• recognizing a mug?
![Page 180: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/180.jpg)
What are the right features?
Depend on what you want to know!
• Object: shape• Local shape info, shading, shadows, texture
• Scene : geometric layout• linear perspective, gradients, line segments
• Material properties: albedo, feel, hardness• Color, texture
• Action: motion• Optical flow, tracked points
![Page 181: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/181.jpg)
“Shallow” vs. “deep” architectures
Hand-designedfeature extraction
Trainableclassifier
Image/ VideoPixels Object
Class
Layer 1 Layer NSimple
classifierObject Class
Image/ VideoPixels
Traditional recognition: “Shallow” architecture
Deep learning: “Deep” architecture
…
![Page 182: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/182.jpg)
Object Detection
Search by Sliding Window Detector
• May work well for rigid objects
• Key idea: simple alignment for simple deformations
Object or
Background?
![Page 183: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/183.jpg)
Object Detection
Search by Parts-based model
• Key idea: more flexible alignment for articulated objects
• Defined by models of part appearance, geometry or spatial layout, and search algorithm
![Page 184: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/184.jpg)
Demo – face detection and alignment
![Page 185: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/185.jpg)
Vision as part of an intelligent system
3D Scene
Feature
Extraction
Interpretation
Action
Texture ColorOptical
FlowStereo
Disparity
Grouping SurfacesBits of objects
Sense of depth
ObjectsAgents
and goalsShapes and properties
Open paths
Words
Walk, touch, contemplate, smile, evade, read on, pick up, …
Motion patterns
![Page 186: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/186.jpg)
Well-Established(patch matching)
Major Progress(pattern matching++)
New Opportunities(interpretation/tasks)
Multi-view Geometry
Face Detection/Recognition
Object Tracking / Flow
Category Detection
Human Pose
3D Scene Layout Vision for Robots
Life-long Learning
Entailment/Prediction
Slide credit: Derek Hoiem
![Page 187: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/187.jpg)
This course has provided fundamentals
• What is computer vision?
• Examples of computer vision applications
• Basic principles of computer vision: • Image formation
• Filtering
• Correspondence and alignment
• Geometry
• Recognition.
![Page 188: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/188.jpg)
How do you learn more?
Explore!
![Page 189: Computer Vision Crash Course](https://reader034.vdocuments.mx/reader034/viewer/2022050613/587081fb1a28ab57368b697b/html5/thumbnails/189.jpg)
Resources – where to learn more
• Awesome Computer Vision• A curated list of awesome computer vision resources
• Computer Vision Taiwanese Group• > 2,200 members in facebook
• Videolectures and ML&CV Talks• Lectures, Keynotes, Panel Discussions
• OpenCV – C++/Python/Java