Introduction to Computer Vision Olac Fuentes Computer Science Department University of Texas at El Paso El Paso, TX, U.S.A

Download Introduction to Computer Vision Olac Fuentes Computer Science Department University of Texas at El Paso El Paso, TX, U.S.A

Post on 20-Jan-2016

219 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>Introduction to Computer VisionOlac FuentesComputer Science DepartmentUniversity of Texas at El PasoEl Paso, TX, U.S.A.</p></li><li><p>What is Computer Vision?Computer Vision is the process of extracting knowledge about the world from one or more digital images</p></li><li><p>Digital Images are 2D arrays (matrices) of numbers:</p></li><li><p>Digital ImagesColor Images are formed with three 2-D arrays, representing the Red, Green and Blue components of the image.</p></li><li><p>Computer Vision Main TasksModel generationObject RecognitionObject DetectionTracking</p></li><li><p>Computer Vision Object DetectionDetecting Faces</p></li><li><p>Computer Vision Object DetectionDetecting Faces</p></li><li><p>Computer Vision Object DetectionDetecting Pedestrians</p></li><li><p>Computer Vision Object DetectionDetecting Cars</p></li><li><p>Computer Vision Object DetectionHow to do it?Idea: Use Machine LearningTraining:Training Set: Positive examples are images of objects that belong to the class of interestNegative examples are images of objects that dont belong to that classTrain classifier using the training setDetectionGiven an image to analyze, apply classifier to every subimage (there are lots of them, so a low false positive rate is important!)</p></li><li><p>Face Detection Training Images</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #1: Classifier StructureBuild a cascade classifiers:Where stage i is simpler (and faster) than stage i+1</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #2: FeaturesUse a large number of very simple features:</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #3: Feature ComputationCompute the features very efficiently using the integral image:</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #4: Dealing with multiple scales</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #4: Dealing with multiple scalesObvious solution:Build a detector for each possible scale</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #4: Dealing with multiple scalesObvious solution:Build a detector for each possible scale</p></li><li><p>Efficient Object DetectionViola &amp; Jones, 2005Idea #4: Dealing with multiple scalesObvious solution:Build a detector for each possible scale</p><p>Better idea:Build a detector for a single scaleDuring detection, scale the image</p></li><li><p>Efficient Object DetectionThe Modified census transform (Froba and Ernst, 2004)Used local intensity descriptors as features</p></li><li><p>Efficient Object DetectionThe Modified census transform (Froba and Ernst, 2004)Used local intensity descriptors as featuresUsed simple voting classifiers and Adaboost to build a cascade of classifiers </p></li><li><p>Efficient Object DetectionHistograms of Gradients (Dalal, 2005)Histograms of Gradients (Dalal, 2005)Used histograms of oriented gradients as features</p><p>Used Support Vector Machine as classifierBest results to date</p></li><li><p>Object RecognitionOwlDuckToucanEgret????TrainingTesting</p></li><li><p>Object Recognition Face RecognitionEigenfaces are a set of "standardized face ingredients", derived from statistical analysis of many pictures of faces. First four eigenfaces from the AT&amp;T database</p></li><li><p>Eigenfaces One person's face might be made up of 10% from face 1, 24% from face 2 and so on. Very few eigenvector terms are needed to give a fair likeness of most people's faces Eigenfaces provide a means of applying data compression to faces for identification purposes.</p></li><li><p>Eigenfaces Let E1,...,En, be the eigenfaces obtained from a face database Let F1,...,Fm be the images in our training/testing sets. (For the training images we also know the persons identity) The attributes of Fi are given by the sum of the pixel by pixel products of Fi and E1,...,En, that is, Fi is represented by n numbers: [FiE1, FiE2, ..., FiEn] Using the attribute vectors and the class information we can now construct a classifier</p></li><li><p>TrackingContinuous detection of objects of interest in video streams</p></li><li><p>TrackingContinuous detection of objects of interest in video streams</p></li><li><p>ReconstructionBuild a 3D models of world given 2D Images</p><p>Most-common Approach: Stereo VisionInspired by human 3D perceptionUse two cameras of known geometry</p></li><li><p>ReconstructionBuild a 3D models of world given 2D Images</p><p>Most-common Approach: Stereo VisionInspired by human 3D perceptionUse two cameras of known geometryTake images</p></li><li><p>ReconstructionBuild a 3D models of world given 2D ImagesMost-common Approach: Stereo VisionInspired by human 3D perceptionUse two cameras of known geometryTake imagesFind correspondencesReconstruct using correspondences and known geometry</p></li><li><p>Reconstruction</p></li><li><p>Reconstruction</p><p>Problems with Stereo Vision:Finding matches reliably is difficultCalibration is difficultIt hard to deal with featureless areasComputationally expensive</p></li><li><p>Reconstruction</p><p>Microsoft to the rescue!</p></li><li><p>Reconstruction</p><p>Microsoft to the rescue!</p><p>Seriously!</p></li><li><p>Reconstruction</p><p>Microsoft Kinect</p><p>Reconstruction using active illumination</p><p>Project a known pattern of light at an invisible wavelength</p><p>Learn the appearance of that pattern at different distances</p><p>Fast and easy</p></li><li><p>Reconstruction</p><p>Microsoft Kinect</p></li><li><p>Reconstruction</p><p>Microsoft Kinect</p></li></ul>