cos 429: computer vision thanks to chris bregler

COS 429: Computer VisionCOS 429: Computer Vision

Thanks to Chris BreglerThanks to Chris Bregler


• Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz [email protected]@cs.princeton.edu

• TA: Nathaniel DirksenTA: Nathaniel Dirksen [email protected]@cs.princeton.edu

• Course web pageCourse web page

http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/

What is Computer Vision?What is Computer Vision?

• Input: images or videoInput: images or video

• Output: description of the worldOutput: description of the world

• But also: measuring, classifying, But also: measuring, classifying, interpreting visual informationinterpreting visual information

Low-Level or “Early” VisionLow-Level or “Early” Vision

• Considers local Considers local properties of an properties of an imageimage

““There’s an edge!”There’s an edge!”

Mid-Level VisionMid-Level Vision

• Grouping and Grouping and segmentationsegmentation

““There’s an object There’s an object and a background!”and a background!”

High-Level VisionHigh-Level Vision

• RecognitionRecognition

““It’s a chair!”It’s a chair!”

Big Question #1: Who Cares?Big Question #1: Who Cares?

• Applications of computer visionApplications of computer vision– In AI: vision serves as the “input stage”In AI: vision serves as the “input stage”

– In medicine: understanding human visionIn medicine: understanding human vision

– In engineering: model extractionIn engineering: model extraction

Vision and Other FieldsVision and Other Fields

Computer VisionComputer VisionArtificial IntelligenceArtificial Intelligence

Cognitive PsychologyCognitive Psychology Signal ProcessingSignal Processing

Computer GraphicsComputer Graphics

Pattern AnalysisPattern Analysis

MetrologyMetrology

Big Question #2: Does It Work?Big Question #2: Does It Work?

• Situation much the same as AI:Situation much the same as AI:– Some fundamental algorithmsSome fundamental algorithms

– Large collection of hacks / heuristicsLarge collection of hacks / heuristics

• Vision is hard!Vision is hard!– Especially at high level, physiology unknownEspecially at high level, physiology unknown

– Requires integrating many different Requires integrating many different methodsmethods

– Requires reasoning and understanding:Requires reasoning and understanding:“AI completeness”“AI completeness”

Computer and Human VisionComputer and Human Vision

• Emulating effects of human visionEmulating effects of human vision

• Understanding physiology of human Understanding physiology of human visionvision

• Analogues of human vision atAnalogues of human vision atlow, mid, and high levelslow, mid, and high levels

Image FormationImage Formation

• Human: lens forms Human: lens forms image on retina,image on retina,sensors (rods and sensors (rods and cones) respond to cones) respond to lightlight

• Computer: lens Computer: lens system forms image,system forms image,sensors (CCD, sensors (CCD, CMOS) respond to CMOS) respond to lightlight

Low-Level VisionLow-Level Vision

HubelHubel


• Retinal ganglion cellsRetinal ganglion cells

• Lateral Geniculate Nucleus – function unknownLateral Geniculate Nucleus – function unknown(visual adaptation?)(visual adaptation?)

• Primary Visual CortexPrimary Visual Cortex– Simple cells: orientational sensitivitySimple cells: orientational sensitivity

– Complex cells: directional sensitivityComplex cells: directional sensitivity

• Further processingFurther processing– Temporal cortex: what is the object?Temporal cortex: what is the object?

– Parietal cortex: where is the object? How do I get it?Parietal cortex: where is the object? How do I get it?


• Net effect: low-level human visionNet effect: low-level human visioncan be (partially) modeled as a set ofcan be (partially) modeled as a set ofmultiresolution, orientedmultiresolution, oriented filters filters

Low-Level Depth CuesLow-Level Depth Cues

• FocusFocus

• VergenceVergence

• StereoStereo

• Not as important as popularly believedNot as important as popularly believed

Low-Level Computer VisionLow-Level Computer Vision

• Filters and filter banksFilters and filter banks– Implemented via convolutionImplemented via convolution

– Detection of edges, corners, and other local featuresDetection of edges, corners, and other local features

– Can include multiple orientationsCan include multiple orientations

– Can include multiple scales: “filter pyramids”Can include multiple scales: “filter pyramids”

• ApplicationsApplications– First stage of segmentationFirst stage of segmentation

– Texture recognition / classificationTexture recognition / classification

– Texture synthesisTexture synthesis

Texture Analysis / SynthesisTexture Analysis / Synthesis

MultiresolutionMultiresolutionOrientedOrientedFilter BankFilter Bank

OriginalOriginalImageImage

ImageImagePyramidPyramid

Texture Analysis / SynthesisTexture Analysis / Synthesis

OriginalOriginalTextureTexture

SynthesizedSynthesizedTextureTexture

Heeger and BergenHeeger and Bergen


• Optical flowOptical flow– Detecting frame-to-frame motionDetecting frame-to-frame motion

– Local operator: looking for gradientsLocal operator: looking for gradients

• ApplicationsApplications– First stage of trackingFirst stage of tracking

Optical FlowOptical Flow

Image #1Image #1 Optical FlowOptical FlowFieldField

Image #2Image #2


• Shape from XShape from X– StereoStereo

– MotionMotion

– ShadingShading

– Texture foreshorteningTexture foreshortening

3D Reconstruction3D Reconstruction

Tomasi+KanadeTomasi+Kanade

Debevec,Taylor,MalikDebevec,Taylor,Malik Phigin et al.Phigin et al.

Forsyth et al.Forsyth et al.

Mid-Level VisionMid-Level Vision

• Physiology unclearPhysiology unclear

• Observations by Gestalt psychologistsObservations by Gestalt psychologists– ProximityProximity

– SimilaritySimilarity

– Common fateCommon fate

– Common regionCommon region

– ParallelismParallelism

– ClosureClosure

– SymmetrySymmetry

– ContinuityContinuity

– Familiar configurationFamiliar configuration WertheimerWertheimer

Grouping CuesGrouping Cues

Mid-Level Computer VisionMid-Level Computer Vision

• TechniquesTechniques– Clustering based on similarityClustering based on similarity

– Limited work on other principlesLimited work on other principles

• ApplicationsApplications– Segmentation / groupingSegmentation / grouping

– TrackingTracking

Snakes: Active ContoursSnakes: Active Contours

Contour Evolution forContour Evolution forSegmenting an ArterySegmenting an Artery

BirchfeldBirchfeld

HistogramsHistograms

Expectation Maximization (EM)Expectation Maximization (EM)

Color SegmentationColor Segmentation

Bayesian MethodsBayesian Methods

• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models

• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A

given model Bgiven model B


• Prior probabilityPrior probability– Expected distribution of modelsExpected distribution of models

• Conditional probability P(A|B)Conditional probability P(A|B)– Probability of observation AProbability of observation A

given model Bgiven model B

• Bayes’s RuleBayes’s RuleP(B|A) = P(A|B) P(B|A) = P(A|B) P(B) / P(A) P(B) / P(A)

– Probability of model B given observation AProbability of model B given observation A

Thomas BayesThomas Bayes(c. 1702-1761)(c. 1702-1761)


)|( aXP )|( aXP

)|( bXP )|( bXP

# black pixels# black pixels

# black pixels# black pixels


• Human mechanisms: ???Human mechanisms: ???


• Computational mechanismsComputational mechanisms– Bayesian networksBayesian networks

– TemplatesTemplates

– Linear subspace methodsLinear subspace methods

– Kinematic modelsKinematic models

Cootes et al.Cootes et al.

Template-Based MethodsTemplate-Based Methods

Linear SubspacesLinear Subspaces

DataData

PCAPCA

New Basis VectorsNew Basis Vectors

Kirby et al.Kirby et al.

Principal Components Analysis Principal Components Analysis (PCA)(PCA)

Kinematic ModelsKinematic Models

• Optical Flow/Feature tracking: no constraints

• Layered Motion: rigid constraints

• Articulated: kinematic chain constraints

• Nonrigid: implicit / learned constraints

Real-world ApplicationsReal-world Applications

Osuna et al:

Course OutlineCourse Outline

• Image formation and captureImage formation and capture

• Filtering and feature detectionFiltering and feature detection

• Motion estimationMotion estimation

• Segmentation and clusteringSegmentation and clustering

• PCAPCA

• 3D projective geometry3D projective geometry

• Shape acquisitionShape acquisition

• Shape analysisShape analysis

3D Scanning3D Scanning

Image-Based Modeling and Image-Based Modeling and RenderingRendering

Debevec et al.Debevec et al.

ManexManex

Course MechanicsCourse Mechanics

• 65%: 4 written / programming assignments65%: 4 written / programming assignments

• 35%: Final group project35%: Final group project

Course MechanicsCourse Mechanics

• Book:Book:Introductory Techniques for 3-D Computer Introductory Techniques for 3-D Computer VisionVisionEmanuele Trucco and Alessandro VerriEmanuele Trucco and Alessandro Verri

• PapersPapers


• Instructor: Szymon RusinkiewiczInstructor: Szymon Rusinkiewicz [email protected]@cs.princeton.edu

• TA: Nathaniel DirksenTA: Nathaniel Dirksen [email protected]@cs.princeton.edu

• Course web pageCourse web page

http://www.cs.princeton.edu/courses/cs496/http://www.cs.princeton.edu/courses/cs496/

cos 429: computer vision thanks to chris bregler

Documents

lowlevel human vision

early vision

lowlevel vision hubel

computer vision thanks

computer vision instructor

light slide

midlevel vision

highlevel vision recognition