image processing and computer vision lecture 4, multimedia e-commerce course november 5, 2002 mike...

86
Image Processing Image Processing and Computer and Computer Vision Vision Lecture 4, Multimedia E-Commerce Lecture 4, Multimedia E-Commerce Course Course November 5, 2002 November 5, 2002 Mike Christel Mike Christel (significant input by Henry (significant input by Henry Schneiderman, Schneiderman, http://www.cs.cmu.edu/~hws) http://www.cs.cmu.edu/~hws) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mello

Post on 15-Jan-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

Image Processing Image Processing

and Computer Visionand Computer Vision

Lecture 4, Multimedia E-Commerce CourseLecture 4, Multimedia E-Commerce Course

November 5, 2002November 5, 2002

Mike ChristelMike Christel

(significant input by Henry Schneiderman, (significant input by Henry Schneiderman, http://www.cs.cmu.edu/~hws)http://www.cs.cmu.edu/~hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mellon

Page 2: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 2 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 3: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 3 Carnegie Mellon

Image Processing vs. Computer VisionImage Processing vs. Computer Vision

• Image ProcessingImage Processing• Research area within electrical engineering/signal Research area within electrical engineering/signal

processingprocessing• Focus on syntax,Focus on syntax, low level featureslow level features

• Computer VisionComputer Vision• Research area within computer science/artificial Research area within computer science/artificial

intelligenceintelligence• Focus on semantics,Focus on semantics, symbolic or geometricsymbolic or geometric descriptionsdescriptions

image image

image

FacesPeopleChairsetc.

Page 4: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 4 Carnegie Mellon

Optical Character Recognition (OCR)Optical Character Recognition (OCR)

• First patent in OCR in 19First patent in OCR in 19thth century century

• First applications in post-office and banksFirst applications in post-office and banks

• Documents easier to distribute, search, organize, and Documents easier to distribute, search, organize, and edit in digital formedit in digital form• Typewriter has been replaced by word processorTypewriter has been replaced by word processor• Lots of legacy materials (the world’s libraries of books) Lots of legacy materials (the world’s libraries of books)

available only in printavailable only in print

• State of the art not perfect, but 99% accurate on cleanly State of the art not perfect, but 99% accurate on cleanly printed pagesprinted pages

• Examples of errors. . .Examples of errors. . .

Page 5: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 5 Carnegie Mellon

Heavy PrintHeavy Print

Output from 3 commercial OCR systems

Page 6: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 6 Carnegie Mellon

Light PrintLight Print

Page 7: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 7 Carnegie Mellon

Stray MarksStray Marks

Page 8: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 8 Carnegie Mellon

TypographyTypography

Page 9: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 9 Carnegie Mellon

Processing Overlaid Text in VideoProcessing Overlaid Text in Video

Text Area

Detection

Text Area

Preprocessing

Commercial

OCR

Video

ASCII Text

The Video OCR (VOCR) process used

by the Informedia research group at Carnegie Mellon

Page 10: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

Text Area Detection

Page 11: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

(1/2 s intervals)

Video Frames Filtered Frames AND-ed Frames

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mellon

Page 12: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 12 Carnegie Mellon

VOCR Preprocessing Problems

Page 13: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

Augmenting VOCR with Dictionary Look-upAugmenting VOCR with Dictionary Look-up

Page 14: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 14 Carnegie Mellon

Handwriting RecognitionHandwriting Recognition

• Natural progression to OCR work for print Natural progression to OCR work for print

• Works if constraints on writer, e.g. palm pilot, where Works if constraints on writer, e.g. palm pilot, where user is asked to conform to specific style or conventionuser is asked to conform to specific style or convention

Page 15: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 15 Carnegie Mellon

Other Document ProcessingOther Document Processing

• Not just for text. . .Not just for text. . .

• Examples:Examples:• Engineering document to CAD fileEngineering document to CAD file• Maps to GIS formatMaps to GIS format• Music score to MIDI representationMusic score to MIDI representation

Page 16: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 16 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 17: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 17 Carnegie Mellon

Digital Cameras = ConvenienceDigital Cameras = Convenience

• Easy to capture photosEasy to capture photos

• Easy to store and organize photosEasy to store and organize photos

• Easy to duplicate photosEasy to duplicate photos

• Easy to edit photosEasy to edit photos

• Rough Multimedia eCommerce class survey:Rough Multimedia eCommerce class survey:• 1999: 10% own digital cameras1999: 10% own digital cameras• 2000: 25%2000: 25%• 2001: 50%2001: 50%• 2002: ??2002: ??

Page 18: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 18 Carnegie Mellon

Digital Camera CautionsDigital Camera Cautions

Via “Photo Industry Reporter” e-Magazine at: Via “Photo Industry Reporter” e-Magazine at: http://www.photoreporter.com/2002/10-21/photokina_report_look_at_35mhttp://www.photoreporter.com/2002/10-21/photokina_report_look_at_35mm.htmlm.html

• Film cameras still outsell digital cameras by almost Film cameras still outsell digital cameras by almost three to onethree to one

• The household penetration of digital is at about 15%The household penetration of digital is at about 15%

• ““But let’s face it: film’s days are numbered. Anyone But let’s face it: film’s days are numbered. Anyone staying solely with film these days will have a glorious staying solely with film these days will have a glorious buggy whip in a market that will be clamoring for cars.”buggy whip in a market that will be clamoring for cars.”

Page 19: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 19 Carnegie Mellon

Digital Camera GrowthDigital Camera Growth

• Photo Marketing Association on US digital camera Photo Marketing Association on US digital camera sales:sales:• 4.5 million in 20004.5 million in 2000• 6.9 million in 20016.9 million in 2001• Projected 9.3 million for 2002Projected 9.3 million for 2002• http://www.visioneer.com/About/press/june2402.htmlhttp://www.visioneer.com/About/press/june2402.html

• InfoTrends Research Group estimates that the U.S. InfoTrends Research Group estimates that the U.S. photo-enabled TV set-top installed base will grow from photo-enabled TV set-top installed base will grow from less than 1 million units in 2002, to over 114 million less than 1 million units in 2002, to over 114 million units in 2006. Household penetration will climb from units in 2006. Household penetration will climb from under 1% to around 85%.under 1% to around 85%.

• InfoTrends projects digital camera sales to grow at a InfoTrends projects digital camera sales to grow at a rate of 38% through 2003rate of 38% through 2003

Page 20: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 20 Carnegie Mellon

State of the Art: Digital CamerasState of the Art: Digital Cameras

• Film is currently better in resolution and colorFilm is currently better in resolution and color• Professional photographers Professional photographers

• Digital for low quality newspaper advertisementsDigital for low quality newspaper advertisements

• Film for portrait photosFilm for portrait photos

• Computer storage limitations: 1 high resolution digital image = 20-Computer storage limitations: 1 high resolution digital image = 20-25 Megabytes25 Megabytes• http://pic.templetons.com/brad/photo/pixels.htmlhttp://pic.templetons.com/brad/photo/pixels.html• 3500 line pairs/35 mm or about 5000 dots/inch, but grainy3500 line pairs/35 mm or about 5000 dots/inch, but grainy• At 3:2 frame size, ~20 million pixelsAt 3:2 frame size, ~20 million pixels• Conclusion: “a 5300 x 4000 digital camera would produce a Conclusion: “a 5300 x 4000 digital camera would produce a

shot equivalent to a scan from a quality 35mm camera -- shot equivalent to a scan from a quality 35mm camera -- provided you can get more than 8 bits per pixel. …A 3000 x provided you can get more than 8 bits per pixel. …A 3000 x 2000 digital camera would match the 35mm for a good 2000 digital camera would match the 35mm for a good percentage of shots.”percentage of shots.”

• Printing: home printers not comparable to commercial printersPrinting: home printers not comparable to commercial printers

Page 21: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 21 Carnegie Mellon

Future of Digital CamerasFuture of Digital Cameras

• Improved resolution and colorImproved resolution and color

• ““Smart” cameras Smart” cameras

• More programmable featuresMore programmable features• Auto-focus on object of interestAuto-focus on object of interest• ““Everything in focus” photoEverything in focus” photo• Capture photo when event X occursCapture photo when event X occurs

Page 22: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 22 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 23: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 23 Carnegie Mellon

BiometricsBiometrics

• Technology for Technology for identificationidentification• Finger/palm printFinger/palm print• IrisIris• FaceFace

Page 24: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 24 Carnegie Mellon

FingerprintsFingerprints

• Minutae – spits and merges of ridgesMinutae – spits and merges of ridges

Page 25: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 25 Carnegie Mellon

Face IdentificationFace Identification

• Not quite reliable yet.Not quite reliable yet.• Performance degrades rapidly with uncontrolled Performance degrades rapidly with uncontrolled

lighting, facial expression, and size of databaselighting, facial expression, and size of database• Several companies exist: Several companies exist:

• Visionics (Rockfeller University spin-off)Visionics (Rockfeller University spin-off)• Viisage (MIT spin-off)Viisage (MIT spin-off)• EyeMatic (USC spin-off)EyeMatic (USC spin-off)• Miros (MIT spin-off)Miros (MIT spin-off)• Banque-Tec Intl (Australia)Banque-Tec Intl (Australia)• C-VIS Computer Vision (Germany)C-VIS Computer Vision (Germany)• LAU TechnologiesLAU Technologies

• Commercial systems installed in London and Brazil to Commercial systems installed in London and Brazil to catch criminalscatch criminals

Page 26: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 26 Carnegie Mellon

Automatic Age ProgressionAutomatic Age Progression

Original Image(1962)

Computer-Aged(1997)

Actual Photo(1997)

Page 27: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 27 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 28: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 28 Carnegie Mellon

Management of images on computersManagement of images on computers

• Compression – reducing Compression – reducing storage size needed for imagesstorage size needed for images

• Watermarking – Protecting Watermarking – Protecting copyrightcopyright

• Microsoft, Bell Labs, NEC, etc.Microsoft, Bell Labs, NEC, etc.

Visible watermark

Page 29: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 29 Carnegie Mellon

Photo ManipulationPhoto Manipulation

• Adobe Photoshop, Corel Adobe Photoshop, Corel PhotoPaint, Pixami, PhotoIQ, PhotoPaint, Pixami, PhotoIQ, etc.etc.

• Image editing: crop an image, Image editing: crop an image, adjust the color, paint over part adjust the color, paint over part of any image, airbrush part of of any image, airbrush part of an image, combine images, an image, combine images, etc.etc.

• Future: Applications of Future: Applications of computer vision, e.g., computer vision, e.g., discriminating foreground from discriminating foreground from background.background.

Page 30: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 30 Carnegie Mellon

Online Digital Image CollectionsOnline Digital Image Collections

• Stock photos of use to graphic designers, artists, etc.Stock photos of use to graphic designers, artists, etc.

• Large collections of images existLarge collections of images exist• Corbis 67 million imagesCorbis 67 million images• Getty 70 million stock photography imagesGetty 70 million stock photography images• AP collects 1000s of digitized images per dayAP collects 1000s of digitized images per day

Page 31: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 31 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 32: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 32 Carnegie Mellon

Inspection for ManufacturingInspection for Manufacturing

• Occum – inspection of printed circuit boards ($100M / Occum – inspection of printed circuit boards ($100M / year)year)

• Cognex – Do-it-yourself toolkits for inspection (400 Cognex – Do-it-yourself toolkits for inspection (400 employees)employees)

Page 33: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 33 Carnegie Mellon

Automatic Target Recognition (ATR)Automatic Target Recognition (ATR)

• Finding mines, tanks, etc.Finding mines, tanks, etc.

• Billion dollar a year industryBillion dollar a year industry• Martin-Lockheed, TSR, Northrup-Grumman, other Martin-Lockheed, TSR, Northrup-Grumman, other

aerospace contractors.aerospace contractors.

• Various types of imagery:Various types of imagery:• Synthetic Aperture Radar (SAR), Sonar, hyper-spectral Synthetic Aperture Radar (SAR), Sonar, hyper-spectral

imagery (more than 3 colors)imagery (more than 3 colors)

Page 34: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 34 Carnegie Mellon

Aerial Photo InterpretationAerial Photo Interpretation

• Also referred to as “automated cartography”Also referred to as “automated cartography”

• Classification of land-use: forest, vegetation, waterClassification of land-use: forest, vegetation, water

• Identification of man-made objects: buildings, roads, Identification of man-made objects: buildings, roads, etc.etc.

Page 35: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 35 Carnegie Mellon

Better Security CamerasBetter Security Cameras

• Cameras that are responsive to the environmentCameras that are responsive to the environment• Track and zoom on moving objectsTrack and zoom on moving objects• Automatic adjustment of contrastAutomatic adjustment of contrast

Page 36: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 36 Carnegie Mellon

Medical imageryMedical imagery

• Medical image libraries for study and diagnosisMedical image libraries for study and diagnosis

• Image overlay to guide surgeonsImage overlay to guide surgeons

Page 37: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 37 Carnegie Mellon

HistoryHistory

• 1980’s ~100 companies – manufacturing applications 1980’s ~100 companies – manufacturing applications mostlymostly

• Early 1990’s less than 10 companiesEarly 1990’s less than 10 companies

• Late 1990’s ~100 companies – face recognition, Late 1990’s ~100 companies – face recognition, intelligent teleconferencing, inspection, digital libraries, intelligent teleconferencing, inspection, digital libraries, medical imagingmedical imaging

Page 38: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 38 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 39: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 39 Carnegie Mellon

Image Processing: FilteringImage Processing: Filtering

Enhancing an image’s quality for human viewing, e.g., in Enhancing an image’s quality for human viewing, e.g., in medical imaging or in telescopic views of spacemedical imaging or in telescopic views of space

Page 40: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 40 Carnegie Mellon

Image Processing: CompressionImage Processing: Compression

• Lossless – No loss in quality: gif, tiffLossless – No loss in quality: gif, tiff

• Lossy – Original image cannot be reconstructed: jpegLossy – Original image cannot be reconstructed: jpeg

• New work on advancing lossy compression strategies New work on advancing lossy compression strategies with fewer visual artifacts: JPEG 2000 and wavelet with fewer visual artifacts: JPEG 2000 and wavelet transformationstransformations

Page 41: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 41 Carnegie Mellon

Image Processing: WatermarkingImage Processing: Watermarking

• Information hidingInformation hiding• Protecting copyrightProtecting copyright

Page 42: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 42 Carnegie Mellon

Image Processing: TransformationImage Processing: Transformation

• Transforming image can make it easier to analyzeTransforming image can make it easier to analyze

Wavelet transform of image

Page 43: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 43 Carnegie Mellon

Wavelet CoefficientsWavelet Coefficients

Horizontal LP, Vertical HP

Horizontal HP, Vertical HP

Horizontal HP, Vertical LP

Horizontal LP, Vertical LP

Page 44: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 44 Carnegie Mellon

5/3 Linear Phase Wavelets5/3 Linear Phase Wavelets

Linear phase 5/3: c[n] = {-1, 2,6,2,-1}, d[n]={1,-2,1} g[n] = {1, 2,-6,2, 1}, f[n]={1, 2,1}

Page 45: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 45 Carnegie Mellon

Computer Vision: 3D Shape ReconstructionComputer Vision: 3D Shape Reconstruction

• Use images to build 3D model of object or siteUse images to build 3D model of object or site

3D site model built from laser range scans collected by CMU

autonomous helicopter

Page 46: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 46 Carnegie Mellon

Computer Vision: Guiding MotionComputer Vision: Guiding Motion

• Visually guided Visually guided manipulationmanipulation• Hand-eye Hand-eye

coordinationcoordination

• Visually guided Visually guided locomotionlocomotion• robotic vehiclesrobotic vehicles

CMU NavLab II

Page 47: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 47 Carnegie Mellon

Computer Vision: Recognition & ClassificationComputer Vision: Recognition & Classification

Page 48: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 48 Carnegie Mellon

Challenges in Object RecognitionChallenges in Object Recognition

245 267 234 142 22 28 38245 267 234 142 22 28 38

121 156 187 98 73 32 12121 156 187 98 73 32 12

123 21 21 38 209 237 121123 21 21 38 209 237 121

99 87 59 197 216 24499 87 59 197 216 244

Page 49: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 49 Carnegie Mellon

Object Recognition ResearchObject Recognition Research

Low Image Quality

Large Quantity of Data

Intra-class

Object Variation

Large number of

Object Classes

Automated Learning

Robust Algorithms

Advanced Image Enhancement

Segmentation and Hierarchical Analysis

LipsFace

Text

Building

Hand Gesture

Vehicle

Clock License Plate

Object Detection

Object Detection Issues

Qu

ality/Q

ua

ntity Issu

es

Page 50: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 50 Carnegie Mellon

Intra-Class VariationIntra-Class Variation

Page 51: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 51 Carnegie Mellon

Lighting VariationLighting Variation

Page 52: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 52 Carnegie Mellon

Geometric VariationGeometric Variation

Page 53: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 53 Carnegie Mellon

Simpler Problem: ClassificationSimpler Problem: Classification

• Fixed size input Fixed size input

• Fixed object size, orientation, and alignmentFixed object size, orientation, and alignment

“Object is present” (at fixed size and alignment)

“Object is NOT present”(at fixed size and alignment)

Decision

Page 54: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 54 Carnegie Mellon

Detection: Apply Classifier ExhaustivelyDetection: Apply Classifier Exhaustively

Search in position

Search in scale

Page 55: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 55 Carnegie Mellon

View-based ClassifiersView-based Classifiers

FaceClassifier #1

FaceClassifier #2

FaceClassifier #3

Page 56: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 56 Carnegie Mellon

1) Apply Local Operators1) Apply Local Operators

f1(0, 1) = #3214

f1(0, 0) = #5710

fk(n, m) = #723

Page 57: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 57 Carnegie Mellon

2) Look Up Probabilities2) Look Up Probabilities

f1(0, 1) = #3214

f1(0, 0) = #5710

fk(n, m) = #723

P1( #5710, 0, 0 | obj) = 0.53

P1( #5710, 0, 0 | non-obj) = 0.56

P1( #3214, 0, 1 | obj) = 0.57

P1( #3214, 0, 1 | non-obj) = 0.48

Pk( #723, n, m | obj) = 0.83

Pk( #723, n, m | non-obj) = 0.19

Page 58: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 58 Carnegie Mellon

3) Make Decision3) Make Decision

P1( #5710, 0, 0 | obj) = 0.53

P1( #5710, 0, 0 | non-obj) = 0.56

P1( #3214, 0, 1 | obj) = 0.57

P1( #3214, 0, 1 | non-obj) = 0.48

Pk( #723, n, m | obj) = 0.83

Pk( #723, n, m | non-obj) = 0.19

0.53 * 0.57 * . . . * 0.83

0.56 * 0.48 * . . . * 0.19>

Page 59: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 59 Carnegie Mellon

Two Classifiers Trained for FacesTwo Classifiers Trained for Faces

Page 60: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 60 Carnegie Mellon

Eight Classifiers Trained for CarsEight Classifiers Trained for Cars

Page 61: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 61 Carnegie Mellon

Probabilities Estimated Off-LineProbabilities Estimated Off-Line

f1(0, 0) = #567 H1(#567, 0, 0) = H1(567, 0, 0) + 1

fk(n, m) = #350 Hk(#350, 0, 0) = Hk(#350, 0, 0) + 1

P1(#567, 0, 0) = H1(#i, 0, 0)

H1(#567, 0, 0)

Pk(#350, 0, 0) = Hk(#i, 0, 0)

Hk(#350, 0, 0)

Page 62: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 62 Carnegie Mellon

Training ClassifiersTraining Classifiers

• Cars: 300-500 images per viewpointCars: 300-500 images per viewpoint

• Faces: 2,000 images per viewpointFaces: 2,000 images per viewpoint

• ~1,000 synthetic variations of each original image~1,000 synthetic variations of each original image• background scenery, orientation, position, frequencybackground scenery, orientation, position, frequency

• 2000 non-object images2000 non-object images• Samples selected by bootstrappingSamples selected by bootstrapping

• Minimization of classification error on training setMinimization of classification error on training set• AdaBoost algorithm (Freund & Shapire ‘97, Shapire & Singer AdaBoost algorithm (Freund & Shapire ‘97, Shapire & Singer

‘99) ‘99)

• Iterative methodIterative method

• Determines weights for samplesDetermines weights for samples

Page 63: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)
Page 64: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)
Page 65: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

Web-based Demo of Face DetectorWeb-based Demo of Face Detector

http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

Page 66: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)
Page 67: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)
Page 68: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 68 Carnegie Mellon

CMU Face Detector in Commercial ProductCMU Face Detector in Commercial Product

CMU Face Detector

Page 69: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 69 Carnegie Mellon

Applications of Face DetectionApplications of Face Detection

• Automatic red-eye removal from photographsAutomatic red-eye removal from photographs

• Automatic color balancing in photo-finishingAutomatic color balancing in photo-finishing

• Intelligent teleconferencingIntelligent teleconferencing

• Component in face identification systemComponent in face identification system

Page 70: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 70 Carnegie Mellon

Difficulty Increases with Complexity of ObjectDifficulty Increases with Complexity of Object

• 2D vs. 3D2D vs. 3D

• Specific objects – e.g. my coffee mugSpecific objects – e.g. my coffee mug

• A category of objects – e.g. all coffee mugsA category of objects – e.g. all coffee mugs

• Amount of intra-category variationAmount of intra-category variation• Rigid or semi-rigid structure, e.g. faceRigid or semi-rigid structure, e.g. face• Articulated objects, e.g. human bodyArticulated objects, e.g. human body• Functionally defined objects, e.g. chairsFunctionally defined objects, e.g. chairs

Page 71: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 71 Carnegie Mellon

OutlineOutline

• Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision

• Emerging TechnologyEmerging Technology• Digitization of documentsDigitization of documents• Digitization of images/photographsDigitization of images/photographs• BiometricsBiometrics• Management of images on computersManagement of images on computers• Other: manufacturing, military, games, …Other: manufacturing, military, games, …

• Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision• Automatically Finding Faces and CarsAutomatically Finding Faces and Cars• Content-based Image RetrievalContent-based Image Retrieval

Page 72: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 72 Carnegie Mellon

Find Images With Similar ColorsFind Images With Similar Colors

Page 73: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 73 Carnegie Mellon

Find Images with Similar ShapeFind Images with Similar Shape

Page 74: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 74 Carnegie Mellon

Goal: Find Images with Similar ContentGoal: Find Images with Similar Content

Page 75: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 75 Carnegie Mellon

Spectrum of Content-Based Image RetrievalSpectrum of Content-Based Image Retrieval

Similar color distribution

Similar texture pattern

Similar shape/pattern

Similar real content

Degree of difficulty

Histogram matching

Texture analysis

Image Segmentation,Pattern recognition

Life-time goal :-)

Page 76: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 76 Carnegie Mellon

Status of Image SearchStatus of Image Search

• Typical Search FeaturesTypical Search Features• ColorColor• TextureTexture• ShapeShape• Spatial attributes (local color regions, less common than Spatial attributes (local color regions, less common than

global color, texture, shape metrics)global color, texture, shape metrics)

• Commercial ActivityCommercial Activity• eVision (notes that “visual search engine market segment eVision (notes that “visual search engine market segment

is projected to reach $1.4 billion by 2005 according to the is projected to reach $1.4 billion by 2005 according to the McKenna Group” McKenna Group” http://www.evisionglobal.com/about/index.htmlhttp://www.evisionglobal.com/about/index.html

• Virage (www.virage.com)Virage (www.virage.com)• IBM (QBIC part of database toolset)IBM (QBIC part of database toolset)

Page 77: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 77 Carnegie Mellon

Reference: “A Review of CBIR”Reference: “A Review of CBIR”

Recommended reading:Recommended reading:

A Review of Content-Based Image Retrieval SystemsA Review of Content-Based Image Retrieval Systems

Colin C. Venters and Dr. Matthew Cooper, University of Colin C. Venters and Dr. Matthew Cooper, University of ManchesterManchester

Available at http://www.jisc.ac.uk/jtap/htm/jtap-054.htmlAvailable at http://www.jisc.ac.uk/jtap/htm/jtap-054.html

This review lists features from a number of image This review lists features from a number of image retrieval systems, along with heuristic evaluations on retrieval systems, along with heuristic evaluations on the interfaces for a subset of these systems.the interfaces for a subset of these systems.

Page 78: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 78 Carnegie Mellon

Search Engines Used by 2001 Multimedia ClassSearch Engines Used by 2001 Multimedia Class

• Search Engines used for 2001 multimedia retrieval Search Engines used for 2001 multimedia retrieval homework (15 others answered a single query each):homework (15 others answered a single query each):

0

10

20

30

40

50

60

Google

AltaVist

a

Lyco

s

Yahoo

Allthew

ebCNN

Corbis

Findso

unds

3dca

fe

Excite

VastV

ideo

Vivi

simo

Mam

ma

Qu

erie

s A

nsw

ered

Page 79: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 79 Carnegie Mellon

Search Engines Used in This 2002 ClassSearch Engines Used in This 2002 Class

Also answering 1 query each were: Excite+, Rexfeature, Webseek+, Also answering 1 query each were: Excite+, Rexfeature, Webseek+, search.netscape.com+, animalplanet.com+, ask.com, naver.com+search.netscape.com+, animalplanet.com+, ask.com, naver.com+

0

5

10

15

20

25

30

35

40

45

50

Google

AltaVist

a

allth

eweb

.com

Lyco

s+

corb

is.co

m

Singing

fish.

com

+

Gettyi

mag

e+

Yahoo

CNN

Web

shot

s.com

+

Qu

erie

s A

nsw

ered

Page 80: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 80 Carnegie Mellon

For Further Reading on Texture SearchFor Further Reading on Texture Search

• Texture Search: “Texture features for browsing and Texture Search: “Texture features for browsing and retrieval of image data”, B.S. Manjunath and W.Y. Ma, retrieval of image data”, B.S. Manjunath and W.Y. Ma, IEEE Trans. on Pattern Analysis and Machine IEEE Trans. on Pattern Analysis and Machine IntelligenceIntelligence 1818(8), Aug. 1996, pp. 837-842.(8), Aug. 1996, pp. 837-842.

• Texture search via Texture search via http://www.engin.umd.umich.edu/ceep/tech_day/2000/rhttp://www.engin.umd.umich.edu/ceep/tech_day/2000/reports/ECEreport2/ECEreport2.htm (texture features eports/ECEreport2/ECEreport2.htm (texture features include coarseness, average gray scale value, and include coarseness, average gray scale value, and number of horizontal and vertical extrema of a specific number of horizontal and vertical extrema of a specific image region)image region)

• For QBIC, texture search works on global coarseness, For QBIC, texture search works on global coarseness, contrast and directionality featurescontrast and directionality features

Page 81: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 81 Carnegie Mellon

For Further Exploration of Image SegmentationFor Further Exploration of Image Segmentation

• BlobWorld work at UC BerkeleyBlobWorld work at UC Berkeley

• Papers, description, sample system available at Papers, description, sample system available at http://elib.cs.berkeley.edu/photos/blobworld/http://elib.cs.berkeley.edu/photos/blobworld/

Page 82: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 82 Carnegie Mellon

Further Reading on Wavelet Further Reading on Wavelet Compression and JPEG 2000Compression and JPEG 2000

• http://www.gvsu.edu/math/wavelets/student_work/EF/how-http://www.gvsu.edu/math/wavelets/student_work/EF/how-works.htmlworks.html

• http://www-ise.stanford.edu/class/psych221/00/shuoyen/ http://www-ise.stanford.edu/class/psych221/00/shuoyen/

• Henry Schneiderman Ph.D. Thesis “A Statistical Approach Henry Schneiderman Ph.D. Thesis “A Statistical Approach to 3D Object Detection Applied to Faces and Cars”, to 3D Object Detection Applied to Faces and Cars”, http://www.ri.cmu.edu/pub_files/pub2/schneiderman_henryhttp://www.ri.cmu.edu/pub_files/pub2/schneiderman_henry_2000_2/schneiderman_henry_2000_2.pdf_2000_2/schneiderman_henry_2000_2.pdf

• http://www.jpeg.org/JPEG2000.htmlhttp://www.jpeg.org/JPEG2000.html

Page 83: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 83 Carnegie Mellon

Summary: Image Processing & Computer VisionSummary: Image Processing & Computer Vision

• Not as mature as speech recognition Not as mature as speech recognition • Technology not as reliableTechnology not as reliable• Fewer companies, fewer productsFewer companies, fewer products

• Success on limited problems, e.g., documentsSuccess on limited problems, e.g., documents

• More applicable to fault tolerant problemsMore applicable to fault tolerant problems

• Technology will growTechnology will grow• Emergence of digital cameraEmergence of digital camera• Improved methodsImproved methods

Page 84: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 84 Carnegie Mellon

Decomposition in Resolution/FrequencyDecomposition in Resolution/Frequency

fine

fine

coarse intermediate

intermediate

Page 85: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 85 Carnegie Mellon

Wavelet DecompositionWavelet Decomposition

Vertical subbands (LH)

Page 86: Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, hws)

© Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 86 Carnegie Mellon

Wavelet DecompositionWavelet Decomposition

Horizontalsubbands (HL)