3d computer vision and video computing 3d vision topic 3 of part ii stereo vision csc i6716 spring...

Download 3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York zhu@cs.ccny.cuny.edu

If you can't read please download the document

Post on 19-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • 3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York [email protected]
  • Slide 2
  • 3D Computer Vision and Video Computing Stereo Vision n Problem l Infer 3D structure of a scene from two or more images taken from different viewpoints n Two primary Sub-problems l Correspondence problem (stereo match) -> disparity map n Similar instead of Same n Occlusion problem: some parts of the scene are visible only in one eye l Reconstruction problem -> 3D n What we need to know about the cameras parameters n Often a stereo calibration problem n Lectures on Stereo Vision l Stereo Geometry Epipolar Geometry (*) l Correspondence Problem (*) Two classes of approaches l 3D Reconstruction Problems Three approaches
  • Slide 3
  • 3D Computer Vision and Video Computing A Stereo Pair n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D CMU CIL Stereo Dataset : Castle sequence http://www-2.cs.cmu.edu/afs/cs/project/cil/ftp/html/cil-ster.html ? 3D?
  • Slide 4
  • 3D Computer Vision and Video Computing More Images n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D
  • Slide 5
  • 3D Computer Vision and Video Computing More Images n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D
  • Slide 6
  • 3D Computer Vision and Video Computing More Images n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D
  • Slide 7
  • 3D Computer Vision and Video Computing More Images n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D
  • Slide 8
  • 3D Computer Vision and Video Computing More Images n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D
  • Slide 9
  • 3D Computer Vision and Video Computing Part I. Stereo Geometry n A Simple Stereo Vision System l Disparity Equation l Depth Resolution l Fixated Stereo System n Zero-disparity Horopter n Epipolar Geometry l Epipolar lines Where to search correspondences n Epipolar Plane, Epipolar Lines and Epipoles n http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html l Essential Matrix and Fundamental Matrix n Computing E & F by the Eight-Point Algorithm n Computing the Epipoles n Stereo Rectification
  • Slide 10
  • 3D Computer Vision and Video Computing Stereo Geometry n Converging Axes Usual setup of human eyes n Depth obtained by triangulation n Correspondence problem: p l and p r correspond to the left and right projections of P, respectively. P(X,Y,Z)
  • Slide 11
  • 3D Computer Vision and Video Computing A Simple Stereo System Z w =0 LEFT CAMERA Left image: reference Right image: target RIGHT CAMERA Elevation Z w disparity Depth Z baseline
  • Slide 12
  • 3D Computer Vision and Video Computing Disparity Equation P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image plane LEFT CAMERA B = Baseline Depth Stereo system with parallel optical axes f = focal length Optical Center O r p r (x r,y r ) Image plane RIGHT CAMERA Disparity: dx = x r - x l
  • Slide 13
  • 3D Computer Vision and Video Computing Disparity vs. Baseline P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image plane LEFT CAMERA B = Baseline Depth f = focal length Optical Center O r p r (x r,y r ) Image plane RIGHT CAMERA Disparity dx = x r - x l Stereo system with parallel optical axes
  • Slide 14
  • 3D Computer Vision and Video Computing Depth Accuracy n Given the same image localization error l Angle of cones in the figure n Depth Accuracy (Depth Resolution) vs. Baseline l Depth Error 1/B (Baseline length) l PROS of Longer baseline, n better depth estimation l CONS n smaller common FOV n Correspondence harder due to occlusion n Depth Accuracy (Depth Resolution) vs. Depth l Disparity (>0) 1/ Depth l Depth Error Depth 2 l Nearer the point, better the depth estimation n An Example l f = 16 x 512/8 pixels, B = 0.5 m l Depth error vs. depth Z2Z2 Two viewpoints Z2>Z1Z2>Z1 Z1Z1 Z1Z1 OlOl OrOr Absolute error Relative error
  • Slide 15
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Stereo with Parallel Axes l Short baseline n large common FOV n large depth error l Long baseline n small depth error n small common FOV n More occlusion problems n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases FOV Leftright
  • Slide 16
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Stereo with Parallel Axes l Short baseline n large common FOV n large depth error l Long baseline n small depth error n small common FOV n More occlusion problems n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases FOV Leftright
  • Slide 17
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n l d > 0 Horopter
  • Slide 20
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n
  • 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n
  • 3D Computer Vision and Video Computing Essential Matrix n Essential Matrix E = RS l A natural link between the stereo point pair and the extrinsic parameters of the stereo system n One correspondence -> a linear equation of 9 entries n Given 8 pairs of (pl, pr) -> E l Mapping between points and epipolar lines we are looking for n Given p l, E -> p r on the projective line in the right plane n Equation represents the epipolar line of pr (or pl) in the right (or left) image n Note: l pl, pr are in the camera coordinate system, not pixel coordinates that we can measure
  • Slide 28
  • 3D Computer Vision and Video Computing Fundamental Matrix n Mapping between points and epipolar lines in the pixel coordinate systems l With no prior knowledge on the stereo system n From Camera to Pixels: Matrices of intrinsic parameters n Questions: l What are fx, fy, ox, oy ? l How to measure p l in images? Rank (M int ) =3
  • Slide 29
  • 3D Computer Vision and Video Computing Fundamental Matrix n Fundamental Matrix l Rank (F) = 2 l Encodes info on both intrinsic and extrinsic parameters l Enables full reconstruction of the epipolar geometry l In pixel coordinate systems without any knowledge of the intrinsic and extrinsic parameters l Linear equation of the 9 entries of F
  • Slide 30
  • 3D Computer Vision and Video Computing Computing F: The Eight-point Algorithm n Input: n point correspondences ( n >= 8) l Construct homogeneous system Ax= 0 from n x = (f 11,f 12,,f 13, f 21,f 22,f 23 f 31,f 32, f 33 ) : entries in F n Each correspondence give one equation n A is a nx9 matrix l Obtain estimate F^ by SVD of A n x (up to a scale) is column of V corresponding to the least singular value l Enforce singularity constraint: since Rank (F) = 2 n Compute SVD of F^ n Set the smallest singular value to 0: D -> D n Correct estimate of F : n Output: the estimate of the fundamental matrix, F n Similarly we can compute E given intrinsic parameters
  • Slide 31
  • 3D Computer Vision and Video Computing Locating the Epipoles from F n Input: Fundamental Matrix F l Find the SVD of F l The epipole e l is the column of V corresponding to the null singular value (as shown above) l The epipole e r is the column of U corresponding to the null singular value n Output: Epipole e l and e r e l lies on all the epipolar lines of the left image F is not identically zero For every p r p l p r P OlOl OrOr elel erer PlPl PrPr Epipolar Plane Epipolar Lines Epipoles
  • Slide 32
  • 3D Computer Vision and Video Computing Break n Homework #4 online, due on May 03 before class
  • Slide 33
  • 3D Computer Vision and Video Computing Stereo Rectification n Rectification l Given a stereo pair, the intrinsic and extrinsic parameters, find the image transformation to achieve a stereo system of horizontal epipolar lines l A simple algorithm: Assuming calibrated stereo cameras p l r P OlOl OrOr X r PlPl PrPr Z l Y l Y r TX l Z r n Stereo System with Parallel Optical Axes n Epipoles are at infinity n Horizontal epipolar lines
  • Slide 34
  • 3D Computer Vision and Video Computing Stereo Rectification n Algorithm l Rotate both left and right camera so that they share the same X axis : O r -O l = T l Define a rotation matrix R rect for the left camera l Rotation Matrix for the right camera is R rect R T l Rotation can be implemented by image transformation p l p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX l X l = T, Y l = X l xZ l, Z l = X l xY l
  • Slide 35
  • 3D Computer Vision and Video Computing Stereo Rectification n Algorithm l Rotate both left and right camera so that they share the same X axis : O r -O l = T l Define a rotation matrix R rect for the left camera l Rotation Matrix for the right camera is R rect R T l Rotation can be implemented by image transformation p l p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX l X l = T, Y l = X l xZ l, Z l = X l xY l
  • Slide 36
  • 3D Computer Vision and Video Computing Stereo Rectification n Algorithm l Rotate both left and right camera so that they share the same X axis : O r -O l = T l Define a rotation matrix R rect for the left camera l Rotation Matrix for the right camera is R rect R T l Rotation can be implemented by image transformation ZrZr p l r P OlOl OrOr X r PlPl PrPr Z l Y l Y r R, T TX l T = (B, 0, 0), P r = P l T
  • Slide 37
  • 3D Computer Vision and Video Computing Epipolar Geometry: Summary n Purpose l where to search correspondences n Epipolar plane, epipolar lines, and epipoles l known intrinsic (f) and extrinsic (R, T) n co-planarity equation l known intrinsic but unknown extrinsic n essential matrix l unknown intrinsic and extrinsic n fundamental matrix n Rectification l Generate stereo pair (by software) with parallel optical axis and thus horizontal epipolar lines
  • Slide 38
  • 3D Computer Vision and Video Computing Part II. Correspondence problem n Three Questions l What to match? n Features: point, line, area, structure? l Where to search correspondence? n Epipolar line? l How to measure similarity? n Depends on features n Approaches l Correlation-based approach l Feature-based approach n Advanced Topics l Image filtering to handle illumination changes l Adaptive windows to deal with multiple disparities l Local warping to account for perspective distortion l Sub-pixel matching to improve accuracy l Self-consistency to reduce false matches l Multi-baseline stereo
  • Slide 39
  • 3D Computer Vision and Video Computing Correlation Approach n For Each point (x l, y l ) in the left image, define a window centered at the point (x l, y l ) LEFT IMAGE
  • Slide 40
  • 3D Computer Vision and Video Computing Correlation Approach n search its corresponding point within a search region in the right image (x l, y l ) RIGHT IMAGE
  • Slide 41
  • 3D Computer Vision and Video Computing Correlation Approach n the disparity (dx, dy) is the displacement when the correlation is maximum (x l, y l )dx(x r, y r ) RIGHT IMAGE
  • Slide 42
  • 3D Computer Vision and Video Computing Correlation Approach n Elements to be matched l Image window of fixed size centered at each pixel in the left image n Similarity criterion l A measure of similarity between windows in the two images l The corresponding element is given by window that maximizes the similarity criterion within a search region n Search regions l Theoretically, search region can be reduced to a 1-D segment, along the epipolar line, and within the disparity range. l In practice, search a slightly larger region due to errors in calibration
  • Slide 43
  • 3D Computer Vision and Video Computing Correlation Approach n Equations n disparity n Similarity criterion l Cross-Correlation l Sum of Square Difference (SSD) l Sum of Absolute Difference(SAD)
  • Slide 44
  • 3D Computer Vision and Video Computing Correlation Approach n PROS l Easy to implement l Produces dense disparity map l Maybe slow n CONS l Needs textured images to work well l Inadequate for matching image pairs from very different viewpoints due to illumination changes l Window may cover points with quite different disparities l Inaccurate disparities on the occluding boundaries
  • Slide 45
  • 3D Computer Vision and Video Computing Correlation Approach n A Stereo Pair of UMass Campus texture, boundaries and occlusion
  • Slide 46
  • 3D Computer Vision and Video Computing Feature-based Approach n Features l Edge points l Lines (length, orientation, average contrast) l Corners n Matching algorithm l Extract features in the stereo pair l Define similarity measure l Search correspondences using similarity measure and the epipolar geometry
  • Slide 47
  • 3D Computer Vision and Video Computing Feature-based Approach n For each feature in the left image LEFT IMAGE corner line structure
  • Slide 48
  • 3D Computer Vision and Video Computing Feature-based Approach n Search in the right image the disparity (dx, dy) is the displacement when the similarity measure is maximum RIGHT IMAGE corner line structure
  • Slide 49
  • 3D Computer Vision and Video Computing Feature-based Approach n PROS l Relatively insensitive to illumination changes l Good for man-made scenes with strong lines but weak texture or textureless surfaces l Work well on the occluding boundaries (edges) l Could be faster than the correlation approach n CONS l Only sparse depth map l Feature extraction may be tricky n Lines (Edges) might be partially extracted in one image n How to measure the similarity between two lines?
  • Slide 50
  • 3D Computer Vision and Video Computing Break n Homework #4 online, due on May 03 before class
  • Slide 51
  • 3D Computer Vision and Video Computing Advanced Topics n Mainly used in correlation-based approach, but can be applied to feature-based match n Image filtering to handle illumination changes l Image equalization n To make two images more similar in illumination l Laplacian filtering (2 nd order derivative) n Use derivative rather than intensity (or original color)
  • Slide 52
  • 3D Computer Vision and Video Computing Advanced Topics n Adaptive windows to deal with multiple disparities l Adaptive Window Approach (Kanade and Okutomi) n statistically adaptive technique which selects at each pixel the window size that minimizes the uncertainty in disparity estimates n A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment, T. Kanade and M. Okutomi. Proc. 1991 IEEE International Conference on Robotics and Automation, Vol. 2, April, 1991, pp. 1088-1095 A Stereo Matching Algorithm with an Adaptive Window: Theory and ExperimentT. Kanade l Multiple window algorithm (Fusiello, et al) n Use 9 windows instead of just one to compute the SSD measure n The point with the smallest SSD error amongst the 9 windows and various search locations is chosen as the best estimate for the given points n A Fusiello, V. Roberto and E. Trucco, Efficient stereo with multiple windowing, IEEE CVPR pp858-863, 1997
  • Slide 53
  • 3D Computer Vision and Video Computing Advanced Topics n Multiple windows to deal with multiple disparities Smooth regions Corners edges near far
  • Slide 54
  • 3D Computer Vision and Video Computing Advanced Topics n Sub-pixel matching to improve accuracy l Find the peak in the correlation curves n Self-consistency to reduce false matches esp. for occlusions l Check the consistency of matches from L to R and from R to L n Multiple Resolution Approach l From coarse to fine for efficiency in searching correspondences n Local warping to account for perspective distortion l Warp from one view to the other for a small patch given an initial estimation of the (planar) surface normal n Multi-baseline Stereo l Improves both correspondences and 3D estimation by using more than two cameras (images)
  • Slide 55
  • 3D Computer Vision and Video Computing 3D Reconstruction Problem n What we have done l Correspondences using either correlation or feature based approaches l Epipolar Geometry from at least 8 point correspondences n Three cases of 3D reconstruction depending on the amount of a priori knowledge on the stereo system l Both intrinsic and extrinsic known - > can solve the reconstruction problem unambiguously by triangulation l Only intrinsic known -> recovery structure and extrinsic up to an unknown scaling factor l Only correspondences -> reconstruction only up to an unknown, global projective transformation (*)
  • Slide 56
  • 3D Computer Vision and Video Computing Reconstruction by Triangulation n Assumption and Problem l Under the assumption that both intrinsic and extrinsic parameters are known l Compute the 3-D location from their projections, pl and pr n Solution l Triangulation: Two rays are known and the intersection can be computed l Problem: Two rays will not actually intersect in space due to errors in calibration and correspondences, and pixelization l Solution: find a point in space with minimum distance from both rays p p r P OlOl OrOr l
  • Slide 57
  • 3D Computer Vision and Video Computing Reconstruction up to a Scale Factor n Assumption and Problem Statement l Under the assumption that only intrinsic parameters and more than 8 point correspondences are given l Compute the 3-D location from their projections, pl and pr, as well as the extrinsic parameters n Solution l Compute the essential matrix E from at least 8 correspondences l Estimate T (up to a scale and a sign) from E (=RS) using the orthogonal constraint of R, and then R n End up with four different estimates of the pair (T, R) l Reconstruct the depth of each point, and pick up the correct sign of R and T. l Results: reconstructed 3D points (up to a common scale); l The scale can be determined if distance of two points (in space) are known
  • Slide 58
  • 3D Computer Vision and Video Computing Reconstruction up to a Projective Transformation n Assumption and Problem Statement l Under the assumption that only n (>=8) point correspondences are given l Compute the 3-D location from their projections, pl and pr n Solution l Compute the Fundamental matrix F from at least 8 correspondences, and the two epipoles l Determine the projection matrices n Select five points ( from correspondence pairs) as the projective basis l Compute the projective reconstruction n Unique up to the unknown projective transformation fixed by the choice of the five points (* not required for this course; needs advanced knowledge of projective geometry )
  • Slide 59
  • 3D Computer Vision and Video Computing Summary n Fundamental concepts and problems of stereo n Epipolar geometry and stereo rectification n Estimation of fundamental matrix from 8 point pairs n Correspondence problem and two techniques: correlation and feature based matching n Reconstruct 3-D structure from image correspondences given l Fully calibrated l Partially calibration l Uncalibrated stereo cameras (*)
  • Slide 60
  • 3D Computer Vision and Video Computing Next n Understanding 3D structure and events from motion Motion n Homework #4 online, due on May 03 before class