stereo vision

29
December 5, 2013 Computer Vision Lecture 20: Hidden 1 Stereo Vision Due to the limited resolution of images, increasing the baseline distance b gives us a more precise estimate of depth z. However, the greater b, the more different are the two viewing angles, and the more difficult it can become to determine the correspondence between the two images. This brings us to the main problem in stereo vision: How can we find the conjugate pairs in our stereo images? This problem is called stereo matching.

Upload: tamera

Post on 24-Feb-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Stereo Vision. Due to the limited resolution of images, increasing the baseline distance b gives us a more precise estimate of depth z. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

1

Stereo VisionDue to the limited resolution of images, increasing the baseline distance b gives us a more precise estimate of depth z.However, the greater b, the more different are the two viewing angles, and the more difficult it can become to determine the correspondence between the two images.This brings us to the main problem in stereo vision: How can we find the conjugate pairs in our stereo images?This problem is called stereo matching.

Page 2: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

2

Stereo MatchingIn stereo matching, we have to solve a problem that is still under investigation by many researchers, called the correspondence problem.It can be phrased like this: For each point in the left image, find the corresponding point in the right image.The idea underlying all stereo matching algorithms is that these two points should be similar to each other.So we need a measure for similarity.Moreover, we need to find matchable features.

Page 3: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

3

Stereo MatchingA straightforward approach to stereo matching uses pyramids , i.e., representations of the two camera images at various resolutions.The low-resolution versions of two corresponding rows are used to determine the “rough” matching, i.e. large patterns that match between the images.The precise disparity is determined in the high-resolution images.As a measure of similarity, we could simply use pieces of a row in the left image as convolution filters and apply them to the corresponding row in the right image.

Page 4: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

4

Generating Interesting PointsUsing interpolation to determine the depth of points within large homogeneous areas may cause large errors.To overcome this problem, we can generate additional interesting points that can be matched between the two images.The idea is to use structured light, i.e., project a pattern of light onto the visual scene.This creates additional variance in the brightness of pixels and increases the number of interesting points.

Page 5: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

5

Generating Interesting Points

Page 6: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

6

Shape from ShadingBesides binocular disparity, there are many different ways of depth estimation based on monocular information.For example, if we know the reflective properties of the surfaces in our scene and the position of the light source, we can use shape from shading techniques:Basically, since the amount of light reflected by a surface depends on its angle towards the light source, we can estimate the orientation of surfaces based on their intensity.More sophisticated methods also use the contours of shadows cast by objects to estimate the shape and orientation of those objects.

Page 7: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

7

Photometric StereoTo improve the coarse estimates of orientation derived from shape from shading methods, we can use photometric stereo.This technique uses three light sources that are located at different, known positions.Three images are taken, one for each light source, with the other two light sources being turned off.This way we determine three different intensities for each surface in the scene.These three values put narrow constraints on the possible orientation of a surface and allow a rather precise estimation.

Page 8: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

8

Shape from TextureAs we discussed before, the texture gradient gives us valuable monocular depth information.At any point in the image showing texture, the texture gradient is a two-dimensional vector pointing towards the steepest increase in the size of texture elements.The texture gradient across a surface allows a good estimate of the spatial orientation of that surface.Of course, it is important for this technique that the image has high resolution and a precise method of texture size (granularity) measurement is used.

Page 9: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

9

Shape from MotionThe shape from motion technique s similar to binocular stereo, but it uses only one camera.This camera is moved while it takes images from the visual scene.This way two images with a specific baseline distance can be obtained, and depth can be computed just like for binocular stereo.We can even use more than two images in this computation to get more robust measurements of depth.The disadvantages of shape from motion techniques are the technical overhead for moving the camera and the reduced temporal resolution.

Page 10: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

10

Range ImagingIf we can determine the depth of every pixel in an image, we can make this information available in the form of a range image.A range image has exactly the same size and number of pixels as the original image.However, each pixel does not specify color or intensity, but the depth of that pixel in the original image, encoded as grayscale.Usually, the brighter a pixel in a range image is, the closer the corresponding pixel is to the observer.By providing both the original and the range image, we basically define a 3D image.

Page 11: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

11

Sample Range Image of a Mug

Page 12: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

12

Range Imaging Through TriangulationHow can we obtain precise depth information for every pixel in the image of a scene?One precise but slow method uses a laser that can rotate around its vertical axis and can also assume different vertical positions.This laser systematically and sequentially illuminates points in the image.A scene camera determines the position of every single point in its picture.The trick is that this camera looks at the scene from a different direction than does the laser pointer.Therefore, the depth of every point can be easily and precisely determined through triangulation:

Page 13: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

13

Range Imaging Through Triangulation

Page 14: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

14

Range Imaging Through TriangulationObviously, this is a very slow process and not suitable for dynamic scenes.To speed things up, we can use a laser that projects a vertical line of light onto the scene.This laser rotates around its vertical axis and thereby moves the vertical line of light across the scene.Since only the horizontal positions of points vary and give us depth information, the vertical order of points is preserved.This allows us to compute the depth of each point along the vertical line without ambiguities.

Page 15: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

15

Range Imaging Through Triangulation

Page 16: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

16

Range Imaging Through TriangulationAlthough this method is faster, it still requires a complete horizontal scan before a depth image is complete.Maybe we should use a pattern of many vertical lines that only needs to be shifted by the distance between neighboring lines?The disadvantage of this idea is that we could confuse points in different vertical lines, i.e., associate points with incorrect projection angles.However, we can overcome this problem by taking multiple images of the same scene with the pattern in the same position.In each picture, a different subset of lines is projected.

Page 17: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

17

Range Imaging Through TriangulationThen each line can be uniquely identified by its pattern of presence/absence across the images.For example, for 7 vertical lines we need a series of 3 images to do this encoding:

Line #1 Line #2 Line #3 Line #4 Line #5 Line #6 Line #7Image a off off off on on on onImage b off on on off off on onImage c on off on off on off on

Obviously, with this technique we can encode up to (n – 1) lines using log2(n) images.Therefore, this method is more efficient than the single-line scanning technique.

Page 18: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

18

Range Imaging Through Triangulation

Page 19: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

19

Now we will talk about… Motion Analysis

Page 20: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

20

Motion analysisMotion analysis is dealing with three main groups of motion-related problems:• Motion detection• Moving object detection and location.• Derivation of 3D object properties.

Motion analysis and object tracking combine two separate but inter-related components:• Localization and representation of the object of interest (target).• Trajectory filtering and data association.

One or the other may be more important based on the nature of the motion application.

Page 21: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

21

Motion analysis

Page 22: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

22

Differential Motion AnalysisA simple method for motion detection is the subtraction of two or more images in a given image sequence.Usually, this method results in a difference image d(i, j), in which non-zero values indicate areas with motion.For given images f1 and f2, d(i, j) can be computed as follows:

d(i, j) = 0 if | f1(i, j) – f2(i, j) | = 1 otherwise

Page 23: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

23

Differential Motion Analysis

Page 24: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

24

Difference Pictures

Another example of a difference picture that indicates the motion of objects ( = 25).

Page 25: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

25

Difference Pictures

Applying a size filter (size 10) to remove noise from a difference picture.

Page 26: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

26

Difference Pictures

The differential method can rather easily be “tricked”.Here, the indicated changes were induced by changes in the illumination instead of object or camera motion (again, = 25).

Page 27: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

27

Differential Motion AnalysisIn order to determine the direction of motion, we can compute the cumulative difference image for a sequence f1, …, fn of more than two images:

|),(),(|),(2

1

n

kkkcum jifjifajid

Here, f1 is used as the reference image, and the weight coefficients ak can be used to give greater weight to more recent frames and thereby highlight the current object positions.

Page 28: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

28

Cumulative Difference Image|),(),(|),(

21

n

kkkcum jifjifajid

Example: Sequence of 4 images:0 1 1 0

0 1 1 0

0 1 1 0

0 0 0 0

a2 = 1

0 0 0 0

0 1 1 0

0 1 1 0

0 1 1 0

0 0 0 0

0 0 0 0

0 1 1 0

0 1 1 0

0 0 0 0

0 0 0 0

0 0 0 0

0 1 1 0

a3 = 2 a4 = 4

Result: 0 7 7 0

0 6 6 0

0 4 4 0

0 7 7 0

Page 29: Stereo Vision

December 5, 2013 Computer Vision Lecture 20: Hidden Markov Models/Depth

29

Differential Motion Analysis

Generally speaking, while differential motion analysis is well-suited for motion detection, it is not ideal for the analysis of motion characteristics.