stereo vision based object detection

35
M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection ABSTRACT This report presents an analysis of color-stereo approaches for Object Detection. There are many applications of stereo vision like Pedestrian Detection, 3D Face Detection, Automated Systems, Robotics, Aerial Surveys etc. Generally by using two cameras, Stereo View of the scene can be generated. Stereo Vision is same as that of the structure of human eye. By having two eyes humans can judge the 3D views. In same way by creating disparity maps by use of two cameras, object distance is decided. According to the Distance of Object from Camera it can reconstruct the 3D View of scene in Computers. This science is very helpful in the field of Automation where robots will have understanding of 3D Environment. Different approaches and techniques for analyzing the stereo views with the help of Disparity maps created by triangulation geometrics are discussed. By using the disparity maps, v-Disparity Image & u-Disparity Image for Vertical and Horizontal alignment the object Location can be fixed. Algorithm for Disparity Map generation from two camera Images is Dense- Stereo Matching with correspondence-matching is given by Konolige. For Object Detection Candidate-Bounding-Box Generation is used with the help of Disparity Images. The Last step of this technique is Candidate Filtering and Merging. This approach is reported to be very accurate for Object Detection. SKNCOE – Electronics & Telecommunication Engineering – 2010 1

Upload: dwij-it-solutions

Post on 08-Apr-2015

581 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

ABSTRACT

This report presents an analysis of color-stereo approaches for Object Detection. There

are many applications of stereo vision like Pedestrian Detection, 3D Face Detection, Automated

Systems, Robotics, Aerial Surveys etc. Generally by using two cameras, Stereo View of the

scene can be generated. Stereo Vision is same as that of the structure of human eye. By having

two eyes humans can judge the 3D views. In same way by creating disparity maps by use of two

cameras, object distance is decided. According to the Distance of Object from Camera it can

reconstruct the 3D View of scene in Computers. This science is very helpful in the field of

Automation where robots will have understanding of 3D Environment. Different approaches and

techniques for analyzing the stereo views with the help of Disparity maps created by

triangulation geometrics are discussed. By using the disparity maps, v-Disparity Image & u-

Disparity Image for Vertical and Horizontal alignment the object Location can be fixed.

Algorithm for Disparity Map generation from two camera Images is Dense-Stereo Matching

with correspondence-matching is given by Konolige. For Object Detection Candidate-Bounding-

Box Generation is used with the help of Disparity Images. The Last step of this technique is

Candidate Filtering and Merging. This approach is reported to be very accurate for Object

Detection.

SKNCOE – Electronics & Telecommunication Engineering – 2010

1

Page 2: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

CHAPTER 1

INTRODUCTION

1.1 BACKGROUND

Recently, computing speed and storage media capacity improves spectacularly. This

enables advanced image processing in video rate. Therefore, many video applications have

widely been studied, and robot vision becomes one of the hottest topics [1]. Based on the coming

aging society with fewer children, nursing-care robots have been developed by several

companies. To realize advanced autonomous robot control in such complex living environment,

visual information plays important role. Especially, recognition of moving objects is very

important task to avoid a collision or to recognize human gesture.

There are many methods to detect moving objects from camera [2], but most of them do

not extract shape of the moving objects, simultaneously. As described in following section, some

systems have already been developed to detect moving object from camera. However, they have

some restriction to detection of Objects Location, or these recognition systems tend to be

expensive by employing multiple sensing elements. If, it becomes possible to realize the

recognition system using only camera pictures, then it can be mounted on various systems easily.

Therefore, in this seminar, we propose a method for not only detecting but also extracting

moving objects from only stereo camera.

1.2 RELEVANCE

There are certain approaches for Object detection. Only by using one Camera it is

possible to detect the moving Object but the problem arises while detecting how far the object is

from Camera Location. Object Distance we cannot get from this approach [3]. This approach

may get failed for long distance Object Detection or for detecting exact position of Object.

To detect moving objects, background subtraction and inter-frame subtraction are well

known methods. Background subtraction is obtained from subtraction between pre-taken

background image and an input image. By this method, it’s possible to detect moving objects

almost completely. Inter-frame subtraction is obtained from subtraction between a previous

image and an input image. This method is adaptive in the dynamic environment change

comparing to background subtraction [6].

SKNCOE – Electronics & Telecommunication Engineering – 2010

2

Page 3: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

But Problem comes when the Object is not moving and it is nearer to the Camera then

above approach fails to detect the Object. So because of that we are going for stereo approach.

What is stereo approach is discussed in further chapters.

1.3 ORGANIZATION OF SEMINAR REPORT

Our aim is to detect the Object by using the Stereo Cameras with the help of Disparity

Maps. So this can be done with the help of following modules of System [7].

1. Dense-Stereo Matching: First step to perform the dense stereo matching to yield

disparity estimates of the imaged scene. Here the correspondence-matching algorithm by

Konolige [4] is used for Stereo Matching.

2. u- and v-Disparity Image Generation: The u- and v-disparity images are histograms

that bin the disparity values d for each column or row in the image, respectively. This is

useful for Detecting Position of Object for next stage. The resulting v-disparity histogram

image indicates the density of disparities for each image row v, whereas the u-disparity

image shows the density of disparities for each image column u.

3. Object-Bounding-Box Generation: Here Object can be extracted from regions-of-

interest (ROI) in the u- and v-disparity images. The ROIs in the u-disparity image are

extracted by scanning the rows of the image for continuous spans where the histogram

value exceeds the given threshold.

4. Object Filtering and Merging: We merge overlapping bounding box Objects if their

overlap is significant and the disparities associated with the bounding boxes are close.

5. Object Extraction: In this stage it will extract the Object Information like shape,

Direction and its path.

1.4 SUMMARY

This chapter has introduced a Stereo Approach of Object Detection. It also discussed how

Stereo Approach gives more Accuracy and information about Object.

SKNCOE – Electronics & Telecommunication Engineering – 2010

3

Page 4: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

CHAPTER 2

LITERATURE SURVEY

2.1 INTRODUCTION

There are certain approaches for Object detection. Only by using one Camera it’s

possible to detect the moving Object but the problem arises while detecting how far the object is

from Camera Location. Object Distance cannot be extracted from this approach [3]. This

approach may get failed for long distance Object Detection or for detecting exact position of

Object.

Some approaches fails when the Object is not moving and it is nearer to the Camera.

Therefore stereo approach is commonly used in Object detection. As stereo Approach give basic

information about the objects location in 3D environment. Also there are many systems which

make use of stereo approach combine with other techniques to improve the Object Detection as

well as Object Extraction.

2.2 LITERATURE SERVEY

There are many approaches for Object detection. Basic approaches are explained below:

To detect moving objects, background subtraction and inter-frame subtraction are well

known methods [6]. Background subtraction is obtained from subtraction between pre-taken

background image and an input image. By this method, it can detect moving objects almost

completely. Inter-frame subtraction is obtained from subtraction between a previous image and

an input image. This method is adaptive in the dynamic environment change comparing to

background subtraction. However, when camera moves itself, it is difficult to detect moving

objects by these methods because of less distinction between background and moving objects.

SKNCOE – Electronics & Telecommunication Engineering – 2010

4

Page 5: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

In such case, optical flow becomes a key to extract moving objects. Frazier proposes the

method using complex logarithmic mapping, and Takeda proposes the method using residual

error of FOE estimation. In the first case, calculation cost is large because the mapping is

generated every frame, and the camera motion is restricted to parallel direction. In the latter case,

the camera motion is not restricted, but it is difficult to calculate FOE from optical flow.

Also, Ogale proposes the method using 2-D optical flow. In this method, moving objects

are classified into three classes. When the objects are moving to different direction from the

camera, it can be detected using motion-based clustering. On the other hand, when an object is

moving to same direction with camera, optical flow will not be sufficient to detect the moving

object. So, ordinal depth conflict is needed as an additional constraint. When an object is

occluded behind another object, we can see the order of objects. At the same time, optical flow

lengths of background objects are inversely proportional to distance from camera. Therefore, if

ordinal depth conflict occurs, it can detect moving object. However when it detects the

independent object: there are no object overlapping moving object, it is said that it need another

source.

In reference [6], they employ 3-D optical flow and extract all moving object in images

from a stereo camera with free movement. At first, it captures successive left and right images as

input from the stereo camera. Secondly, it obtains feature points from successive three right

images and 3-D coordinates at these points from stereo images. Thirdly, it relates feature points

between next frame and a previous frame, and then we estimate the camera motion by analyzing

the change of world coordinate at feature points. Then it corrects these successive images for the

estimated camera motion. Finally, it extracts moving objects using information form inter-frame

subtraction, edge, inter-frame subtraction and stereo. It is shown in diagrams below.

SKNCOE – Electronics & Telecommunication Engineering – 2010

5

Page 6: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

Fig. 1: Compensation of Camera Motion and Object Detection

But Problem comes when the Object is not moving and it is nearer to the Camera then

above approach fails to detect the Object. So because of all above problems it is preferable to go

for stereo approach for Object Detection.

A fundamental step to analyzing Object with stereo imagery is to detect obstacles in the

scene and localize their position in 3D space from the disparity maps generated from stereo

correspondence matching [7]. The disparity images derived from stereo analysis can be used to

generate a list of object regions in the scene. We adapt a classical approach to obstacle detection

in stereo imagery proposed by Labayrade et al. [5] that utilizes the concept of v-disparity to

identify potential obstacles in the scene. Essentially, v-disparity is a histogram of the disparity

image that counts the occurrence of disparity values for each row in the image and can be used to

detect the ground plane in the scene and isolate regions that contain obstacles. Variations of this

approach to detecting objects in stereo imagery have been implemented. However, this paper

illustrates a generalized framework that is able to obtain dense stereo correspondences and robust

ground plane estimates with both color and infrared-based stereo technique consists of the

following stages:

1. Dense-Stereo Matching

2. u- and v-Disparity Image Generation

3. Object-Bounding-Box Generation

4. Object Filtering & Merging

5. Object Extraction

There is one more approach is to match the stereo image using Sum of Absolute

Differences (SAD) correlation algorithm to establish correspondence between image features in

the different views of scene [8]. This is used to produce a stereo disparity image containing SKNCOE – Electronics & Telecommunication Engineering – 2010

6

Page 7: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

information about the depth of objects away from the camera in the image. A geometric

projection algorithm is then used to generate a 3-Dimensional (3D) point map, placing pixels of

the disparity image in 3D space. This is then converted to a 2-Dimensional (2D) depth map

allowing objects in the scene to be viewed The disparity mapping is produced by block matching

algorithm Sum of Absolute Differences (SAD). This assistive technology utilizing stereoscopic

cameras has the purpose of automated obstacle detection, path planning and following, and

collision avoidance during navigation.

Stereo Disparity

The purpose of stereo vision is to perform range measurements based on the left and

right images obtained from stereoscopic cameras. Basically, an algorithm is implemented to

establish the correspondence between image features in different views of the scene and then

calculate the relative displacement between feature coordinates in each image. In order to

produce a disparity image, the PGR software’s inbuilt SAD correlation algorithm in Eq. is used

to compare a neighborhood in one stereo image to a number of neighborhoods in the other stereo

image. [9]

where a window of size (2M +1)×(2M +1) (called a correlation mask) is centered at the

coordinates of the matching pixels (i, j) , (x, y) in one stereo image, IL and IR are the intensity

functions of the left and right images, and dmin, dmax are the minimum and maximum disparities.

The disparity dmin of zero pixels often corresponds to an infinitely far away object and the

disparity dmax denotes the closest position of an object. The disparity range was tuned, through

trials comparing range and matching accuracy, to between 0.5m and 4.75m which provided

adequate mapping accuracy and distance for response to obstacles.

3D Point Map Generation

Once a disparity image is produced from the processed left and right camera images, a

3D point map can be created which maps the depth-determined pixels from the disparity image

into a 3D plane. The PGRView software then allows the 3D plane to be rotated and observed

from different viewpoints. This is a very useful feature in determining where there was noise and

SKNCOE – Electronics & Telecommunication Engineering – 2010

7

Page 8: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

which calibration settings improved the 3D point cloud for the purposes of accurate obstacle

detection and depth analysis.

The Result of above experiment is shown in diagrams below.

SKNCOE – Electronics & Telecommunication Engineering – 2010

8

Page 9: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

Another paper presents a method to solve the correspondence problem in matching the

stereo image using Sum of Absolute Differences (SAD) algorithm [10]. The computer vision

application in this paper is using an autonomous vehicle which has a stereo pair on top of it. The

estimation of range is using curve fitting tool (cftool) for each detected object or obstacles. This

tool is provided by Matlab software. The disparity mapping is produced by block matching

algorithm Sum of Absolute Differences (SAD).

A. Image Rectification

The rectification of stereo image pairs can be carry out under the condition of calibrated

camera. To quickly and accurately search the corresponding points along the scan lines,

rectification of stereo pairs are performed so that corresponding epipolar lines are parallel to the

horizontal scan-lines and the difference in vertical direction is zero. Image rectification is the

undistortion according to the calibration parameters calculated in the camera calibration. After

all intrinsic and extrinsic camera parameters are calculated they can be used to rectify images

according to the epipolar constraint [4]. The rectification process is shown by Figure 4. The

process above starting with acquire stereo images after that the image programming software

Matlab will enhance the images using histogram equalization method. The next step is finding

the matching point to be rectified. This problem faces a correspondence problem. Then the

matched point and camera calibration information are applied to reconstruct the stereo images to

form a rectified images.

The equation below is used to rectify the images in Matlab.

Inew (x0, Y0) = a1Iold (x1, y1) + a2Iold (x2, y2) + a3Iold(x3, y3) + a4Iold (x4, y4)

Fig. 5 Rectification Process

SKNCOE – Electronics & Telecommunication Engineering – 2010

9

Page 10: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

Fig. 6 Original Image (a) and Image after Rectification (b)

With Inew and IOld as the original and the rectified image and the blending coefficients a i

separate for each camera. Above Figure 6(a)(b) are the original image before rectification and

after rectification. The output size of rectified stereo Image is 320x240. The horizontal line for

both images indicates the left and right image is horizontally aligned compared to image Figure

(a).

B. Stereo Correspondence

With assume from now on that the images are rectified. That is, the epipolar lines are

parallel to the rows of the images. By plotting the rectified images indicate the horizontal

coordinate by x and the vertical coordinate by y. With that geometry, given a pixel at coordinate

xb the problem of stereo matching is to find the coordinate xr of the corresponding pixel in the

same row in the right Image. The difference d = xr - xi is called the disparity at that pixel. The

basic matching approach is to take a window W centered at the left pixel, translate that window

by d and compare the intensity values in W in the left image and W translated in the right image.

The comparison metric typically has the form:

SKNCOE – Electronics & Telecommunication Engineering – 2010

10

Page 11: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

SAD: ∑Il(x, y), Ir (x+d, y)) = ∑ | Il (x, y) - Ir (x+d, y) |

The function of SAD measures the difference between the pixel values. The disparity is

computed at every pixel in the image and for every possible disparity. It sums up the intensities

of all surrounding pixels in the neighborhood for each pixel in the left image. The absolute

difference between this sum and the sum of the pixel, and Its surrounding, in the right image is

calculated. The minimum over the row in the right image is chosen to be the best matching pixel.

The disparity then is calculated as the actual horizontal pixel difference. The output is a disparity

image. Those Images can be interpreted as disparity being the inverse of the depth (larger

disparity for points closer to the cameras).

Fig. 7 SAD Block Matching Process

To calculate stereo correspondence of stereo Images, there are some simple standard

algorithms by using block matching and matching criteria. The blocks are usually defined on

epipolar line for matching ease. Each block from the left image is matched into a block in the

right image by shifting the left block over the searching area of pixels in right image as shown in

Figure above. At each shift, the sum of comparing parameter such as the intensity or color o f the

two blocks is computed and saved. The sum parameter is called "match strength". The shift

which gives a best result of the matching criteria is considered as the best match or

correspondence.

According to the SAD algorithm works on each block from the left image is matched into

a block in the right image by shifting the left block over the searching area of pixels in right

image as shown in Figure above. Ideally, for every pixel mask within the original Image there

should be a single mask within a second image that is nearly identical to the original and thus the

SAD for this comparison should be zero.

SKNCOE – Electronics & Telecommunication Engineering – 2010

11

Page 12: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

C. Disparity Mapping

Together with the stereo camera parameters from calibration and the disparity between

corresponding stereo points, the stereo images distances can be retrieved. In order to find

corresponding pairs of stereo points, they first have to be compared for different disparities, after

which the best matching pairs can be determined. The maximum range at which the stereo vision

can be used for detecting obstacles depends on the image and depth resolution. Absolute

differences of pixel intensities are used in the algorithm to compute stereo similarities between

points. By computing the sum of the absolute differences for pixels in a window surrounding the

points, the difference between similarity values for stereo points can be calculated. The disparity

associated with the smallest SAD value is selected as best match Figure below shows the

disparity mapping using SAD block matching algorithm.

D. Range Estimation using Curve Fitting Tool:

The estimation of the obstacle's range in this paper is using curve fitting tool in Matlab to

determine the range according to the pixel values. Each pixel in the mapping of disparity will be

calculated through the curve fitting tool and the coordinate of horizontal is referring to left

Image.

Fig. 8 using Tsai’s method

SKNCOE – Electronics & Telecommunication Engineering – 2010

12

Page 13: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

The equation of the distance estimation is:

Range = a*exp(b*x) + c*exp(d*x)

• a = 0.339

• b = -3.525

• c= 0.9817

• d= -0.4048

Where the value of a, b, c and d is a constant value produced by curve fitting tool. The

value of x represents the value of pixels in the disparity mapping. The curve can be explained as

Figure below. X axis represents disparity value in pixel density and y axis shows the distance or

range in meter for every disparity values.

Fig. 9 Curve Fitting tool Window

2.4 SUMMARY

This section discussed various approaches of detection object by using Stereo Camera

Images. The first approach was of background subtraction and inter-frame subtraction to detect

the moving object. Another one is Stereo Disparity using Sum of Absolute Differences (SAD)

algorithm & 3D Point Map Generation. Next approach was of using Image rectification before

going for Sum of Absolute Differences (SAD) algorithm.

SKNCOE – Electronics & Telecommunication Engineering – 2010

13

Page 14: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

CHAPTER 3

INTRODUCTION

So, after discussing various approaches used in

previous papers, the basic approach or steps are shown in

list below.

1. Dense-Stereo Matching

2. u- and v-Disparity Image Generation

3. Object-Bounding-Box Generation

4. Object Filtering & Merging

5. Object Extraction

In this chapter all of above steps is discussed in details.

3.1 DENSE-STEREO MATCHING:

The First step is to perform the dense stereo matching for yielding disparity estimates of

the imaged scene. Here images from Stereo Camera are processed (matched) together and the

disparity images will be found with the help of various algorithms. The background behind the

Stereo Matching is the Triangulation Geometrics explained below [12]:

Triangulation Geometrics

The technique for gauging depth information given two offset images is called

triangulation. Triangulation makes use of a number of variables; the center point of the cameras

(c1, c2), the cameras focal lengths (F), the angles (O1, O2), the image planes (IP1, IP2), and the

image points (P1, P2). The following examples show how the triangulation technique works.

SKNCOE – Electronics & Telecommunication Engineering – 2010

14

Fig. 10 Object Detection Algorithm

Page 15: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

Fig. 11 Triangulation Geometrics

For any point P of some object in the real world, P1 and P2 are pixel point representations

of P in the images IP1 and IP2 as taken by cameras C1 and C2. F is the focal length of the camera

(distance between lens and film). B is the offset distance between cameras C1 and C2. V1 and V2

are the horizontal placement of the pixel points with respect to the center of the camera. The

disparity of the points P1 and P2 from image to image can be calculated by taking the difference

of V1 and V2. This is the equivalent of the horizontal shift of point P1 to P2 in the image planes.

Using this disparity one can calculate the actual distance of the point in the real world from the

images. The following formula can be derived from the geometric relation above:

Distance of Point∈RealWorld=(baseoffset )∗(focal length of camera)

(disparity )∨D=bf

d

This formula is used to calculate the real world distance of a point. If we are interested in

is relative distance of points rather than exact distance we can do this with even less information.

The base offset and focal length of the camera are the same for both images. Hence the distance

of different points in the images will vary solely based on this disparity component. Therefore

we can gauge relative distance of points in images without having the base offset and focal

length.

Triangulation works under the assumption that points P1 and P2 represent the same point

P in the real world. An algorithm for matching these two points must be performed. This can be

done by taking small regions in one image and comparing them to regions in the other image.

Each comparison is given a score and the best match is used in calculating the disparity.

The technique for scoring region matching varies, but usually is based on the number of pixels

SKNCOE – Electronics & Telecommunication Engineering – 2010

15

Page 16: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

that are the same on an exact or near-exact point basis. Both triangulation technique for stereo

image matching and technique for point-matching within a region are successfully implemented

in the “Cooperative Algorithm for Stereo Matching and Occlusion Detection”

One of Stereo Vision experiment gives the output of Stereo Matching as shown in figures

below [11].

Fig. 12 Dense Stereo Matching Result

In the figure above we can see very small difference between the Left & Right Camera

Image. That small Image Difference is being calculated by Algorithm. Here the correspondence-

matching algorithm by Konolige [4] is selected for its ease of use and reliable disparity

generation for both color-stereo and infrared-stereo imageries. In this progress, there are two

processes. One is low-pass filtering and calibration. The other is stereo processing. In the stereo

processing, the stereo matching is based on relation. We apply following equation to obtain the

disparity map [6],

SKNCOE – Electronics & Telecommunication Engineering – 2010

16

Page 17: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

where dmax and dmin mean disparity range, m means window size (mm). Iright and Ileft mean

right and left image. X and y mean remarkable point. Passing through these processes, we obtain

the depth image. In this paper, dmax equals 40, dmin equals 0 and m equals 11.

3.2 U- AND V-DISPARITY IMAGE GENERATION:

The u- and v-disparity images are histograms that bin the disparity values d for each

column or row in the image, respectively. The resulting v-disparity histogram image indicates the

density of disparities for each image row v, whereas the u-disparity image shows the density of

disparities for each image column u. Fig. 14 shows an example of u-disparity images, and Fig. 14

shows the corresponding v-disparity images generated from the color stereo and infrared-stereo

disparity maps in Fig. 13.

Fig. 13 Disparity Image

SKNCOE – Electronics & Telecommunication Engineering – 2010

17

Page 18: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

Fig. 14 U- & V-Disparity Image from Color-Stereo Images

Notice that the u-disparity images in Fig. 14 show three distinct horizontal regions

corresponding to the three pedestrians in the scene. It is these regions that we wish to detect in

order to build objects areas. The region spanning the entire length at the top of the u-disparity

image indicates the background plane and can be filtered from processing. Similarly, the v-

disparity images in Fig. 14 show vertical peaks of high density for both the background plane

and the range of disparities containing pedestrians. These regions also need to be detected to

build objects. Additionally, the downward-sloping trend for each row in the v-disparity image is

exploited to estimate the ground plane in the scene [13].

3.3 OBJECT BOUNDING BOX GENERATION:

Bounding-box Objects can be extracted from regions-of-interest (ROI) in the u- and v-

disparity images. The ROIs in the u-disparity image are extracted by scanning the rows of the

image for continuous spans where the histogram value exceeds the given threshold. Fig. 14(a)

and (b) overlays the extracted regions on the u-disparity image. The ROIs are extracted from the

v-disparity image by selecting columns where the sum of the histogram values above the ground

plane is greater than the threshold. The ROI spans from the ground plane to the highest point in

the column that exceeds the given threshold. Fig. 14(a) and (b) shows the extracted regions in on

the v-disparity image. The Object bounding boxes are selected from the ROIs in the u- and v-

disparity images based on their disparity values. For a given disparity d, the widths of the

bounding boxes are determined by the ROIs found in the u-disparity image, and the heights are

derived from the ROIs in the v-disparity image. Large bounding boxes associated with

background regions are filtered, and the remaining objects are shown in Fig. 15.

SKNCOE – Electronics & Telecommunication Engineering – 2010

18

Page 19: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

Fig. 15 Bounding-box objects with color-stereo images.

3.4 OBJECT FILTERING AND MERGING:

As shown in Fig. 15, there are often multiple overlapping object bounding boxes

generated. This occurs because the disparities associated with a single object span a range of

values, particularly as the object moves closer to the camera. We merge significantly overlapping

objects if the disparities that are associated with the bounding boxes are close. The final object

bounding boxes are shown in Fig. 16. Notice how the overlapping candidates have merged into

the correct bounding boxes corresponding to the pedestrians(Objects) in the scene.

Fig. 16 Final selection of Objects

after bounding box merging with

color-stereo images.

SKNCOE – Electronics & Telecommunication Engineering – 2010

19

Page 20: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

3.5 OBJECT EXTRACTION:

After we have detected the Object then it is possible to go for Extracting the information

about the Object like its shape, location and direction of moving. By using the Segmentation on

the output we can get the shapes of the Object as shown in figures below.

Fig. 17 Outlined foreground extraction for color

images using Color segmentation

Further it is also possible to measure the distance of object from camera with the help of

Disparity Map. As Disparity Map gives the Objects Distance Information on the Brightness of

Object in Disparity Map, we can detect object which is so near or far away. This information is

useful in application like Pedestrian detection & Tracking.

For shape Extraction we can also go for Image background Subtraction or inter-frame

Subtraction as discussed before in reference [6].

There is one more application in which this stereo approach can be used, which is a face

detection in Stereo Environment [14]. With the help of Disparity Maps it is possible to create 3D

Depth Map of Object represented by Grids. Then by using Grid map of Face we can create 3D

Face as shown in diagram below.

Fig. 18 creating 3D Face using Disparity Maps: (a) left frame of a stereo image, (b) reconstructed disparity map, (c)

corresponding depth-map on a 3D grid, (d) reconstructed 3D head.

It is also possible to create the 3D View of any Object with the help of Disparity Map,

but for that the Disparity Map must be of very high quality which give us wide range of distance

in terms of its intensity.

SKNCOE – Electronics & Telecommunication Engineering – 2010

20

Page 21: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

2.4 SUMMARY

This section has discussed the One of the approach for detecting Object using the Stereo

cameras with the help of Disparity Maps. That contained the way of finding the Disparity Map in

the Triangulation geometry. Then the disparity image get divide into u- and v-disparity maps

which are like histograms of disparity map, and helped to get exact object location on the Scene.

After getting the Object Box over Object with the help of Object Bounding Box generation and

Object filtering & merge. At last we have seen that how object’s shape and distance can be

calculated with the help of Disparity Maps by which we can get more information about Object.

SKNCOE – Electronics & Telecommunication Engineering – 2010

21

Page 22: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

CHAPTER 4

CONCLUSION

We have presented a technique for Object Detection with the help of Stereo Cameras.

Paper discussed the various approaches of finding the Object in Scene given by the cameras.

With help of correspondence-matching algorithm by Konolige [4] we got the Disparity map.

Here it is possible to get more accuracy for the Object Detection by using Stereo Vision,

compared to previous approaches. So by using this technique not only we are getting the Object

detected but also get 3D Information about Object as well as the Scene as a whole. The result of

this technique is shown below from reference [7].

So this stereo vision technique can be used for application like Automated Systems,

Robotics, Pedestrian Detection for cars, 3D Face Detection [14], Aerial Surveys for calculation

of contour maps or even geometry extraction for 3D building mapping, calculation of 3D

heliographical information such as obtained by the NASA STEREO project

SKNCOE – Electronics & Telecommunication Engineering – 2010

22

Page 23: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

CHAPTER 5

REFERENCE

[1]R.Cucchiara, C.Grana, A.Prati, G.Tardini, R.Vezzani, “Using computer vision techniques dangerous

situation detection in domotic applications,” Proc. of the IDSS04, London, Great Britain, pages 1–

5, February, the 23rd 2004.

[2]R.Fablet, P.Bouthemy, M.Gelgon, “Moving object detection in color image sequences using region-

level graph labeling,” ICIP, 1999.

[3]A. Ess, B. Leibe, K. Schindler, L. van Gool, “Moving Obstacle Detection in Highly

Dynamic Scenes” IEEE International Conference on Robotics and Automation, 2009.

[4]K. Konolige, “Small vision systems: Hardware and implementation,” in Proc. 8th Int. Symp.

Robot. Res., 1997, pp. 111–116.

[5]R. Labayrade, D. Aubert, and J.-P. Tarel, “Real time obstacle detection in stereovision on

non flat road geometry through “v-disparity” representation.” in IEEE Conference on

Intelligent Vehicles, 2002.

[6]Masakazu MORIMOTO, Yasuhiro MITO, Kensaku FUJII, “AN OBJECT DETECTION

AND EXTRACTION METHOD USING STEREO CAMERA” in Automation Congress,

2008. WAC 2008.

[7]Stephen J. Krotosky and Mohan Manubhai Trivedi “On Color-, Infrared-, and Multimodal-

Stereo Approaches to Pedestrian Detection” IEEE TRANSACTIONS ON INTELLIGENT

TRANSPORTATION SYSTEMS, VOL. 8, NO. 4, DECEMBER 2007

[8]Jordan S. Nguyen, Thanh H. Nguyen, Hung T. Nguyen, “Semi-autonomous Wheelchair

System Using Stereoscopic Cameras” 31st Annual International Conference of the IEEE

EMBS Minneapolis, 2009.

[9]T. H. Nguyen, J. S. Nguyen, D. M. Pham, and H. T. Nguyen, "Real-Time Obstacle

Detection for an Autonomous Wheelchair Using Stereoscopic Cameras," in The 29th

Annual International Conference of the IEEE Engineering in Medicine and Biology

Society, 2007.

[10]Rostam Affendi Hamzah, Rosman Abd Rahim, Zarina Mohd Noh, “Sum of Absolute

Differences Algorithm in Stereo Correspondence Problem for Stereo Matching in

Computer Vision Application” IEEE, 2010

[11]Zeng-Fu Wang, Zhi-Gang Zheng , “A Region Based Stereo Matching Algorithm Using

Cooperative Optimization”SKNCOE – Electronics & Telecommunication Engineering – 2010

23

Page 24: Stereo Vision Based Object Detection

M.E. (Electronics & Telecommunication) Stereo Vision based Object Detection

[12]http://disparity.wikidot.com/

[13]K. Konolige, “Small vision systems: Hardware and implementation,” in Proc. 8th Int.

Symp. Robot. Res., 1997

[14]Sergey Kosov, Kristina Scherbaum, Kamil Faber, Thorsten Thorm¨ahlen, Hans-Peter

Seidel , “RAPID STEREO-VISION ENHANCED FACE DETECTION” ICIP 2009

SKNCOE – Electronics & Telecommunication Engineering – 2010

24