density based traffic management using dip -...
TRANSCRIPT
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
759
All Rights Reserved © 2016 IJARECE
“Density based traffic Management using DIP”
ROOPADEVI G HOSUR M.TECH, DIGITAL ELECTRONICS,BITM, BALLARI
HEMANTHAKUMAR.K
ASS PROF ELECTRONICS AND COMMUNICATION
BITM BALLARI
CHAPTER 1
Abstract
The project is designed to develop a density based
dynamic traffic signal system. With the world moving
towards smart cities, one of the major problems faced by
all cities is vehicular road traffic congestions. Traffic
congestion has been causing many critical problems in
most populated cities. Due to these congestion problems,
people lose time, miss opportunities, and get frustrated.
The traffic light durations in the conventional methods
have been constant which turns out to be a big
drawback. Conventional traffic light system is based on
fixed time concept allotted to each side of the junction
which cannot be varied as per varying traffic density .
This paper attempts to address the problem of traffic
congestions caused at traffic signals. Traffic monitoring
is based on density of vehicles which improve the traffic
control system by calculating the density of vehicles on
the road.
Key words: Background modeling, foreground detection,
frame differencing, RGB model.
INTRODUCTION
The count of vehicles on the roads increasing each and
every day hence it is required to manage the flow of traffic
.Now a days traffic congestion is a considerable problem in
the fast growing urban areas. Due to population increase in
large urban areas leads to increase in vehicular density,
leading to traffic jams and other related problems. These
problems include increasing commuting time raise in
transportation cost, delayed services and increase in fuel consumption.
Due to lack in the maintenance of traffic signal and its
control which leads to inept traffic flow. For example if we
consider two lanes with one lane having more number of
vehicles and the other lane with lesser number but both got
same time period of green signal where it is loss of time. By
observing suppose if the time period of green signal is more
for the lane which is having more number of vehicles which
leads to efficient use of time.
Some of the old traffic control methods which involve
magnetic loop detectors, infra-red and radar sensors gives us very less information about traffic and thus we need a
separate systems to count traffic and its surveillance.
Certain cost effective solution like the inductive loop
detector when used on poor road surfaces leads to high rate
of failure less pavement life and abstracting traffic while
maintenance. During fog the infrared sensors are affected
more than the video camera and cannot be used for
surveillance.
In the manual traffic control the individual traffic personnel
is made to stand at each individual junction. Even though
this concept is most reliable as the traffic flow is controlled
based on priority but it created health concern. In the present traffic control technique which is timer based automatic
timer system the traffic flow is analyzed for a certain period
of time, the traffic lights are embedded, it is not reliable as
the traffic flow can never be interpreted and it‟s a non
reliable technology. Sometimes when there is zero density in
lane but still it gives green signal because traffic signals
change depending upon the time interval.
This paper attempts to address the problem of traffic
congestions caused at traffic signals. Traffic monitoring is
based on density of vehicles which improve the traffic
control system by calculating the density of vehicles on the
road.
1.1 Problem Definition With the world moving towards smart cities, the major
problems faced by all cities is
vehicular road traffic congestions. Traffic congestion has
been causing many critical problems in most populated
cities. In the present traffic control technique which is timer
based automatic timer system the traffic flow is analyzed for
a certain period of time, the traffic lights are embedded, it is not reliable as the traffic flow can never be interpreted and
it‟s a non reliable technology.
The solution to the above problem is to have Traffic
monitoring based on density of
vehicles which improve the traffic control system by
calculating the density of vehicles
on the road.
1.2 Proposed Methodologies The methodologies include image processing algorithms and
matlab code which includes moving object detection from
the video, background subtraction, tracking, classification,
and counting number of vehicles.
1.3 Objectives
The main objective is to detect a presence of vehicle on the
lane and counting of vehicles which are present on the lanes.
Depending upon the count of vehicles the traffic signals are
changed. Here we initially do the background subtraction by
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
760
All Rights Reserved © 2016 IJARECE
using segmentation with help of surveillance video for
detecting and tracking vehicles.
CHAPTER 2
LITERATURE SURVEY
Intensive research has been done on traffic control system
and a large number of articles and research papers has been
published on this topic since from last few decades. But
most of the systems are inductive loop detectors, infra red
and radar sensors which provide very limited traffic
information and they subject to high failure rate when
installed in road surfaces. Video monitoring has been long
in use to monitor security sensitive areas such as banks,
department stores, crowded public places and borders. Nowadays it is also being used in automatic traffic
monitoring system to control the traffic.
Current traffic control system is based on tracking,
classification and activity analysis.
In [2] Bhadra et al. they have used agent-based fuzzy logic
technology for traffic control situations involving multiple
approaches and vehicle movements. In order to provide real
time based dynamic traffic system this agent based fuzzy logic technology is used. Decisions are made considering
clock time by taking into account of density of vehicles on a
specific lane at a specific time. Busy, moderate, and idle are
the three lane status. Real-time data is collected with
specific time intervals. In Fuzzy logic in order to implement
agent technology its mathematical model is utilized.
In [3] author R.tina‟s objective is to enhance efficiency of present automatic traffic signalling system. By integrating
traditional system with automated signaling system. Using
digital camera mounted on a motor we fetch a artificial
vision from all the lanes and hence detects the vehicular
density on the road .By using PC through microprocessor
using we control the direction of the camera for each lane.
Thus the obtained image of the lane is processed to estimate
vehicular density by using image processing .
In [4] author Farheena Shaikh described his idea to
overcome the problem of traffic congestion on intersection.
at the Traffic Signal system is introduced. Here the prior
objective is to calculate the number of total vehicle present
on the road for smooth flow of vehicular traffic without
traffic jam. And the other objective is to, give priority to the
emergency vehicles in spite of other vehicles on the lane. It
is also helpful to overcome the traffic jam problem to
reducing the delay problem and avoiding congestion. It also
helps in providing the emergency services like Fire Brigade
Vehicle, Ambulance or Police on pursuit at right time. Traffic Signal Management when properly designed,
operated and maintained yields significant benefits like less
congestion, saving fuel consumption. Vehicle emissions are
also reduced and it also improves the air quality.
Pezhman Niksaz et. al.[5] propose a system which uses
image processing technique to count the density of vehicles
and the result message is been shown to inform the vehicles
count on the highway. Operations involved are Image
Acquisition of image, transformation of RGB to gray scale,
image enhancement and morphological operations. The captured image from the camera counts has consecutive
frames and they are compared with the first frame. And
finally , if the density of the vehicles is more than a
threshold, a message is shown. By this message we can
predict the amount of reduction in the traffic jam.
CHAPTER – 3
GENERIC VIDEO PROCESSING FRAMEWORK
3.1.1 Video Frame: The input video format is AVI. AVI
abbreviation is audio video interleave. RIFF (Resource
Interchange File Format) container format is used to store
audio and video data of avi file.An uncompressed PCM
format with various parameter are used to store audio data.
Compressed format with various codes and parameters are used to store video data in avi files.
The aviread, aviinfo, mmreader functions are used to read
the input video AVI format. The videoinput is used to read
the video from webcam.
3.1.2 Preprocessing:
Color Models: Particular colors are specified using
standard color models by defining a 3D coordinate system,
and a subspace that contains all constructible colors within a
particular model. Each color model is oriented towards
either specific hardware (RGB, CMY, YIQ), or image
processing applications
The RGB Model: The three independent primary colors red
green blue in the RGB model makes an image. The standard
wavelength for the 3 primaries are as shown in Figure 3.2.
By specifying the amount of each of the primary component
particular specified color is done Figure 3.3 Cartesian
coordinate system is used to specify color model of RGB
geometry .
Figure 3.2: The standard wavelength of
RGB.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
761
All Rights Reserved © 2016 IJARECE
Figure 3.3: The RGB Model.
Color image smoothing: The concept of image smoothing
can be extanded to full-color images with the princiipal
difference that instead of scaler intensity values, component
vectors must be considered.
We get average of the RGB component vectors in this
neighborhood is given in equation
Where c represents an arbitrary vector in RGB color space.
Therefore, smoothiing averagiing can be done on a per-
color-plane basiss
a) Original image
b) Red component
),(
),(
),(
),(
yxB
yxG
yxR
yxc
xySyx
yxK
yx),(
),(1
),( cc
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
762
All Rights Reserved © 2016 IJARECE
c) Green component d)
Blue component
Figure 3.4:Result of processing each RGB component.
Figure 3.5:Image somoothing by 5X5 averaging mask
Color image sharpening
By applying Laplacian operator Images can be sharpened.
Where components are equal to the laplacien of the
individual scalar component of the input vector is called Laplacien vector.
𝛁𝟐[𝒄(𝒙, 𝒚]) =
𝛁𝟐𝑹 𝒙, 𝒚
𝛁𝟐𝑮 𝒙, 𝒚
𝛁𝟐𝑩 𝒙,𝒚
Color segmentation
Images are partitioned into regions by segmentation process
.The objective is to specify a color range of segmented
object in an RGB image. Better results are obtained by
Segmentation in RGB space. Given a set of sample color
point‟s representative of the colors of interest we obtain an
estimate of an “average” color that we need to segment.
Such classification requires a measure of similarity. The
simplest measure is the Euclidean distance. Here arbitrary point in RGB space is denoted by z. The distance between z
and a is given by
D (z,a) = ||z – a||
= [(z-a)T(z - a)]
0.5
= [ (zR – aR)2 + (zG – aG)
2 + (zB – aB)
2 ]
0.5
Where the subscripts R, G, and B denote the RGB
components of the vectors a and z.
3.1.3 Moving Object Detection
Each video processing has different needs depending on
applications and thus requires different treatment .Thus
moving objects are the common things found in them.
Moving objects have got detecting regions which are basic
for vision system to provide focus of attention and simplify
analysis of subsequent steps. Changes in illumination,
weather and climate, repetitive motions can lead to clutter
hence it leads to the problem of motion detection.
Background subtraction and optical flow are used as
techniques to detect moving object.
Background Subtraction
Most commonly used technique for motion segmentation in
static scenes is Background subtraction. By subtracting the
current image pixcel by pixel from a given background
image it detects the moving regions like vehicles and
people. The pixels where the difference is above a threshold
are classified as foreground. To reduce the effects of noise
and enhance the detected region we need morphological
post processing operations such as erosion, dilation and
closing are required. For new images over time to adapt to scene change the reference background is updated.
The background image Bt is updated by the use of an
Infinite Impulse Response (IIR) filter as follows:
Bt+1 = αIt + (1 − α) Bt
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
763
All Rights Reserved © 2016 IJARECE
Where α ( [0.0, 1.0]) is a learning constant which specify
how much information from the incoming image is put to
the background.
Temporal Differencing
Difference of consecutive frames by pixel by pixel detects
the moving entity from a video this is called as temporal
differencing .This method is more adaptive to scene
changes; however, it usually fails in detecting whole
relevant pixels of moving object.
Optical Flow
Flow vectors of moving entity over time to find the moving
regions in an image is called optical flow. Motion in video
sequences even from a moving camera are detected by optical flow, but the most of the optical flow methods are
computationally complex and cannot be used in real-time
without specialized hardware.
3.1.4 Object Classification
Pedestrians, vehicles, clutter, etc are the few detected
objects from the given video Moving areas found in video
may correspond to different objects in real-world such as.
Objects are to be tracked for there reliability and analyzed .
The two main methods used for moving object
classification and are 1) Shape-based and
2) Motion-based methods.
Shape-based Classification
The Bounding rectangle, area, silhouette and gradient of
detected object regions are the common features of shape
based classification. Shape based classification is used to
classify vehicles into low and heavy vehicle. Cars and bikes
are low vehicles and trucks and bus are heavy vehicle.
Dispersedness of an object consist of area and its perimeter
as in equation
Dispersedness =Perimeter2/Area
Neural network classifier is used to view dependent visual
features like human and human groups vehicle and clutters.
Motion-based Classification
Temporal motion is used to find classes of an object. These
classes helps to detect objects into rigiad and none rigid.
Self-similarity measure are shown by an object which shows
periodic motion.
3.1.5 Object Tracking
Though Object Tracking is a difficult problem but it still
arouses interest among computer vision researchers
.Correspondence of objects and object parts between consecutive frames of video are established by object
tracking. Object tracking provides cohesive temporal data
about moving objects used to enhance lower level
processing and to enable higher level data extraction.
Applying tracking in congested situations leads to inaccurate segmentation of objects. Erroneous segmentation leads to
long shadow, partial and full occlusion of objects with each
other and with stationary thing in the scene. Two type of
approaches in tracking objects as a whole are
1) One is based on correspondence matching and
2) One carries out explicit tracking by making use of
position prediction or motion estimation.
Vehicle tracking is done by comparing and matching the
features such as size, shape and by calculating the centroid
of the object.
CHAPTER - 4
MOTION SEGMENTATION USING BACKGROUND
SUBTRACTION
The 4 main steps of background subtraction are
preprocessing, background modeling, foreground detection
and data validation. Here in preprocessing first it takes the
raw input video and converts it into a format which can be
used for the further steps. In the background modeling it calculates the frame updates it in a model with the support
of background modeling. Video fram that cannot be
adequately explained by the background model are detected
by foreground detection, and outputs them as a binary
candidate foreground mask. And data validation examines
the candidate mask, eliminates those pixels that do not
correspond to actual moving objects, and outputs the final
foreground mask.
4.1 PREPROCESSING
In preprocessing video the smoothened video is used to
remove the noise and then RGB color model is selected.
Simple temporal or spatial smoothing is used in computer
vision system for early stage of processing to reduce camera
noise. Environmental noise is removed by using smoothing
technique. Data processing rate are reduced by frane-size and frame-rate reduction. Registration of an image between
the one and other frames before the background modeling is
needed then the more number of cameras are used at
different positions or areas. One main problem in the
preprocessing is formatting a data which is used by
background subtraction. luminance intensity, is used by
many of the algorithms which is one scalar value per pixel.
In the background subtraction literature color image, in
either RGB or HSV color space, has become more popular.
In addition to color,to incorporate edges and motion
information spatial and temporal derivatives are u
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
764
All Rights Reserved © 2016 IJARECE
Figure 4.2 Pixel level noise removals. (a) Estimated
background image (b) Current image (c) Detected
foreground regions before noise removal (d) Foreground
regions after noise removal
In Fig 4.2 To remove noise caused by the camera
morphological operations such as dilation and erosion are
applied. These operations are used to remove noisy
foreground pixels that do not correspond to actual
foreground regions and to remove the noisy background pixels near and inside object regions that are actually
foreground pixels.
4.2 BACKGROUND SUBTRACTION
This algorithm is used for background modeling. It is very
sensitive hence it detects the all moving elements but it is
very hard to identify when there is changes occur with
environment Non-recursive and recursive are the two
methods used in background subtraction .The non recursive
methods are used for Highly-adaptive operations and
exclude those that require significant resource for initialization. It(x,y) and Bt(x, y) are used to denote the
luminance pixel intensity and its background estimate at
spatial location (x, y) and time t.
Figure4.3: Background Subtraction
In Fig 4.3 first the background B(x, y, t) at time t is
estimated by non recursive technique and then it is subtracted from input frame I(x, y, t). After subtracting the
absolute difference is checked and if it is greater than the
threshold (Th), the foreground mask is estimated.
Non-recursive Techniques
A Sliding-window approach used in non-recursive technique
for background estimation. Image based on the temporal
variation of each pixel within the buffer of the previous L
video frames, estimate the background image. Non-
recursive techniques are highly adaptive as they does not
depend on the history beyond those frames stored in the
buffer.
Frame Differencing
By calculating the difference between two consecutive
images the presence of moving objects determined by
subtraction method. Evaluated background is just the
previous frame. It works only in particular conditions of
objects speed and frame rate and is very sensitive to the threshold (Th) and also calculation is simple and easy to
implement. Depending on the object structure, speed, frame
rate and global threshold, this approach may or may not be
useful.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
765
All Rights Reserved © 2016 IJARECE
B(x , y , t) = I (x , y , t − 1) here the estimated background
is the previous frame.
⇓
|I (x , y , t) − I (x , y , t − 1)| > Th
In the equation 3.1 I(x, y, t) represents intensity value at
position (x, y), at time instance t, I(x, y, t-1) represents
intensity value at position (x, y), at time instance t-1 and the
per-pixel threshold, Th, is initially set to a pre-determined
value. Frame differencing is the difference between
consecutive frames as shown in Figure4.4.
Figure 4.4: Frame Differencing
Figure 4.5: Frame Differencing at different thresholds
(a)Th = 25,(b)Th = 50,(c)Th = 100,(d)Th=200.
The Fig 4.5 shows the effect of threshold in frame
differencing. When the threshold is 200, foreground pixels
are not found as compared to when threshold is 25. When
the threshold increases the number of foreground pixels
found reduces, hence it is diffiicult to get the complet
outline of the moviing vehicle. Thus the identification of the moving vehicle is not accurrate in frame differencing
technique.
Median Filtering
Median filtering is one of the most commonly-used
background modeling techniques. The median point at every
pixel of every frame in the buffer is called as background
estimate. Here we make assumption that the pixel stays in
the background for more then half of the frames in the buffer. It has been extend to color by replacing the median
with the medoid. The problem of calculating the median is
O (nlog n) for each pixel where n is the previous video
frames stored in buffer.
If I(x, y, t-i) represents a frame at time interval t-i, then
background image is obtained by taking its median and it is
given by equation
B(x, y, t) = median {I (x, y, t − i)}
⇓ |I (x , y , t) − median{I (x , y , t − i )}| > Th
Where i ∈ {0 . . . n − 1} and n is number previous frames
For example when n= 10, estimated background and
foreground masks are as shown in Figure 4.6.
Figure 4.6: Estimated background and foreground
masks for n=10
In Fig 4.6 the number of frames taken for background
estimation is 10, hence foreground pixel are not clearly
found. The background estimate is the median at each pixel
location of all the frames in the buffer.
Figure 4.7: Estimated background and foreground
masks for n=20
As in Fig 4.7 number of frames for background registration
is 20, hence foreground objects detected are better than for
n=10.
In Fig 4.8 number of frames for background registration is
50, hence foreground objects are accurately detected as
compared to n=10 and 20.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
766
All Rights Reserved © 2016 IJARECE
4.3 FOREGROUND DETECTION
Candidate foreground pixels from the input frame are
identified by comparing the input video frame with the
background model this is called foreground detection. The
most commonly used approach for foreground detection is
to check whether the input pixel is significantly different
from the corresponding background estimate. Threshold is
chosen experimentally. Threshold is a function of the spatial
location (x, y). The primary step is to recognize “strong"
foreground pixels so that it‟s utter difference with the
background overreach a large threshold. Then, foreground regions are developed from strong foreground pixels by
including neighboring pixels with utter difference greater
than a small threshold.
If It(x,y) and Bt(x,y) represents the input frame and
background frame respectively then foreground objects are
obtained by subtracting background frame from input frame
by satisfying the following equation (3.3).
|It(x, y) – Bt(x, y)| > Th
………… (3.3)
Where Th is the threshold
Fig 4.9 shows the example where an input frame is
subtracted from a given background image Figure 4.3(b)
a
b
Figure 4.9: Foreground detection. a) Input frame, for
given background image b) Input frame after
subtracting with background image, showing foreground
objects.
4.4 DATA VALIDATION
The process of improving the candidate foreground mask
based on information obtained from outside the background
model is called Data validation. The main limitations of
background: 1) Correlation between neighboring pixels are
ignored,
2) Due to moving speed of the foreground objects it is
difficult to match the rate of adaption
3) Non-stationary pixels cast by moving objects such
as moving leaves or shadow are easily mistaken as
true foreground objects.
The false-positive or negative areas scattered invariably
across the candidate mask are the typical issue. Thus by
combining morphological filtering and connected component grouping to eliminate these regions. To
eliminate isolated foreground pixels and merges nearby
disconnected foreground regions need to apply
morphological filtering on foreground masks. Moving entity
of interest should be greater than a certain size. By
connecting-component grouping to recognize all connected
foreground areas, and remove those that are small to
correspond to moving entity. Large areas of false foreground
often occur when the background model modify at a slow
rate than the foreground scene. It fail to detect the part of a
foreground entity which has corrupted the background
modal if the background model adjust too fast, simple approach to reduce these issues is to use more number of
background models running at different adaptation rates,
and periodically cross-validate between other models to
improve production.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
767
Chapter 5
VEHICLE TRACKING, DETECTION, COUNTING
AND CLASSIFICATON
The video in avi format is processed by selecting RGB
model smoothened by 5X5 averaging masks. After
background modeling by median filtering, the vehicles are
detected, tracked and classified. Speed of the vehicle is also
estimated.
Figure 5.1: Proposed Architecture flow diagram.
In the proposed architecture flow diagram as shown in Fig
4.1, the video is acquired through a stationary camera is
used to acquire the video and the median filtering is used by
background modeling The processing includes:
1) Automatically estimating the primary location of moving
entity,
2) Extricating characteristic details from all moving entity
within site,
3) Tracing identified objects by feature and
4) Categorizing the moving entity into two groups: heavy
vehicle and low vehicle.
By integrating spatial position, motion, shape and color in
tracking system. Changes in the background by integration
are made insensitive by the tracker and interruption of
motion and position of objects. Segment the moving object
blobs, detect the motion and background variation and then
compare the similarity of the object blob with different
templates, thereby tracing the objects.
5.1 MOVING OBJECT DETECTION
The primary objective is to separate the object from its
background. Frame differencing and background subtraction
are the two common methods. Frame differencing is
basically a threshold of difference between the current
image and sequence images by assuming that the
background do not change over successive frames. Some issue occurs when tracing many entity or when an entity
stops in which the moving object is not accurately detected.
Hence background subtraction is used at the cost of
improving the background. By considering a moving entity
will stay at the same point and by using pixel median centre
we make the background for more than half of L where L is
previous frames stored in buffer. The background model for
pixel xt (m, n) at frame t using a length L median filter is
given by equation
xt (m, n) = medianL ( xt−0.5L (m, n),..., xt+0.5L (m, n))
This retains the stationary pixels in the background. When
we identify a large new blob it requires more amount of
memory to store the L frames at the time and the median
filter can make the background.
Background subtraction done on RGB color model.
The RGB Color Model
The RGB (Red, Green, Blue) color model uses a cartesian
coordinate system and forms a unit cube shown in Figure 1.
The Fig 5.2 shows the RGB model in the Cartesian
coordinate system. Gray level is shown with dotted lines
where the red green blue has equal amounts. This diagonal
is referred as gray diagonal. Image capturing, processing and rendering devices use this RGB model and it is in the
hardware form.
Figure 5.2: RGB Model in Cartesian
coordinate system
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
768
Subtract the foreground from the background in each RGB
color channel and then take the maximum accurate values of
the 3 differences as the difference value Diffc in color space.
Diff c = max{ | R f − Rb | , | G f −Gb | , | B f − Bb | }
The equation 5.2 indicate the subtraction of foreground from
background in RGB color channel
The binary foreground pixels F (x, y) are produced by
equation
𝑭 𝒙,𝒚 = 𝟏 𝒊𝒇 𝐃𝐢𝐟𝐟𝐜 > 𝑻𝒉
𝟎 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The resulting foreground contains noise due to the clutter in
the background. Noise is removed by the „close‟ binary
morphological operator. We make assumption that initially
blob contains moving entity and it may be a human being or
bikes, cars or group of people. Where we find the moving
entity or object the rectangle box has been drawn over
it.Figure 4.10 shows sample foreground areas before and
after region connecting, labeling and boxing.
a
b
c
Figure 5.3: Connected component labeling sample. (a)
Calculated background
(b) Present image (c) Filtered foreground pixels and
connected and named regions
with bounding boxes
5.1.2 Noise Removal
The background and the discrepancy image contains the
motion region as well as large number of noises, By using
morphological operations noises are removed which is
caused due to environmental factors, illumination changes,
and during transmission of video from the camera to the
further processing..
Tool for extracting image components is Mathematical
morphology which are useful in the representation and
description of region shape. Morphological techniques for pre- and post-processing are morphological filtering,
thinning, and pruning.
Morphology is a broad set of image processing operations
that process images based on shapes. Structuring
element to an input image, creating an output image of the
same size is done by Morphological operations. Dilation and
erosion are basic morphological operations.
Figure 5.4: Original Picture
In Fig 5.4 shows the original picture with a moving car,
subtract this image with the background image as in
Figure 5.5. The resultant figure after subtraction is
shown in Figure 5.6 which contains the moving blobs
along with noise.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
769
Figure 5.5: Background Picture
Figure 5.6: Moving blobs
The Figure 4.6 shows moving object with the noise.
There are various factors that cause the noiise in
foreground detection such as:
1. Camera noise 2. Reflectance noise
3. Background colored object noise
Figure 5.7: Denoised Image
The Fig 5.7 is the image after performing the morphological
operations to remove the noise. Noises are removed using
morphological operations.
These operations are applied to remove noisy foreground
pixels that do not correspond to actual foreground regions.
Figure 5.8: Original image Figure 5.9:
Dilation followed by erosion
For example consider the image in Figure 5.8 which is the
original image and Figure 5.9 which shows dilation followed by erosion in which only the non background noise
is removed. If erosion is followed by dilation, then non-
foreground noise (NFN) regions would be eliminated but
non background noise (NBN) would not be eliminated
because the holes inside objects could not be closed.
Figure 5.10 Original image Figure 5.11
Erosion followed by dilation
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
770
For example consider the image in Figure 4.10 which is the
original image and Figure 4.11 shows erosion followed by
dilation in which only the non foreground noise is removed.
5.2 FEATURE EXTRACTION
Two types of features are extracted in each moving blob.
Features of object centroid and color are used for extraction.
Centroid is used for the spatial position of the blob.and it
acts as an important feature for tracing an entity.We
calculate the centroid (𝑋 , 𝑌 ) in binary image I(x,y) by the
equation 4.6.
𝑋 =1
𝐴 𝑥(𝑥 ,𝑦)∊𝑅
𝑌 =1
𝐴 𝑦(𝑥,𝑦)∊𝑅 ……….. (4.6)
Where A is the number of pixels in the blob R.
An object‟s next position is in the neighborhood of its
current position. When the distance from an entity to every
template is minimum then the matching occurs and the
distance is less than the specified low threshold. If distance greater than certain high threshold then it is said to be a new
object. Here high and low threshold is per pixel, pre-
determined value. shape-based information is provided by
shape feature . The shape feature uses length, width and
area of the objects
1)
L = max x (t) − min x (t)
W = max y (t) − min y (t)
Where x (t) is the pixel along x-direction and y (t) is the
pixel along y-direction.
2)
Area A = ∑∑I (x, y)
(x, y)ϵR
Where R is the region of moving blob.
When shape information is not reliable colorist used to
trace. And it is independent of the object size.
5.3 OBJECT TRACKING
The tracing operation compares the features vector Ri,t with
all templates Ti,t −1 (i=1,2…M).The template is increased to
next step if matching is found. The next matching through
an adaptive filter as given in equation 4.7. Where β ( [0.0,
1.0]) is a learning constant which specify how much
information from the incoming image is put to the
background.
Ti,t = βTi,t −1 +(1− β)Ri,t ………………(4.7)
If matching is not obtained for next frames, then a new
template is created TM +1,. A centroid, shape and color.in
this order matching operation is executed.
5.4 CLASSIFICATION
The vehicle classification done based on shape of the object.
Hence shape based classification is used as by calculating
the length and area of the object. Compared to heavy vehicle
such as trucks ,low vehicle such as bike and cars have less
area
vehicles are classified into two categories: cars and non-
cars . Shape-based techniques is used to Separating, say
SUVs from pickup trucks . Lower level work is done by
coarse, classify dimension based at the top level. The final aim of the system is to classify vehicle at more stages of
granularity.
To do the classification depending upon the dimensions of
vehicles, we calculate the actual length and height of the
vehicles.
5.5 VEHICLE COUNTING
The tracked binary image mask1 forms the input image for
counting. To Detect the presence of an object image is
scanned from top to bottom. An input image with masked
binary image is used for count. Count and count register are used to maintain information of registered entity. The
registration of the entity is cross verified in the buffer about
its prior registration. If the entity is not registered it is taken
as new entity and count is increased , if it is in buffer it is
neglected.
This method being used to all image and count of entity is
increased and accuracy is got sometimes object are merged
and treated as single.
Steps to count vehicle
1. Mask in detected to trace an object by traversing.
2. If entity is being newly faced then register in count reg is verified.
3. If the entity is new then its count increased and count reg
is named with new count.
4. Steps are repeated(2-4) until cross verification not
completed
Chapter 6
RESULTS AND DISCUSSION
The database for traffic images is created and the vehicle
count for the images is done by taking median of the pixels
and background
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
771
a)Image1
b)BackgroundImage
c) Result of image1 after background subtraction
d) Image 2
e) Result of Image 2
f) Results of Image 3
Figure 6.1 Background subtractions for traffic images.
The Fig 6.1 shows the background subtractions of images in
which 6.1 (b) shows the background image, 6.1(a) is the
input image 1, 6.1(c) is the image after background
subtraction with vehicle count = 2, and 6.1(f) shows the
result of image 3 in which the actual number of vehicle
count is 6 but number of vehicles counted after background
subtraction is 4 which is due to occlusion effect.
Image
no
Original
Image
Vehicle
Found
Error
1 2 2 0
2 3 3 0
3 2 2 0
4 1 1 0
5 3 3 0
6 1 1 0
7 1 1 0
8 2 2 0
9 1 1 0
10 1 1 0
11 3 2 1
12 2 2 0
13 4 1 3
14 2 2 0
15 3 2 1
16 6 4 2
17 3 3 0
18 3 3 0
19 3 2 1
20 6 4 2
21 4 4 0
22 4 3 1
23 3 1 2
24 4 3 1
25 3 2 1
Table 1: Vehicles found in Images
Orginal Number of Vehicles :70
Vehicles Found : 55
Accuracy: 78.57%
Vehicle tracking and counting in images is a diffcult task
and produce an error result. It does not give proper count
because background image is fixed and the timing of images
taken varies. The time of background image taken and
foreground image taken will vary, hence it gives improper
vehicle count due to which the traffic signals cannot be
controlled properly. It can be observed from Table 1 that the orginal number of vehicles in images are 70 and the vehicles
found are 55 which gives an accuracy of 78.57%. Hence we
go for background subtraction in video which give a better
video based traffic survivellance.
6.2 FRAME DIFFRENCING BACKGROUND
SUBTRACTION.
The presence of moving objects is found by taking
difference between consecutive frames. The background is
just the previous frame. It works only in particular
conditions of objects speed and frame rate. Here we have to
initially specify the threshold. Depending upon the threshold
the foreground pixels are found. Hence these method is not
used in pratically and we cannot detect fast moving vehicles.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
772
Its is difficult to obtain the complete outline of the moving
object.
Figure 6.2 shows the frame diffrencing performed for a
stardard visiontraffic.avi video. In the result shown in Figure
6.1 it can be seen that it is difficult to obtain complete
outline of moving object, due to which it appears as empty
phenomenon at threshold 200. As a result the detection of
moving object is not accurate. Depending on the object
structure, speed, frame rate and global threshold, this
method isnot usefull
a)Threshold=25
b)Threshold= 50
c)Threshold=100
d)Threshold = 200
Fig6.2: Frame Diffrencing at different Thresholds
6.3 BACKGROUND SUBTRACTION BY MEDIAN
FILTERING.
Background modeling by median filtering is initially done
for standard traffic videos in which the vehicle tracking,
vehicle detection, speed calculation, vehicle classification,
and vehicle counting is done.
Video1 of 530 frames, width 640and height 360.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
773
Figure 6.3: Video1 with counting 1 at frame 160
The Fig 6.3 shows the video1 with counting 1 at frame 160.
The Fig 6.3(a) shows the input frame 160 of video 1, Fig
6.3(b) is the background frame, Fig 6.3(c) indicates the
difference values after background subtraction, Fig 6.3(d) is
the binary mask which contains noise, Fig 6.3(e) is the
denoised mask in which the noise is removed by the
morphological operations, and Fig 6.3(f) shows the output
with the vehicle count 1, and also indicating the speed of the vehicle with which it moves.
Figure 6.4 Video 1with counting 2 at frame 227
Fig 6.4 shows the video1 with counting 2 at frame 227. The
Fig 6.4(a) shows the input frame 227 of video 1, Fig 6.4(b)
is the background frame, Fig 6.4(c) indicates the difference
values after background subtraction, Fig 6.4(d) is the binary
mask which contains noise, Fig 6.4(e) is the denoised mask
in which the noise is removed by the morphological
operations, and Fig 6.4(f) shows the output with the vehicle count 2, and also indicating the speed of the vehicle with
which it moves.
Video 3 viptraffic.avi with 120 frames, width 160 height
120
Figure 6.5: Video2 with counting 1 at frame 60
Fig 6.5 shows the video3 with counting 1 at frame 60. The
Fig 6.5(a) shows the input frame 60 of video 3, Figure
6.5(b) is the background frame, Fig 6.5(c) indicates the
difference values after background subtraction, Fig 6.5(d) is
the binary mask which contains noise, Fig 6.5(e) is the de
noised mask in which the noise is removed by the
morphological operations, and Fig 6.5(f) shows the output
with the vehicle count 1, and also indicating the speed of the
vehicle with which it moves.
Input
Vide
o
Forma
t
Frame
s
Actual
No of
Vehicle
s
Detecte
d No of
Vehicles
Accurac
y
Vide
o 1
RGB 530 4 4 100%
Video 2
RGB 100 4 4 100%
table 2: Accuracy of counting in Videos
Table 2 shows the accuracy of counting the vehicles in the
traffic videos which is recorded through a camera. A fairly
good accuracy is obtained, but some time due to occlusion,
two vehicles may merge together and treated as a single
entity.
ISSN: 2278 – 909X International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE)
Volume 5, Issue 3, March 2016
774
Chapter 7
CONCLUSION AND SCOPE FOR FUTURE WORK
The background registration technique using a median
filtering and frame differencing is studied. The problem of
selecting threshold for frame differencing is seen hence
median filtering is chosen. Noise removal using
morphological operator have been studied. The project
worked is considered in ideal conditions.
A system has been build to identify, count of vehicles on a road efficiently. Development of a system to trace , count of
vehicles in a lane effectively. We integrate domain
knowledge about entity cleses and time domain static
measure to find entity with various morphological operation
and remove unwanted clutters.
By looking at table 1 we find that actual count of vehicle is
70 but vehicle found are 55 hence the accuracy is 78.57%.
Vehicle tracking and counting in images is a diffcult task
and produce an error result. It does not give proper count
because background image is fixed and the timing of images
taken varies. But from the video we get 100% accuracy
because here we do background subtraction..
REFERENCE
[1]https://data.gov.in/catalog/total-number-registered-motor-
vehicles-india
[2]https://en.wikipedia.org/wiki/Three-phase_traffic_theory
[3]S.Bhadra, A. Kundu and S. K. Guha, ―An Agent based
Efficient Traffic Framework using Fuzzy, Fourth
International Conference on Advanced Computing &
Communication Technologies, 2014.
[4]R.tina, G.Sharmila Sujatha- Density Based Traffic
Signal System Volume No: 2 (2015), Issue No: 9
,September 2015
[5]Farheena Shaikh - An Approach towards Traffic
Management System using Density Calculation and
Emergency Vehicle Alert International Conference on Advances in Engineering & Technology – 2014 (ICAET-
2014)
[6]Pezhman Niksaz -Automatic Traffic Estimation Using
Image Processing , Science &Research Branch, Azad
University of Yazd, 2012 International Conference on
Image, Vision and Computing (ICIVC 2012)
[7]Vivek, Tyagi, Senior Member IEEE, Shivakumar
Kalyanaraman, Fellow, IEEE, and Raghuram
Krishnapuram, Fellow, IEEE “Vehicular Traffic Density State Estimation Based On Cumulative Road
Acoustics” in IEEE Transaction on Intelligent
Transportation System.Vol.23. No.3 September 2012.
[8]Milos Borenovic, Alexender Neskovic, Natasa
Nescovic,”Vehicle positioning using gsm and cascade
connected ann structure”,IEEE transaction on intelligent
transportation system volume 14 No.1 March 2013
[9]Hasan Omar Al-Sakran “Intelligent traffic
information system based on integration of Internet of
Things and Agent technology”, IJACSA ,vol 6, 2015.
AUTHOR NAME
ROOPADEVI G HOSUR
M.TECH, DIGITAL ELECTRONICS
BITM, BELLARY
HEMANTHAKUMAR.K ASS.PROF ELECTRONICS AND
COMMUNICATIONS
BITM, BELLARY