real-time video segmentation using color … video segmentation using... · peringkat yang pertama...
TRANSCRIPT
REAL-TIME VIDEO SEGMENTATION USING COLOR INFORMATION AND CONNECTED COMPONENT LABELING:
APPLICATION TO ROAD SIGN DETECTION AND RECOGNITION
Lydia Ubong Jau
Master of Science 2011
I Pu 'i' lidnaat M klumat "ad roik l.IN 'F.ru m MALAYSI ~ AI W K
REAL-TIME VIDEO SEGMENTATION USING COLOR INFORMATION AND CONNECTED COMPONENT LABELING:
APPLICATION TO ROAD SIGN DETECTION AND RECOGNITION P.KHIOMAT MAKLUMAT AKAOEMIK
111111111 rlil~ii 111111111 1000246405
LYDIA UBONG JAU
A thesis submitted in fulfillment of the requirement for the Degree of
Master of Science (Cognitive Science)
Faculty of Cognitive Sciences and Human Development UNIVERSITI MALAYSIA SARAW AK
2011
ACKNOWLEDGMENTS
First and foremost, I would like to thank my supervisor, Assoc. Prof. Dr. Teh Chee
Siong for his guidance and motivation throughout the project. The help and the precious
advice towards this research had broadened my knowledge regarding the area of research. I
am also greatly grateful to my co-supervisor, Dr. Ng Giap Weng who has given useful help
and comments for my research. The time he spent for our discussion had enabled me to
conduct this research. Both of my supervisors are really helpful in many ways. Without
their valuable collaboration, this project would not have been completed.
I am deeply grateful to my parent, Mr. Paul Jau Eng and Mdm. Cristina anak
Sawang for their love, support and caring for all these years. Their encouragements have
given me the strength to continue my studies to the higher degrees. To my colleagues, I am
thankful for their openhandedness in time, ideas and advise throughout my research. I
appreciate their help and support which have driven me along my research.
Last but not least, I would like to thank the Zamalah Pascasiswazah UNlMAS for
providing financial support with a special grant under the Postgraduate Scholarship
Scheme. I totally appreciate this support for it had. helped me a lot in conducting this
research.
I
ABSTRAK
•
Di dalam aplikasi computer visi, pengekstrakan sifat objek dan pengecaman objek
memainkan peranan yang penting. Tesis ini adalah berkenaan dengan penghasilan teknik
penlabelan permukaan jasad objek serta teknik segmentasi untuk proses pengekstrakan
sifat objek dan pengecaman objek di dalam video image berwarna. Fokus kajian ini adalah
untuk menggunakan informasi warna dan teknik Connected Component Labeling (eeL) di
dalam aplikasi video segmentasi berdasarkan objek. Kajian ini dilakukan melalui dua
peringkat. Peringkat yang pertama menunjukkan bagaimana teknik segmentasi warna
digunakan untuk mengsegmentasi permukaan jasad objek berdasarkan informasi warna.
Peringkat yang kedua menunjukkan bagaimana teknik eeL digunakan untuk melabel
permukaan jasad yang telah disegmentasi. Teknik eeL berdasarkan kaedah two-scan yang
terbaru dicadangkan dari kajian ini. Teori serta strategi melabel permukaan jasad di
dalam video image berwarna menggunakan teknik yang dicadangkan telah dihuraikan -
secara terperinci. Beberapa eksperimen telah dijalankan untuk mengetahui persembahan
teknik yang dicadangkan. Berdasarkan eksperimen - eksperimen yang telah dijalankan,
didapati teknik eeL yang dicadangkan boleh diguna di dalam aplikasi video segmentasi
berdasarkan objek sebagai teknik penlabelan permukaan jasad objek. Faedah - faedah
daripada penggunaan teknik penlabelan permukaan jasad objek yang dicadangkan serta
teknik segmentasi untuk proses pengekstrakan sifat objek dan pengecaman objek telah
didedahkan. Kebolehgunaan prosedur - prosedur yang dicadangkan juga telah ditunjukkan
di dalam applikasi real-time road sign detection and recognition.
ii
ABSTRACT
Feature extraction and pattern recognition plays an important part in computer
vision applications. The concern of this study is with the development of region labeling and
segmentation technique for feature extraction and pattern recognition in color video
images. The focus of this research is to use color information and Connected Component
Labeling (CCL) technique in the object-based video segmentation application. This study is
carried out in two stages. The first stage shows how color segmentation technique is used to
segment the region of interest from the color video images based on color information. This
is followed by the CCL technique to label the segmented region. A novel two-scan approach
in CCL technique is proposed in this study. The theoretical foundations and operational
strategies of labeling region from the color video images using the two-scan approach are
elaborated in details. Several experimental studies to identify its performances are
reported. Based on the experiments results, it is shown that the new CCL can be applied in
object-based video segmentation as region labeling technique. The benefits of using the
proposed region labeling and segmentation technique for feature extraction and pattern
recognition are revealed. In addition, the applicability of the suggested methods is
demonstrated in the real-time road sign detection and recognition application.
iii
.-
Punl idmat Maldumllt Akau fI J
UNlVERSITI MALAYSIA ARAWAl<
TABLE OF CONTENTS
ACKNOWLEDGMENTS .. ............... .... ........................................................... . ABSTRACT ................ ................................. ................... . ....... ........ ............... 11
LIST OF TABLES ...... ... ... .......................... .... ...................... ... .. .... ................. Xl
LIST OF ABBREVIATIONS ......... ......... .................................... ... ............ ....... xu
TABLE OF CONTENTS ....................................................... ....... .... ................ IV
LIST OF FIGURES............................................ . ...................... .. ... .. ... .. ...... ... IX
CHAPTER 1
INTRODUCTION
1.1 Preliminaries ....................................................................................... 1
1.2 Image Processing........................ ..... ......................... .... ....................... 2
1.3 Segmentation ..... .................... .. ......... ................... ..... .......................... 4
1.4 Object-Based Video Segmentation ......................................................... 7
1.5 Color Segmentation ............ ... ... .. ... . ... ......... ... ... ........... .......... ... ... ......... 8
1.6 Connected Component Labeling........................... . .... ... .. ......... .. . ...... .... 10
1.7 Problems and Motivation ...... .. . ............... ... ...... .... .. ... ..................... ....... 11
1.8 Research Objectives................. .... ........................ ... ........................... ... 14
1.9 Scopes and Limitations........... .......... .................... ...... ........................ .. . 15
1.10 Research Design ...... ... ... ...... ...... ... ......... ... ...... ...... ............ ................... 16
1.11 Thesis Outline ...... ............ ......... ..................... ... ......... ... ... ......... .......... 16
CHAPTER 2
LITERATURE REVIEW
2.1 Introduction ... ........... ... . .. .... ... ... .. .. .. ... ... ... ...... ... .... ..... ... ... ......... ... ....... 19
2.2 Object-Based Video Segmentation............. ... ... ... .... ... .. .. ... ........ .. .. ... ..... .. 19
2.3 Color Segmentation ........ ...... ...... ...... ........... .. .. ....... ............. .... ..... . ... .... 21
2.4 Connected Component Labeling.................................................... ....... .. 25
iv
I
2.5 Road Sign Detection and Recognition... ............. .. .................................... 28
2.6 Summary.......................... ... ............................................................ ... 30
CHAPTER 3
COLOR SEGMENTATION
3.1 Introduction ....... .. ...... ... ...... ... ........................... ... ...... ... ... ............... .... 32
3.2 Color Segmentation.. .................. . ............ .. ... ............... ............................. 32
3.2.1 Components of the proposed color segmentation ........ ..... ...... .. .. ....... 33
3.2.1.1 Hardware and Software .... ... .......................... .... ............... 33
3.2.1.2 Color Model....................................................... .. ............. 34
3.2.1.3 Color Segmentation Algorithm ...... ... ......... ............ ......... ..... 35
3.2.1.4 Output Image..... ........................ .. .............. ..................... 36
3.2.2 RGB and HSI color segmentation on the real-time color video image
sequence........... . ..................... ....... . . ... ... .................. ... ...... ......... 37
3.3 Experimental Studies .................................. ' " ...... ... .. ... ....... ... ............... 38
3.3.1 Experiment 1 ... ... ... .. . ... .......... ..... ............. ......... .................. .. . .... 39
3.3.1.1 Experimental Procedures.............. .......... ........... .. ............ .. 39
3.3.1.2 Experimental Results and Discussions ..... ........................ .... 40
3.3.2 Experiment 2 ...... ....... ................................... ................... . ......... 42
3.3.2.1 Experimental Procedures ...... ...... ...... ... ..................... ......... 42
3.3.2.2 Experimental Results and Discussions ... ........ . ... .................. 43
3.4 Comparison Study .......... .. ........ .. .............. . .................................. . ........ 44
3.5 Summary................ .................. .... ...................................................... 45
CHAPTER 4
CONNECTED COMPONENT LABELING
4.1 Introduction ......... ........................... ... . ............................. .. .... .... ......... 47
4.2 Connected Component Labeling .................. ......... ... .. .... ...... ..... ....... ... .... 47
v
I n
4.2.1 Components of the proposed eCL. .... ...... .......... .... . .. . ......................... 48
4.2.1.1 Input Image........................ . ............... . ... . .. ....... .. ... ..... ... ..... 48
4.2.1.2 Connectivity Mask...... .. . ..... . . .. . .... ......... .. .... .. ................. ... 48
4.2.1.3 Image Scan.................................... .................................. 49
4.2.1.4 Equivalent Table............ . ... ...... . .. . .................................... 49
4.2.1.5 Neighborhood Function..................................................... 50
4.2.2 Region labeling on the real-time color video image sequence ...... ......... 51
4.3 Experimental Studies....... ............ ... ... .. .... . .. . ........................................ 52
4.3.1 Experiment 1 .......................... . ..... .. . . .. . .. .. ............................. ..... 52
4.3.1.1 Experimental Procedures .. .. .. ... ... ..................... ... ... ............ 52
4.3.1.2 Experime ntal Results and Discussions................................ 54
4.3.2 Experiment 2 .. . ........ ....... . .. .. ....... ... . ........ .. .... .. .... ... ... ... .............. 56
4.3.2.1 Experimental Procedures ............ ...... ... .. . ... ...... .................. 56
4.3.2.2 Experimental Results and Discussions.. . ............................. 56
4.4 Summary......................... ... . ...... . ............... .. . . ....... . .......................... 59
OHAPTER 5
FEATURE EXTRACTION AND PA'ITERN RECOGNITION
5.1 Introduction ... ...... ... ... . . . ...... ... ... ...... ... ... ... ......... .... ..... ... ... . . . .. . .. . . ... . . .... 61
5.2 Object Segmentation ......................... ... ... ................ . . ... ... ... .. .. . . . .. .. .... .. 61
5.2.1 Components of the Proposed Segmentation Technique....... ... . . ..... ... . 62
5.2.1.1 Boundary Establishment ................................. . ............. ... 62
5.2.2 Object Segmentation on the Real-Time Color Video Image Sequence ... 63
5.3 Feature Extraction......................................................... .. .. .. .. . ....... .... .. 63
5.3.1 Components of the ProposedFeature Extraction......... ........ .. . ... .. . .. .... 64
5.3.1.1 Order Moments ................................................................. 64
5.3.1.2 Center of Gravity............................................. .. . ......... . .. 64
5.3.1.3 Central Moment................................................... ............ 65
5.3.1.4 AMIs Features.................................................. ......... ... ........ ....... 65
vi
,
I
5.3.2 Feature Extraction on the Real-Time Color Video Image Sequence
.............................................. .. .... ... ............ ....... .................... ........ ..... ... ... ... 66
5.4 Pattern Recognition........................ .. .... .. .. ..... ... .. .... .. ............................. 67
5.4.1 Components of the Proposed Pattern Recognition............ ........ ......... 68
5.4.1.1 BP Neural Network Architecture.......... .. .......... .. .. .. ............. 68
5.4.1.2 Input Layer........ ............. ...... ... ... ... ........... ........................ 69
5.4.1.3 Hidden Layer.............. .... .. ... .... .. . .. ...................... ............. 69
5.4.1.4 Output Layer ................ ...... ...... .................................................... 70
5.4.1.5 Connection Weights .. ...... .. .... .... .. .. .... ........................................... 72
5.4.2 Training using BP Neural Network .... .. .. .. .. .... ................ ...... .......... ........ 74
5.4.3 Pattern Recognition Using BP Neural Network on the Real-Time
Color Video Image Sequence .... .............. .. .... .. .......................................... 75
5.5 Experimental Studies ...... ........ .... ............ .... .. ...... ........................ ......... 75
5.5.1 Experiment 1 .. ... .. .. .. .. ............ .... . .. .... .. ...... . ..... . ........................ 76
5.5.1.1 Experimental Procedures .... .. .. .. .. ...... .................. ............ 76
5.5.1.2 Experimental Results and Discussions ...... .................... ...... ...... 77
5.5.2 Experiment 2 ... ... ... ........ . ... .... ... ....... .. ... .... .. .. ..... . .. ....... ... .. .. ....... 79
5.5.2.1 Experimental Procedures .... .... ............ .... .. ...... .... .. .. .. .. .. ..... 79
5.5.2.2 Experimental Results and Discussions.......... ........ .............. 80
5.5.3 Experiment 3..... . .. . ..... .... .. . .. . ... ...... ... .................. ..... ... . .. ... .. .. ..... 82
5.5.3.1 Experimental Procedures .................. '" ...... .. .. ... .. ... ... ... .. .... 82
5.5.3.2 Experimental Results and Discussions...... .................. .. ...... . 85
5.6 Summary..... . ....... .. .............. .................................................... ... ..... 86
CHAPTER 6
ROAD SIGN DETECTIONAND RECOGNITION
6.1 Introduction ..... ....... ... ......................................................................... 88
6.2 Road Sign Detection and Recognition ....................................................... 88
6.2.1 Road Sign Detection...... ............ .............................................. .... .. .. .. .. 89
6.2.2 Road Sign Recognition ...... ..... ,........................................................ .. ......... 89
vii
6.3 Experimental Study.............................................................................. 90
6.3.1 Experimental Procedures ............................................................. 91
6.3.2 Experimental Results and Discussions ................................................. 93
6.4 Summary......................................................................................... 96
CHAPTER 7
SUMMARY AND FUTURE WORK
7.1 Summary........... ............... ............ .......................................... ...... ....... 98
7.2 Future Work ........................ ....................................................................... 100
REFERENCES ................................................................................................................ 102
PUBLICATIONS ............................................................................................................. 112
viii
LIST OF FIGURES
Figure 1.1: Computer vision structure ...... ............ .......... ...... .. . ....... .. .. ... .. .. ... . ... 3
Figure 3.1: Scene acquisition and scene representation for the proposed color
Figure 3.2: RGB and HSI video image buffers used by the proposed color
Figure 1.2: Application of image segmentation in medical imaging ...... ...... ... ........ 6
Figure 1.3: Video segmentation applications... ....... ........................................... 6
Figure 2.1: Connectivity mask............. . .. ............ ........ ..................................... 25
segmentation ..... ....................... .................. ................................................. 34
segmentation .... ...... ... . ... .. . ........................................................... 35
Figure 3.3: Dataset of Experiment 1 ........ . ..... ............................. ............. ....... ...... 40
Figure 3.4: Results of Experiment l............................................. ........... ..... .... ...... 41
Figure 3.5: Results of Experiment 2 .................................................................. 43
Figure 4.1: 8-connectivity mask used by the proposed CCL ... ....... .......................... 48
Figure 4.2: Equivalent table used by the proposed CCL . ... ........ ............... ... ......... ... ... 49
Figure 4.3: Dataset of Experiment 1 ... ..... ................................................ ... ........ .......... 53
Figure 4.4: Results of Experiment 1 ..... ............. ......... ..... ... ....... ... ........ ............ .... ... . .... 54
Figure 4.5: Dataset of Experiment 2 .. ......... ... ...... ... ... ................................. ... ........ ....... 56
Figure 4.6: Results of Experiment 2.............................................................................. 57
Figure 5.1: A three layer BP neural network ........................................ .......... ............. 68
Figure 5.2: Dataset of Experiment 1 ... ... ..... ... ... ......... ..... ... ....... ... ..... ............... ........ ..... 77
Figure 5.3: Results of Experiment 1 ............... ........................................... ... ....... ......... 78
Figure 5.4: Dataset of Experiment 2 .. ........... .......... ... ..... ... ....... ... ..... ....... ........ ........ ..... 80
ix
I
Figure 5.5: Results of Experiment 2.......... ...... ...... ........ .... ..... ...................................... 81
Figure 5.6: Sample of training dataset used in Experiment 3 ...... ........... ................... 83
Figure 5.7: Sample of testing dataset used in Experiment 3................. .... ......... .. ....... 84
Figure 5.8: Results of Experiment 3................ ........... ...... ........................ ............ ......... 85
Figure 6.1: Dataset used in the experimental study................ ............... ..................... 91
Figure 6.2: Results of the experimental study............................................................. 93
x
LIST OF TABLES
Table 5.1: Sample of neural network training dataset containing normalized AMIs
and the corresponding class for road sign used in Experiment 3.. .... ... . ........ .. .. .. . ........ 83
Table 6.1: Sample of training dataset containing normalized AMls feature and the
corresponding class for road sign investigated in the experimental study
........... .... ... .. . ................. ............... ......... ..... ............... ................. .. ... .......... ................. ... .... 92
xi
LIST OF ABREVIATIONS
2D Two-dimensional
3D Three-dimensional
AMIs Affme Moment Invariants
ANN Artificial Neural Network
BP Backpropagation
CAT Computer-Aided Tomography
CCL Connected Component Labeling
CMOS Complementary Metal Oxide Silicon
CPU Central Processing Unit
GB Gigabyte
GHz Gigahertz
HSI Hue, Saturation and Intensity
HSV Hue, Saturation and Value
RAM Random Access Memory
RGB Red, Green and Blue
SDK Software Development Kit
xii
CHAPTER 1
INTRODUCTION
1.1 PRELIMINARIES
In this days and age, computer VISIOn has become a familiar field to a large
percentage of general public. It has the capability of duplicating the effect of human vision,
in which it performs tasks that either is tedious for people to perform, require work in a
hostile environment, require high rate of processing or require access and use of a large
database of information. With such capability, it became a self contained system capable of
automatic information extraction; such as moving objects extraction and tracking. The
result is a fully or semi automated surveillance systems, potentially cutting the cost of
human resources observing the output from cameras. Examples applications include both
civilian and military scenarios such as traffic control, security surveillance in banks or
antiterrorism system (Hongtu, Viktor & Hakan, 2006).
The core part in computer VISIOn IS Image processmg. Working by electronically
perceiving and understanding an image, computer vision uses image processing to process
captured image. Image analysis is one of many fields ~n image processing used to examine
the image data to smooth the progress of solving a vision problem. It had necessitates the
development of segmentation algorithm which is the crucial step in all computer vision
applications. In the state of the art, there are many segmentation algorithms can be found
(Ng et ai, 2006; Chien, Huang & Chen, 2003; Xiao, Shi & Chang; 2006). In this thesis, the
concern is with the investigation of real-time object-based video segmentation.
1
I
In the following section, an introduction to image processmg, segmentation and
object-based video segmentation are presented. This is followed by color segmentation and
connected component labeling (CCL), which are some of many theories in image processing.
Challenges of object-based video segmentation and motivation of color segmentation and
CeL technique in the object-based video segmentation design are then discussed in this
chapter. Subsequently the objectives of this research work are defined and an overview of
organization of this thesis is presented.
1.2 IMAGE PROCESSING
The history of image processmg started back in the year 1960s. Digital image
processing was carried out to fix camera geometric and distortions on digital image of lunar
surface. The operation was successfully done at significant expense using large mainframe
computers; resulting digital computer were ushered into the image processing world since
then (Baxes 1994). Nowadays, image processing has become a familiar field in our world.
Many imag,e processing theories were established since 1960s and are extended to various
technical applications that directly interact with human being (Rosenfeld, 1969).
In Baxes (1994), image processmg refers to· the manipulation and analysis of
pictorial information which is in the form of two-dimensional (2D) image. Such tasks
require pictorial information to be firstly acquired by censoring device such as
photosensitive resistive devices or video camera (Low, 1991). These devices will then map
the three-dimensional (3D) visual world into a 2D image by sampling and quantizing the
2
I
captured 2D signals (Acharya & Ray, 2005). Depending to the capturing device, the input
image may be a still 2D image or joint sequence of still images which commonly known as
video images (Reed, 2005). The input image is defined by brightness value (Baxes, 1994)
and is represented in a numerical form for digital computation (Dougherty, 1994). The
output of the computation can be acted upon by human being or by the computer
(Umbaugh, 1998). In Figure 1.1, the computer vision structure is generally shown. Some
applications of digital image processing are character recognition, blood cell analysis,
automatic screening of chest X-rays, registration of nuclear medicine lung scans, computer-
aided tomography (CAT), chromosome classification, land-use identifications, parts
inspection for quality assurance industrial, part identification, automatic navigation based
on passive sensing and target acquisition and range-finding (Vernon, 1991).
----CenSOring devite
... , ,I~ I DIGmZERI~ , .., ... :~ ~
I"n J"'I'" I.~. ElectriC Digrtal :" •• '. I~' Signal Signal •I .,; .. .... • .'
~~ ,1I"~ • :~._ I, Y'!"I.
Ma rixN(N
(Pictorial Informarion)
IMAGE PROCESSING OPERATION
OUlput
Figure 1.1: Computer vision structure
In Umbaugh (1998), the essential of image processmg lies on the human visual
system. It is studied not only to help in maximizing ~he effectiveness of image processing
operations but also to support some cases in which the output of image processing will be
use by human being. In Baxes (1994), there are five fundamental classes of image
processing namely image enhancement, restoration, analysis, compression and synthesis.
The image analysis class is commonly utilized in the computer vision application
(Umbaugh, 1998). It had necessitates the development of segmentation, which plays critical
3
role in computer vision applications, and is one of the many areas in image processing that
continuously studied by many researchers (Acharya & Ray, 2005).
1.3 SEGMENTATION
Segmentation refers to a process partitioning a digital image into regions that have
strong correlation with objects or area of the real world contained in the image (Sonka,
Hlavac & Boyle, 1999). It is often described by analogy to visual processes as a
foregroundlbackground separation, implying that the selection procedure concentrates on a
single kind of feature and discards the rest (Russ, 2002). The definition of feature generally
depends on an application's particular requirement (Baxes, 1994). The segmentation result
will undergoes subsequent processing such as object classification and scene description to
achieve the goal of the system (Acharya & Ray, 2005).Therefore segmentation is impOl"tant
as the first step in image analysis and interpretation.
According to Sonka, Hlavac and Boyle (1999), there are two types of segmentation
known as complete segmentation and partial segmentation. A complete segmentation
results in a set of disjoint regions corresponding uniquely with objects in the input image
while a partial segmentation results regions do not correspond directly with the objects. To
achieve complete segmentation, cooperation with higher'processing levels which use specific
knowledge of the problem domain is necessary in the process (i.e. object model). On the
other hand, to achieve partial segmentation, the image needs to be divided into separate
regions that are homogenous with respect to a chosen property/feature such as brightness,
color and texture . According to Sonka, Hlavac and Boyle (1999), approaches to
4
Pusat KhidOlal MnkJumaf Akad mil< UNlVIRSm MALAY lA AR \VAl<
segmentation can be based on global knowledge, edge and region. Lighting issues, image
noise and computation cost are some of the major issues in image segmentation.
From the literature study, two groups of segmentation that can be found. The image
segmentation uses still image as its input image and the video segmentation uses joint
sequence of still images or video images as its input image. According to Umbaugh (1998),
the goal of image segmentation is to help higher - level processing such as recognition by
locating position of objects of interest within the image. For example, in practical
application of medical imaging (see Figure 1.2), image segmentation is used to simplify the
appearance of original medical image so that location of the bones and kidneys can be
identified. In Figure 1.2 (a), the picture shows original medical image consist of bones and
kidneys while in Figure 1.2 (b), the picture shows image segmentation performed on the
original medical image showing position of bones and kidneys. On the other hand, video
segmentation can be further divided into shot-based and object-based (Liu & Fan, 2005).
The shot-based uses a set of key-frames to represent a video shot (see Figure 1.3 (a» while
the object-based partitions a video shot into objects and background (see Figure 1.3 (b» . In
Figure l.3 (a), the picture shows video annotation (i.e. Shot-based video segmentation)
while in Figure l.3 (b), the picture shows tracking moving object in the scene (i.e . Object-
based video segmentation).
5
(a) (b)
Figure 1.2: Application of image segmentation in medical imaging (Figure adopted from Mannan, 2001)
(a) (b)
Figure 1.3: Video segmentation applications (Figure adopted from YouTube Korea Blog, 2008 and Mitsubishi Electric
Research Laboratory, 2005)
In particular, the goal of object-based video segmentation IS to locate and track
position of objects of interest within the video image sequence, so that subsequent
information regarding the tracked object can be provided (Wang et al., 2004; Liu & Fan,
2005; Hongtu, Viktor & Hakan, 2006). With such capabilities, the result is a fully or semi
6
automated systems, potentially cutting the cost of human resources observing the output
from cameras (Hongtu, Viktor & Hakan, 2006). Nowadays, the object-based video
segmentation application has gaining many interests among our society (Mani, 2003; Wang
et aI., 2004; Liu & Fan, 2005). Object tracking system for example, is one of the many
examples of object-based video segmentation application (Bruce, Balch & Veloso, 2000;
Wang et at, 2005).
1.4 OBJECT-BASED VIDEO SEGMENTATION
Object-based video segmentation application requires the detection and recognition
of object of interest from the video images, tracking of such objects and analysis of object
tracks to recognize their behavior to achieve the goal of the system (Yilmaz, Javed & Shah,
2006). Features related to object of interest such as color, texture, motion and geometric
properties are commonly used to detect and recognize object of interest within the video
image (Khan & Shah, 2001 ; Mani, 2003).But according to Yilmaz, Javed and Shah (2006),
object tracking in general is a challenging problem due to abrupt object motion, changing
appearance patterns of both the object and the scene, non-rigid object structures, object-to
object and object-to-scene occlusions and camera motion. Typically, assumptions are made
to oonstrain the tracking problem in the context of a particular application. Those are
assuming the motion of the object is smooth with no abrupt changes and/or give prior
knowledge about the number, size , shape or appearance of the object to constrain the object
motion.
7
The choice of representation used in a video processing is fundamental (Reed, 2005).
This is because a proper representation will makes feature of interest apparent and will
significantly facilitating operations that follow. Time is another important factor in object
based video segmentation applications (Ganssle, 2000). A real-time based system requires
the processing tasks to be completed within specified timeline. Therefore great computer
power and knowledge of programming in video image in a real-time manner must exist to
cope with the time challenge.
Consequently it can be seen that there are many works that can be done to cope with
the challenges in the object-based video segmentation application. Coping with the
challenges in the object-based video segmentation is necessary because object-based video
segmentation applications continue to grow in our society nowadays as it has remarkable
advantages to human being. Some of the examples of object-based video segmentation
application nowadays are pedestrian and vehicle tracking and surveillance, bubbles
tracking and soccer players tracking (Litzenberger et aI., 2006), advance driver-assistant
systems (de la Escalera et aI., 2004) and face detection and recognition system (ping, 2008).
1.5 COLOR SEGMENTATION
According to Gonzalez and Wood (2002) and ·Lucchese and Mitra (2001), color
segmentation is a process that partitions an image into regions based on color. It identifies
color using any color space/model available and performs color segmentation using any
color segmentation algorithm available. In Deng, Manjunath and Shin (1999), color
aegmentation has been used by many object-based video segmentation applications because
8
i could lead to identification of regions of interest and objects in the scene, which is very
beneficial to the subsequent image analysis or annotation. Gavrila (1999) also supported
that color could provide a very immediate focusing mechanism for detecting objects since
awry object has color information. The road sign detection and recognition application for
example, has been using color segmentation as the method to detect the position of road
sign in the image (Malik, Khurshid & Ahmad, 2007). Consequently color is one of the many
important low-level features used for content based retrieval of information in images and
videos (Lucchese & Mitra, 2001), and it has been used as a powerful descriptor that could
aimplifies object identification and extraction from a scene (Gonzalez & Wood, 2002).
In the state of the art, color segmentation faces various chatienges in its process.
The spectral power distribution of the illuminant and the surface reflectance properties of
the object has become the classical challenge of color segmentation (yilmaz, Javed & Shah,
2006). For example, changes in the weather's condition such as in day time (e.g. sunny or
cloudy day) often cause illumination changes that could affect the appearance of the object's
color (Estevez & Kehtarnavaz, 1996; Miura, Kanda & Shirai, 2000; Malik, Khurshid &
Ahmad, 2007). It also can cause difficulties for the censoring device to determine region
with desired color (Benallal & Meunier, 2003; Nguwi & Kouzani, 2006a). Color
segmentation too must be able to segment correct region in conditions such as deteriorate
object and object under shadows (de la Escalera et aI.~ 2004; Malik, Khurshid & Ahmad,
2007; Vicen - Bueno et aI., 2008). Another challenge for color segmentation is there are
many objects which colored is similar to the color of object of interest (Ghica, Si & Yuan,
. 6; Miura, Kanda & Shirai, 2000; Fang, Chen & Fuh, 2003).Object of interest may
appear in a fIXed location including behind other objects which make it difficult to segment
the whole object based on the color information (Fleyeh, 2006). Additionally, color
9
segmentation is a computationally expensive task (Volker, Raimund & Lutz, 1995).Color
spaces used in the color segmentation too is sensitive to noise (Yilmaz, Javed & Shah,
2(06).
Consequently it can be seen that there are many works that can be done to cope with
the challenges in the color segmentation. Coping with the challenges in color segmentation
is necessary because color is a powerful descriptor that can simplifies object identification
and extraction from a scene. Color Segmentation Robust to Brightness Variations by Using
B-spline Curve Modeling by Kim, You and Kim (2006) for example, is seen as one of many
works to cope with the challenges in the color segmentation operation.
1.6 CONNECTED COMPONENT LABELING
Introduced in the year 1966 by Rosenfeld and Pfaltz, CCL is an approach that labels
regions within an image with a unique label. The region labeling technique involves
l188igning a label to the same connected components of region to ensure the region have the
ame label and those in different connected components of region have different label. With
the region labeling result, boundaries of objects and components of regions can be
tablished, and the number of blobs in the image can be counted (park, Looney & Chen,
20(0). Higher level image analysis too can use the region labeling result to identify objects
. the image (Ito & Nakano, 2008). Some of CCL application can be found in the image
ased applications such as fingerprint identification, character recognition, automated
. pection, target recognition, face identification, medical image analysis and computer
aided diagnosis (He, Chao & Suzuki, 2008).
10