ensemble color segmentation spring 2009 ben-gurion university of the negev
TRANSCRIPT
Sensor Fusion Spring 2009
Ensemble Color Image Segmentation
Performance of present day image segmentation algorithms is low Aim is to improve the performance of these algorithms by
performing the algorithm on an ensemble of images formed by transforming the input RGB image into different color spaces.
Sensor Fusion Spring 2009
Color Spaces
Four families: Primary system: RGB, XYZ, rgb Luminence-Chrominance: Luv, Lab Perceptual Systems: HIS, HSV, HLS, IHLS Statistical: I1I2I3, H1H2H3
Sensor Fusion Spring 2009
K-means Cluster We segment each color scheme using K-means cluster algorithm: Each pixel (m,n) in input image is characterized by a color vector. In RGB
space this is
Select K centers (centroid) in the given color space:
Iterate until convergence. For this process we require a cost function (“distance”) between a centroid
and a pixel For RGB this is easy – there is no difference between the axes and we can use
Euclidean distance:
Sensor Fusion Spring 2009
K-means Cluster For other spaces we cannot use the Euclidean distance:
Eg For HVS space, hue is measured in angles and is a circular variable. One approach is to establish a window W(m,n) at each pixel. In each window we create a histogram of (HVS) values. This we do by quantizing the Hue, Value and Saturation into little
quantized units and counting up how many pixels have (HVS) value which falls in a given quantized unit
Sensor Fusion Spring 2009
Histogram of HSV values In each window create a histogram of (HVS) values by quantizing H,V,S and counting how many
pixels have (HVS) value which falls in a given quantized unit. Eg for 3x3 window the HVS planes are
Range of Hue is [0-360 deg]. Quantize H in units of 60 deg ie 6 units Range of Sat is [0-255]. Quantize S in units of 100, ie 3 units Range of Val is [0-255]. We quantize V into units of 150, ie 2 units Total # quantization units for (HVS)=6x3x2=36 units, ie Hist is [H1 H2 ……. H36]
where H1=no of pixels with 0<=H<60, 0<=S<100, 0<=V<150
H2=no of pixels with 60<=H<120,0<=S<100,0<=V<150
H36=no of pixels with 300<=H<=360,100<=S<=200,150<=V<=300
Sensor Fusion Spring 2009
Histogram of HSV values Example. Use a 3x3 window. At a given pixel the HVS planes are
Hist is [H1 H2 ……. H36]
eg H1=no of pixels with 0<=H<60, 0<=S<100, 0<=V<150
ie H1= 4 pixels
eg H2=no of pixels with 60<=H<120,0<=S<100,0<=V<150
ie H2=2 pixels
Sensor Fusion Spring 2009
K-means Cluster By converting HVS image into a 3-dimensional histogram cube we can now perform K-means clustering In this case we use a distance measure based on matching histograms eg Chi2 distance,
Kullback_Leibler distance etc. The result of the K-means clustering is we now have an ensemble of segmentation (cluster) image Obtained by clustering image in RGB space Obtained by clustering image in HVS space ….. Obtained by clustering image in L1L2L3 space etc Technical language: We have an ensemble of segmented images We now wish to fuse them together Important: All are obtained from same input image. Therefore no need for spatial alignment. But
we do need semantic equivalence
Sensor Fusion Spring 2009
K-means Cluster
We have segmented the input picture in 6 different ways, one for each color space.
Sensor Fusion Spring 2009
K-means Ensemble Clustering
We wish to fuse together the 6 segmented images. However the clusters are not semantically equivalent:
A pixel which has a label #3 in is not necessarily the same as label #3 in
In fact there may be no relationship between the labels. A label #3 in
may have no equivalent label in In order to fuse the segmented images we must transfer into a
semantically equivalent space. There are three ways of doing this
Sensor Fusion Spring 2009
Semantic Equivalence: Co-Association Matrix
Convert each MxN segmented image into a MNxMN co-assocation matrix
Fuse the co-association matrices. Eg Mean co-association matrix
and then find the corresponding cluster map The problem with this is that it is not at all easy to go from an co-
association matrix to a corresponding cluster map In order to do this we require number of expected number of labels
(clusters) in
Sensor Fusion Spring 2009
Semantic equivalence. Co-association Matrix.
The rank of the co-association matrices
are equal to number of clusters, ie labels in The eigenvalues of are equal to number of samples in each
label
Example.
Eigenvalues e = (3 3 1) i.e. 3 labels where two labels have thee
samples and one label has 1 sample.
Sensor Fusion Spring 2009
Semantic Equivalence. Co-Association Matrix
The above properties are approximately true for hold . Why?
Assuming the properties can estimate the number of labels in
= number of eigenvalues whose values are greater than (say) 2 Given we find using spectral cluster algorithm
Sensor Fusion Spring 2009
Semantic equivalence. Hypothesize The second approach to fusing the and thus finding the fused
image is to hypothesize as follows: Suppose we hypothesize a given . How do we now it is a good solution? Simple: Measure the average
error between and :
But we cannot do this difference because the are not semantically equivalent.
So we convert and into co-association matrices and assume that
The main problem is hypothesizing the but there are optimization algorithm for this
Sensor Fusion Spring 2009
Semantic Equivalence: Cocatenated Histogram
Third alternative to use use the idea of a cocatented histogram. This time we create a local histogram of for each Example. Suppose we have a given segmented image where
the number of labels is . This is equal to the number of clusters K we used in K-means cluster algorithm.
Consider a pixel (m,n) in and its 8 surrounding neighbors:
3
We may characterize this pixel by the histogram of its local labels. Thus for the above pixel (m,n) its histogram is:
Sensor Fusion Spring 2009
Semantic Equivalence. Cocatenated Histogram
We now concatenate all the histograms together to form a “fused” histogram. Thus for a given pixel (m,n) we have
Given the cocatenated histogram we can simply cluster it using the K-means algorithm as before
Important. By cocatenating we have side-stepped the problem that the indicividual histograms are not semantically equivalent.