1 image databases conventional relational databases, the user types in a query and obtains an...
Post on 19-Dec-2015
226 views
TRANSCRIPT
11
Image Databases Conventional relational databases, the user types in a query
and obtains an answer in response It is different in image databases
a police officer may issue a query: “ Retrieve all pictures from the image database that are “similar” to this person and give the identities of the people.”
This query is fundamentally different from ordinary queries for 2 reasons: 1. The query includes a picture as part of the query 2. The query asks about similar pictures and therefore uses a
notion of “imprecise match”
22
Raw images the content of an image consists of all “interesting” objects
in that image each object is characterized by
a shape descriptor: that describes the shape/location of the region within which the object is located inside the image
a property descriptor that describes the properties of individual pixels (e.g. RGB values of the pixel, RGB values aggregated over a group of pixels, grayscale levels)
a property consists of a property name, e.g., red, green, blue, texture a property domain - range of values that a property can assume {0, 1, ..7}
33
Images Every image is associated with a pair of positive integers
(m,n), called grid-resolution, which divides the image into (mn) cells of equal size (called image grid)
Each cell consists of a collection of pixels A cell property: (Name, Values, Method)
Example (bwcolor, {b,w}, bwalgo}, where the possible values are b(black) and
w(white), and bwalgo is an algorithm that takes a cell as an input and returns either black or white by somehow combining the black/white levels of the pixels in the cell
(graylevel, [0,1], grayalgo), where the possible values are real numbers within the interval [0,1].
44
Image Database Image Database: (GI,Prop,Rec)
GI is a set of gridded images (Image,m,n) Prop is a set of cell properties Rec is a mapping that associates with each image, a set of
rectangles denoting objects (in fact this does not necessarily have to be rectangle)
55
Problems with image databases Images are often very large
infeasible to explicitly store the properties on a pixel by pixel basis
This led to a family of image “compression” techniques: attempt to compress the image into one containing fewer pixels
There is a need to determine the “features” of the image (compressed or raw) done by “segmentation” : breaking up the image into a set of
homogeneous rectangular regions called segments Need to support “match” operations that compare either a
whole image or a segmented image against another
66
Image Compression Lossy Compression
Image may contain details that human eye cannot recognize get rid of those details
DCT(Discrete Cosine Transform) DFT(Discrete Fourier Transform) DWT(Discrete Wavelet Transform)
• convert images from time domain(Spatial) to frequency domain• get rid of the frequencies which do not contain information.
Transforms DCT and DFT are similar concepts
From time domain to signal domain Given a signal of length “n”, these transforms return a sequence of n
frequencies. • Sample1, Sample2, . . . . . . . , Sample n transforms to :• Freq1, Freq2, . . . . . . . . . , Freq n.
77
Why do we use the transform Noise removal is easier in the frequency domain Various filters are easier to implement in frequency domain Compression (gathers similar values together)
88
Desirable Properties of Transforms DFT
Invertibility: It is possible to get back the original image I from its DFT representation. (useful for decompression)
Note: practical implementations of DFT often use DFT with other non-invertible operations: thus sacrifice invertibility
Distance preservation: DFT preserves Euclidean distance. This is important in image matching applications where we often use distance
measures to represent similarity levels
DCT DCT preserves all the above a given signal can be represented with fewer frequencies
DWT DFT and DCT have no temporal locality
a change in one single part of data changes all frequencies wavelets introduce locality
99
Distance preservation
1010
Distance preservation
1111
Fractal Compression Transform-based approaches benefit from the difference in
visual perception in different frequencies What else can we use for compression ?
Self similarity We can find self similarities in a given image and describe the image in
terms of these similarities.
1212
Fractal Compression
1313
Image Processing: Segmentation A process of taking an image as input and cutting up the
image into disjoint homogeneous regions Connected region (R):
is a set of cells C1 .. Cn in R such that the Euclidean distance between Ci and Ci+1 for all i < n is 1
Example R1,R2,R3 is connected R1 R2 is connected R2 R3 is connected R1R2 R3 is connected R1 R3 is not connected Because the Euclidian distance between (2,3) and (3,4) is 2>1
R3
R1R2
1 2 3 4
4
3
2 1
1414
Measuring Homogeneity Homogeneity predicate: is a function H that takes any
connected region as input and returns either true or false Example 1:
Suppose is some real number between 0 and 1 H
bw can be defined as Hbw (R) is true if over (100*)% of cells in
R have the same color
Region # of black #of white cells cellsR1 800 200R2 900 100R3 100 900
Region H0.8bw H0.89
bw H0.92bw
R1 true false falseR2 true true false R3 true true false
1515
Measuring Homogeneity Example 1:
Suppose each cell has a real value between 0, 1, this value is bw-level
Suppose f assigns a value between 0 and 1 to each cell Assume is the noise factor and a threshold H,f,(R) is true if {(x,y)| |bwlevel(x,y)-f(x,y)|< }/(mn) >
1616
Segmentation Given an image I with (mn) cells, a segmentation of I wrt
a homogeneity predicate P is a set of R1, .Rk such that Ri Rj = for all 1 i j k I = R1 .. Rk H(Ri) = true for all i j k for all distinct i,j, 1 I, j n such that Ri Rj is a connected
region, it is the case that H(Ri Rj) = false
1717
An Example of Segmentation
Row/Col 1 2 3 4 1 0.1 0.25 0.5 0.52 0.05 0.30 0.6 0.63 0.35 0.30 0.55 0.84 0.6 0.63 0.85 0.90
For Hdyn,0.03(R) of the following (44) image will yield the following segmentation R1 = {(1,1),(1,2)} R2 = {(1,3),(2,1),(2,2),(2,3)} R3 = {(3,1),(3,2),(3,3),(4,1),(4,2)} R4 = {(3,4),(4,3),(4,4)} R5 = {(1,4),(2,4)}
Row/Col 1 2 3 4 1 0.1 0.25 0.5 0.52 0.05 0.30 0.6 0.63 0.35 0.30 0.55 0.84 0.6 0.63 0.85 0.90
1818
Segmentation Algorithm Split:
if the whole image is homogeneous, we are done otherwise, split the image into two parts and recursively repeat
this process till we find a set of R1 .. Rn such that each region is homogeneous
Merge: check whether any of the Ri’s can be merged together at the end of this step, we obtain a valid segmentation R1, ..Rk
1919
Similarity Based Retrieval
2020
Similarity Based Retrieval
2121
Similarity Based Retrieval The Metric Approach:
Uses a distance measure d that can compare tow images The smaller the distance, the more similar they are I.e., given an input image I, find the “nearest neighbor” of I in the
image archive The Transformation Approach:
The metric approach assumes that the notion of similarity is fixed Whereas the transformation approach computes the cost of
transforming one image into another based on user-specified cost functions that may vary from one query to another
2222
The Metric Approach We define a distance function on a k dimensional space
(k=n+2) the distance function satisfies the following properties
d(x,y) = d(y,x) d(x,z) d(x,z) + d(z,y) d(x,x) = 0
Example: Let the image object consists of (256256) cells with 3 attributes (red,green,blue) each of which assumes a value from {0,…7} di(o1,o2) = (diffr[i,j]+diffg[i,j]+diffb[i,j]) where diffr[i,j] = (o1[i,j].red - o2[i,j].red)2
diffg[i,j] = (o1[i,j].green - o2[i,j].green)2
diffb[i,j] = (o1[i,j].blue - o2[i,j].blue)2
Such computations can be cumbersome (65536 expressions being computed inside the sum)
2323
The Metric Approach How can this massive similarity computation be avoided? Through feature extraction! Use a good feature extraction function fe and use it to map
objects into single points in a s-dimensional space where s would typically be pretty small compared to n+2
This leads to two reductions an object is originally is a set of points in an (n+2) dimensional
space. In contrast, fe(0) is a single point fe(o) is a point in s-dimensional space where s << (n+2)
The feature extraction mapping must preserve the distance relationships in the original space
(n+2) dim space s-dim space indexing algorithm index object repository (could be quadtree, R-tree for s-dim data)
2424
Searching Finding the best matches
find the nearest neighbors of fe(o) in the tree using a nearest neighbor search technique.
Finding sufficiently similar objects execute a range query in the tree with center fe(o) and radius
2525
The Transformation Approach The main principle
the level of dis-similarity between o1,o2 is proportional to the cost of transforming o1 into o2, or vice-versa
Transformation operators translation rotation scaling (uniform and nonuniform) excision
Transformation of o into o’ is a sequence of transformation operations (to1,to2, ..tor) such that to1(o) = o1 …... To(or) = o’ Cost of transformation, cost(TS) = cost(toi)
2626
Example
2727
Example
2828
Example
2929
Transformation vs. Metric Advantages of the transformation model
user can setup his own notion of similarity by specifying certain transformation operators
user may associate a cost function with each transformation operator
Advantages of the metric model by forcing the user to use only one similarity metric, the system
can facilitate the indexing of data so as to optimize nearest neighbor search