cs 556 – computer vision image basics & review. what is an image? image: a representation,...
TRANSCRIPT
CS 556 – Computer Vision
Image Basics & Review
What is an Image?• Image: a representation, resemblance, or likeness
• An image is a signal: a function carrying information
• Thus, an image is a function, f, from 2 to : f (x, y) gives the amount of some value at position (x, y)
In practice, an image is only defined over a finite rectangular domain and with a finite range:
f : [a, b] [c, d] [vmin, vmax]
Images as Functions
Images as Functions• An image is a signal: a function carrying information
• Functions have domains and ranges:
ν)(f
Domain:(t)
(x,y)(x,y,t)(x,y,z)
(x,y,z,t)
Range:sound (air pressure)
graylevel (light intensity)color (RGB, HSL)
LANDSAT (7 bands)
Image as Functions• A color image is “vector-valued” function created by
pasting three functions together:
• Spatial image: function of two or three spatial dimensions f(x, y): images (grayscale, color, multi-spectral)
f(x, y, z): medical scans or image volumes (CT, MRI)
• Spatio-temporal image: 2/3-D space, 1-D time f(x, y, t): videos, movies, animations
),(
),(
),(
),(
yxb
yxg
yxr
yxf
• May be quantities we cannot sense:
What do the Range Values Mean?• May be visible light:
Radio waves (e.g., doppler radar)
Magnetic resonance
Range images
Ultrasound
X-rays (e.g., CT)
Intensity (gray-level)
What intensity does the value “213” represent?
Color (RGB)
Digital Images: Domains & Ranges
• The real world is analog: continuous domain and range
• Computers operate on digital (discrete) data
• Converting from continuous to discrete: Domains: selection of discrete points is called sampling Ranges: selection of discrete values is called quantization
Domain Sampling
Ran
ge
Qua
ntiz
atio
n
Digital Image Formation• To create a digital image:
Sample the 2-D space on a regular grid Quantize each sample (round to nearest integer)
• If the samples are apart, wecan write this as:
f [i, j] = Quantize{f (i , j )}
• The image can now berepresented as a matrix ofinteger values
62 79 23 119 120 105 4 0
10 10 9 62 12 78 34 0
10 58 197 46 46 0 0 48
176 135 5 188 191 68 0 49
2 1 1 39 26 37 0 77
0 89 144 147 187 102 62 208
255 252 0 166 123 62 0 208
166 63 127 17 1 0 99 30
i
j
Resolution• Ability to discern detail – both domain & range
• Not simply the number of samples/pixels
• Determined by the averaging or spreading of information when sampled or reconstructed
Apertures• Point measurements are impossible
• Have to make measurements using a (weighted) average over some aperture: Time window Spatial area Etc.
• Size determines resolution: Smaller better resolution Larger worse resolution
AperturesLenses allow physically larger aperture with effectively smaller one
Sensor
Lens
EffectiveAperture
PhysicalAperture
Image TransformationsAn image processing operation typically defines a new image g in terms of an existing image f
• We can transform either the domain or the range of f
• Range transformation (a.k.a level operations):
g(x, y) = t( f (x, y))
What’s kinds of operations can this perform?
Image TransformationsSome operations preserve the range but change the domain of f (a.k.a geometric operations):
g(x, y) = f (tx(x, y), ty(x, y))
What kinds of operations can this perform?
• Many image transforms operate both on the domain and the range
Linear TransformsA general and very useful class of transforms are linear transforms
Properties of linear transforms: Multiplying input f(x) by a constant value multiplies the
output by the same constant:
t(a f(x)) = a t( f(x))
Adding two inputs causes corresponding outputs to add:
t( f(x) + h(x)) = t( f(x)) + t(h(x))
Linearity: the transform t is linear iff
t(a f(x) + b h(x)) = a t( f(x)) + b t(h(x))
Linear TransformsA linear transforms of a discrete signal/image f can be defined by a matrix M using matrix multiplication
1
0
][],[][n
j
jfjiig M
*
*
*
*
*
*
***
***
***
f [i]M [i, j] g[i]
Note that matrix and vector indices start
at 0 instead of 1
Does M(a f + b h) = a M f + b M h?
Linear Transforms: Examples
Let’s start with a discrete 1-D image (a “signal”): f [x]
][xff [x]
x
44002466
Linear Transforms: Examples
Identity transform:
fM g
6
6
4
2
0
0
4
4
10000000
01000000
00100000
00010000
00001000
00000100
00000010
00000001 4
4
0
0
2
4
6
6
M = I M f = I f = f
Linear Transforms: Examples
Scale:
fM g
6
6
4
2
0
0
4
4
0000000
0000000
0000000
0000000
0000000
0000000
0000000
0000000
a
a
a
a
a
a
a
a 4a
4a
0
0
2a
4a
6a
6a
M = a I M f = a I f = a f
Linear Transforms: Examples
Shift (translate):
fM g
6
6
4
2
0
0
4
4
00100000
00010000
00001000
00000100
00000010
00000001
00000000
00000000 0
0
4
4
0
0
2
4
f [x]
g[x]
Linear Transforms: Examples
Derivative (finite difference):
fM g
6
6
4
2
0
0
4
4
00000000
11000000
01100000
00110000
00011000
00001100
00000110
00000011 0
– 4
0
2
2
2
0
0
f [x]
g[x]
Linear Transforms: Examples
The transformation matrix doesn’t have to be square
fM g
6
6
4
2
0
0
4
4
11000000
00110000
00001100
00000011
2
1
4
0
3
6
f [x]
g[x]
Fourier TransformOne important linear transform is the Fourier transform
• Basic idea: any function can be written as the sum of (complex-valued) sinusoids of different frequencies
• Euler’s equation: ei2sx = cos(2sx) + i sin(2sx)
Note: i is the imaginary number
• To get the weights (amount of each frequency):
1
0
2][
1][
n
x
nsxi
exfn
sF
1
Fourier TransformIn matrix form:
where
The frequency increases with the row number
][
][
]2[
]1[
]0[
1
1
1
1
11111
1
2
2
2242
2
nf
xf
f
f
f
WWWW
WWWW
WWWW
WWWW
n
nnn
nxn
nn
nn
snn
sxn
sn
sn
nn
xnnn
nn
xnnn
ni
n eW
2
Linear Shift-Invariant TransformA special class of linear transforms are shift invariant
• Shift invariance: an operation is invariant to translation
• Implication: shifting the input produces the same output with an equal shift
if g(x) = t( f(x))
then t( f(x + x0) = g(x + x0)
FiltersFilter: linear, shift-invariant transform
Often applied to operations that are not technically filters (e.g., median “filter”)
Transformation matrix M: Shifted copy of some pattern
applied to each row Pattern is (usually) centered
on (or near) the diagonal
Pattern is called a filter, kernel,or mask and is represented bya vector h
**000000
00000
00000
00000
00000
00000
00000
000000**
cba
cba
cba
cba
cba
cba
M
h[x] = [a b c]
FiltersFilter operations can be written (for a kernel size of 2k + 1) as:
Assumes negative kernel indices. . . Actual implementation may need to use h[j + k] instead of h[j]
• Can think of it as a dot (or inner) product of h with a portion of f
• Since 2k + 1 is often much less than n, this computation is more efficient (it ignores summing terms that are multiplied with 0)
k
kj
jxfjhxg ][][][
Cross-Correlation & ConvolutionFiltering operations come in two (very similar) types:
Cross-correlation (already seen):
Convolution:
Convolution is cross-correlation where either the kernel or signal is flipped first
How do the results differ for cross-correlation and convolution if the kernel is symmetric? Anti-symmetric?
k
kj
jxfjhxfxhxg ][][][][][
k
kj
jxfjhxfxhxg ][][][*][][
2-D Linear TransformsA 2-D discrete image (in matrix form) can form a 1-D vector by concatenating the rows into one long vector:
However, it is usually easier to think about it in terms of the computation for an individual value of g[u, v]:
]%,/[][ˆ nxnxfxf
*
*
*
*
*
*
***
***
***
M f g
1
0
1
0, ],[],[ ],[
m
y
n
xvu yxfyxwvug
2-D Transforms: Fourier Transform
The 2-D discrete Fourier Transform is given by:
where the weight values wu,v have been replaced with
1
0
1
0
)//(2],[ 1
],[n
x
m
y
mvynuxieyxfnm
vuF
)//(2,
1],[ mvynuxi
vu enm
yxw
Fourier Transform: Examples
2-D FilteringA 2-D image f[x, y] can be filtered by convolving (or
cross-correlating) it with a 2-D kernel h[x, y] to produce an output image g[u, v]:
As with the 1-D case, actual implementation may need to use h[i + k, j + k] instead of h[i, j] to adjust for negative indices
• Filtering is useful for many reasons such as noise reduction and edge detection
k
kj
k
ki
jyixfjihyxg ],[],[ ],[
Noise• Unavoidable/undesirable fluctuation from “correct”
value: The nemesis of signal/image processing and computer vision
• Usually random: modeled as a statistical distribution Mean () at the “correct” value Measured sample varies from according to distribution ()
• Signal-to-Noise Ratio (SNR) = :
Measures how “noise free” the acquired signal is
“Signal” can refer to absolute or relative value
σμ
NoiseFiltering is useful for noise reduction. . .
• Common types of noise: Salt and pepper: random
occurrences of black and white pixels
Impulse: random occurrences of white pixels
Gaussian: intensity variations drawn from a normal distribution
What kind of filter (i.e., kernel) reduces noise? Why?
Noise Reduction: Mean Filter
What does a 33 mean (i.e., averaging) kernel look like?
What does it do to the salt and pepper noise?
What does it do to the edges of the white box?
0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
f[x, y] g[x, y]
Noise Reduction: Mean Filter
0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
0 10 20 30 30 30 20 100 20 40 60 60 60 40 200 30 60 90 90 90 60 300 30 50 80 80 90 60 300 30 50 80 80 90 60 300 20 30 50 50 60 40 20
10 20 30 30 30 30 20 1010 10 10 0 0 0 0 0
f[x, y] g[x, y]
1 1 11 1 11 1 1
9
1
h[x, y]
Noise Reduction: Mean Filter
0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 90 0 90 90 90 0 00 0 0 90 90 90 90 90 0 00 0 0 0 0 0 0 0 0 00 0 90 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0
0 10 20 30 30 30 20 100 20 40 60 60 60 40 200 30 60 90 90 90 60 300 30 50 80 80 90 60 300 30 50 80 80 90 60 300 20 30 50 50 60 40 20
10 20 30 30 30 30 20 1010 10 10 0 0 0 0 0
f[x, y] g[x, y]
1 1 11 1 11 1 1
9
1
h[x, y]
Mean Filter: Effect
33 55 77
Gaussiannoise
Salt andpeppernoise
Noise Reduction: Gaussian Filter
A Gaussian kernel gives less weight to pixels further from the center of the window
This kernel is an approximation of a Gaussian function:
1 2 12 4 21 2 1
16
1
h[x, y]
2
22
222
1),(
yx
eyxh
Mean vs. Gaussian Filtering
Non-Linear OperationsThey are often mistakenly called “filters” Strictly speaking, non-linear operators are not filters
They can be useful, though
Examples: Order statistics (e.g., median filter) Iterative algorithms (e.g., CLEAN) Anisotropic diffusion Non-uniform convolution-like operations
Median “Filter”Instead of a local neighborhood weighted average, compute the median of the neighborhood
• Advantages: Removes noise like low-pass filtering does Value is from actual image values Removes outliers – doesn’t average (blur) them into result
(“despeckling”) Edge preserving
• Disadvantages: Not linear Not shift invariant Slower to compute
Comparison: Salt & Pepper Noise
33
77
GaussianMean Median
Comparison: Gaussian Noise
33
77
GaussianMean Median