intelligent visual prosthesis
TRANSCRIPT
![Page 1: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/1.jpg)
![Page 2: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/2.jpg)
Depth image-based obstacle
detection
Intelligent Visual ProsthesisOrientation
sensor (IMU)
Depth
camera Wide-
angle RGB
camera
du
Simultaneous
object
recognition,
localization, and
hand tracking
New projects:
• Barcode reading
• Face recognition
• Text reading
• Currency recognition
![Page 3: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/3.jpg)
Will Povell ‘20
University memorial service
4pm today
Leung Family Gallery in the Stephen Robert ’62 Campus Center
Class will end early.
![Page 4: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/4.jpg)
![Page 5: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/5.jpg)
![Page 6: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/6.jpg)
Capture Frequency - Rolling `Shutter’
James Hays
![Page 7: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/7.jpg)
High pass vs low pass filters
![Page 8: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/8.jpg)
I understand frequency as in waves…
…but how does this relate to the complex
signals we see in natural images?
…to image frequency?
![Page 9: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/9.jpg)
FOURIER SERIES & FOURIER TRANSFORMS
Another way of thinking about frequency
![Page 10: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/10.jpg)
Fourier series
A bold idea (1807):Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.
Hays
Jean Baptiste Joseph Fourier (1768-1830)
Our building block:
Add enough of them to get any signal g(t) you want!
𝑛=1
∞
𝑎𝑛 cos(𝑛𝑡) + 𝑏𝑛 sin(𝑛𝑡)
![Page 11: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/11.jpg)
g(t) = (1)sin(2πf t) + (1/3)sin(2π(3f) t)
= +
Slides: Efros
Co
effic
ien
t
t = [0,2], f = 1
t
g
![Page 12: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/12.jpg)
Square wave spectra
![Page 13: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/13.jpg)
= +
=
Square wave spectra
![Page 14: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/14.jpg)
= +
=
Square wave spectra
![Page 15: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/15.jpg)
= +
=
Square wave spectra
![Page 16: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/16.jpg)
= +
=
Square wave spectra
![Page 17: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/17.jpg)
= +
=
Square wave spectra
![Page 18: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/18.jpg)
=
Square wave spectra
Co
effic
ien
t
𝑛=1
∞1
𝑛sin(𝑛𝑡)
For periodic signals
defined over [0, 2𝜋]
0 2𝜋
![Page 19: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/19.jpg)
Jean Baptiste Joseph Fourier (1768-1830)
A bold idea (1807):Any univariate function can be rewritten as a weighted sum of sines and cosines of different frequencies.
Don’t believe it?
– Neither did Lagrange, Laplace, Poisson and other big wigs
– Not translated into English until 1878!
But it’s (mostly) true!
– Called Fourier Series
– Applies to periodic signals
...the manner in which the author arrives atthese equations is not exempt of difficultiesand...his analysis to integrate them stillleaves something to be desired on the scoreof generality and even rigour.
Laplace
LagrangeLegendre
Hays
Reviewer #2
![Page 20: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/20.jpg)
Wikipedia – Fourier transform
![Page 21: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/21.jpg)
Sine/cosine and circle
Wikipedia – Unit Circle
sin t
cos t
![Page 22: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/22.jpg)
Square wave (approx.)
Mehmet E. Yavuz
![Page 23: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/23.jpg)
Sawtooth wave (approx.)
Mehmet E. Yavuz
![Page 25: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/25.jpg)
Amplitude-phase form
Add phase term to shift cos values into sin values
𝑛=1
𝑁
𝑎𝑛 sin(𝑛𝑥 + ∅𝑛)
Phase
Morse
[Peppergrower; Wikipedia]
![Page 26: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/26.jpg)
Amplitude-phase form
Add component of infinite frequency= mean of signal over period
= value around which signal fluctuates
𝑎02+
𝑛=1
𝑁
𝑎𝑛 sin(𝑛𝑥 + ∅𝑛)
Average of signal
over period Electronics signal processing
calls this the ‘DC offset’
𝑎02
![Page 27: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/27.jpg)
Spatial
domain
Frequency
domainFrequencyA
mplit
ude
Equivalent
amplitude image
representation
![Page 28: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/28.jpg)
How to read Fourier transform images
We display the space such that inf-frequency coefficient is in the center.
Fourier transform component image
X
𝑎02+
𝑛=1
𝑁
𝑎𝑛 sin(𝑛𝑥 + ∅𝑛)
𝑎0 location
Y
![Page 29: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/29.jpg)
How to read Fourier transform images
Image is rotationally symmetric about center because of negative frequencies
Fourier transform component image
X
𝑎02+
𝑛=1
𝑁
𝑎𝑛 sin(𝑛𝑥 + ∅𝑛)𝑎1 positive
location
Y𝑎1 negative
location
e.g., wheel rotating
one way or the other
For real-valued signals, positive
and negative frequencies are
complex conjugates (see
additional slides).
![Page 30: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/30.jpg)
How to read Fourier transform images
Image is as large as maximum frequency
by resolution of input under Nyquist frequency
Fourier transform component image
X
𝑎02+
𝑛=1
𝑁
𝑎𝑛 sin(𝑛𝑥 + ∅𝑛)
Y
N
Nyquist frequency is half the
sampling rate of the signal.
Sampling rate is size of X (or Y),
so Fourier transform images are
(2X+1,2Y+1).
![Page 31: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/31.jpg)
Fourier analysis in imagesSpatial domain images
Fourier decomposition amplitude images
http://sharp.bu.edu/~slehar/fourier/fourier.html#filtering More: http://www.cs.unm.edu/~brayer/vision/fourier.html
![Page 32: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/32.jpg)
Signals can be composed
+ =
http://sharp.bu.edu/~slehar/fourier/fourier.html#filtering More: http://www.cs.unm.edu/~brayer/vision/fourier.html
Spatial domain images
Fourier decomposition amplitude images
![Page 33: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/33.jpg)
Periodicity
• Fourier decomposition assumes periodic signals with infinite extent.
– E.G., our image is assumed to repeated forever
• ‘Fake’ signal is image wrapped around
Hays
Convention is to pad with
zeros to reduce effect.
![Page 34: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/34.jpg)
Natural image
What does it mean to be at pixel x,y?
What does it mean to be more or less bright in the Fourier decomposition
image?
Natural image
Fourier decomposition
Amplitude image
![Page 35: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/35.jpg)
Think-Pair-Share
Match the spatial domain image to the Fourier amplitude image
1 54
A
32
CB D E
Hoiem
![Page 36: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/36.jpg)
Fourier Bases
This change of basis is the Fourier Transform
Teases away ‘fast vs. slow’ changes in the image.
Blue = sine
Green = cosine
Hays
![Page 37: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/37.jpg)
Basis reconstruction
Danny Alexander
![Page 38: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/38.jpg)
Fourier Transform
• Stores the amplitude and phase at each frequency:– For mathematical convenience, this is often notated in terms of real
and complex numbers
– Related by Euler’s formula
Hays
![Page 39: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/39.jpg)
Fourier Transform
• Stores the amplitude and phase at each frequency:– For mathematical convenience, this is often notated in terms of real
and complex numbers
– Related by Euler’s formula
22 )Im()Re( +=A
)Re(
)Im(tan 1
−=
Hays
Phase encodes spatial information
(indirectly):
Amplitude encodes how much signal
there is at a particular frequency:
![Page 40: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/40.jpg)
What about phase?
Efros
![Page 41: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/41.jpg)
What about phase?
Amplitude Phase
Efros
![Page 42: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/42.jpg)
What about phase?
Efros
![Page 43: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/43.jpg)
What about phase?
Amplitude Phase
Efros
![Page 44: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/44.jpg)
John Brayer, Uni. New Mexico
• “We generally do not display PHASE images because most people who see them shortly thereafter succumb to hallucinogenics or end up in a Tibetan monastery.”
• https://www.cs.unm.edu/~brayer/vision/fourier.html
![Page 45: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/45.jpg)
Think-Pair-Share
• In Fourier space, where is more of the information that we see in the visual world?
– Amplitude
– Phase
![Page 46: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/46.jpg)
Cheebra
Zebra phase, cheetah amplitude Cheetah phase, zebra amplitude
Efros
![Page 47: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/47.jpg)
• The frequency amplitude of natural images are quite similar
– Heavy in low frequencies, falling off in high frequencies
– Will any image be like that, or is it a property of the world we live in?
• Most information in the image is carried in the phase, not the amplitude
– Not quite clear why
Efros
![Page 48: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/48.jpg)
What is the relationship to phase in audio?
• In audio perception, frequency is important but phase is not.
• In visual perception, both are important.
• ??? : (
![Page 49: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/49.jpg)
Brian Pauw demo
• Live Fourier decomposition images
– Using FFT2 function
• I hacked it a bit for MATLAB
• http://www.lookingatnothing.com/index.php/archives/991
![Page 50: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/50.jpg)
Properties of Fourier Transforms
• Linearity
• Fourier transform of a real signal is symmetric about the origin
• The energy of the signal is the same as the energy of its Fourier transform
See Szeliski Book (3.4)
)](F[)](F[)]()(F[ tybtxatbytax +=+
![Page 51: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/51.jpg)
The Convolution Theorem
• The Fourier transform of the convolution of two functions is the product of their Fourier transforms
• Convolution in spatial domain is equivalent to multiplication in frequency domain!
]F[]F[]F[ hghg =
]]F[][F[F* 1 hghg −=
Hays
![Page 52: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/52.jpg)
Filtering in spatial domain
-101
-202
-101
* =
Hays
Convolution
![Page 53: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/53.jpg)
Filtering in frequency domain
Fourier
transform
Inverse
Fourier transform
Slide: Hoiem
Element-wise
product
Fourier
transform
![Page 54: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/54.jpg)
Now we can edit frequencies!
![Page 55: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/55.jpg)
Low and High Pass filtering
![Page 56: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/56.jpg)
Removing frequency bands
Brayer
![Page 57: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/57.jpg)
High pass filtering + orientation
![Page 58: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/58.jpg)
Why does the Gaussian filter give a nice smooth image, but the square filter give edgy artifacts?
Gaussian Box filter
Hays
![Page 59: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/59.jpg)
Why do we have those lines in the image?
• Sharp edges in the image need _all_ frequencies to represent them.
= 1
1sin(2 )
k
A ktk
=
co
effic
ien
t
Efros
![Page 60: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/60.jpg)
• What is the spatial representation of the hard cutoff (box) in the frequency domain?
• http://madebyevan.com/dft/
Box filter / sinc filter dualityHays
![Page 61: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/61.jpg)
Evan Wallace demo
• Made for CS123
• 1D example
• Forbes 30 under 30
– Figma (collaborative design tools)
• http://madebyevan.com/dft/
with Dylan Field
![Page 62: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/62.jpg)
• What is the spatial representation of the hard cutoff (box) in the frequency domain?
• http://madebyevan.com/dft/
Box filter / sinc filter duality
Spatial Domain Frequency Domain
Frequency Domain Spatial Domain
Box filter Sinc filter sinc(x) = sin(x) / x
Hays
![Page 63: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/63.jpg)
Box filter (spatial)Frequency domain
magnitude
![Page 64: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/64.jpg)
Gaussian filter duality
• Fourier transform of one Gaussian……is another Gaussian (with inverse variance).
• Why is this useful?– Smooth degradation in frequency components– No sharp cut-off– No negative values– Never zero (infinite extent)
Box filter (spatial)Frequency domain
magnitude
Gaussian filter
(spatial)
Frequency domain
magnitude
![Page 65: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/65.jpg)
Ringing artifacts -> ‘Gibbs effect’
Where infinite series can never be reached
Gaussian Box filter
Hays
![Page 66: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/66.jpg)
Is convolution invertible?
• If convolution is just multiplication in the Fourier domain, isn’t deconvolution just division?
• Sometimes, it clearly is invertible (e.g. a convolution with an identity filter)
• In one case, it clearly isn’t invertible (e.g. convolution with an all zero filter)
• What about for common filters like a Gaussian?
![Page 67: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/67.jpg)
Convolution
* =
FFT FFT
.x =
iFFT
Hays
![Page 68: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/68.jpg)
Deconvolution?
iFFT FFT
./=
FFT
Hays
![Page 69: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/69.jpg)
But under more realistic conditions
iFFT FFT
./=
FFT
Random noise, .000001 magnitude
Hays
![Page 70: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/70.jpg)
But under more realistic conditions
iFFT FFT
./=
FFT
Random noise, .0001 magnitude
Hays
![Page 71: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/71.jpg)
But under more realistic conditions
iFFT FFT
./=
FFT
Random noise, .001 magnitude
Hays
![Page 72: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/72.jpg)
Deconvolution is hard.
• Active research area.
• Even if you know the filter (non-blind deconvolution), it is still hard and requires strong regularization to counteract noise.
• If you don’t know the filter (blind deconvolution),then it is harder still.
![Page 73: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/73.jpg)
A few questions
How is the Fourier decomposition computed?
Intuitively, by correlating the signal with a set of waves of increasing frequency!
Notes in hidden slides.
Plus: http://research.stowers-institute.org/efg/Report/FourierAnalysis.pdf
![Page 74: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/74.jpg)
Earl F. Glynn
![Page 75: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/75.jpg)
Earl F. Glynn
![Page 76: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/76.jpg)
Index cards before you leave
Fourier decomposition is tricky
I want to know what is confusing to you!
- Take 1 minute to talk to your partner about the lecture
- Write down on the card something that you’d like clarifying.
![Page 77: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/77.jpg)
How is it that a 4MP image can be compressed to a few hundred KB without a noticeable change?
Thinking in Frequency - Compression
![Page 78: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/78.jpg)
Lossy Image Compression (JPEG)
Block-based Discrete Cosine Transform (DCT)
Slides: Efros
The first coefficient B(0,0) is the DC
component, the average intensity
The top-left coeffs represent low
frequencies, the bottom right
represent high frequencies
8x8 blocks
8x8 blocks
![Page 79: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/79.jpg)
Image compression using DCT
• Compute DCT filter responses in each 8x8 block
• Quantize to integer (div. by magic number; round)– More coarsely for high frequencies (which also tend to have smaller values)
– Many quantized high frequency values will be zero
Quantization divisers (element-wise)
Filter responses
Quantized values
![Page 80: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/80.jpg)
JPEG Encoding
• Entropy coding (Huffman-variant)Quantized values
Linearize B
like this.Helps compression:
- We throw away
the high
frequencies (‘0’).
- The zig zag
pattern increases
in frequency
space, so long
runs of zeros.
![Page 81: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/81.jpg)
Color spaces: YCbCr
Y(Cb=0.5,Cr=0.5)
Cb(Y=0.5,Cr=0.5)
Cr(Y=0.5,Cb=05)
Y=0 Y=0.5
Y=1Cb
Cr
Fast to compute, good for
compression, used by TV
James Hays
![Page 82: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/82.jpg)
Most JPEG images & videos subsample chroma
![Page 83: Intelligent Visual Prosthesis](https://reader030.vdocuments.mx/reader030/viewer/2022012808/61bde38f7a01b100b33e623f/html5/thumbnails/83.jpg)
JPEG Compression Summary
1. Convert image to YCrCb
2. Subsample color by factor of 2– People have bad resolution for color
3. Split into blocks (8x8, typically), subtract 128
4. For each blocka. Compute DCT coefficients
b. Coarsely quantize• Many high frequency components will become zero
c. Encode (with run length encoding and then Huffman coding for leftovers)
http://en.wikipedia.org/wiki/YCbCr
http://en.wikipedia.org/wiki/JPEG