"computational photography: understanding and expanding the capabilities of standard...
TRANSCRIPT
Copyright © 2016 NVIDIA 1
Computational PhotographyUnderstanding and expanding the capabilities of standard
cameras
Orazio GalloSenior Research Scientist, NVIDIA Research
05/03/2016
Copyright © 2016 NVIDIA 2Copyright © 2016 NVIDIA 2
Research Areas
LOW-LEVEL VISION GEOMETRIC VISION
MACHINE LEARNINGHIGH-LEVEL VISION
VISUAL COMPUTING RESEARCH @ NVIDIA
Copyright © 2016 NVIDIA 3Copyright © 2016 NVIDIA 3
Today, most of us are used to pressing the shutter button and getting great pictures out.
Copyright © 2016 NVIDIA 4Copyright © 2016 NVIDIA 4
This is a more accurate rendition of the output of the sensor. There is a lot going on
between the collection of photons and the picture being saved to disk.
Copyright © 2016 NVIDIA 5
Why do we care for computer vision?
Copyright © 2016 NVIDIA 6
Why do we care for computer vision?
Copyright © 2016 NVIDIA 7
Why do we care for computer vision?
Copyright © 2016 NVIDIA 8
Why do we care for computer vision?
Copyright © 2016 NVIDIA 9
Why do we care for computer vision?
Copyright © 2016 NVIDIA 11
What is computational photography?
Image processing
Input image(s) Processed image(s)
Computer vision
Input image(s) Semantic info
• Denoising
• Image enhancement
• Super-resolution
• Morphological operators
• …
• Object detection/recognition
• Optical flow/tracking
• Segmentation
• 3D scene reconstruction
• …
• Deblurring
• Edge detection
• …
Copyright © 2016 NVIDIA 13
What is computational photography?
Generalized optics
Lytro Pelican
Copyright © 2016 NVIDIA 14
What is computational photography?
Generalized opticsGeneralized
illumination
[11] Petschnigg et al., 2004
Copyright © 2016 NVIDIA 15
What is computational photography?
Generalized opticsGeneralized
sensors
Generalized
illumination
[9] Sims et al. 2016
Copyright © 2016 NVIDIA 16
What is computational photography?
Combining imagesGeneralized opticsGeneralized
sensors
Generalized
illumination
HDR image
Copyright © 2016 NVIDIA 17
What is computational photography?
Combining imagesGeneralized opticsGeneralized
sensors
Generalized
illumination
New cameras
Copyright © 2016 NVIDIA 18
What is computational photography?
Image processing
Input image(s) Processed image(s)
Computer vision
Input image(s) Semantic info
Computational
photography
Copyright © 2016 NVIDIA 19
• Interpreting the pixel value:
• Noise involved in the image formation process,
• What happens under the hood, and
• What to do when using RAW or JPEG images.
• Overcoming the main limitations of standard cameras with computation.
Outline
Copyright © 2016 NVIDIA 20
Image formation and on-camera processing
Copyright © 2016 NVIDIA 21
Know your types of noise!
+
Photon shot noise
Readout
logicISP
+
Vignetting
+
Dark current
+
Saturation
+
Decimation
+
Readout noise
+
ADC noise
+
LOTS of processingJPEG
RAW(10-16 bits linear)1
2
What do you need to know if you use…
Copyright © 2016 NVIDIA 22
Photon shot noise
• Due to the quantized nature of light:
(for a large number of photons)
• Recall that:
RAW images — Know your types of noise!
Adapted from Wikipedia
Solution: Expose To The Right (ETTR), or average multiple frames
Copyright © 2016 NVIDIA 23
RAW images — Know your types of noise!
Solution: Cool the sensor or subtract a dark frame
Dark current(a.k.a. dark current shot noise, or thermal noise)
• Thermal energy can free electrons even with
no incoming photons!
• Function of:
• Sensor’s temperature
• Exposure time
Wid
enhorn
et
al. [1]
Copyright © 2016 NVIDIA 24
• Photo-response non-uniformity (PRNU)
• Vignetting
• Readout noise:
• reset noise, occurring during charge-to-voltage transfer;
• white noise and flicker noise during voltage amplification;
• quantization noise during analog-to-digital conversion.
RAW images — Know your types of noise!
Solution: Flat field image
Solution: Expose To The Right (ETTR), or average multiple frames
Copyright © 2016 NVIDIA 25
• All in all, RAW images are great for computer vision for their dyamic
range and because they are a simple affine transformation of photon
count:
• Just remember:
• ETTR
• Correct for stuck/broken pixels
• Capture a dark frame (frame with lens-cap on)
• Capture a flat field (image of a uniform background)
RAW images — Know your types of noise!
Copyright © 2016 NVIDIA 26
JPEG images
Denoise Demosaic
Bad Pixel
Correction
Image
Enhancing
Tone
Mapping
Lens
Correction
Black
Level
Metering
AF/AE
Camera Imaging Pipeline
Copyright © 2016 NVIDIA 27
JPEG images
Denoise Demosaic
Bad Pixel
Correction
Image
Enhancing
Tone
Mapping
Lens
Correction
Black
Level
Metering
AF/AE
Copyright © 2016 NVIDIA 28
• Black level subtraction
• Correct for stuck pixels
• Lens shading compensation
• White balance
• Demosaic
• Color-space conversion (usually sRGB)
• Color correction matrix (≠ than white balance)
• Gamma compression
• Jpeg compression
• …
ISP Basics
Copyright © 2016 NVIDIA 29
Put everything together: Camera Response Function
Input (10-16bits)
Ou
tpu
t (8
bits)
Copyright © 2016 NVIDIA 30
Grossberg and Nayar [3]
Radiometric calibration
And this function changes with the camera!
Copyright © 2016 NVIDIA 31
• Changes significantly from camera to camera…
• … and even scene to scene!
• Usually estimated using images with different :
Radiometric calibration
For a review see Gallo and Sen [6]
Copyright © 2016 NVIDIA 32
Why gamma-like? Because of our visual system
Input (10-16bits)
Output (8bits)
Very sensitive Not so sensitive
Copyright © 2016 NVIDIA 35
So about our question before…
Copyright © 2016 NVIDIA 36
Alternatives to standard ISPs
Copyright © 2016 NVIDIA 37
ISP Alternatives
Denoise Demosaic
Bad Pixel
Correction
Image
Enhancing
Tone
Mapping
Lens
Correction
Black
Level
Metering
AF/AE
Copyright © 2016 NVIDIA 38Copyright © 2016 NVIDIA 38
Beefy NVIDIA GPU!
Copyright © 2016 NVIDIA 39
A
PSF CFA Nosie
argminx |z – Ax|2 + λ(x)
[2] Heide et al., 2014
FlexISP: A flexible camera image processing framework
z x
Data fidelity Regularization
Copyright © 2016 NVIDIA 40
FlexISP: A flexible camera image processing framework
Copyright © 2016 NVIDIA 41
Neural Network ISP
The problem with FlexISP is that the optimization happens at run time.
Why not using a NN where inference is very quick?
We are working on this!
Copyright © 2016 NVIDIA 42
Pushing the limits of modern cameras
Copyright © 2016 NVIDIA 43
Limited Dynamic Range
World
log-luminance
(log cd/m2)
-2 4 8-6
Copyright © 2016 NVIDIA 44
High-dynamic-range Imaging (HDR)
Copyright © 2016 NVIDIA 45
HDR Imaging Issues
• Biggest issue is image registration (camera motion and dynamic scenes)
• Need non-rigid registration
[7] Hu et al., 2013
Input stack Output stack
Copyright © 2016 NVIDIA 46
• 50mm lens has 47-degree FoV
• 110mm lens has 22-degree FoV
Limited field of view
Copyright © 2016 NVIDIA 47
Image stitching
Fro
m W
ikip
edia
Copyright © 2016 NVIDIA 48
Image stitching
Google Jump
+
GoPro Odyssey
GoPro Omni Jaunt
Copyright © 2016 NVIDIA 49
• Biggest issue is parallax
• Rotate the camera around the center of projection of the lenses,
• or hope everything is at infinity,
• or do full 3d reconstruction.
Limited field of view
Copyright © 2016 NVIDIA 50
Limited control over depth-of-field
https://commons.wikimedia.org/w/index.php?curid=330435
Copyright © 2016 NVIDIA 51
[8] Agarwala et al. 2004
Extended depth-of-field (when is small)
Copyright © 2016 NVIDIA 52
[8] Agarwala et al. 2004
Extended depth-of-field (when is small)
Copyright © 2016 NVIDIA 53
[8] Agarwala et al. 2004
Extended depth-of-field (when is small)
Copyright © 2016 NVIDIA 54
[8] Agarwala et al. 2004
Extended depth-of-field (when is small)
Copyright © 2016 NVIDIA 55
[8] Agarwala et al. 2004
Extended depth-of-field (when is small)
Copyright © 2016 NVIDIA 56
Photo credit: Marc Levoy
Synthetic shallow depth-of-field (for small aperture)
Copyright © 2016 NVIDIA 57
Take home lessons
• RAW vs JPEG (or non-linear):
• RAW:
• Better model of incoming light and has larger dymamic range
• Compensate for the types of noise involved in the image formation process.
• JPEG:
• Better picture quality thanks to the manufacturer processing
• Be aware of the non-linear transformation and undo it when needed!
• You can deal with the limitations of the camera modules you are given by means of
computation
• E.g., stack-based HDR, extended depth-of-field, etc.
Copyright © 2016 NVIDIA 58Copyright © 2016 NVIDIA 58
That’s all!
My contact info
Copyright © 2016 NVIDIA 59
[1] Widenhorn et al. “Temperature dependence of dark current in a CCD,” Electronic Imaging, 2002.
[2] Heide et al. “FlexISP: A flexible camera image processing framework,” SIGGRAPH Asia, 2014.
[3] Grossberg and Nayar “What is the space of camera response functions?” CVPR, 2003.
[4] Debevec and Malik “Recovering high dynamic range radiance maps from photographs,” SIGGRAPH, 1997.
[5] Grossberg and Nayar “Determining the camera response from images: What is knowable?” PAMI, 2003.
[6] Gallo and Sen “Stack-Based algorithms for HDR capture and reconstruction,” chapter in “High Dynamic Range Video: From Acquisition, to Display and Applications”, Academic Press, 2016.
[7] Hu, Gallo, Pulli, Sun, “HDR Deghosting: How to deal with Saturation?” CVPR, 2013.
[8] Agarwala, Dontcheva, Agrawala, Drucker, Colburn, Curless, Salesin, Cohen “Interactive Digital Photomontage,” SIGGRAPH, 2004.
[9] Sims, Yue, and Nayar, “ Towards Flexible Sheet Cameras: Deformable Lens Arrays with Intrinsic Optical Adaptation," ICCP, 2016.
[10] Asif, Ayremlou, Sankaranarayanan, Veeraraghavan, and Baraniuk, “FlatCam: Thin, Bare-Sensor Cameras using Coded Aperture and Computation,” Arxiv, 2015.
[11] Petschnigg, Agrawala, Hoppe, Szeliski, Cohen, Toyama, “Digital Photography with Flash and No-Flash Image Pairs,” SIGGRAPH, 2004.
References