"computational photography: understanding and expanding the capabilities of standard...

Copyright © 2016 NVIDIA 1

Computational PhotographyUnderstanding and expanding the capabilities of standard

cameras

Orazio GalloSenior Research Scientist, NVIDIA Research

05/03/2016

Copyright © 2016 NVIDIA 2Copyright © 2016 NVIDIA 2

Research Areas

LOW-LEVEL VISION GEOMETRIC VISION

MACHINE LEARNINGHIGH-LEVEL VISION

VISUAL COMPUTING RESEARCH @ NVIDIA


Today, most of us are used to pressing the shutter button and getting great pictures out.


This is a more accurate rendition of the output of the sensor. There is a lot going on

between the collection of photons and the picture being saved to disk.


Why do we care for computer vision?


What is computational photography?

Image processing

Input image(s) Processed image(s)

Computer vision

Input image(s) Semantic info

• Denoising

• Image enhancement

• Super-resolution

• Morphological operators

• …

• Object detection/recognition

• Optical flow/tracking

• Segmentation

• 3D scene reconstruction

• …

• Deblurring

• Edge detection

• …



Generalized optics

Lytro Pelican



Generalized opticsGeneralized

illumination

[11] Petschnigg et al., 2004



Generalized opticsGeneralized

sensors

Generalized

illumination

[9] Sims et al. 2016



Combining imagesGeneralized opticsGeneralized

sensors

Generalized

illumination

HDR image



Combining imagesGeneralized opticsGeneralized

sensors

Generalized

illumination

New cameras



Image processing

Input image(s) Processed image(s)

Computer vision

Input image(s) Semantic info

Computational

photography


• Interpreting the pixel value:

• Noise involved in the image formation process,

• What happens under the hood, and

• What to do when using RAW or JPEG images.

• Overcoming the main limitations of standard cameras with computation.

Outline


Image formation and on-camera processing


Know your types of noise!

+

Photon shot noise

Readout

logicISP

+

Vignetting

+

Dark current

+

Saturation

+

Decimation

+

Readout noise

+

ADC noise

+

LOTS of processingJPEG

RAW(10-16 bits linear)1

2

What do you need to know if you use…


Photon shot noise

• Due to the quantized nature of light:

(for a large number of photons)

• Recall that:

RAW images — Know your types of noise!

Adapted from Wikipedia

Solution: Expose To The Right (ETTR), or average multiple frames



Solution: Cool the sensor or subtract a dark frame

Dark current(a.k.a. dark current shot noise, or thermal noise)

• Thermal energy can free electrons even with

no incoming photons!

• Function of:

• Sensor’s temperature

• Exposure time

Wid

enhorn

et

al. [1]


• Photo-response non-uniformity (PRNU)

• Vignetting

• Readout noise:

• reset noise, occurring during charge-to-voltage transfer;

• white noise and flicker noise during voltage amplification;

• quantization noise during analog-to-digital conversion.


Solution: Flat field image

Solution: Expose To The Right (ETTR), or average multiple frames


• All in all, RAW images are great for computer vision for their dyamic

range and because they are a simple affine transformation of photon

count:

• Just remember:

• ETTR

• Correct for stuck/broken pixels

• Capture a dark frame (frame with lens-cap on)

• Capture a flat field (image of a uniform background)



JPEG images

Denoise Demosaic

Bad Pixel

Correction

Image

Enhancing

Tone

Mapping

Lens

Correction

Black

Level

Metering

AF/AE

Camera Imaging Pipeline


JPEG images

Denoise Demosaic

Bad Pixel

Correction

Image

Enhancing

Tone

Mapping

Lens

Correction

Black

Level

Metering

AF/AE


• Black level subtraction

• Correct for stuck pixels

• Lens shading compensation

• White balance

• Demosaic

• Color-space conversion (usually sRGB)

• Color correction matrix (≠ than white balance)

• Gamma compression

• Jpeg compression

• …

ISP Basics


Put everything together: Camera Response Function

Input (10-16bits)

Ou

tpu

t (8

bits)


Grossberg and Nayar [3]

Radiometric calibration

And this function changes with the camera!


• Changes significantly from camera to camera…

• … and even scene to scene!

• Usually estimated using images with different :

Radiometric calibration

For a review see Gallo and Sen [6]


Why gamma-like? Because of our visual system

Input (10-16bits)

Output (8bits)

Very sensitive Not so sensitive


So about our question before…


Alternatives to standard ISPs


ISP Alternatives

Denoise Demosaic

Bad Pixel

Correction

Image

Enhancing

Tone

Mapping

Lens

Correction

Black

Level

Metering

AF/AE


Beefy NVIDIA GPU!


A

PSF CFA Nosie

argminx |z – Ax|2 + λ(x)

[2] Heide et al., 2014

FlexISP: A flexible camera image processing framework

z x

Data fidelity Regularization


FlexISP: A flexible camera image processing framework


Neural Network ISP

The problem with FlexISP is that the optimization happens at run time.

Why not using a NN where inference is very quick?

We are working on this!


Pushing the limits of modern cameras


Limited Dynamic Range

World

log-luminance

(log cd/m2)

-2 4 8-6


High-dynamic-range Imaging (HDR)


HDR Imaging Issues

• Biggest issue is image registration (camera motion and dynamic scenes)

• Need non-rigid registration

[7] Hu et al., 2013

Input stack Output stack


• 50mm lens has 47-degree FoV

• 110mm lens has 22-degree FoV

Limited field of view


Image stitching

Fro

m W

ikip

edia


Image stitching

Google Jump

+

GoPro Odyssey

GoPro Omni Jaunt


• Biggest issue is parallax

• Rotate the camera around the center of projection of the lenses,

• or hope everything is at infinity,

• or do full 3d reconstruction.

Limited field of view


Limited control over depth-of-field

https://commons.wikimedia.org/w/index.php?curid=330435


[8] Agarwala et al. 2004

Extended depth-of-field (when is small)


Photo credit: Marc Levoy

Synthetic shallow depth-of-field (for small aperture)


Take home lessons

• RAW vs JPEG (or non-linear):

• RAW:

• Better model of incoming light and has larger dymamic range

• Compensate for the types of noise involved in the image formation process.

• JPEG:

• Better picture quality thanks to the manufacturer processing

• Be aware of the non-linear transformation and undo it when needed!

• You can deal with the limitations of the camera modules you are given by means of

computation

• E.g., stack-based HDR, extended depth-of-field, etc.


That’s all!

My contact info


[1] Widenhorn et al. “Temperature dependence of dark current in a CCD,” Electronic Imaging, 2002.

[2] Heide et al. “FlexISP: A flexible camera image processing framework,” SIGGRAPH Asia, 2014.

[3] Grossberg and Nayar “What is the space of camera response functions?” CVPR, 2003.

[4] Debevec and Malik “Recovering high dynamic range radiance maps from photographs,” SIGGRAPH, 1997.

[5] Grossberg and Nayar “Determining the camera response from images: What is knowable?” PAMI, 2003.

[6] Gallo and Sen “Stack-Based algorithms for HDR capture and reconstruction,” chapter in “High Dynamic Range Video: From Acquisition, to Display and Applications”, Academic Press, 2016.

[7] Hu, Gallo, Pulli, Sun, “HDR Deghosting: How to deal with Saturation?” CVPR, 2013.

[8] Agarwala, Dontcheva, Agrawala, Drucker, Colburn, Curless, Salesin, Cohen “Interactive Digital Photomontage,” SIGGRAPH, 2004.

[9] Sims, Yue, and Nayar, “ Towards Flexible Sheet Cameras: Deformable Lens Arrays with Intrinsic Optical Adaptation," ICCP, 2016.

[10] Asif, Ayremlou, Sankaranarayanan, Veeraraghavan, and Baraniuk, “FlatCam: Thin, Bare-Sensor Cameras using Coded Aperture and Computation,” Arxiv, 2015.

[11] Petschnigg, Agrawala, Hoppe, Szeliski, Cohen, Toyama, “Digital Photography with Flash and No-Flash Image Pairs,” SIGGRAPH, 2004.

References

"computational photography: understanding and expanding the capabilities of standard...

Technology