layered depth images

Layered Depth Images

Jonathan Shade University of Washington

Steven Gortler Harvard University

Li-wei He Stanford University

Richard Szeliski Microsoft research

Presented by Chung, Shuo-Heng

Introduction

The most familiar Image Base Rendering method is texture mapping, however…

1.Aliasing. e.q. Infinite checkerboard

2.Speed is limited by the surface the texture applied.

e.q. Tree with thousands of leafs

Introduction

Two extensions have been presented to address these two difficulties:

1.Sprite

•If the new view is near the original view, the new image can be created by altering the original sprite

•cannot produce parallax well

2.Depth Image

•gap will be introduced due to visibility changes when some portion of the scene become unoccluded or a surface is magnified.

IntroductionThe paper introduces two extension:

•sprite with depth

•layered depth image.

Previous Work

Max:

• use a representation similar to an LDI.

•Focus on high-quality anti-aliasing.

•Warping multiple input LDI with different camera.

Mark et al. and Darsa et al:

• create triangulated depth maps from input images with per-pixel depth.

•Taking advantage of graphics hardware pipelines.

Previous Work

Shade et al. and Shaufler et al.:

•Render complex portions of a scene onto a sprites.

•Reuse these sprites onto in subsequent frames.Lengyel and Snyder extended above work:

•Apply affine transformation to fit a set of sample points.

•Transforms are allowed to change as sample points change.Horry et al.:

•Use a single input and some user supplied information.

•Be able to provide approximate 3D cues.

Previous Work

McMillan’s ordering algorithm:

Sprites

Texture maps or images with alphas rendered onto planar surface.

1.Used directly as drawing primitives.

2.Used to generate new views by warping.

Sprites

• Where is pixel x,y from image 1 located in image 2?

• Assume image 1 a picture of the x-y coordinate plane at z = 0– Plane can be placed arbitrarily,

but setting z = 0 allows us to ignore z coordinate

• C1 = WPV, takes point on plane (x,y) to pixel (x1,y1)

z = 0

cam1

cam2

x,y

x1,y1

x2,y2

Sprites

1

1

1

1

1

1

122

2

1

11

11

1

122

2

22

22

1

1

11

11

yx

CCwwywx

wwywx

CCyx

Cwwywx

yx

Cwwywx

w = w2 / w1

Sprites with Depth

0100

10

1

1

121

1

1

1

12

1

1

1

1

12

2

2

2

CCzyx

CC

zyx

CC

wwzwywx

epipole

Sprites with Depth

Forward map has a problem. When the view suddenly changes, gap may be introduced.

Sprites with Depth

In order to deal with the problem, following steps are used:

1.Forward map the displacement map d1(x1, y1) to get d3(x3, y3)

=> d3(x3, y3) = d1(x1, y1)

2,1

12,12,1

*

2,1*

11

1

3

3

11eHe

edyx

yx

H1,2 is obtained by dropping the third row and column of C2C1

e1,2 is the third column of H1,2

-1

(x1, y1) (x3, y3)

Sprites with Depth

2.Backward map d3(x3, y3) to obtain d2(x2, y2)

2,111

1

2,1

3

3

2,1

2

22

22

1

1

edyx

H

yx

Hw

ywxw

d2(x2, y2) = d3(x3, y3)

Sprites with Depth

3.Backward map the original sprite color.

1,222

2

1,2

1

11

11

1edy

xH

wywxw

Assign the color in input image (x1, y1) to output image (x2, y2)

Sprites with Depth

These steps (first forward mapping the displacement and then backward mapping with the new displacements) have following advantages:

1. Small errors in displacement map warping will not be as evident as errors in the sprite image warping

2. We can design the forward warping step to have a simpler form by factoring out the planar perspective warp.

Sprites with Depth

We can rewrite the equation:

1,2333

3

33

33

1,222

2

1,2

1

11

11

2

2

1,2

3

33

33

),(1

1

eyxdw

ywxw

edyx

Hw

ywxw

yx

Hw

ywxw

Sprites with DepthAnother faster but less accurate variance:

In the first step, set u3(x3, y3) = x1 – x3 and v3(x3, y3) = y1 – y3

In the third step, we use

),(),(

333

333

3

3

1

1

yxvyxu

yx

yx

1,2333

3

33

33

1

11

11

),( eyxdw

ywxw

wywxw

instead of

Sprites with Depth

Input color (sprite) image

Warped by homography only (no parallax)

Warped with homography and crude parallax (d1)

Warped with homography and true parallax (d2)

Warped with both three steps

Input depth map d1(x1, y1)

Pure parallax warped depth map d3(x3, y3)

Forward warped depth map d2(x2, y2)

Forward warped depth map without parallax correction

Sprite with “pyramid” depth map

Recovering sprite from image sequences

We can use computer vision technique to extract sprite from image sequences..

1. Segment the sequence into coherently moving regions with a layered motion estimation algorithm.

2. Compute a parametric motion estimate (planar perspective transformation) for each layer.

3. Determine the plane equation associated with each region by tracking feature points from frame to frame.


The third of five images Initial segmentation into six layers

Recovered depth map


The five layer sprites

Residual depth image for fifth layer


Re-synthesized third image Novel view without residual depth

Novel view with residual depth

Layered Depth Image

Layered depth image can handle more general disocclusions and large amount of parallax as the viewpoint moves. There are three ways presented in the paper to construct an LDI.

1. LDIs from multiple depth image.

2. LDIs from a modofied ray tracer.

3. LDIs from real images.

Layered Depth Image

The structure of an LDI:

DepthPixel =

ColorRGBA: 32 bit integer

Z: 20 bit integer

SplatIndex: 11 bit integer

LayeredDepthPixel =

NumLayers: integer

Layers[0..numlayers-1]: array of DepthPixel

LayeredDepthImage =

Camera: camera

Pixels[0..xres-1,0..yres-1]: array of LayeredDepthPixel

LDI from Multiple Depth Images

LDI can be constructed by warping n depth image into a common camera view.

LDI from a Modified Ray Tracer

BA

Ray can be cast from any point on cue face A to any point on frustum face B.

LDI from a Modified Ray Tracer•“Uniformly” Sample the scene.

•Any object intersection hit by the ray is reprojected into the LDI.

•If the new sample is within a tolerance in depth of an exsiting depth pixel, the new sample color is averaged with the existing depth pixel. Otherwise a new depth pixel is created.

LDI from Real Images

Use voxel coloring algorithm to obtain the LDI directly from input images.

LDI from Real Images

This is a dinosaur model reconstructed from 21 photographs.

Space Efficient Representation

It is important to maintain the spatial locality of depth pixels to exploit the cache in CPU.

1. Reorganize the depth pixel into a linear array ordered from bottom to top and left to right in screen space, and back to front along the ray.

2. A accumulated depth array is maintained in each scanline and the depth can be retrieved with the layer number as index.

Incremental Warping Computation

Recall that:

11

1

1

1

12

2

22

22

22

zyx

CC

wwzwywx

2

22

22

22

1

1

1

2,1

1 wwzwywx

zyx

T1

122,1Set CCT


depthstart

0100

10

1

*1

*2,1*11

1

2,1

1

1

1

2,1

z

Tzyx

Tzyx

T


xincrstart

0001

10

10

1

*2,11

1

2,11

1

2,1

Tyx

Ty

x

T

We can simply increment the start to get the warped position of the next layered depth pixel along a scanline.

Splat Size Computation

To splat the LDI into the output image, we roughly approximate the projected area of the warped pixel.

res1 = 1/(w1h1) res2 = 1/(w2h2) LDI camera

Output camera


The square root size can be approximated more efficiently.


The square root size can be further approximated by a lookup table.5 bits for d1

6 bits for normal (3 bits for nx, 3bits for ny)

Total 11 bits => 2048 possible values.

]d ,n ,lookup[n 1yx2 zsize


The paper uses four splat sizes: 1x1, 3x3, 5x5 and 7x7.

Each pixel in a footprint has an alpha value to approximate a Gaussian splat kernel.

These alpha value are rounded to 1, ½, or ¼, so the alpha blending can be done with integer shift and add.

Depth Pixel Representation

To fit four depth pixels into a single cache line (32 bytes in Pentiumm Pro and Pentium II)

1. Convert the floating point Z value into a 20 bit integer.

2. Splat table index size is 11 bits.

3. R, G, B, and alpha values fill out the other 4 bytes.

This yields a 25 percent improvement in rendering speed.

Clipping

• Split the LDI frustum into two segments, a near and a far one.

• Near and far segment are clipped individually.

• The near frustum is kept smaller then the far segment.

• Intersect the view frustum with the frustum of the LDI.

• Only visible pixels are rendered.

• Far segment first, then near segment.

• Speed rendering time by a factor of 2 to 4.

Clipping

Warping summary

Result

Future Work

•To explore representations and rendering algorithms combining several IBR techniques.

•Automatic techniques for taking a 3D scene and re-represented it in the most appropriate fashion for image base rendering.

layered depth images

Documents