smpte3 d2011 bove

8
The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the Society of Motion Picture and Television Engineers (SMPTE), and its printing and distribution does not constitute an endorsement of views which may be expressed. This technical presentation is subject to a formal peer-review process by the SMPTE Board of Editors, upon completion of the conference. Citation of this work should state that it is a SMPTE meeting paper. EXAMPLE: Author's Last Name, Initials. 2010. Title of Presentation, Meeting name and location.: SMPTE. For information about securing permission to reprint or reproduce a technical presentation, please contact SMPTE at [email protected] or 914-761-1100 (3 Barker Ave., White Plains, NY 10601). SMPTE Meeting Presentation Live Holographic TV: From Misconceptions to Engineering V. Michael Bove, Jr. MIT Media Lab, 20 Ames St., Room E15-448, Cambridge MA USA, [email protected] Written for presentation at the 2011 SMPTE International Conference on Stereoscopic 3D for Media and Entertainment Abstract. Futurists have recently proposed that “holographic telepresence” will be one of the life- changing technologies of the next few years. But what exactly do we mean by holographic TV? Did Al Gore really appear as a hologram at Live Earth Tokyo? What technologies and processes would it require to display live holograms of real scenes? Do we have to capture directly and transmit actual holographic diffraction patterns or can we use a camera array or a rangefinding camera (like a Kinect) and compute the hologram from that? And what does holographic TV have to do with standardization efforts for other sorts of 3-D TV? In this presentation I answer these questions and more. Keywords. 3-D, stereoscopic television, holographic television, holograms, visual perception, displays, telepresence

Upload: draganastek68

Post on 16-Aug-2015

28 views

Category:

Engineering


3 download

TRANSCRIPT

The authors are solely responsible for the content of this technical presentation. The technical presentation does not necessarily reflect the official position of the Society of Motion Picture and Television Engineers (SMPTE), and its printing and distribution does not constitute an endorsement of views which may be expressed. This technical presentation is subject to a formal peer-review process by the SMPTE Board of Editors, upon completion of the conference. Citation of this work should state that it is a SMPTE meeting paper. EXAMPLE: Author's Last Name, Initials. 2010. Title of Presentation, Meeting name and location.: SMPTE. For information about securing permission to reprint or reproduce a technical presentation, please contact SMPTE at [email protected] or 914-761-1100 (3 Barker Ave., White Plains, NY 10601).

SMPTE Meeting Presentation

Live Holographic TV: From Misconceptions to Engineering

V. Michael Bove, Jr. MIT Media Lab, 20 Ames St., Room E15-448, Cambridge MA USA, [email protected]

Written for presentation at the 2011 SMPTE International Conference on Stereoscopic 3D for Media and Entertainment

Abstract. Futurists have recently proposed that “holographic telepresence” will be one of the life-changing technologies of the next few years. But what exactly do we mean by holographic TV? Did Al Gore really appear as a hologram at Live Earth Tokyo? What technologies and processes would it require to display live holograms of real scenes? Do we have to capture directly and transmit actual holographic diffraction patterns or can we use a camera array or a rangefinding camera (like a Kinect) and compute the hologram from that? And what does holographic TV have to do with standardization efforts for other sorts of 3-D TV? In this presentation I answer these questions and more.

Keywords. 3-D, stereoscopic television, holographic television, holograms, visual perception, displays, telepresence

2

Introduction

Why Are We Talking About Holograms at a TV Conference, Anyway?

Although Gabor pioneered holography in 1948, it was the “off-axis” holograms developed by Leith and Upatnieks in the early 1960s that made holographic images of real 3-D scenes practical. And it was not long after Leith and Upatnieks’ team had made the announcement of their holograms, before they were presenting a paper at the 1965 SMPTE technical conference on how holographic television might work (Fig. 1). [1]

Figure 1. Proposed holographic television transmitter, SMPTE Journal, 1965.

While the proposed system would not have been possible with 1965 analog technology, and wouldn’t really be practical even with 2011 digital technology, one might question the often-expressed notion that holography, like photography, fully matured as a “wet” process before becoming electronic, and then digital. Indeed, holography came of age along with the computer, and within three years of the above SMPTE paper researchers were already synthesizing holograms on computers. [2]

On the Proper Definition of “Hologram”

In late 2010, IBM futurists included “holographic telepresence” as one of five technologies that they predicted would change people’s lives within five years. [3] But what is a “hologram”? Writers and marketers sometimes use the term generically for any multiview autostereoscopic (no glasses) display, including parallax barrier or lenticular screens, volumetric displays, multi-rear-projection displays with angularly selective diffusers, and “real” holograms; even some 2-D technologies like Musion’s high-definition version of the old “Pepper’s Ghost” [4] illusion are referred to as holograms (do an Internet search on “Al Gore hologram” or “Prince Charles hologram”).

To a holographer, a hologram is a diffraction pattern that can recreate the light wavefronts that come from a desired visual scene. In an optically-captured hologram this pattern is made through interference between a reference laser beam and laser light reflected from the scene, while a computed hologram can be a mathematical simulation of the interference, or it can be built up by superposing smaller patterns that direct light in desired directions.

Because holograms have a great deal of control over the behavior of the light coming from them (unlike a 2-D display they can independently control not only its intensity but also its direction

3

and the curvature of its wavefronts, i.e. the apparent distances from which it is emitted) they can provide the appearance of 3-D and also avoid the common problem in stereoscopic displays known as accommodation-vergence mismatch. In most stereoscopic displays there is a decoupling between the distance to which they viewer’s eyes are converged and the distance at which they are focused. While accommodation (focusing) as a perceptual cue to depth works only up to a few meters of distance, significant mismatch can be unpleasant to viewers and thus viewer comfort may require limiting the usable range of stereoscopic depth to less than what a display is capable of showing, particularly at shorter viewing distances (as in mobile devices). [5],[6] Holograms also offer continuous motion parallax (the view changes smoothly with viewer position), which is not generally practical with stereographic displays.

Making Holographic TV Happen In a previous presentation at this conference [7] I reviewed design aspects of a holographic television display. As noted in that paper, much work remains to be done on displays suitable for the home, but here I will concentrate on some practical considerations for an end-to-end system for live holographic TV.

Analog Examples

The 1965 SMPTE paper referred to above noted two significant problem areas in the design of such a system – and these have continued to be problematic until the present day – scene capture and bandwidth. Because holographic capture requires coherent light and scene immobility to within a fraction of a wavelength of light during the exposure interval, extremely powerful pulsed lasers and exposures measured in nanoseconds would be required for a television production (late holographic pioneer and colleague Stephen Benton remarked that this was why most holograms in circulation were of “small dead things”). Further, the large amount of data in a hologram makes real-time transmission impractical over the usual channels.

Notwithstanding the capture and bandwidth issues, starting in the mid-1960s researchers attempted to build analog holographic TV systems. Generally these involved reducing bandwidth by having small images and extremely narrow view angles; in a hologram the diffraction angle is inversely related to the pixel pitch (for view angles in the tens of degrees, physics dictates a pixel size similar to the wavelength of the illumination!). A group at Bell Laboratories in 1966 captured a holographic interference fringe pattern on a vidicon and transmitted it to a cathode-ray tube, where it was photographed and the resulting transparency illuminated by a laser to reconstruct the image.[8] A CBS Laboratories team in 1972 directly wrote the signal from a vidicon to a reusable thermoplastic material to create a phase hologram.[9] When liquid-crystal light modulators designed for projectors became available in the 1990s, several groups used these with laser illumination as the displays for analog real-time holographic TV systems. [10],[11]

Digital

Holographic telepresence/TV research in the digital realm has had two major differences from the analog examples above. First is the move away from scene capture using coherent light and interference. The availability of tiny and inexpensive cameras has enabled relatively easy construction of multiview camera arrays, both horizontal-parallax-only (HPO) and full-parallax; also digital camera resolution has increased to the degree that lenslet arrays can be attached to single sensors, permitting the digital capture of integral images. While digital holograms can be computed from 3-D computer graphics models, it is also possible to make holo-stereograms

4

from sets of parallax 2-D views, and the latter approach is typically used in current holographic TV systems.

It would not likely have been predicted by the 1960s holographic television pioneers, but the graphics processors (GPUs) in current personal computers and game consoles have sufficient processing to perform the generation of holographic stereograms at SDTV resolutions in real time. [12] Given the ubiquity of such computing power, it makes sense not to transmit holograms but instead to transmit the underlying views – since they are essentially the information content of the holograms – and compress them using standards like the multiview extension to H.264/MPEG AVC. [13] An additional advantage to this approach to the bandwidth problem is that holograms must be generated to match the physical characteristics of the display (such as the pixel pitch of the light modulator and the wavelengths of the color light sources) so supporting a range of display sizes and architectures only becomes practical when the holograms are generated by the receivers. This approach further places holographic displays in the same ecosystem as other multiview autostereoscopic displays that use optical elements rather than diffraction; the same source material could be streamed to either sort of display (although the holographic display could support much denser sets of parallax views if available).

A holographic stereogram can be thought of as very much like any other sort of stereogram, in which an array of apertures permits eyes to see appropriate parallax view pixels in different directions. In the case of the hologram, this is done not with physical apertures or lenses but by segmenting the hologram into a set of regions called hogels each of which contains a set of diffraction patterns that send light in specific directions; these are modulated by the intensities of the parallax views in the corresponding directions. Thus if the parallax views are available the generation of the hologram is effectively multiplying precomputed basis functions by the view intensities and combining the results into the hologram.

Figure 2. In a typical holographic stereogram (left), a hogel emits light in multiple directions with

the same wavefront curvature but different intensities. In a Diffraction Specific Coherent Panoramagram (right) both intensity and curvature can vary with direction, eliminating

accommodation-vergence mismatch and giving smooth motion parallax with fewer views.

While the above sounds wonderfully simple, in one sense it is too simple. In return for the computational simplicity the holo-stereogram also gives up one of the main advantages of holograms: providing consistent vergence and accommodation cues to the viewer. The holo-stereogram is so similar to other stereograms that it suffers the same mismatch problem. Seeking to remedy this, recent research in the Object-Based Media Group at the MIT Media Lab invented the Diffraction Specific Coherent Panoramagram (DSCP), a new form of stereogram that is not much more computationally complex than the above method – it can still be

5

computed in real time – but without accommodation-vergence mismatch. [14] In the DSCP, it is necessary to know the distance to each point in the parallax views, which is no problem when the views are generated from a graphics model (the z-buffer values can be retained); for real imagery it is necessary to use range-sensing cameras, but these are quickly becoming available, as will be discussed below. In the DSCP the basis functions multiplied by the view intensities are selected not just based on angle but also on the 3-D location of points in the parallax views, with the result that the wavefront curvature can be varied, preserving the correct accommodation cue (Fig. 2). An interesting side effect of having the correct wavefront curvatures is that smooth motion parallax is possible with many fewer views than in a regular stereogram – about a tenth along each dimension in our experience. Thus instead of the hundreds of camera positions that a typical HPO stereogram would require for the appearance of continuous parallax, the DSCP works well with a much more manageable number.

One direction that has been taken by some recent developers of holographic television systems involves capture of integral images and conversion to holograms. In integral photography, developed by Lippmann in 1908, film is placed at the focal plane of a sheet of tiny convex lenslets, each of which images its perspective onto a small region of the picture. Following a replication step that converts a pseudoscopic image to one with correct parallax, the resulting image is viewed through a lenslet array and recreates the discrete perspectives at appropriate directions, causing the viewer to perceive a 3-D image. Electronic image sensors have grown in resolution to the point that it is now possible to acquire integral images electronically. Because of the small size of electronic sensors, to get a good range of parallax views it is necessary to place a large field lens and an aperture between the lenslet array and the sensor. Each subimage (a region of the overall captured image covered by one lenslet) can be processed independently into a small hologram (a process which in essence consists of computing fast Fourier transforms), and a tiled array of these holograms displayed on a light modulator illuminated by a monochromatic light source (or three, in the case of color images). The conversion to a hologram can be done on dedicated digital signal processing hardware or on GPUs. [15],[16]

Another branch of current holographic television/telepresence work is based on capturing and transmitting arrays of parallax images with cameras and converting them to holo-stereograms for display. A University of Arizona group captured and transmitted images from 16 small cameras, and displayed holo-stereograms on a 100 by 100 cm refreshable photorefractive polymer display at a rate of 0.5 frames/second. [17] Zebra Imaging claims an earlier demonstration of telepresence from a camera array to a holographic display, but has not published details as of this writing. The Object-Based Media Group at the MIT Media Lab has transmitted intensity+depth images from the Microsoft Kinect rangefinding camera over a computer network to a PC which converted them to HPO DSC Panoramagrams in real time at 15 frames/second (Figs. 3,4). [18] The resulting holograms have been displayed on both our 30 frames/second display based on acousto-optic light modulation and Arizona’s photorefractive polymer display. It is important to note that it is possible to create a hologram from a single rangefinding camera but that there will be missing occluded regions visible when the viewer looks far to the left or right of the original camera viewpoint. Active rangefinding cameras like the Kinect, which project infrared patterns onto the scene, pose problems when more than one is used simultaneously as they interfere with one another but we have found that if the angle between them is large enough it is possible to use multiple Kinects.

6

Figure 3. Kinect camera captures both range and intensity images, which are converted in real

time to holograms on MIT acousto-optic display (right).

Figure 4. Frame from holographic sequence generated from Kinect camera displayed on

Arizona display (not at full video rate). Courtesy of University of Arizona College of Optical Sciences.

7

Conclusion Holographic television is no longer science fiction, and it’s also not just a marketing gimmick. It is proving possible to create and display actual holograms of moving scenes in real time. While research remains to be done on practical displays suitable for consumers, problems in other parts of the holographic television chain are being solved by technologies and standards originally developed for other sorts of consumer imaging.

Acknowledgements The author gratefully acknowledges the late Steve Benton and the many alumni of the Spatial Imaging Group who started the holo-video project at the MIT Media Lab that led to our current research. Thanks to Jim Barabas, Sunny Jolly, Dan Smalley, and Quinn Smithwick from the author’s research group. Thanks to the University of Arizona College of Optical Sciences for their collaborative support. This work has been supported in part by the Digital Life, Things That Think, and CELab consortia at the MIT Media Lab. This research was also funded in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), through the AFRL contract FA8650-10-C-7034. All statements of fact, opinion or conclusions contained herein are those of the authors and should not be construed as representing the official views or policies of IARPA, the ODNI, or the U.S. Government. Thanks also to NVIDIA for providing graphics hardware used in this research.

References 1. E. N. Leith, J. Upatnieks, B. P. Hildebrand, and K. Haines, “Requirements for a wavefront reconstruction facsimile system,” Journal of the Society of Motion Picture and Television Engineers, 74:10, pp. 893-896, Oct. 1965.

2. L. B. Lesem, P. Hirsch, and J. A. Jordan, Jr., “Holographic display of digital images,” Communications of the ACM, 11:10, pp. 661-674, Oct. 1968.

3. “IBM reveals five innovations that will change our lives in the next five years,” IBM press release, Dec. 27, 2010, http://www-03.ibm.com/press/us/en/pressrelease/33304.wss

4. R. Yu, “Holograms could soon give virtual meetings new life,” USA Today, p. 5B, February 24, 2011.

5. D. M. Hoffman, A. R. Girshick. K. Akeley, and M. S. Banks, “Vergence-accommodation conflicts hinder visual performance and cause visual fatigue,” Journal of Vision, 8:3, pp. 1-30, March 2008.

6. J. Hecht, “3-D TV and movies: exploring the hangover effect,” Optics and Photonics News, 22:2, pp. 20-27, Feb. 2011.

7. V. M. Bove, Jr., “Holographic television: what and when?” SMPTE Motion Imaging Journal, 120:4, May-June 2011.

8. L. H. Enloe, J. A. Murphy, and C. B. Rubinstein, “Hologram transmission via television,” Bell Syst. Tech. J., 45:2, pp. 335-339, Feb. 1966.

9. R. J. Doyle and W. E. Glenn, “Remote real-time reconstruction of holograms using the Lumatron,” Applied Optics, 11:5, pp. 1261-1264, 1972.

10. K. Sato, K. Higuchi, and H. Katsuma, “Holographic television by liquid-crystal device,” Proc. SPIE Practical Holography VI, vol. 1667, pp. 19-31, 1992.

8

11. N. Hashimoto, K. Hoshino, and S. Morokawa, “Improved real-time holography system with LCDs”, Proc. SPIE Practical Holography VI, vol. 1667, pp. 2-7, 1992.

12. Q. Y. J. Smithwick, J. Barabas, D. E. Smalley, and V. M. Bove, Jr., “Real-Time Shader Rendering of Holographic Stereograms,” Proc. SPIE Practical Holography XXIII, 7233, p. 723302, 2009.

13. A. Vetro, T. Wiegand, and G. J. Sullivan, “Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard,” Proc. of the IEEE, 99:4, pp. 626-642, April 2011.

14. Q. Y. J. Smithwick, J. Barabas, D. E. Smalley, and V. M. Bove, Jr., “Interactive Holographic Stereograms with Accommodation Cues,” Proc. SPIE Practical Holography XXIV, 7619, p. 761903, 2010.

15. R. Oi, T. Mishina, K. Yamamoto, and M. Okai, “Real-time IP-hologram conversion hardware based on floating point DSPs,” Proc. SPIE Practical Holography XXIII, vol. 7233, p. 723305, 2009.

16. K. Yamamoto, T. Mishina, R. Oi, T. Senoh, and T. Kurita, “Real-time color holography system for live scene using 4K2K video system,” Proc. SPIE Practical Holography XXIV, vol. 7619, p. 761906, 2010.

17. P.-A. Blanche, A. Bablumian, R. Voorakaranam, C. Christenson, W. Lin, T. Gu, D. Flores, P. Wang, W.-Y. Hsieh, M. Kathaperumal, B. Rachwal, O. Siddiqui, J. Thomas, R. A. Norwood, M. Yamamoto, and N. Peyghambarian, “Holographic three-dimensional telepresence using large-area photorefractive polymer,” Nature, 468:7320, pp. 80-83, 2010.

18. J. Barabas, S. Jolly, D. E. Smalley, and V. M. Bove, Jr., “Diffraction specific coherent panoramagrans of real scenes,” Proc. SPIE Practical Holography XXV, vol. 7957, p. 795702, 2011.