Underwater 3D Capture using a Low-Cost Commercial Depth Camera

Download Underwater 3D Capture using a Low-Cost Commercial Depth Camera

Post on 03-Jan-2017

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Underwater 3D Capture using a Low-Cost Commercial Depth Camera

    Sundara Tejaswi Digumarti Aparna Taneja Amber ThomasDisney Research Zurich, ETH Zurich Disney Research Zurich Walt Disney World

    dtejaswi@mavt.ethz.ch Amber.X.Thomas.-ND@disney.com

    Gaurav Chaurasia Roland Siegwart Paul BeardsleyDisney Research Zurich ETH Zurich Disney Research Zurich

    gaurav.chaurasia@disneyresearch.com rsiegwart@ethz.ch pab@disneyresearch.com

    Abstract

    This paper presents underwater 3D capture using a com-mercial depth camera. Previous underwater capture sys-tems use ordinary cameras, and it is well-known that a cali-bration procedure is needed to handle refraction. The sameis true for a depth camera being used underwater. We de-scribe a calibration method that corrects the depth maps ofrefraction effects. Another challenge is that depth camerasuse infrared light (IR) which is heavily attenuated in water.We demonstrate scanning is possible with commercial depthcameras for ranges up to 20 cm in water.

    The motivation for using a depth camera under water is thesame as in air it provides dense depth data and higherquality 3D reconstruction than multi-view stereo. Under-water 3D capture is being increasingly used in marine biol-ogy and oceanology; our approach offers exciting prospectsfor such applications.

    To the best of our knowledge, ours is the first approachthat successfully demonstrates underwater 3D capture us-ing low cost depth cameras like Intel RealSense. We de-scribe a complete system, including protective housing forthe depth camera which is suitable for handheld use by adiver. Our main contribution is an easy-to-use calibrationmethod, which we evaluate on exemplar data as well as 3Dreconstructions in a lab aquarium. We also present initialresults of ocean deployment.

    1. Introduction

    Depth cameras have grown in importance since the intro-duction of the first Kinect [3], a general purpose low-costtechnology alongside ordinary cameras. The underlying

    The presented work was done when this author was at Disney Re-search Zurich. The author is currently at Google, Zurich.

    principle is to emit infrared (IR) light and capture the lightreflected by scene objects. The two common approachesfor using IR light to measure depth are time-of-flight (TOF)and structured light, wherein a structured pattern of IR lightis projected on the surface and the distortion in reflectedstructure is used to calculate surface geometry.

    The advantage of a depth camera is that it produces denseand reliable depth measurements, albeit over a limitedrange. While the Kinect was developed for gaming, it wasalso demonstrated for 3D scene capture [20]. Google Tangois a prototype mobile phone with depth camera that is tar-geted at 3D scene capture [5]. Our goal is to use an under-water depth camera to capture 3D models of submerged ob-jects like marine flora/fauna. Applications that make use ofthis data require centimeter to millimeter accuracy in orderto capture minute changes in geometry. Our use of depthcameras is motivated by such high-accuracy requirements.

    (a) (b) (c)

    Figure 1. (a) The coral nursery, (b) close-up of suspended corals.and (c) the ecological volume is the volume of the enclosing ellip-tic cylinder for the coral.

    The main motivation for our work comes from coral reefresearch being conducted at Disneys Castaway Cay andGreat Abaco in the Bahamas. Pieces of coral that natu-rally break off a reef, through storm, wave activity or im-pact, are taken to a coral nursery where they are suspendedfrom a frame as shown in Fig 1(b). Measurements of coralvolume are taken at six month intervals and corals whichshow healthy growth are transplanted back to the reef. Thecurrent method for measuring coral size is to manually es-

  • timate the ecological volume [17] i.e. the enclosing ellipticcylinder as shown in Fig 1(c). We seek to automate and im-prove the accuracy of volume measurement by capturing a3D model of the coral using a depth sensor, and estimatingits true volume. Keeping this application in mind, our tar-get is to develop a cheap and compact solution that enableshandheld scanning of marine life for divers.

    There are two challenges in underwater depth sensing firstly, IR light used by depth cameras is heavily attenuatedby water. We demonstrate a system that is capable of cap-turing underwater surfaces within a range of 20 cm, whichis compatible with our application. Secondly, the images ordepth scans captured by any camera underwater do not fol-low the principles of perspective projection [14] because ofrefraction at the transparent interface of the housing of thecamera. A calibration procedure is required to account forthe refraction. There are existing methods for calibratingordinary cameras for underwater use [24], but to the bestof our knowledge, there is no analogue for underwater cal-ibration for the new generation of commercial depth cam-eras. We present a model for refraction in our setup and aneasy-to-use calibration method which requires only a sin-gle depth image of a plane. Our approach extends straight-forwardly to multiple images if needed for improved con-ditioning. We have tested our calibration method on twodifferent depth cameras Intel RealSense [6] as an exampleof a structured light camera, and Creative Senz3D [4] as anexample of a TOF camera.

    In the remainder of the paper, we discuss previous work inSec. 2. We describe our scanning hardware and housing forthe cameras in Sec. 3 and the refraction model along withthe calibration algorithm in Sec. 4. We present our resultson lab as well as ocean environments in Sec. 5 for IntelRealSense.

    We have experimented with both structured light and TOFdepth cameras; we focus on structured light cameras likeIntel RealSense. We describe how to adapt our approach toTOF cameras like Creative Senz3D in Appendix A.

    2. Related Work

    The success of terrestrial 3D reconstruction for visualizingnatural and man-made environments has spurred a similarmovement in marine science. The XL Catlin Seaview Sur-vey [13] is a scientific initiative for capturing imagery forthe worlds coral reefs. The Computer Vision Coral Ecol-ogy Project [9] is focused on classification and automaticimage annotation of coral reef images, although not con-cerned with 3D reconstruction. The latter project is also as-sociated with CoralNet [1], a Citizen Science website thatallows users to upload and label coral images. Hydrous [2]

    is a scientist-artist initiative in which 3D models of coralsare created from images using AutoDesk. These conserva-tion and visualization related projects require high quality3D reconstructions of ocean bed, marine life etc.

    The most prominent work on underwater 3D reconstruc-tion include monocular [25] and stereo vision [8, 7]. Visionbased approaches have difficulty generating dense and accu-rate 3D point clouds for complex geometry; depth camerasoften give much better quality, at least for terrestrial scenes.Our intuition is that if we overcome the complexity of re-fraction and IR attenuation, using underwater depth cam-eras can significantly improve reconstruction quality overstereo vision.

    Some previous approaches project structured light [19, 10]or laser [18] patterns on underwater surfaces and computethe surface geometry using the reflected patterns capturedusing ordinary cameras. Their results are better than stereovision, but their use of visible light makes them impracti-cal for our purpose. Such approaches can only be used inlab conditions where the ambient lighting can be controlled.Dancu et al [12] used Kinect to reconstruct an underwa-ter surface. However, they had to hold the Kinect outsidewater because it is only effective for distances greater than50 cm and IR attenuation under water renders the Kinect in-effective beyond 20 cm or so. All these approaches accountfor some of the refraction-borne complexity, but they arenot usable for actual underwater scanning in the wild.

    Our target is to design a system that is effective within theIR underwater attenuation range of circa 20 cm. To this end,we have experimented with Intel RealSense and CreativeSenz3D cameras. Instead of designing a custom hardwaresetup, we use off-the-shelf depth cameras and account forattenuation and refraction issues in software. The end re-sult is an affordable device that can be used by a diver forhandheld scanning of the ocean bed or coral reefs.

    3. Capture device

    (a) (b)

    Figure 2. Capture device. (a) Exploded schematic of the hardware.The red oblong is the Intel RealSense and the cyan rectangle on thetop-right is the screen. (b) The device being tested underwater.

    Our capture hardware is a self-contained device (see Fig. 2)

  • in a waterproof housing, suitable for handheld use by adiver. The housing is made of acrylic, sealed with silicone.The rear cover is removable for accessing the interior whenoutside of the water. The system consists of:

    Intel RealSense depth camera, Intel NUC mini-PC, Waveshare 7 screen, SSD for captured data, LiPo batteries, and magnetic switches.

    The magnetic switches allow a diver to activate or deactivatea small number of functions, such as start/stop recording,from outside the housing. The screen allows the diver to seethe current view of the depth camera. The Intel RealSensegenerates a stream of both RGB images and depth images,which are recorded to the SSD. The 3D reconstruction isperformed offline (see Sec. 4); we plan to attempt online3D reconstruction in the future. The housing has an externalattachment point for ballast. The entire assembly measures250 cm 18 cm.

    4. Calibration of a Depth Sensor for Underwa-ter Operation

    The pinhole camera model is the de facto standard in com-puter vision applications. This model is not valid for un-derwater captures, as light rays are refracted at multiple in-terfaces. An underwater camera requires a transparent pro-tective housing; light is refracted in the housing materialas well as water. Previous approaches described solutionsin the context of color cameras [24]. Some approachescircumvent this problem by placing the camera at the cen-ter of a hemispherical housing; light rays from/to the cam-eras pinhole are orthogonal to the housing and water in-terfaces, thereby undergoing no refraction. It is physicallychallenging to construct such a housing for the Intel Re-alSense because it has two pinholes, the projector and thereceiver (sensor). Instead of relying on custom fabricatedhardware, we present a generic solution to this problem byaccounting for refraction in our mathematical model.

    The Intel RealSense is an infrared (IR) structured light cam-era that consists of a projector and a receiver. The structuredpattern of IR rays emitted by the projector undergoes re-fraction at the air-housing interface and then at the housing-water interface as shown in Fig. 3. The reflected rays un-dergo the same refractions in the reverse order. We use this4-stage refraction path instead of a pinhole camera modelfor the RealSense (see Fig 4). The parameters required todescribe the complete path of a IR ray are:

    dh th

    waterhousingair

    pinhole

    Figure 3. Refractions in an underwater camera. A ray throughthe pinhole is refracted at the air-housing and the housing-waterinterface. Different rays in water when traced back into the air donot meet at the pinhole.

    n normal to the interfacedh distance from the pinhole to the housingth thickness of the housingb baseline between projector and receiver

    In the following sections, we review the mathematical de-tails of ray traversal in our model and then describe the cal-ibration method to obtain the above parameters.

    4.1. Refraction model

    Let the pinhole of the receiver be Or with center of the pro-jector at Op (see Fig 4). We assume that the receiver of thedepth camera is already calibrated in air and its intrinsic ma-trix is known. We model the projector as a virtual camerawith the same intrinsic matrix as the receiver. Furthermore,the image planes of the projector and receiver are parallelwith no vertical disparity.

    The depth map captured by the receiver contains one depthmeasurement dmeas for each pixel (ur, vr). The depth mea-surement is the distance along the principal axis of the re-ceiver. Using the intrinsic matrix of the receiver, every pixel(ur, vr) can be back-projected to compute the direction ofthe incident ray in air Xra.

    This ray intersects the air-housing interface at xri. The angleof incidence a and the angle of refraction h (with respectto the normal n) at the interface can be computed as:

    a = arccos

    (n XraXra

    )(1)

    h = arcsin

    (a sin ah

    )(2)

    where a and h are the refractive indices of air and the

  • Pw

    Pa

    Op

    d

    b

    h

    t

    (u ,v )p p

    h

    (u ,v )r r

    O r

    n

    n(u ,v )n n

    xrixpi

    xroxpo

    water

    housing

    air

    Xpa

    Xph

    Xpw

    Xrh

    Xra

    Xrw

    w

    a

    h

    Figure 4. Refraction model showing the complete path of a struc-tured IR ray in an Intel RealSense along with all model parame-ters. The ray starting from the projector is refracted multiple timesbefore reaching the receiver.

    housing material respectively.

    The direction of the refracted ray inside the housing ma-terial can then be computed by rotating the ray Xra by anangle ha = h a about an axis nha perpendicular to boththe interface normal and the incident ray:

    nha =nXraXra sin a

    (3)

    The point xri can be thus computed as:

    xri = Or + hXraXra

    , h =dh

    cos a, (4)

    where dh is the distance from the pinhole of the receiver tothe housing interface.

    The ray inside the housing material Xrh intersects thehousing-water interface at xro and undergoes another refrac-tion. The point xro can be computed as:

    xro = xri + w

    XrhXrh

    , w =dh + th n Xri

    cos h(5)

    The ray direction in water Xrw can be computed usingSnells law and the refractive index of water w as before.The true 3D point xw lies on this ray Xrw. This point is alsothe point of intersection of a ray from the projector Xpw withthe ray Xrw. Therefore, if we find the ray X

    pw we can find

    the true 3D point. This ray can be found by using the work-ing principle of a structured light camera. Structured light

    cameras project a sequence of light patterns or gray codeswhich is unique for each column. Assuming a row-alignedprojector-receiver system, the depth of a 3D point can becomputed as follows under no refractions:

    d =fxb

    up ur(6)

    2D to 3D computation We have the depth measurementdmeas for each pixel (ur, vr) on the receiver. Under no re-fractions, substituting this in Eq 6 gives the column index,and hence the pixel (uc, vr) on the projector. In our aboverefraction model, we compute this intersection using a raycasting algorithm: we shoot rays from the projectors centeralong the column uc, apply all refractions and compute theset of rays S = {Xpw} in water. We then select the ray thatintersects the exactly one ray Xrw from the projector. Thepoint of intersection gives the true 3D point xw.

    4.2. Calibration

    We require a calibration step to measure the parameters ofour refractive model from Sec. 4.1. The fundamental prin-ciple is that the depth measurements of a true...

Recommended

View more >