3d metric reconstruction from uncalibrated...

8
CENTER FOR MACHINE PERCEPTION CZECH TECHNICAL UNIVERSITY VY ´ ZKUMNA ´ ZPRA ´ VA ISSN 1213-2365 3D Metric Reconstruction from Uncalibrated Omnidirectional Images Branislav Mic ˇus ˇı ´k, Daniel Martinec and Toma ´s ˇ Pajdla {micusb1, martid1, pajdla}@cmp.felk.cvut.cz CTU–CMP–2003–20 13th October 2003 Lze zı ´skat na ftp://cmp.felk.cvut.cz/pub/cmp/articles/martinec/Micusik-TR-2003-20.pdf This research was supported by the grants CTU 0306713, CTU 0306813, GACR 102/01/0971, BENOGO-IST-2001-39184, Aktion 34p24, MSMT Kontakt 22-2003-04, and MSM 212300013. Research Reports of CMP, Czech Technical University in Prague, No. 20, 2003 Published by Centrum strojove ´ho vnı ´ma ´nı ´, Katedra kybernetiky Fakulta elektrotechnicka ´C ˇ VUT Technicka ´ 2, 166 27 Praha 6 fax: (02) 2435 7385, tel: (02) 2435 7637, www: http://cmp.felk.cvut.cz

Upload: others

Post on 03-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

CENTER FORMACHINE PERCEPTION

CZECH TECHNICALUNIVERSITY

VY

ZK

UM

NA

ZP

RA

VA

ISS

N12

13-2

365

3D Metric Reconstruction fromUncalibrated Omnidirectional

ImagesBranislav Micusık, Daniel Martinec and Tomas

Pajdla

{micusb1, martid1, pajdla}@cmp.felk.cvut.cz

CTU–CMP–2003–20

13th October 2003

Lze zıskat naftp://cmp.felk.cvut.cz/pub/cmp/articles/martinec/Micusik-TR-2003-20.pdf

This research was supported by the grants CTU 0306713, CTU 0306813,GACR 102/01/0971, BENOGO-IST-2001-39184, Aktion 34p24, MSMT

Kontakt 22-2003-04, and MSM 212300013.

Research Reports of CMP, Czech Technical University in Prague, No. 20, 2003

Published by

Centrum strojoveho vnımanı, Katedra kybernetikyFakulta elektrotechnicka CVUT

Technicka 2, 166 27 Praha 6fax: (02) 2435 7385, tel: (02) 2435 7637, www: http://cmp.felk.cvut.cz

Page 2: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

3D Metric Reconstruction from Uncalibrated Omnidirectional Images

ABSTRACT

We show that it is possible to obtain a very complete 3Dmetric reconstruction of the surrounding scene from two ormore uncalibrated omnidirectional images. In particular, wedemonstrate that omnidirectional images with angle of viewabove 180◦ can be reliably autocalibrated. We also show thatwide angle images provide reliable information about theircamera positions and orientations. We link together a methodfor simultaneous omnidirectional camera model and epipo-lar geometry estimation and a method for factorization-based3D reconstruction in order to obtain metric reconstructionof unknown scene observed by uncalibrated omnidirectionalimages. The 3D reconstruction is done from automaticallyestablished image correspondences only. Besides rigidity, noassumptions about the scene are made (e.g. presence of a ca-libration object). We demonstrate our method in experimentswith Nikon FC–E8 and Sigma 8mm-f4-EX fish-eye lenses.Nevertheless, the proposed method can be used for a largeclass of non-perspective central omnidirectional cameras.

1. INTRODUCTION

In comparison to standard cameras with narrow view angle,omnidirectional cameras capture larger part of a surroundingscene. Large angle of view often allows to establish more spa-cious point correspondences which leads to a more complete3D reconstruction from fewer images. Notice in Fig. 1 howlarge part of a scene can be reconstructed from two omnidi-rectional images only. An occurrence of degenerate scenes(e.g. when only a single plane is observed in the image) isless probable with omnidirectional images and therefore morestable ego-motion estimation is often achieved.

The 3D metric reconstruction from uncalibrated omnidi-rectional cameras calls for searching for correspondences, ca-mera model calibration, epipolar geometry estimation and 3Dreconstruction. These tasks are similar to the ones for standarddirectional cameras [8] and will be explained in more detail.

We show that 3D metric reconstruction of the surroundingscene from two or more uncalibrated omnidirectional imagescan be performed very similarly as with standard perspectivecameras. First, the omnidirectional camera is calibrated usingimage correspondences and epipolar constraint [9]. Secondly,a projective factorization-based reconstruction from manyimages handling occlusions [9] is used, and finally an up-grade to a metric reconstruction is performed. The proposedlinear estimation techniques give a good starting point for a

Fig. 1. 3D metric reconstruction of a Venice yard from twouncalibrated omnidirectional images with automatically de-tected point correspondences. Bottom: Two omnidirectionalimages acquired by the Sigma 8mm-f4-EX fish-eye convertermounted on the Canon EOS-1Ds digital camera. In the rightimage point matches are depicted.

non-linear bundle adjustment enforcing metric constraints onthe reconstruction and including lens nonlinearity.

In the next, we assume rigid scene, elliptical (usually circu-lar) view field, and approximately known corresponding viewangle. We concentrate on fish-eye lenses, namely Nikon FC–E8 and Sigma 8mm-f4-EX fish-eye. However, the proposedmethod for 3D reconstruction can be applied to a large classof non-perspective central omnidirectional cameras based onlenses as well as on mirrors. Fish-eye lenses with view angle

Page 3: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

PSfrag replacements

optical axis

θ

PSfrag replacements

π

optical axis

θ

ρ

sensor

�−�

r �(a) (b) (c)

Fig. 2. (a) Nikon FC–E8 fish-eye converter. (b) The opticalaxis of the lens is marked by the dash dotted line and theoptical center from which rays emanate is shown as the reddot. The angle between a ray and the optical axis is denotedby θ. (c) The image taken by the lens to the planar sensor π

can be represented by intersecting a spherical retina ρ withcamera half-rays.

larger than 180◦ are, at least in terms of image point repre-sentation and image formation non-linearity, same as centralpanoramic catadioptric cameras [15].

Previous works on the calibration of omnidirectional ca-tadioptric cameras assumed presence of lines in scene [6],planar motion of the camera [7], or used some informationabout the shape of mirrors [4]. In [11], the calibration wasperformed from point correspondences and epipolar constra-int through minimizing of an objective function. Our approachis similar in its goal but introduces a new method that provi-des a closed-form solution of camera model parameters andepipolar geometry as a solution of a quadratic eigenvalue pro-blem. It allows to incorporate our method in RANSAC robustestimation technique handling mismatches in automaticallyestablished point correspondences.

Previous work related to 3D reconstruction from omnidi-rectional images assumed uncalibrated [5] or usually calibra-ted catadioptric sensors. In [5], the para-catadioptric cameracalibration is performed from an image of the absolute co-nic. It is shown that Euclidean reconstruction is feasible fromtwo views with constant parameters. Relations that exist be-tween multiple views of a static scene, where the views canbe taken by any mixture of para-catadioptric, perspective oraffine cameras, were described in [16] and usage of this the-ory for motion estimation, 3D reconstruction or (self-) calib-ration was indicated. The 3D reconstruction from large setsof calibrated omnidirectional images with help of GPS wasintroduced in [14]. Similarly, in [1], a multi-baseline stereoalgorithm for 3D reconstruction of an environment from a setof panoramic images with known camera positions was de-scribed. There were other works, different in principle fromour method, assuming calibrated omnidirectional cameras for3D reconstruction [3, 12] and structure from motion [2].

The main contribution of our method insists in the extensionof multiple view metric 3D reconstruction from many unca-librated images to omnidirectional cameras with large field ofview and highly nonlinear image projection.

The structure of the paper is the following. The omnidirecti-onal camera model, the epipolar geometry, and the camera

auto-calibration method are described in Sec. 2. A projectivefactorization-based 3D reconstruction and the upgrade to ametric reconstruction are explained in Sec. 3 and 4, respecti-vely. The method for automatic searching for point correspon-dences is discussed in Sec. 5. Experiments and a summary aregiven in Sec. 6 and 7, respectively.

2. OMNIDIRECTIONAL CAMERA

Rays of the image will be represented as a set of unit vectorsin R

�such that one vector corresponds just to one image of a

scene point, see Fig. 2.Let us assume that ��� �

u, v � > are coordinates of a pointin a pre-calibrated (will be explained later) image with theorigin of the coordinate system in the center of the view fieldcircle

�u � , v �� >. The radius r of an image point is transformed

to the angle θ of a corresponding 3D vector, see Fig. 2, bya nonlinear function. For Nikon and Sigma fish-eye lenses,respectively, we use the models

θ � ar��br , resp. θ �

barcsin

� bra� , (1)

where θ is the angle between a ray and the optical axis, andr � √

u � v is the radius of a point in the image plane w.r.t.�u � , v � � >, and a, b are parameters of the models. The models

may have various forms determined by the lens or the mirrorconstruction.

The relationship between the 3D vector � emanating fromthe optical center towards a scene point and the image point� can be expressed up to scale as

� ' g� �����

( �f� � , a, b �

)

�( �

r�����θ

)

�( �

r����� ar���br �

)

, (2)

where f� ��� is a rotationally symmetric function w.r.t. the

point�u � , v � � >.

2.1. Calibration from the epipolar geometry

The epipolar geometry can be formulated for the omnidirecti-onal central panoramic cameras [15] as well as for omni-directional cameras with fish-eye converters, which have asingle center of projection.

By the calibration of the omnidirectional camera we under-stand the determination of the affine transformation � from aview field ellipse to a circle, the symmetry center

�u � , v � � , and

camera model parameters a, b, see Fig. 3. The point�u � , v � �

is estimated as the center of the elliptical view field. After ap-plying the calibration matrix � , the precalibrated image withsquare pixel and radial symmetry of non-linear mapping is ob-tained. Parameters a, b remain unknown and will be estimatedfrom the epipolar geometry.

Function f� � , a, b � in Eq.(2) can be linearized and the ray

Page 4: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

PSfrag replacements

Dig Linu”v”

u’v’

uv� ′′

� ′

� �f

r

(a) (b) (c) (d)Fig. 3. Image formation and calibration of an omnidirectionalcamera. An image (a) on a sensor plane in a metric coordinatesystem is digitized to a digital image (b) related by an affinetransformation, which can be estimated by transforming anellipse to a circle (c) up to a rotation around the center of thecircle and a scalar factor. From this pre-calibrated image, 3Dvectors (d) are computed by applying nonlinear function f .

direction vector � can be then written, using Eq.(2), as

� '[( �

f�. � − a � fa

�. � − b � fb

�. �) �

a( �fa

�. �) �

b( �fb

�. �)]

' � � a � � b � ,where � , � , and � are the known vectors computed from imagecoordinates, a and b are the unknown parameters, and fa, fb

are partial derivatives of f�. � w.r.t. a and b.

Using the epipolar constraint for vectors � l in the left and� r in the right image

� >

l � � r �� � l

�a � l � b � l � > � � � r

�a � r � b � r ���

leads after arranging of unknown parameters to QuadraticEigenvalue Problem (QEP) [18]:

���� �a��� �

a ��� ��� � , (3)

which can be solved, e.g., by MATLAB using the functionpolyeig. Parameters a, b, and matrix � can be thus com-puted simultaneously. A robust technique based on RANSAC

with bucketing introduced in [9, 19] can be applied.Parameters of the camera model described in Eq.(2) and

matrix � for an image pair are recovered. Angles between raysand the camera optical axis become known, � is therefore anessential matrix and a calibrated camera is obtained. Readeris referred to [9] for more detailed explanation.

3. PROJECTIVE RECONSTRUCTION

Since parameters a, b in Eq. (2) have been obtained from thecalibration, a vector � satisfying multi-view constraints canbe constructed for every image point. It allows to performprojective reconstruction of a surrounding scene.

Suppose a set of n 3D points is observed by m central omni-directional cameras. Not all points are visible in all views.There may be outliers, i.e. mismatches in correspondences.The goal is to reject outliers and to recover 3D structure (pointlocations) and motion (camera locations) from the remainingimage measurements.

Let � p be the unknown homogeneous coordinate vectorsof 3D points, � i the unknown � × � projection matrices, and

� ip the corresponding coordinate vectors of measured image

points, where i � , . . , m labels images and p �

, . . , n

labels points. Due to occlusions, � ip are unknown for some i

and p.The basic image projection equation says that � i

p are pro-jections of � p up to unknown scale factors λi

p, which will becalled (projective) depths:

λip � i

p ��� i � p

All projections can be gathered into a matrix equation

λ ���� �� λ �� � �� . . . λ �n � �n× λ

�� ��� . . . ×

.... . .

...λ

m��� m� × . . . λm

n � m

n

�... m

︸ ︷︷ ︸�

! " � . . . " n #︸ ︷︷ ︸$

where marks × stand for unknown elements which could notbe measured due to occlusions,

$and � stand for structure and

motion, respectively. The � m×n matrix % � ip & i ')( ..m,p ')( ..n will

be called the measurement matrix, shortly MM. The MM mayhave some missing elements and outliers.

As a result of Sec. 2.1, matches between pairs of imagessatisfying the epipolar constraint have been obtained. They arenot guaranteed to be all the true correspondences. Therefore,an outlier detection technique may still be needed to rejectremaining outliers that can only be found using more than twoimages. Image pairs were read in a sequence. The matchesbetween image pairs were placed into the MM so that theoverlapping matches were joined.

The main idea for outlier detection is that minimal configu-rations of points in triples of images are sufficient to validateinliers reliably. The RANSAC paradigm is used. Trifocal ten-sors are computed from randomly selected minimal six-tuplesof points in triples of images. Ray orientations had to be es-timated so that rays projected to the proper image points.Quasiaffine [8] upgrade was computed for the six points inthe three images only. After this, the three camera matriceshad the right signs w.r.t. the six vectors. Moreover, it turnedout in our experiments that most1 of the points reconstruc-ted using such cameras were properly signed (see more inSec. 4). If there are sufficiently enough points consistent withthe three cameras2, the consistent points not used to estimatethe trifocal tensor are validated as inliers. Sampling is repea-ted until the set of inliers is sufficient to make the projectivereconstruction described below.

Projective reconstruction is done by factorization of theMM with occlusions into structure and motion [9]. This me-thod can handle perspective and also any central views andocclusions jointly. The projective depths of image points are

1We have not examined how this differed for the right tensors and the onescontaminated by an outlier. Nevertheless, the quasiaffine upgrade existed forthe projective reconstruction from all validated points in our experiments.

2Image points are consistent if all reprojection errors of a reconstructedworld point are below a given threshold.

Page 5: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

estimated by the method of Sturm & Triggs [17] using theepipolar geometry. Occlusions are solved by an extension ofthe method by Jacobs [10] for filling the missing data. The ex-tension exploits both points with known and unknown depths.

A set of inliers consistent with the projective reconstructionis found in the following way. In a column of the MM, a pairof image points consistent with the corresponding cameras issearched for. Such a pair forms a track. An image point isjoined into a track if the point and the track are consistentwith cameras as a whole. Columns may also contain more orno track.

It turned out that it is possible to correctly merge tracks withclose reprojections even if they are from different columns ofthe MM. If there are more candidates, the longest one ischosen because of stability. If they are still more than one,the one from the same column is preferred because there maybe some correct pair-wise match by which the tracks got intothe same column during building the MM. Also single imagepoints (tracks of length one) are merged this way.

By track merging, tracks significantly prolong, as shown inFig. 4c. In fact, by this technique, new matches are generated.This is significant for closed sequences whose structure of theMM is strictly diagonal at the beginning but the first and lastviews join after track merging, as shown in Fig. 4d.

4. UPGRADE TO METRIC RECONSTRUCTION

The vector � in Eq.(2) is determined up to a rotation � anda scalar λ with respect to the directional vector of the corre-sponding ray in a Cartesian world coordinate system [9]

� ′′ � λ

(� − )

� '�� ′′| − � ′′ � ′′ ��� .

Estimated vector � is thus related to scene point � , expressedin a metric coordinate system, by � ' � T

���| − � ′′ ��� . The

matrix � T represents a rotation of the estimated coordinatesof the camera in the world coordinate frame. It is clear thatcalibration matrix [8] � � diag

� ,, � .

In contrary to standard cameras, orientation of rays inomnidirectional cameras must be taken into account becausethe rays with the opposite orientations project into differentimage points. Therefore, to obtain a metric upgrade after theprojective reconstruction, orientation of rays has to be pro-perly estimated. This was done using the so-called oriented-projective or quasi-affine reconstruction [8]. Ray orientationshad to be estimated also in the outlier detection stage [9] af-ter the tensor estimation so that rays projected to the properimage points.

The final step of the metric upgrade was done by finding atransformation into a Euclidean basis. Some arbitrarily chosenimage pair % ij & was used to estimate the essential matrix [8].However, there can be too few correspondences between theimages to estimate the essential matrix. Therefore, the es-sential matrix was estimated from the reprojections of all

reconstructed points into the image pair. New projection ma-trices, � , � , were directly estimated from the essential matrixup to a two fold ambiguity [8, page 240]3 The quasi-affinereconstruction was transformed to a Euclidean one by a pro-jective transformation, � , so that the corresponding cameramatrices became aligned with % � >| � > & > in some four rows.Nevertheless, due to noise in the data and hence also in thequasi-affine reconstruction, the remaining two rows were notaligned. Consequently, the internal parameter matrices, � s,could not be identity. To put � s as close to identity as possible,� was iteratively improved by the linear projection in whichthe new projection matrices are obtained as � i % � | − � i & where� i ��� i � i % � |− � i & 4. It turned out that this process converges infive or ten iterations for the data used in the paper. After that,� s were very close to identity so setting � i ��� i % � |− � i & incre-ased the reprojection error only slightly even without bundleadjustment. There is still remaining the two fold ambiguityof camera matrices. The two solutions differ exactly by thesign of the last column of � i.e. they are related by projectivetransformation diag

� % − & � . The one leading to highernumber of scene points with the positive fourth coordinate ischosen.

5. FINDING CORRESPONDENCES

The automatic search for correspondences in omnidirectionalimages becomes more complicated than in perspective imagesbecause the affine invariance of corresponding features, usedby most of methods, is preserved only approximately for largemotions of the camera.

Nevertheless, in many practical situations, omnidirectionalimages can be matched by technique [13] that has been de-veloped for conventional wide baseline perspective images.The method [13] looks for maximally stable extremal regions(MSER), that are matched between pairs of images. In expe-riments presented here, only moderate motions of the camerawere made, and only smaller regions in images were used toestablish tentative correspondences.

6. EXPERIMENTS

In this section, the proposed method is applied to real data.In all experiments, the tentative correspondences, i.e. centersof gravity of every region, were obtained by [13]. As a re-sult of applying the calibration method described in Sec. 2.1,the camera model and the essential matrix were obtained andmost outliers rejected. Partial correspondences from imagepairs were integrated into the measurement matrix as descri-bed in Sec. 3. Final 3D reconstructions were improved by a

3The four fold ambiguity from [8, page 240] was reduced to two fold oneby imploying positivity constraint on � × � determinants of the first threecolumns of the projection matrices.

4For stability even in (hypothetical) case of huge number of cameras,the iterative process was first applied on the image pair used to estimate theessential matrix and right afterwards on all the images.

Page 6: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

non-linear bundle adjustment tuning all 3D points � i, cameramatrices � i and camera model parameters

�u � , v � � >, a, and

b, and enforcing the same internal parameters for all cameras.To show the quality of the 3D reconstruction, some corre-spondences, like corners on the walls, have been establishedmanually. The estimated camera matrices were used for re-constructing these points. Finally, textures were mapped onthe planes created by the reconstructed 3D points.

In our first experiment, one image pair was selected fromthe Venice Yard QY dataset, acquired by the Sigma 8mm-f4-EX fish-eye lens with view angle

�� ◦ mounted on the CanonEOS–1Ds digital camera with resolution 4064×2704pxl. Theobtained calibrated cameras and validated point matches wereused for 3D reconstruction achieved by a linear technique [8].See Fig. 1 for a result. Notice how precise and complete the3D reconstruction from only two omnidirectional images canbe obtained. The RMS of the reprojection error was . ��� pxl.

In our second experiment, all images from the Venice YardQY dataset were used. The obtained camera model was usedin factorization-based 3D reconstruction described in Sec. 3,followed by the metric upgrade given in Sec. 4. One pixelwas set as the threshold on accuracy of outlier detection.

Fig. 4ab shows the metric reconstruction of scene points andcamera positions. The table shows number of images and sizesof the cropped images containing the view field circle, num-ber of found correspondences, amount of the missing data,amount of the image points used in the multi-view projectivereconstruction (these were already consistent with epipolargeometries), amount of the detected inliers consistent withthe multi-view geometry, amount of the reconstructed, par-tially reconstructed, and not-reconstructed correspondences,and the Euclidean reprojection errors of the reconstructionwithout outliers. In structure of the MM of the Venice scene,see Fig. 4d, “•” stand for inliers, “•” stand for outliers, and“ ” stand for the missing data.

In the third experiment, Nikon FC–E8 fish-eye lenswas mounted on the Nikon COOLPIX digital camera with1600×1200 pxl. The camera was moving along a line inconstant steps (

� cm) capturing the scene at direction perpen-dicular to motion. 11 images were acquired. Notice in Fig4a,that the estimated trajectory is really straight and distancesbetween cameras are equal. It can be seen in the top-view ofFig. 4a that all reconstructed points are the points in the fieldof view of the cameras. See table in Fig. 4a for results.

7. SUMMARY AND CONCLUSIONS

The paper presented a 3D metric reconstruction techniquefrom uncalibrated omnidirectional images. As the main con-tribution, the paper shows that omnidirectional cameras withhighly non-linear projection can be used for 3D reconstructionin the same manner as the standard perspective cameras withnarrow view angle. Omnidirectional cameras were calibra-ted from image correspondences only and then a 3D recon-struction from two images and a factorization-based 3D re-

construction from many images with metric upgrade wereperformed. It is an advantage over the standard cameras thatmore complete and a very stable 3D reconstruction from fewerimages can be obtained. The experiments suggest that our me-thod is useful for structure from motion with good accuracy.

Using camera with resolution 1200×1200pxl and lens withfield of view 183◦ is equivalent to using camera with reso-lution 300×300 pxl and standard lens with field of view 45◦

in the sense of ratio pxl/angle. Our proposed method showsthat a very accurate reconstruction of camera positions andan accurate reconstruction of scene points can be obtainedby relatively small (4 times lower) resolution in comparisonto standard camera. The accuracy of reconstruction of scenepoints can be improved by using a camera with a higher reso-lution, e.g. Canon EOS–1Ds as in first two experiments.

8. REFERENCES

[1] R. Bunschoten and B. Krose. Robust scene reconstructionfrom an omnidirectional vision system. IEEE Transaction onRobotics and Automation, 2002.

[2] P. Chang and M. Hebert. Omni-directional structure from mo-tion. In Proc. of the IEEE Workshop on Omnidirectional Vision,pages 127 – 133, June 2000.

[3] P. Doubek and T. Svoboda. Reliable 3D reconstruction froma few catadioptric images. In Proc. of the IEEE Workshop onOmnidirectional Vision, pages 71–78, 2002.

[4] J. Fabrizio, J.-P. Tarel, and R. Benosman. Calibration of pa-noramic catadioptric sensors made easier. In Proc. of IEEEWorkshop on Omnidirectional Vision, pages 45–52, 2002.

[5] C. Geyer and K. Daniilidis. Structure and motion from unca-librated catadioptric views. In Proc. CVPR, pages I: 279–286,2001.

[6] C. Geyer and K. Daniilidis. Paracatadioptric camera calib-ration. PAMI, 24(5):687–695, May 2002.

[7] J. Gluckman and S. K. Nayar. Ego-motion and omnidirectionalcameras. In Proc. ICCV, pages 999–1005, 1998.

[8] R. Hartley and A. Zisserman. Multiple View Geometry inComputer Vision. Cambridge University Press, Cambridge,UK, 2000.

[9] hidden to conceal authors’ identity.

[10] D. Jacobs. Linear fitting with missing data: Applications tostructure from motion and to characterizing intensity images.In CVPR, pages 206–212, 1997.

[11] S. B. Kang. Catadioptric self-calibration. In Proc. CVPR,pages I: 201–207, June 2000.

[12] S. B. Kang and R. Szeliski. 3-D scene data recovery usingomnidirectional multibaseline stereo. IJCV, 25(2), Nov. 1997.

[13] J. Matas, O. Chum, M. Urban, and T. Pajdla. Robust widebaseline stereo from maximally stable extremal regions. InProc. of the British Machine Vision Conference, pages I: 384–393, UK, 2002.

[14] J. Mellor. Geometry and texture from thousands of images.IJCV, 51(1), January 2003.

Page 7: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

Scene Lab side motion: 11 images [ �� � × �� � ] Venice: 13 images [ ����� � × ��� � � ]

Correspondences / missing data 1119 / 62.62 % 1846 / 77.58 %Image points consistent with EG 4614 (48.57 % of all regions) 5385 (53.92 % of all regions)Image points consistent with MVG 4591 (99.50 % of 4614 image points) 5301 (98.44 % of 5385 image points)Outliers 23 (0.50 % of 4614 image points) 84 (1.56 % of 5385 image points)Rec. / partially rec. / not-rec. corresp. 1117 / 16 / 2 of 1119 1833 / 54 / 13 of 1846Mean / maximal reprojection error 0.23 / 1.00 pxl (inliers @ thr = 1 pxl) 0.28 / 1.00 pxl (inliers @ thr = 1 pxl)

(a) (b)Length of reconstructed tracks 11 10 9 8 7 6 5 4 3 2 allNo track merging 0 0 13 6 14 19 44 148 440 1391 2075After track merging 11 1 4 19 40 52 79 175 357 1112 1850

(c)

13im

ages

1846 correspondences︷ ︸︸ ︷

(d)

Fig. 4. Metric reconstruction from a sequence of images. (a) The first, the middle, and the last image from the lab side motionsequence with inliers and three views of reconstructed scene points with camera positions are shown. Notice that all reconstructedpoints are the points in the view field of the cameras and the cameras lie on a line. (b) Reconstruction of the Venice Yard QYsequence with camera positions. For comparison, only reconstructed points are shown on top whereas only textures from thesame view-point are shown in the middle. Summary table is given for both sequences. (c) Lengths of the reconstructed trackswithout and with track merging. Notice the amount of longer tracks after merging. (d) Measurement matrix for the Venicesequence. Without track merging, the matrix would be dominantly diagonal, i.e. no correspondences between first and last viewswould be used.

Page 8: 3D Metric Reconstruction from Uncalibrated Omnidirectionalcmp.felk.cvut.cz/ftp/articles/martinec/Micusik-TR-2003-20.pdf · system is digitized to a digital image (b) related by an

[15] T. Pajdla, T. Svoboda, and V. Hlavac. Epipolar geometry ofcentral panoramic cameras. In R. Benosman and S. B. Kang,editors, Panoramic Vision : Sensors, Theory, and Applications,pages 85–114. Springer Verlag, Berlin, Germany, 1 edition,2001.

[16] P. Sturm. Mixing catadioptric and perspective cameras. InProc. of the IEEE Workshop on Omnidirectional Vision, pages60–67, 2002.

[17] P. Sturm and B. Triggs. A factorization based algorithm formulti-image projective structure and motion. In ECCV96(II),pages 709–720, 1996.

[18] F. Tisseur and K. Meerbergen. The quadratic eigenvalue pro-blem. SIAM Review, 43(2):235–286, 2001.

[19] Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong. A robusttechnique for matching two uncalibrated images through therecovery of the unknown epipolar geometry. Artificial Intelli-gence, 78(1-2):87–119, 1995.