3d modeling of complex environments - columbia universityallen/photopapers/el-hakim.model.pdf ·...

12
SPIE Proceedings Vol 4309, Videometrics VII, San Jose, Jan 21-26, 2001 3D Modeling of Complex environments Sabry F. El-Hakim Visual Information Technology (VIT) Group Institute For Information Technology, National Research Council Canada ABSTRACT Creating geometrically correct and complete 3D models of complex environments remains a difficult problem. Techniques for 3D digitizing and modeling have been rapidly advancing over the past few years although most focus on single objects or specific applications such as architecture and city mapping. The ability to capture details and the degree of automation vary widely from one approach to another. One can safely say that there is no single approach that works for all types of environment and at the same time is fully automated and satisfies the requirements of every application. In this paper we show that for complex environments, those composed of several objects with various characteristics, it is essential to combine data from different sensors and information from different sources. Our approach combines models created from multiple images, single images, and range sensors. It can also use known shapes, CAD, existing maps, survey data, and GPS. 3D points in the image-based models are generated by photogrammetric bundle adjustment with or without self-calibration depending on the image and point configuration. Both automatic and interactive procedures are used depending on the availability of reliable automated process. Producing high quality and accurate models, rather than full automation, is the goal. Case studies in diverse environments are used to demonstrate that all the aforementioned features are needed for environments with a significant amount of complexity. Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual Reality, Geometric accuracy 1. INTRODUCTION Sensors and techniques for 3D modeling of small and medium size single objects, for example to the size of a human, have reached an advanced stage so that these models can be created accurately and fully automatically 1-3 . However, the situation is different for more complex large objects and environments or sites. Although there is no clear definition, a complex environment may consist of multiple objects of different types and may require a large number of images or scans to completely reconstruct in 3D. Some successful, application specific, examples do exist. For example urban or city models 4-6 have been successfully created with semi-automatic techniques. The success of those systems increases with the use of multiple sensors and the availability of CAD models describing all possible house or roof shapes. A flexible approach that can be applied to any type of complex site or structure remains elusive. The research presented in this paper is an attempt to develop some of the tools needed for such approach. The original focus of our work was to allow the creation of high quality, complete, and accurate models of complex environments by in-house intuitive easy to use software tools. The next aim was to investigate and apply full automation of each phase. 1.1. 3D Reconstruction Paradigms Techniques for 3D digitizing and modeling have been rapidly advancing over the past few years. The ability to capture details, the degree of automation, and geometric accuracy vary widely from one approach to another. Therefore, it is a challenging task to find a single approach that works for all types of environment and at the same time is fully automated and satisfies the requirements of every application such as accuracy, realistic look, cost, effort, and time constraints. The process of creating 3D models from real scenes has a few well-known steps: data collection, data registration, and modeling (geometry, texture, and lighting). There are many variations within each step depending on the sensor used and the data collection procedure. Approaches that skip the geometric modeling step, such as image-based rendering 7 , are popular for applications that require only visualization or some walkthrough. However, the lack of geometric model impedes the Email: [email protected]; http://www.vit.iit.nrc.ca/elhakim/home.html

Upload: others

Post on 23-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

SPIE Proceedings Vol 4309, Videometrics VII, San Jose, Jan 21-26, 2001

3D Modeling of Complex environments

Sabry F. El-Hakim∗

Visual Information Technology (VIT) Group Institute For Information Technology, National Research Council Canada

ABSTRACT Creating geometrically correct and complete 3D models of complex environments remains a difficult problem. Techniques for 3D digitizing and modeling have been rapidly advancing over the past few years although most focus on single objects or specific applications such as architecture and city mapping. The ability to capture details and the degree of automation vary widely from one approach to another. One can safely say that there is no single approach that works for all types of environment and at the same time is fully automated and satisfies the requirements of every application. In this paper we show that for complex environments, those composed of several objects with various characteristics, it is essential to combine data from different sensors and information from different sources. Our approach combines models created from multiple images, single images, and range sensors. It can also use known shapes, CAD, existing maps, survey data, and GPS. 3D points in the image-based models are generated by photogrammetric bundle adjustment with or without self-calibration depending on the image and point configuration. Both automatic and interactive procedures are used depending on the availability of reliable automated process. Producing high quality and accurate models, rather than full automation, is the goal. Case studies in diverse environments are used to demonstrate that all the aforementioned features are needed for environments with a significant amount of complexity. Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual Reality, Geometric accuracy

1. INTRODUCTION Sensors and techniques for 3D modeling of small and medium size single objects, for example to the size of a human, have reached an advanced stage so that these models can be created accurately and fully automatically1-3. However, the situation is different for more complex large objects and environments or sites. Although there is no clear definition, a complex environment may consist of multiple objects of different types and may require a large number of images or scans to completely reconstruct in 3D. Some successful, application specific, examples do exist. For example urban or city models4-6 have been successfully created with semi-automatic techniques. The success of those systems increases with the use of multiple sensors and the availability of CAD models describing all possible house or roof shapes. A flexible approach that can be applied to any type of complex site or structure remains elusive. The research presented in this paper is an attempt to develop some of the tools needed for such approach. The original focus of our work was to allow the creation of high quality, complete, and accurate models of complex environments by in-house intuitive easy to use software tools. The next aim was to investigate and apply full automation of each phase. 1.1. 3D Reconstruction Paradigms Techniques for 3D digitizing and modeling have been rapidly advancing over the past few years. The ability to capture details, the degree of automation, and geometric accuracy vary widely from one approach to another. Therefore, it is a challenging task to find a single approach that works for all types of environment and at the same time is fully automated and satisfies the requirements of every application such as accuracy, realistic look, cost, effort, and time constraints. The process of creating 3D models from real scenes has a few well-known steps: data collection, data registration, and modeling (geometry, texture, and lighting). There are many variations within each step depending on the sensor used and the data collection procedure. Approaches that skip the geometric modeling step, such as image-based rendering7, are popular for applications that require only visualization or some walkthrough. However, the lack of geometric model impedes the

∗ Email: [email protected]; http://www.vit.iit.nrc.ca/elhakim/home.html

Page 2: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

accuracy and the freedom to render the environment from arbitrary viewpoints and thus will not be considered here. Only 3D textured models allow unrestricted walkthrough and close inspection of the site details. Image-based modeling methods can be divided into two categories. The first uses widely separated images, interactive correspondence, and a priori knowledge about the scene 8. It applies basic photogrammetry for image registration followed by stereo matching to add details. The second category uses image sequence, projective geometry and automatic correspondence, and no or little knowledge about the scene 9-11. These methods require images taken close to each other (short baseline) in order for the automatic correspondence to succeed. This makes them more noise sensitive and numerically unstable. Errors in the range of 2.3% to 5.4% on a single object have been reported10. However these methods can be useful for applications that do not require high geometric accuracy but need realistic looking and easy to create models. Techniques based on a single image plus constraints were developed for specific objects such as buildings12,13. The Main Advantage of all image-based methods is that the sensors are inexpensive and very portable. However, due to the need for features, incomplete models may result particularly on irregular geometry or sculptured featureless surfaces. Active sensors such as laser scanners have the advantage of automatically acquiring dense 3D points14-16 without the need for any features. They also produce organized points suitable for automatic modeling2,3. However, the sensors can be costly, bulky, and affected by surface reflective properties. Also a range sensor is usually designed for a specific range, thus a sensor designed for a close range cannot be used for a long range. In addition, it is particularly difficult to find active sensing technology suitable or accurate enough for the medium range of between 3m and 20m. For relatively simple objects, structures, or environments, most existing methods are capable of successfully creating 3D models, albeit at varying degree of automation, level of details, effort, cost, and accuracy. Many researchers have presented examples of those types of model in the past several years. However, when it comes to complex environments the only proven methods so far are those using positioning devices, CAD or existing models and operator in the loop, such as city modeling application4-6. 1.2. Summary of the Approach Our approach boasts features contained in all the above technologies by combining image-based methods with range data and other information such as known shapes or CAD, surveying, and GPS. The high accuracy of the image-based approach is achieved with rigorous photogrammetric techniques. In this approach, points are extracted interactively. Once the images are registered using bundle adjustment, corners are automatically extracted and matched using our hierarchical stereo matching approach17. However, this will usually not produce sufficient points for complete 3D reconstruction. Points in selected locations need to be interactively added using multi-image triangulation. However, there are usually many parts of the scene where multi-image triangulation is not possible due to occlusions or lack of features. These parts can be reconstructed in the 3D space from the coordinates in a single image and the mathematical model of the surface determined by fitting a function to existing surface points. We now have a sampled geometry in the form of points in the three-dimensional space however the connectivity or the topology is not known and must be determined someway. Our approach relies on interactive point segmentation followed by automatic mesh generation. Counting on images for modeling may not be sufficient because the features that can be extracted are usually fewer then the required level of details. To overcome this problem two options are implemented. First, large number of points can be automatically added to surfaces of known shape, such as spheres, cylinders and quadrics, using a polygon subdivision method. The second option integrates data from range sensors that can densely digitize the surface where filled out details are required. Another important feature of the approach is the ability to combine independently created models. In almost every complex site we modeled, we found that it was not practical, or even sometimes possible, to cover the whole site with a single set of images that can be registered together. Also some parts required images at much larger scale than the remaining parts of the site in order to capture the required details. Registering and combining these various models proved to be essential. The process can be divided into several phases, mainly the selection of sensors and sensor placement, feature extraction and matching for image registration (bundle adjustment), feature extraction and matching for 3D modeling, and determining connectivity between points for modeling. These phases can be carried out either interactively or automatically. Although full automation is the ultimate goal especially for complex environments where human interaction is time consuming, this goal is yet to be achieved. This is particularly true for scene understanding required for determining sparse-points connectivity for automatic reconstruction. Currently it is very difficult, or at least not yet convincingly demonstrated, to automatically and correctly segment the scene into regions that contain the surfaces and objects appropriate for modeling. The exception is dense 3D data, such as those obtained by active range sensors or closely spaced image sequences, where successful automatic

Page 3: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

segmentation is possible. Combining such data with color and intensity improves the segmentation success rate18. However, for complex environments and unorganized sparse 3D points most methods are still experimental. Therefore, in order to produce 3D models for today’s applications, our philosophy is to design a system that is partially automatic and partially relays on human interaction for specific operations. This system provides the infra structure for a future fully automated system that reaches this goal incrementally. Once fully automated methods are proven reliable for any of the phases of the procedure, it may replace the current interactive method. The idea is not to design an approach for the mass user market, although this remains a desirable goal, but for users who are prepared to take some care in image acquisition and point selection in order to achieve the best possible results from the existing technologies. More details of the approach, with several examples, are given in the next section. In the third section, accuracy analysis and geometric considerations are discussed. We present several case studies using three on-going projects in the forth section. Finally conclusions and our future research directions are summarized.

2. THE APPROACH

(A) The overall procedure

(B) Image-based procedure

(C) Active range-sensor procedure

Figure 1: Simplified block diagram of the main steps.

Our approach to 3D modeling of complex environments can be summarized as follows (Figure 1): 1. It creates accurate models from multiple images using photogrammetric techniques, mainly bundle adjustment with self-

calibration - a simultaneous optimization of all camera interior and exterior parameters and object-space point coordinates.

2. Implements proper network design of the images to increase accuracy and stability and reduce error propagation. 3. Registers and integrates models created from independent sets of images into a non-redundant single model. 4. Registers and integrates models obtained by range data with image-based models. 5. Computes 3D coordinates from single images for surfaces of known shapes or whose relationships with each other are

known.

Page 4: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

6. Makes use of available non-imaging information such as surveyed points, existing maps, known shapes, CAD, known camera positions, and GPS.

7. Automatically extracts and matches features such as corners after image registration. 8. On certain types of surface, 3D points are automatically generated without measuring new features using subdivision

techniques. We emphasize that for complex environments high geometric accuracy is very important since any compromise will result in visibly incorrect relationships between objects and surfaces in the scene. 2.1. The Main Procedure Steps In this section we discuss briefly the image-based procedure, figure 1-B (the active range-sensor procedure is summarized n figure 1-C). We use photogrammetric bundle adjustment with or without self-calibration19,20 depending on network design. It provides the most flexible solution, most rigorous statistical error model, and statistical analysis all through the process to verify quality of registration and 3D point estimation. If conditions are not suitable for self-calibration21, such as having a small number of images (3 or less) containing only few points (10 or less), the focusing of the camera lens should not change during the image taking. The camera is then calibrated separately at the same focus setting. This usually provides a comparable accuracy to self-calibration since the distortion parameters, particularly radial lens distortion, are stable22. When self-calibration is used, first a bundle adjustment is performed without self-calibration. The resulting points can be thought of as reference points with approximately known 3D coordinates for the self-calibration procedure that follows. 2.2. 3D From a Single Image In complex environments and structures, there are always parts of the scene that will be visible from only one image due to occlusions. Also lack of features on some surfaces makes it difficult to obtain 3D coordinates from image correspondence. Therefore, an approach to extract 3D information from a single image is necessary. Our approach applies the equation of the surface as a constraint, along with the camera parameters, to the single-image coordinates to compute the corresponding 3D coordinates. For example in the structure shown in figure 2 many parts of its walls appear in only one image. However, the walls are planes that either parallel or perpendicular to each other. The equations of some of the planes can be determined from points that appear in multiple images, such as the corners of windows or walls. The remaining plane equations are determined using the knowledge that they are either perpendicular or parallel to one of the planes already determined. With a little effort, the equations of all the planes defining the structure cab be computed. From these equations and the known camera parameters for each image, we can easily determine 3D coordinates of any point or pixel from a single image. See also figure 4 for another example that uses plane and cylinder equations to compute 3D coordinates from a single image.

Figure 2: Images and model of ‘t Huys te Warmont, Warmond, Holland.

Page 5: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

2.3. Surface Segmentation, Subdivision, and Extension The 3D points generated from image-based methods may not be sufficient for complete reconstruction. Also the connectivity, or the topology, is unknown. Three interrelated operations are needed in order to add sufficient points and organize them to create a complete 3D model. Segmenting or grouping 3D points into sets each belonging to a different surface is the first step. Most existing automatic modeling methods were developed for organized 3D points, such as the range images obtained from a laser scanner3, or unorganized points belonging to specific types of object2. Sparsely distributed points obtained from features on various surfaces on different objects are almost impossible to model automatically since they are subject to many possible interpretations. In our approach, the scene is visually divided into surface patches, each is triangulated and texture mapped separately. Although this is specified manually by a human operator, it is easy to do since all that is required is to draw, with the mouse, a window around the points belonging to the same surface set. Each set may be on a different surface, or the same surface may be divided into several sets depending on the complexity of its shape. Once this is done, the modeling is then carried out automatically. Using the existing features on the surface set, 3D-point computation of selected points is first done interactively. These 3D-points are then used to compute the function defining the surface using least squares fitting. The function is in turn used to automatically generate new points on the surface. On more complex surfaces where only B-splines can describe the shape, we can subdivide the existing triangles into a large number of smaller triangles by subdivision techniques23, thus creating a much smoother surface. For partially occluded surfaces, the surface equation can be used to extend them into the occluded regions. For example, in figure 3, we can extend the sides of the buildings (side 1 and 2) and the street (surface 3) by fitting planes using corners of windows on the surface or the visible building corners. The same can be done for the front building (surface 4 in figure 3) by using a cylinder equation rather than plane

Figure 3: Example of surface extension and subdivision.

(A) (B) Figure 4: Two culture-heritage sites: (A) Pomposa, Italy (B) Dazu, China

2.4. Adding Range Data This involves matching and integrating local detailed points obtained by a laser scanner to the global model obtained by the image-based method. This is best described by an example. In figure 4 A and B, most of the structure is easy to model by images taken with digital camera. However, parts of the surface contain fine geometric details that will be very difficult or impractical to model from images, such as the enlarged sections shown. Those parts are best captured by a laser scanner and

Page 6: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

added to the global model created from the images. To register the detailed model we measure 3D coordinates of several features, usually 10, using the images then extract the 3D coordinates of the same features from the scanned data. This is done interactively using the intensity images generated by the laser scanner. The transformation parameters are then used to register the two data sets. 2.5. Combining Models Created by Independent Sets of Images An approach to combine models created by different sets of images (also applies to models created by different types of sensor data) into a coherent model of the complete site has been developed. The first step is properly registering the models into the same coordinates system using common points, existing maps or surveys, or positioning devices such as GPS. The transformation between models allows for scale variations since the model points are often generated with free network adjustment with only an approximate single distance to determine the scale. The next step is the integration. In some cases, each model can be left as a separate self-contained node (e.g. two separate objects or a scanned object setting on top of a surface model created by images). However, in many cases more effort is required for example to re-triangulate the overlapped region and produce a non-redundant mesh. In figure 5, images 1,2, and 3 are part of a set of images that are used to create the overall model of the structure. Close up images such as images 4 and 5 are used to get more details like the entrance to building. Any unwanted features, such as shadows, trees, and humans, may be digitally cut from the images prior to texture mapping onto the geometric model.

1

2

3

The combined model

4 5

Figure 5: Independent sets of images used to create a complete detailed model (Chapel of the Scrovegni, Padova, Italy).

Page 7: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

3. GEOMETRIC CONSIDERATIONS AND ACCURACY ANALYSYS Geometric accuracy is critical for complex environments. The relationships between objects and surfaces, such as parallelism and perpendicularity, and relative sizes of the numerous details can be significantly altered by errors in point coordinates. Two geometric issues must be considered for accurate image-based modeling: • Camera configuration (network design) • Point selection Network configuration design has been extensively studied in photogrammetry24, therefore we will only show one case study here (see more accuracy analysis in a previous paper25).

1 2 3 -2977.020 -160.400

2422.941 -30.460

-5833.178 11.870

-2458.418 -165.960

2463.202 -21.300

-5570.078 7.530

5080.373 -165.910

2127.957 32.740

-5691.127 -8.250

Figure 6. Images, camera exterior parameters, and the 3D model of locks on the Dutch waterways. Camera 2-3 Camera 1-3 Camera 1-2

X -68.048 -65.863 -86.829

Y 1043.812 1044.401 1053.609

Z -25.120 -27.464 -69.132

The points altered by 1 pixel in x &y in image 1: X -62.736 -124.730

Y 1048.112 1077.913

Z -30.073 -154.881

Table 1: XYZ coordinate computation at different camera configurations

Figure 7. Configuration of camera-object for the 3 images in figure 6.

Page 8: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

We will use the flowing example to illustrate the effect of camera configuration on the geometric accuracy. The site shown in figure 6 consists of several structures of various shapes. We purposely took the images at the configuration shown in figure 7. Images 1 and 2 have a short baseline (B1) relative to the object standoff distance. Images 2 and 3 were taken at a much larger baseline and at highly convergent angle. To illustrate the difference in accuracy between the two configurations, we introduced an error of 1 pixel to a point in image 1. If we compute the 3D coordinates with the error using only image 1 and 2, this 1 pixel error resulted in a total of 2% error in the object space. Using images 1 and 3 the error in object space is only 0.07%, a factor of 30. Considering that automated methods require short baselines such as the one between image 1 and 2 in order to have successful correspondence, it is obvious that accuracy will be significantly compromised. Point selection is also important. The points should be distributed over the entire image, be at different planes, and at least 15 points should be used. Points on steep surfaces (such as the sides of the buildings in figure 3 and 5) should be avoided. The main reason is that any small pointing error in the image space results in a large triangulation error in the object space even with large baseline. We have tested this by repeating the measurement several times on points on the steep walls and found that the resulting variations in the 3D coordinates, mainly the Z-coordinate which is close to the camera optical axis, was in the 2-3% range.

4. CASE STUDIES We described the fundamentals of our approach and discussed some geometric accuracy considerations. In addition to the examples used in the above sections, we present here three case studies that further illustrate the approach. 4.1 The Site of the Pyramids of Egypt

We used an exiting map of the site, and some surveys of the pyramids that are publicly available (see for example the famous book by P. Tompkins26). We then proceeded to the site and planned and acquired digital images to create models of the following structures:

• Each of the three main pyramids • The Sphinx • Details on parts of the pyramids (for example

the top of Chephren pyramid.)

Each of the pyramids required more than 12 high- resolution (3.3 Mega-Pixels) images to capture the overall shape (some are shown in figure 9). The sphinx required several more images. The coordinates of the corners and the top of each pyramid in a single spatial coordinate system were computed from the available survey data (figure 8). The coordinates of the four corners of the base of the sphinx were also computed. None of these corners were well defined thus the accuracy of positioning these structures will not be very high (about 1m). Using the same points on the individual models, all the four structures were registered in the same coordinate system. The detailed close-up models (figure 9-B) were combined with the overall models as described in section 2.5.

Figure 8: Layout / map of the pyramids and the sphinx

Having built the pyramids and sphinx models, the next stage involved the landscape or the terrain. To do this it was necessary to obtain aerial photographs of the site. We are currently in the process of acquiring such images in order to create a complete model of the whole site.

Page 9: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

(A)

(B)

Figure 9: Some of the images of the second pyramid and its model, including the close up on the top. 4.2. Virtualized Art Gallery In this project we had to use multi-models, single image, and laser scanner data. No non-imaging data were available. Each room in the gallery, including room contents, was modeled separately. These room models were then registered together using the common doors between them. We will briefly describe one representative room as a case study.

Figure 10: Sample images taken at the art gallery.

In this room, called the water court, we used two separate models and one single image to create a complete model of the walls, the floor, and all the stands. Images 1 to 6 in figure 11 were used to create the model of the three shown walls, while images 7 to 9 were used to create the model of the remaining wall and parts of the two adjacent walls. This overlaps with the first model at the doors. These doors were used to register the two models to create one complete model of all the walls and parts of the floor. The image of the floor (top right image in figure 10) was registered with this model, again using common points, in order to provide the complete floor, which has a pool of water/fountain in the middle. Many parts of the model, such as sides of the stands scattered around the pool, appear in single images only. Therefore the single-image 3D approach, which takes advantage of the perpendicularity and parallelism of planes, has been used extensively. The art sculptures, which are placed on the floor and on the bases shown, are best modeled with a laser scanner27.

Figure 11: Camera locations in one of the art courts.

Page 10: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

Any other scanned sculptures that are not on display in the actual gallery can be placed on the stands in the virtualized gallery. By replacing the sculptures in the virtualized gallery as desired, one can imagine that a much larger collection of art than possible in the real art gallery can be visualized. 4.3. Busy Culture Heritage Site This site in the heart of the city of Trento, Italy, consists of quite a few structures with varied facades that do not have obvious constraints, such as perpendicularity or parallelism (figure 12 shows some of the images and parts of the 3D mode). High geometric accuracy from images was particularly critical for this project since it is almost impossible to detect these relationships in advance and enforce them when creating the model. Whatever relationships exist between the structures must be computationally evaluated from the 3D data rather than a priori assumed. Since this is a very busy tourist site, actual surveying or measurements at the site was difficult. Therefore, the best source of 3D data was high-resolution digital images. However, since these images are the only source of 3D information care had to be exercised in the geometric configurations (network design) of the camera positions to achieve the highest possible accuracy from the photogrammetric bundle adjustment. Any loss in geometric accuracy may result in incorrect relationships between the structures. We used only 11 images (3.3 Mega-Pixels resolution). The points required for image registration were extracted interactively then additional points were extracted by stereo matching after the registration. Additional interactive 3D measurements were performed, and surface fitting was also used to verify type of surfaces and the relationships between them. Some surfaces were determined from single images once the surface equations were determined. Although this site may lend itself to image-based rendering (IBR) visualization or panoramas, the many depth variations and the presence of several bends and alleys make using a geometric model appreciably more effective than IBR.

Figure 12: Top row: some of the images, bottom row: views from the model of the facades.

5. CONCLUSIONS AND FUTURE WORK An approach to create accurate and complete 3D models of complex environments has been presented. The method combines image-based approaches, both multi-image and single-image approaches, and active-sensor data, and utilizes non-imaging data such as CAD, surveying, or GPS. High geometric accuracy is critical for modeling complex environments. Therefore, care must me taken in image and sensor placement and in selecting points for image registration. Some of the operations are performed interactively while others are carried out automatically. The accuracy achieved by applying a complete camera model and simultaneous photogrammetric global adjustment of bundles is sufficient for most applications. Although this interactive strategy can be used to model a wide spectrum of objects and sites, it is of course highly desirable to reduce human intervention, particularly when using a large number of images. Automation is above all needed for view

Page 11: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

planning and image acquisition, point extraction and matching before registration especially for widely spaced and highly convergent camera angles, and determining point connectivity by segmentation of 3D points into groups. All the three items are very difficult to automate, however more effort should be focused on the first and last item since the second takes only few minutes to perform interactively. Occlusions and variations in illumination between images affect existing automatic methods for correspondence and image registration. Therefore, those methods need images taken at close intervals, which result in a large number of images as well as reduced geometric accuracy, as shown in this paper. In addition, the resulting 3D points are not likely to be suitable for modeling, thus significant human input is still required. Therefore, improved automated methods that do not suffer from these shortcomings are the subject of our current and future research. ACKNOWLEDGMENTS I would like to thank my colleagues Angelo Beraldin, Eric Paquet, John Taylor, and Luc Cornouyer for providing the range sensor data and taking the digital images in China, Egypt, and Italy. References 1. G. Eckert: “Automatic shape reconstruction of rigid 3-D objects from multiple calibrated images”, Proc. of Eusipco

2000, Tampere, Finland, September 4-8, 2000. 2. H. Hoppe, “Surface reconstruction from unorganized points”, PhD Thesis, Department of Computer Science and

Engineering, University of Washington, June 1994 3. M. Soucy, D. Laurendeau. "Multiresolution surface modeling based on hierarchical triangulation", Computer Vision and

Image Understanding 63(1), pp. 1–14, January, 1996. 4. C. Brenner, “Towards fully automatic generation of city models“, International Archives of Photogrammetry and

Remote Sensing, Volume 33, Part B3, Commission III, pp 85-92, Amsterdam, July 16-23, 2000. 5. A. Gruen, “Semi-automatic approaches to site recording and modeling”, International Archives of Photogrammetry and

Remote Sensing, Volume 33, Part B5A, Commission V, pp 309-318, Amsterdam, July 16-23, 2000. 6. S. Teller, “Towards urban acquisition from geo-located images”, 6th Pacific Conference on Computer Graphics and

Applications, Pacific Graphics’98, IEEE Press, pp. 45-51, Oct. 26-29, 1998. 7. S.B. Kang, “Survey of image-based rendering techniques”, SPIE Vol. 3641, Videometrics VI, pp. 2-16, 1999. 8. P.E. Debevec, C.J. Taylor, J. Malik, “Modeling and rendering architecture from photographs: A hybrid geometry and

image-based approach”, SIGGRAPH’96, pp.11–20, 1996. 9. M. Pollefeys, R. Koch, M. Vergauwen., L. Van Gool, “Hand-held acquisition of 3D models with a video camera”, IEEE

proceedings of 2nd. Int. Conf. On 3D Digital Imaging and Modeling (3DIM’99), pp. 14- 23, 1999. 10. M. Pollefeys, R. Koch, L. Van Gool, “Self-calibration and metric reconstruction in spite of varying and unknown

intrinsic camera parameters”, International J. of Computer Vision, 32(1), pp. 7-25, 1999. 11. Z. Zhang, “Image-based geometrically-correct photorealistic scene/object modeling (IBPhM): A review”, Proc. Of the

Asian Conference on Computer Vision (ACCV’98), Hong Kong, January 8-11, 1998. 12. F.A. van den Heuvel, “3D reconstruction from a single image using geometric constraints”, ISPRS Journal for

Photogrammetry and Remote Sensing, 53(6), pp. 354-368, December, 1998 13. D. Liebowitz, A. Criminisi, A. Zisserman, A., ”Creating Architectural Models from Images”, EUROGRAPHICS ’99,

18(3), 1999. 14. J.-A. Beraldin, F. Blais, L. Cornouyer., M. Rioux, S.F. El-Hakim, R. Rodella, F. Bernier, N. Harrison “3D imaging

system for rapid response on remote sites”, IEEE proceedings of 2nd. Int. Conf. On 3D Digital Imaging and Modeling (3DIM’99), pp. 34- 43, 1999.

15. A. Johnson, R. Hoffman, J. Osborn, M. Hebert, “A system for semi-automatic modeling of complex environments”, IEEE proceedings of the. Int. Conf. On 3D Digital Imaging and Modeling (3DIM’97), pp. 213- 220, 1999.

16. V. Sequeira, K. Ng, E. Wolfart., J.G.M. Goncalves, D. Hogg, “Automated reconstruction of 3D models from real environments”, ISPRS Journal for Photogrammetry and Remote Sensing, 54(1), pp. 1-22, January, 1999.

17. S.F. El-hakim, “A hierarchical approach to stereo vision”, Photogrammetric Engineering and Remote Sensing, 55(4), pp. 443-448, April, 1989.

18. B. Maxwell, S.A. Shafer, “Physics-based segmentation of complex objects using multiple hypotheses of image formation”, Computer Vision and Image Understanding, 65 (2), pp. 269-295, February, 1997.

19. D.C. Brown, “The bundle adjustment - Progress and prospective.” International Archives of Photogrammetry, 21(3): paper no. 3-03-041, 33 pages, ISP Congress, Helsinki, Finland, 1976.

Page 12: 3D Modeling of Complex environments - Columbia Universityallen/PHOTOPAPERS/el-hakim.model.pdf · Keywords: 3D modeling, Photogrammetry, Complex environments, Visualization, Virtual

20. C.S. Fraser, “Digital camera self-calibration”, ISPRS Journal for Photogrammetry and Remote Sensing, 52(4), pp. 149-159, August, 1997.

21. A. Gruen, H.A. Beyer, “System calibration through self-calibration”, Workshop on Calibration and Orientation of Cameras in Computer Vision, Congress of ISPRS, Washington, D.C., 1992.

22. M.R. Shortis, S. Robson, T. Short, “ Multiple focus calibration of a still video camera”, International Archives of Photogrammetry and Remote Sensing, Vol. 31, Part B5, pp. 534-539, Vienna, July 1996.

23. D.N. Zorin, Subdivision and Multiresolution Surface Representation. Ph.D. Thesis, Caltech, California, 1997. 24. C.S. Fraser, “Network design considerations for non-topographic photogrammetry”, Photogrammetric Engineering and

Remote Sensing, 50(8); 1115-1126, 1994. 25. S.F. El-Hakim, “A practical approach to creating precise and detailed 3D models from single and multiple views",

International Archives of Photogrammetry and Remote Sensing, Volume 33, Part B5A, Commission V, pp 122-129, Amsterdam, July 16-23, 2000.

26. Tompkins, P. Secrets of the great pyramid. Harper Colophon Books, New York, 1978. 27. S.F. El-Hakim, C. Brenner, G. Roth, "A multi-sensor approach to creating accurate virtual environments", ISPRS Journal

for Photogrammetry and Remote Sensing, 53(6), pp. 379-391, December, 1998.