image processing workflows for airborne remote sensing

14
Proceedings 5 th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 1 IMAGE PROCESSING WORKFLOWS FOR AIRBORNE REMOTE SENSING Jan Biesemans 1 , Sindy Sterckx 1 , Els Knaeps 1 , Kristin Vreys 1 , Stefan Adriaensen 1 , Jef Hooy- berghs 1 , Koen Meuleman 1 , Pieter Kempeneers 1 , Bart Deronde 1 , Jurgen Everaerts 1 , Daniel Schläpfer 2 , Jens Nieke 2 1. VITO, Department of Remote Sensing and Earth Observation Processes, Mol, Belgium; [email protected] 2. University of Zurich, Remote Sensing Laboratories, Department of Geography, Zurich, Swit- zerland; [email protected] ABSTRACT In support of the hyperspectral remote sensing campaigns and the APEX and MEDUSA sensor development projects, VITO has developed a dedicated experimental Central Data Processing Center (CDPC) for airborne earth observation data. The processing of imagery originating from airborne whiskbroom point scanners, pushbroom line scanners, photogrammetric digital frame cameras and video cameras encompasses following steps: (a) the archiving of the incoming im- age data, image metadata and telemetry (i.e. the generation of Level-1B user products), (b) the Level-2/3 processing which is constituted by geometric (orthorectification) and atmospheric correc- tion and (c) the Level-4 processing which generates image composites from multiple lower level scenes. All products are distributed as self-descriptive HDF5 files, containing the lossless com- pressed image data and all metadata for further processing. The Level-1B archiving workflow and the Level-2/3/4 processing workflow are the two main work- flows constituting the CDPC. These workflows are distributed software systems focused on the full automation of all processing steps. In case of disaster management applications, the workflows are able to generate the user products in near real-time (latency in the order of minutes). However, given the complexity involved in a state of the art atmospheric correction, it was necessary to de- sign the workflows as reactive software systems, reactive upon user/operator specific tuning of the atmospheric correction parameters. Within this paper an overview is given of the hardware and software technical layout of the workflows and their algorithmic components. INTRODUCTION The assembly of an experimental Central Data Processing Center (CDPC) for airborne remote sensing was triggered by the recurrent hyperspectral campaigns funded by the Belgian Federal Science Policy (BELSPO), the APEX project (funded by ESA-PRODEX) and the PEGASUS pro- gramme (funded by the Flemish Government and ESA-PRODEX). In preparation of the APEX operations and to support the scientific research with respect to the further downstream product development based on hyperspectral imagery, BELSPO has initiated since 2002 recurrent hyperspectral missions: CASI-SASI in 2002, CASI-ATM in 2003, HYMAP in 2004 and an AHS160 campaign in 2005. The Airborne Prism Experiment (APEX) is currently being built by a joint Swiss/Belgian consor- tium. The spectrometer consists of two sensors: one sensitive in the VNIR (Visible/NearInfraRed, 380-1000 nm) and one sensitive in the SWIR (ShortWave InfraRed, 930-2500 nm) wavelength range. The incoming light is dispersed onto 1000 spatial pixels across-track for both sensors, with 312 spectral rows in the VNIR and 195 spectral rows in the SWIR. Flexible, re-programmable binning on-chip allows summarizing the spectral bands into a maximum of 312 spectral rows for both detectors. The data throughput of the instrument is estimated at 500 Mbit/s.

Upload: ugent

Post on 05-May-2023

1 views

Category:

Documents


0 download

TRANSCRIPT

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 1

IMAGE PROCESSING WORKFLOWS FOR AIRBORNE REMOTE SENSING

Jan Biesemans1, Sindy Sterckx1, Els Knaeps1, Kristin Vreys1, Stefan Adriaensen1, Jef Hooy-berghs1, Koen Meuleman1, Pieter Kempeneers1, Bart Deronde1, Jurgen Everaerts1, Daniel

Schläpfer2, Jens Nieke2

1. VITO, Department of Remote Sensing and Earth Observation Processes, Mol, Belgium; [email protected]

2. University of Zurich, Remote Sensing Laboratories, Department of Geography, Zurich, Swit-zerland; [email protected]

ABSTRACT In support of the hyperspectral remote sensing campaigns and the APEX and MEDUSA sensor development projects, VITO has developed a dedicated experimental Central Data Processing Center (CDPC) for airborne earth observation data. The processing of imagery originating from airborne whiskbroom point scanners, pushbroom line scanners, photogrammetric digital frame cameras and video cameras encompasses following steps: (a) the archiving of the incoming im-age data, image metadata and telemetry (i.e. the generation of Level-1B user products), (b) the Level-2/3 processing which is constituted by geometric (orthorectification) and atmospheric correc-tion and (c) the Level-4 processing which generates image composites from multiple lower level scenes. All products are distributed as self-descriptive HDF5 files, containing the lossless com-pressed image data and all metadata for further processing.

The Level-1B archiving workflow and the Level-2/3/4 processing workflow are the two main work-flows constituting the CDPC. These workflows are distributed software systems focused on the full automation of all processing steps. In case of disaster management applications, the workflows are able to generate the user products in near real-time (latency in the order of minutes). However, given the complexity involved in a state of the art atmospheric correction, it was necessary to de-sign the workflows as reactive software systems, reactive upon user/operator specific tuning of the atmospheric correction parameters. Within this paper an overview is given of the hardware and software technical layout of the workflows and their algorithmic components.

INTRODUCTION The assembly of an experimental Central Data Processing Center (CDPC) for airborne remote sensing was triggered by the recurrent hyperspectral campaigns funded by the Belgian Federal Science Policy (BELSPO), the APEX project (funded by ESA-PRODEX) and the PEGASUS pro-gramme (funded by the Flemish Government and ESA-PRODEX).

In preparation of the APEX operations and to support the scientific research with respect to the further downstream product development based on hyperspectral imagery, BELSPO has initiated since 2002 recurrent hyperspectral missions: CASI-SASI in 2002, CASI-ATM in 2003, HYMAP in 2004 and an AHS160 campaign in 2005.

The Airborne Prism Experiment (APEX) is currently being built by a joint Swiss/Belgian consor-tium. The spectrometer consists of two sensors: one sensitive in the VNIR (Visible/NearInfraRed, 380-1000 nm) and one sensitive in the SWIR (ShortWave InfraRed, 930-2500 nm) wavelength range. The incoming light is dispersed onto 1000 spatial pixels across-track for both sensors, with 312 spectral rows in the VNIR and 195 spectral rows in the SWIR. Flexible, re-programmable binning on-chip allows summarizing the spectral bands into a maximum of 312 spectral rows for both detectors. The data throughput of the instrument is estimated at 500 Mbit/s.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 2

The PEGASUS (Policy support for European Governments by Acquisition of information from Sat-ellite and UAV borne Sensors) initiative, initiated by VITO (The Flemish Institute for Technological Research) in 2000, is based on very specific user needs, i.e. the need for image data of decimetre resolution for (a) large-scale photogrammetric and environmental mapping applications and (b) for event-monitoring purposes. This inherently requires a better coverage both in the space and time domain than current platforms can deliver. In response of these user needs, VITO proposes to use a High-Altitude Long-Endurance (HALE) Unmanned Aerial Vehicle (UAV) as innovative platform to conduct remote sensing from. This choice is based on following advantages of HALE UAV sys-tems: (a) longer mission lengths than traditional airborne systems, (b) continuous observation at high update rates of target spots, (c) high resolution imagery combined with regional coverage, (d) flexible trajectories (operates where there are no clouds) and (e) better photogrammetric quality by operating at high altitude since there is less geometric distortions due to smaller view angles.

For the Mercator-1 UAV platform that is currently being assembled, a Belgian consortium under the lead of VITO is building a RGB frame camera system within the framework of the MEDUSA ESA-PRODEX project (i). The sensor system of the MEDUSA payload is composed of two CMOS imaging devices (a PAN and RGB frame) each having a dimension of 10000 by 1200 pixels. Be-cause there is no on-board data storage, the data throughput is limited by the S-band downlink characteristics and has an upper limit of 20 Mbit/s. Since every 0.7 seconds each CMOS frame is read-out, the imagery will by lossy compressed by means of a JPEG2000 filter to fit the upper limit of 20 Mbit/s.

Given these impressive throughputs and given the requirement of fast or even near-real time prod-uct availability (disaster management, environmental security), VITO has invested in a dedicated experimental hardware system and the development of custom-made software workflows, to (a) simulate operations and (b) to enhance the development of algorithmic image processing compo-nents within an operational context. Furthermore, since the APEX and MEDUSA camera systems will trigger applied research for the production of better downstream user products, the CDPC workflows must allow for scientific experimentation without hampering the operational production system. This paper presents an overview of the CDPC hardware and software design.

Currently, the CDPC software system contains an archiving workflow (Level 0 to Level 1B) and a product processing workflow (Level 1B to Level 2/3/4). A description of the product levels can be found in Table 1. According the CEOS definition, products which map variables that are not di-rectly measured by the instruments (but are derived from these measurements), are classified under Level4. In the CEOS philosophy, this implies that before these maps can be produced, a Level3 product has to be generated first, i.e. a spatially and/or temporally resampled product (such resampling may include averaging and compositing). As illustrated in (ii), all algorithms in an opti-mal workflow perform their work on the raw scene geometry and image resampling to a certain projection system and projection datum is done at the very end. Therefore, as indicated in Table 1, the CEOS processing level definition is modified in support of the findings presented in (ii)

The current workflows contain two major algorithmic components: orthorectification and atmos-pheric correction. Since virtually all users of the CDPC generated products, use earth observation derived material for their assessment and modeling applications (amongst others: food security, vegetation dynamics, spatial epidemiology, land use and land cover mapping, indirect mapping of potential soil pollution, atmospheric modeling…), these corrections must be compliant to very spe-cific user requirements with respect to the models used to convert the raw sensor digital numbers to physical quantities (ii). This means that the software system must allow the users to submit their pre-processed in-situ measurements (e.g. columnar water vapor content, visibility, etc…) to further customize the Level-2/3/4 products. This paper will also present an overview of all algorithmic components currently integrated in the CDPC in support of the hyperspectral remote sensing cam-paigns. Note that these algorithmic components cover the current user requirements, but will not necessarily being used in the APEX Level2/3/4 processing workflows. For the APEX instrument, the algorithmic module selection will be the responsibility of the APEX Science Centre (ii).

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 3

Table 1: CDPC Product Levels. The product definitions were adopted from CEOS (Committee on Earth Observing Satellites, http://www.ceos.org/ ), however it is annotated where additions and modifications were necessary in support of the optimised workflow for the processing of hyper-spectral imagery.

Level 0 Raw data. If a product remains in a Level0 state, the product is featuring missing image data and/or image metadata.

Level 1B Raw user product – Level 1B data products are reconstructed, unprocessed instrument data at full resolution, time-referenced, and annotated with ancillary information, including radiometric and geometric calibration coefficients and georeferencing parameters, e.g., platform ephemeris, external and internal sensor orientation parameters, computed and appended but not applied. A bitmap quick look is added to the archive file.; any and all communications artifacts, e.g. synchronization frames, communications headers, duplicate data are removed. Platform/sensor exterior orientation parameters appears both in unprocessed and dGPS-corrected format (if base station GPS time-series were available). Consequently, the Level 1B file is a completely self-descriptive file, enabling for a full radiometric, atmospheric and geometric correction.

Level 2 User product – Level 2 data products are geometric and atmospheric corrected sensor values of individual scenes. The projection system and projection datum depends on the user request. The geometric correction is calculated but not applied (however, to fit the user needs, a Level2 HDF5 file generated by the CDPC will contain resampled and projected data layers). Addition to the CEOS definition: Following sub-levels are defined in the APEX workflow: Level 2A: geometrically indexed sensor data. Level 2B: 2A + surface reflectance (including correction for atmospheric BRDF). Level 2C: 2B + spectral albedo resulting from an additional correction for target BRDF.

Level 3 User product – Level 3 products are directly measured variables mapped on uniform space-time grid scales, usually with some completeness and consistency and are the result of combining multiple scenes to cover the user’s region of interest. Modification to the upper CEOS definition: Level 3 products are variables that are not directly measured by the instruments, but are derived from these measurements (e.g. aerosol parameters, vegetation parameters, vegetation species map, materials classification, snow parameters, water parameters and concentration of gases NOx, H2O, O2, O3). The geometric correction is calculated but also not applied. Under the CEOS definition, these products would be classified as Level4.

Level 4 User product – Level 4 data products are model output or results from analyses of lower level data (i.e. variables that are not directly measured by the instruments, but are derived from these measurements). The product is not necessarily a map, but can also be a table or figure. Modification to the upper CEOS definition: Level 4 products are variables mapped on uniform space-time grid scales, usually with some completeness and consistency and are the result of combining multiple scenes to cover the user’s region of interest.

METHODS This section presents an overview of the CDPC hardware system, the CDPC workflow software system, the Level0 tot Level1 algorithmic components and the Level1 to Level2/3 algorithmic com-ponents. The Level1 to Level2/3 algorithmic components described here, are currently being used for the product generation in support of VITO’s airborne remote sensing campaigns. For the APEX instrument, the Level2/3/4 algorithmic module selection will be the responsibility of the APEX Sci-ence Centre (ii).

CDPC Hardware System Currently, since the APEX and MEDUSA sensor systems are not operational yet, a dedicated ex-perimental hardware system is established to (a) simulate the operations and (b) to process the incoming imagery from the recurrent hyperspectral missions.

The CDPC experimental hardware system is a dedicated cluster of 25 dual processor machines (3.2 GHz Intel XEON) and about 45 TB iSCSI SAN (Storage Area Network) storage (see Figure 1). The hard disk arrays and the workstation nodes are interconnected via two 1 GBit/s iSCSI in-

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 4

terfaces and the partitions of the ARCHIVE (i.e. the archive and user-order database system, Figure 1) system are managed through the Linux LVM (Logical Volume Management) software, which allows for on-line (i.e. without having to shutdown the CDPC software system) enlargement of the storage capacity of the logical volume. The preferable operating system is Linux, however, for some dedicated third party software (e.g. Applanix POSPAC for the dGPS/IMU corrections) the Microsoft Windows operating system is necessary.

Given the expected data volume, the hard disk arrays are the systems main potential bottleneck, therefore the archive storage is separated from storage used during computations. Since it is no good practice running file serving daemons and computations on one node, every sub-cluster has a dedicated file serving node, mainly serving NFS (Network File System) links and iSCSI connec-tion(s). A fully configured fileserver will have three Ethernet based connections: one 1GBit/s iSCSI based storage connection and two 1GBit/s Ethernet connections.

By simulating non-stop airborne missions of several weeks, and thus putting the computation nodes and file servers under continuous severe stress, it was observed that the Ethernet inter-faces go down once in a while (see Figure 2). Choosing for other Ethernet interface hardware ven-dors and/or other types of Linux distributions does not help: under long lasting severe stress the probability of an Ethernet interface failure is considerable. Therefore, on every node, the Ethernet interfaces are continuously checked by a shell script to avoid them loosing connection. This script is automatically activated at boot time and from the stress testing it appeared to be robust enough to guarantee the needed production progress.

Figure 1: Layout of the experimental hardware system. DMZ is a ‘demilitarised zone’ where every network connectivity is controlled by a firewall. Note that the computing nodes can be reassigned to other sub cluster(s) according the ad-hoc processing needs.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 5

Figure 2. Ethernet failure events during continuous stress testing in March 2007.

CDPC Workflow Software System Given the volume of the expected data stream, introducing parallelism is inevitable to comply the user requirement of near-real-time product availability (in case of disaster management applica-tions). Since the processing of hyperspectral imagery or photogrammetric camera images is very data intensive, it was decided to combine the task/data decomposition pattern in combination with a master/worker program structure pattern to implement concurrency (iii).

Choosing for third party middleware packages to implement concurrency still requires job con-struction and job dependency to be programmed. Generally, third party middleware packages in-clude a supervisor unit. The supervisor controls the workstation farm centrally, and has to perform following tasks: (1) load balancing, (2) collect the accept/reject decisions from the workstations, and (3) system monitoring and management. Third party middleware software typically distributes jobs over the workstations that perform the jobs by spreading them evenly according to the com-puting power of the workstation: job pushing. The supervisor is thus a complex central point. As a result, it is difficult to avoid the supervisor from being a potential bottleneck performance-wise and reliability-wise (iv). Since there is a wealth of third party software, it is a challenging task to pick that software that will fit our future hardware and operating system layout.

An alternative for a central managed job pushing system is job pulling: the workstations can be made to request jobs from a job-queue at the moment they have got the CPU power available to process another job. Job pulling has the following advantages over job pushing: (a) load balanc-ing, (b) fault tolerance and (c) simplicity.

Load balancing: The load on a workstation strongly depends on the characteristics of the images being analysed. These characteristics only become clear during the image analysis. Job pulling results in a load-balancing scheme that takes the CPU load of each workstation into account. In case of job pushing, this is significantly more complex: the component that sends the job, typically has little information to determine the load of the workstation to which the job is pushed. Mecha-nisms that make the load information available to the supervisor are complex and will require third party middleware software. Job pulling inherently allows these differences in CPU time to be taken into account. Furthermore, it automatically adapts to the computing power of the workstation.

Fault tolerance: Workstations that have crashed of workstations where an Ethernet interface went down, are unable to request further jobs. Therefore, the load is automatically balanced over the remaining workstations that are operational. In case of job pushing, the supervisor needs a mechanism to determine whether workstations are operational or not.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 6

Simplicity: In case of job pulling, no details of the CPU power of the different workstations, or the types of jobs they are executing need to be known to the supervisor. Nor does the supervisor need to know which workstations it is supervising, and whether they are operational or not.

Given all upper considerations, following decisions were made: (a) third party middleware shall not be used to schedule the jobs, and (b) a job pulling strategy will be implemented conform the mas-ter/worker pattern using Java (v). The latter encompasses (see also Figure 3):

• The production of a very lightweight platform independent Java “worker” (i.e. they contain no intelligence, they simply ask the Master for a job and launce the C++, F77 or IDL algorithmic programs on the client to do the processing).

• The production of a platform independent Java “master” which decomposes the problem at hand in parts that can be executed concurrently (according the task/data decomposition pat-tern) and determines the dependency of all these smaller parts.

• The production of a lightweight platform independent Java “monitoring tool” which can com-municate with all running workers and the master on a subnet over a socket. This software module is intended to present the workflow operator, a quick overview of the workflow status.

An overview of the functional flow is presented in Figure 4. From that figure it can be derived that there will be two major workflows: a Level1B product generation workflow and a Level2/3/4 prod-uct generation workflow (processing workflow).

The “archiving workflow” produces in near-real-time self-descriptive Level1B image files from the incoming sensor data streams. These HDF5 files contain all relevant metadata besides the loss-less compressed image data, such as: sensor interior orientation, sensor exterior orientation as measured by the sensor integrated GPS and IMU, boresight angles (offset angles between IMU and sensor reference frames), raw and/or dGPS corrected IMU time series, sensor spectral re-sponse curves, orthorectified quick-looks. The production of self-descriptive Level1 files allows for (a) near real-time image consultation by the end-user after data acquisition and (b) delivers a start point for Level2/3/4 product generation (i.e. a further value-adding of the raw imagery, see ii). The dGPS/IMU component corrects the incoming IMU time series for drifts by using base-station GPS time series.

Figure 3: Message passing (over reliable TCP/IP sockets) distributed software system developed in Java according the Master/Worker and Task/Data decomposition patterns for parallelism and applied in the Archiving and Processing workflows.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 7

Figure 4: CDPC functional flow diagram.

The dGPS/IMU software system can be used for automatic or manual processing. In Flanders, VITO has a direct link towards the FLEPOS (composed of 37 fixed base stations, with a mean distance between the stations of 17 km, http://www.agiv.be/gis/diensten/flepos/) base station FTP server. For disaster management applications, the incoming imagery can first be archived using the raw IMU/GPS data and when the base station data becomes available (latency of approxi-mately 1 hour) the already archived files can be updated. During missions relying on ad-hoc in-stalled GPS base stations, the dGPS/IMU correction is manually executed by an operator before the Level1B file generation processes can start. If the imagery has a resolution in the order of decimetres, an IMU/GPS solution is not always enough. This requires then an additional (and cur-rently a rather time consuming) processing step, namely a semi-automatic block bundle adjust-ment using commercial software.

The “processing workflow” is capable of generating user configurable products. The user can browse the image database using a WWW interface towards this database. The same WWW in-terface can be used to define actions on the selected images, and these actions are then submit-ted. The WWW user interface then adds new records in the relevant tables of the database sys-tem. The master or masters (the master can be configured to only listen to orders submitted by certain users or user groups) who constantly check the database system for new incoming product orders, adjust their job queues when they detect new orders. The workers installed on the working nodes pull jobs from the master queue and return the process return value to the master.

The job queue and the job dependencies of the processing workflow are worked out according the principles described in (ii). Meaning that (a) all algorithms perform their work on the raw scene geometry and that image resampling to a certain projection system and projection datum is done at the very end and (b) that the layout of the processing scheme can be completely configured by the workflow operator.

Level2/3/4 products will only be generated on-demand and not on a continuous basis (except in the framework of dedicated service contracts).

Level0 to Level1 Algorithmic Components – dGPS/IMU correction If configured in “automatic” mode, the dGPS processing workflow takes as input already archived files and updates the raw exterior orientation parameters stored within these archive files. After the dGPS correction the archive files are labelled as full Level 1B. The workflow is designed conform the master/worker pattern and is permanently scanning the database for (a) Level1 files that are marked to still need a dGPS correction and (b) for GPS base station files. When relevant data sets are found, the raw IMU time series is corrected towards a so called “Smoothed Best Estimated

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 8

Trajectory” (SBET) file. The algorithmic core of the dGPS workflow is constituted by the Applanix POSPAC software package.

Level0 to Level1 Algorithmic Components – Sensor Radiometric and Geometric Calibration The sensor radiometric and geometric calibration encompasses the conversion from digital num-ber to at-sensor radiances (in physical units, such as W/m²/sr/µm) and the correction for smile and frown effects.

If the airborne missions are carried out with sensor(s) owned by third-party subcontractors, there is a serious dependency on the calibration efforts of these subcontractors. From our experience, the incoming data is often contaminated by spectral shifts in central wavelengths. Spectral instrument changes accumulate during flight due to changes in instrument temperature and pressure and misalignment induced by mechanical vibrations. With hyperspectral sensors, which have narrow bandwidths, small shifts of the central wavelength of those channels located in atmospheric ab-sorption regions will lead to errors in the retrieved reflectance spectra. These errors are visible as spikes and dips near the atmospheric gas absorption features. An example is given in Figure 5.

For the calibration of the APEX instrument with hundreds of very narrow spectral bands, the us-ability of the data strongly depends on the calibration. Reference is made to (vi) for a full descrip-tion of the APEX calibration strategy. RSL has made dedicated APEX image calibration software, which will be integrated in the operational VITO processing chains during the AIT (Assembly, Inte-gration & Testing) phase of the APEX contract.

Figure 5: CASI reflectance spectra of sand pixels : (left) original, wrong wavelength calibration file: too high reflectance, changing the aerosol type and/or visibility didn’t solve the problem; (right) after recalibration of the sensor by the subcontractor: reflectance spectra in blue region are correct and less spikes are observed near the absorption bands (however still some spikes are visible near the O2 band).

Level1 to Level2/3 Algorithmic Components – Direct Georeferencing Direct georeferencing (DG) is the direct measurement of the position (by means of a GPS) and orientation parameters (by means of an IMU) of a sensor and it is a technique increasingly used in airborne mapping applications because of its economical advantages (vii). For the validation of the DG methodology, ADS40 (pushbroom line scanner), CASI2 (pushbroom line scanner), CASI550 (pushbroom line scanner), AHS160 (whiskbroom point scanner), HYMAP (whiskbroom point scan-ner), DSS (frame CCD camera) and UltraCamD (frame CCD camera) images were used.

Figure 6 presents a graphical overview of the output of the DG orthorectification module integrated in the CDPC. The output grid features 8 information layers, containing all pixel dependent position and viewing geometry parameters of importance for the atmospheric correction of the observed radiance. All these data layers are stored in the Level2/3 HDF5 product file.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 9

Figure 6: The eight output layers of the CDPC direct georeferencing module.

Level1 to Level2/3 Algorithmic Components – Atmospheric Correction The objective of any atmospheric correction is the extraction of physical earth surface parameters such as reflectance, emissivity and temperature (viii). The solar radiation on the top of the atmos-phere is known, the at-sensor radiance is measured and the atmospheric influence on radiances is traced back by a radiative transfer code (RTC) and finally the surface radiance of each pixel on the earth surface is calculated. Within the CDPC, the MODTRAN4 RTC model (ix) is integrated. With respect to the physical and technical details of the procedure, reference can be made to (x), (ix)and (xi). The latter references contain an elaborated description how an RTC can be used to calculate correction coefficients to determine the at-surface reflected and emitted radiation for land and the sub-surface reflected radiation for water.

The integrated atmospheric correction technology is equivalent with the ATCOR4 theory (viii) and is internally identified by the term WATCOR. WATCOR contains an extension of the ATCOR4 model, which features an additional correction procedure over salt and fresh water bodies. Full reference can be made to (xi) for the technical details concerning this feature.

After the geometric correction, the orientation of every pixel relative to the sun and the sensor po-sition is known (Figure 6). Based on these orientation parameters and the optional in-situ meas-ured atmospheric parameters, MODTRAN4 can be configured and executed, resulting in the nec-essary information to apply the correction. If spectral measurements are available of clearly identi-fiable targets, atmospheric corrected imagery can be iteratively created by the CDPC operators. In such iterative process, MODTRAN4 parameters describing the atmospheric composition (e.g. visi-bility, water vapour content and aerosol type) are altered to better describe the atmosphere, and this is done until the calculated target reflectance is in good accordance with the measured reflec-tance.

The geometry is pixel-dependent. In principle the atmospheric correction should be performed for each pixel independently, but this is practically not possible with respect to the necessary amount of computing time. As a workaround one often uses pre-calculated look up tables (LUT): these are produced by running the RTC for a discrete set of samples in the geometry space and saved to

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 10

disk. Later in the atmospheric correction the LUT is combined with an interpolation technique. LUT values depend on the atmospheric state and are sensor-dependent due to the specific spectral bands. Hence, the pre-calculated LUT approach is non-generic. Therefore, the atmospheric cor-rection in the CDPC is equipped with a direct interpolation method: for each image and each spec-tral band a band specific and image geometry specific LUT is created in memory during the at-mospheric correction. A number of samples is taken from the relevant geometry space, for these samples a number of RTCs are executed just before the atmospheric correction, which is per-formed by interpolating the RTC results in the geometry space.

Hence, the CDPC does not use the traditional approach of a disk-stored LUT, but performs the MODTRAN4 configuration “on the fly”: during the image processing MODTRAN4 configuration files are created, the needed parameters are determined by the given image geometry and possi-ble in-situ measurements, the MODTRAN4 runs are performed and finally the MODTRAN4 output is used to calculate the atmospheric correction using the approach of (viii) and (xi).

To enable a correct correction in the water absorption bands, the methodology of (xii) is inte-grated. In the absence of in-situ observations, this algorithm determines the columnar water va-pour concentration. This information is then used in the subsequent atmospheric correction. If this algorithm cannot be applied because of the spectral configuration of the sensor and in absence of in-site observations, a default value of 1.5 g/cm² will be used. Of course, the operator can change this default value when the image processing order is being composed in the WWW interface.

The visibility parameter is a crucial parameter in the atmospheric correction of visual and near in-frared spectral bands. For the automatic determination of the visibility, the methodology of (xiii) can be used. However, this algorithm does not always converge to a solution and if no pixels in the image result in a valid visibility value and if no in-situ measurements are available, the default value of 23.0 km will be used. Again, the operator can change this default value during the image ordering process.

The ATCOR4 LUT has a limited number of dimensions (generally 6 dimensions): sensor elevation, ground elevation, solar zenith, water vapour, visibility and aerosol type. Since MODTRAN4 has about 176 tuneable parameters, the major advantage of running MODTRAN4 ‘on-the-fly’ is that in principle all 176 parameters can be customized. Consequently, all MODTRAN4 functionality be-comes available for the research expert operating the CDPC by submitting his custom parameters to the WWW interface towards the image database system.

Within the Level1B to Level2/3/4 processing workflow, the user can activate or deactivate the op-tions to take into account the viewing geometry (view azimuth, solar azimuth, view zenith, solar zenith). Viewing the same object from various angles may lead to significant variations of the path scattered radiance component, an effect known as atmospheric BRDF (xiv). An example is shown in Figure 7.

Within the CDPC, a simple nadir normalization algorithm is available. For pushbroom line-scanners and whiskbroom point scanners, this algorithm calculates first the column means for all or a selected set of rows in the at-surface reflectance image of the spectral band to correct. Then, a second order polynomial trend line is fitted using the column number as independent variable and the column means as dependent variable. The correction factors can then be determined ac-cording following equation:

)nadir_c(PolyR

)c(PolyR)c(CF =

wherein, CF is the resulting correction factor as function of the column number c, and PolyR is the column averaged mean as determined by the fitted polynomial trend line and c_nadir is the column having the smallest view angle. The nadir-normalized reflectance value for every column can then be calculated using:

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 11

)c(CF)c,r(R)c,r('R =

wherein, R’(r,c) is the normalized reflectance at row r and column c and R(r,c) is the original reflec-tance value.

However, although this nadir normalization algorithm is good to enhance the viewing quality of an RGB bitmap, it can be hardly considered as a full fletched target BRDF correction. State-of-the-art BRDF correction encompasses: (a) LULC (Land Use Land Cover) classification and (b) combining the classification map with the class-specific BRDF functions and the viewing angles (view zenith, solar zenith, view azimuth and solar azimuth). This algorithmic feature is currently being developed to be integrated in the CDPC algorithmic software.

Figure 7: Atmospheric BRDF effects in an AHS160 image (050618_HYECD_P01MX) with a solar zenith of 40.7°, a view zenith ranging from 3.7° up to 49.9°, a solar azimuth of 242.5° and a rela-tive azimuth ranging from -148.7° up to 27.6°. The dependent axis displays the ratio of the image column averaged values of the viewing geometry independent atmospheric correction (Nadir) over the viewing geometry dependent correction (Off-nadir).

Level2/3 to Level4 Algorithmic Components The workflow operator/user can select from within the WWW interface towards the image data-base a custom projection system, projection datum and image resampling method (nearest neighbour, bilinear or cubic interpolation). Currently, there are no algorithms integrated in the ex-perimental CDPC allowing for the automatic and sensor-generic generation of image composites. At the moment of this writing, composite generation modules are still under development.

RESULTS

Archiving Workflow Stress Test To test the near real-time product delivery capabilities of the hardware and software system, two stress tests were performed:

• A two-week non-stop UAV mission with a PAL video camera was simulated with following properties: 25 frames/sec, 720 x 576 pixels/frame, 30 Mbit/s MPEG compressed data stream.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 12

The archiving actions were: Level 1B HDF5 generation and orthorectification of 1 frame/sec at full resolution).

• A two-week non-stop UAV mission was simulated with a medium format photogrammetric camera (i.e. imagery from an Applanix DSS camera) having following properties: 1 frame per 5 seconds (throughput of 80 Mbit/s), 4092 x 4077 pixels/frame, 25 cm ground resolution. The ar-chiving actions were: Level 1B HDF5 generation and orthorectification at 1 meter resolution.

For each test, there was one FTP node that was constantly pushing image data and image meta-data towards the input folder system of the archiving workflow and there was 1 master node (con-stantly checking for new incoming data sets) and 5 worker nodes (which ask the master for data sets that are ready to archive). Upon the successful generation of the HDF5 archive file, the data-base system containing the list of products, was updated, and the imagery was then available for the user through a WWW interface to the image database.

The latency between sending and the actual product availability on the Internet was in the order of minutes. The only problem that occurred during these system stress tests was that once in a while an Ethernet interface went down (see Figure 2). However, the software system and hardware sys-tem was designed such way to automatically recover from these failures.

Upon failure of an Ethernet interface, the running processes on the machines which are affected by this failure enter a “wait” status, and resume their activity when the interface becomes alive again. As such, an Ethernet interface failure event does not automatically generate a job failure. Although the master software can be configured to reschedule a failed job, the masters are cur-rently not configured this way. This to enable an operator to fully analyse the probability of job fail-ure of the workflow in a simulated operational settings and the type of failures. Further stress test-ing will be done to optimise the strategies for job-rescheduling after job failure. During both tests, no more than 20 incoming images were placed in the “failed folder structure” (i.e. the folder whereto the master moves the files upon unsuccessful HDF5 creation). An operator simply moves these file sets back to the input folder system, which makes them available again for archiving.

During these stress tests it was also analysed how the system reacts on an on-line enlargement of the archive LVM managed partitions. The on-line enlargement of 1 TB caused an file system resiz-ing process on the file server for about 1 hour. During this process, the bandwidth to the file sys-tem decreased and the workers could not follow in incoming data rate. However, after 4 hours, the arrears were made up.

Processing Workflow Stress Test Table 2 presents the stress test result of the Level1 to Level2/3/4 processing workflow based on the full AHS160 dataset from the summer 2005 campaign funded by BELSPO. Using 5 dual proc-essor worker nodes, the total necessary processing time was about 1 day and 2 hours.

Since all algorithms are rather I/O bound, the worker software module installed on each of the worker nodes was configured to maximally allow three jobs to run concurrently, this to balance the disk I/O. As such, no more than 15 jobs could run concurrently over the entire cluster used in this experiment.

Note that the orthorectification of one image is subdivided in more than one job (data decomposi-tion pattern, in this test 1 job per 500 scan lines). Upon success of all orthorectification jobs, the results are appended in one output raster. Also remark the significant difference between the maximum needed processing time (22:14) and the mean needed processing time (03:18). During this stress test, the Ethernet interfaces went down several times (but could always be brought alive without operator interaction), causing considerable delays for some jobs, but none of the jobs failed.

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 13

Table 2: Stress test timing result (in hours:minutes:seconds) of the Level1B to Level2/3/4 process-ing workflow. The processing cluster was composed of 5 dual processor worker nodes (3.2 GHz Intel XEON). The dataset was composed of 53 AHS160 images with 63 spectral bands in the VNIR and SWIR. There are 750 columns in a scan line and the total dataset equals 287872 scan lines.

Job type Quantity Mean Maximum TotalFile Copies 159 00:05:06 00:22:53 13:33:28 Extract metadata from Level1B HDF5 file 3604 < 1 sec 00:00:10 00:35:44 Extract image data from Level1B HDF5 file 3339 00:00:08 00:03:31 07:57:05 Orthorectification (using LIDAR DEMs) 590 00:03:18 00:22:14 32:30:04 Append binary files 106 00:00:02 00:00:27 00:03:34 Append binary GIS grids 53 00:01:20 00:10:13 01:11:19 Image-based visibility extraction 53 00:20:15 01:07:21 17:53:47 Image-based water vapour extraction 53 01:01:25 01:48:33 54:15:39 Atmospheric correction 3339 00:01:29 00:10:55 83:23:32 Image resampling 3339 00:00:35 00:11:28 32:31:43 Level2 HDF5 file creation 53 00:19:06 00:42:25 16:52:56 Total 14688 10 days, 20 hours, 49 minutesTotal using 5 dual processor nodes (3.2 GHz XEON) 1 day, 2 hours, 5 minutes

CONCLUSIONS

The intention of this paper was to give a comprehensive overview of all major aspects involved in the development process towards operational image processing workflows for airborne remote sensing. Especially for hyperspectral sensors, users have very specific requirements with respect to the algorithms used in the generation process of their products. For these sensors, it is of ut-most importance that the processing workflows allow for users tuning of the modelling parameters based on their in-situ measurements or field knowledge. With the software and hardware system described in this paper, VITO has demonstrated that it has a production chain which allows for the on-demand generation of user customisable products. Furthermore, since the workflow software system allows for multiple masters (the masters can be configured to only handle orders submitted by specific users or user groups), one can separate the standard operational activities from the scientific experimentation. However, both the R&D scientists and the production operator will have the same look-and-feel and have the same image database at hand to select images from and to define processing actions on these images. This significantly enhances the scientific experimenta-tion possibilities (which will be needed to fully exploit the information delivered by the APEX sen-sor), where the scientist can directly experiment with his newly developed algorithms in a semi-operational environment. This feature will enhance the productivity with respect to the integration of R&D outputs in an operational environment where the fault tolerance has to be restricted to an absolute minimum.

The experimental CDPC can handle all currently available digital aerial sensors such as hyper-spectral pushbrooms or whiskbrooms and photogrammetric frame cameras. The CDPC is heavily dependent on the DG method for its georeferencing. If good quality LIDAR DEMs or DSMs are available and the image resolution is in the order of meter(s), pixel or sub-pixel accuracy can be obtained with dGPS corrected IMU time series.

However, if the resolution of the imagery is in the order of decimetres, DG is currently not always sufficient to obtain sub-pixel spatial accuracy. Block bundle adjustment, automatic tie-point gen-eration algorithms and surface model generation method are needed to generate products of pho-togrammetric quality that can be used in high precision mapping applications. It is theoretically possible to fully automate these procedures, and the MEDUSA project is aimed to deliver the sen-

Proceedings 5th EARSeL Workshop on Imaging Spectroscopy. Bruges, Belgium, April 23-25 2007 14

sor allowing for further experimentation in the automated photogrammetric production workflow. This will be one of the major development actions planned in the near future.

REFERENCES i Van Achteren T, B Delauré & J Everaerts, 2006. Instrument Design for the PEGASUS HALE

UAV Payload. In: 2nd International Workshop "The Future of Remote Sensing", (ISPRS Inter-Commission Working Group I/V Autonomous Navigation, Antwerp, Belgium).

ii Schläpfer D, J Nieke, F Dell'Endice, A Huene, J Biesemans, K Meuleman & K I Itten, 2007. Optimizing the workflow for APEX Level2/3 processing. In: 5th EARSeL Workshop on Imaging Spectroscopy, (Bruges, Belgium).

iii Mattson T G, B A Sanders & B L Massingill, 2004. Patterns for Parallel Programming (Addi-son-Wesley) 355p.

iv Boosten M, 2003. Fine-Grain Parallel Processing on a Commodity Platform: a Solution for the ATLAS Second Level Trigger (Ph.D. Thesis, Eindhoven University of Technology) 243p.

v Biesemans J & J Everaerts, 2006. Image Processing Workflow for the PEGASUS HALE UAV Payload. In: 2nd International Workshop "The Future of Remote Sensing", (ISPRS Inter-Commission Working Group I/V Autonomous Navigation, Antwerp, Belgium).

vi Performance and Calibration requirements for APEX, ESA/ESTEC contract no 14906/00/NL/DC, 2000- 2001.

vii Honkavaara E, 2004. Calibration in Direct Georeferencing: Theoretical Considerations and Practical Results. Photogrammetric Engineering and Remote Sensing, 70: 207-1208.

viii Richter R, 2004. Atmospheric/Topographic Correction for Airborne Imagery. ATCOR-4 User Guide Version 3.1 (DLR, Wessling, Germany) 75p.

ix Berk A, G P Anderson, P K Acharya, J H Chetwynd, L S Bernstein, E P Shettle, M W Mat-thew, & S M Adler-Golden, 1999. MODTRAN4 User’s Manual (Air Force Research Laboratory, Hanscom, USA) 93p.

x De Haan J F & J M M Kokke, 1996. Remote sensing algorithm development toolkit I Opera-tionalization of atmospheric correction methods for tidal and inland waters (Netherlands Re-mote Sensing Board (BCRS) publication. Rijkswaterstaat Survey Dept. Technical Report) 91p.

xi The T H, 2004. Handleiding WATCOR versie 2. (Metropolis Software, Technical Consultancy Report for VITO) 31p.

xii Rodger A & M J Lynch, 2001. Determining atmospheric column water vapour in the 0.4-2.5 µm spectral region. In: Proceedings of the AVIRIS Workshop 2001 (Pasadena, California, USA).

xiii Richter R, D Schläpfer & A Müller, 2006. An automatic atmospheric correction algorithm for visible/NIR imagery. International Journal of Remote Sensing, 27(10): 2077-2085.

xiv Schläpfer D & J Nieke, 2005. Operational Simulation of At Sensor Radiance Sensitivity using the MODO/MODTRAN4 Environment. In: 4th EARSeL Workshop on Imaging Spectroscopy, edited by B Zagajewski, M Sobczak & M Wrzesień (EARSeL, Warsaw, Poland), 561-569.