spie proceedings [spie photonics asia - beijing, china (monday 5 november 2012)] optoelectronic...

An illumination and affine invariant descriptor for aerial image registration

Zhaoxia Liu * a, Yaxuan Wang a, Yu Jing a, Jing Zhao a and Jingjing Wangb

a School of Software, Dalian University of Foreign Languages, Dalian, P.R.C, 116044 b Vocational Education Department, Liaoning Police Academy, Dalian, P.R.C, 116036

ABSTRACT

An illumination and affine invariant descriptor is proposed for registering aerial images with large illumination changes and affine transformation, low overlapping areas, monotonous backgrounds or similar features. Firstly, triangle region is detected by K-nearest neighbors (K-NN) graph of initial matched result by Scale-Invariant Feature Transform (SIFT). In order to improve the accuracy, region growth is applied to boost small and slender triangles. Then illumination and affine invariant descriptor is defined to describe triangle regions and measure their similarity. The descriptor named as IIMSA is the combination of MultiScale Autoconvolution (MSA) and multiscale retinex (MSR). The performance of the descriptor is evaluated with optical aerial images and the experimental results demonstrate that the proposed descriptor IIMSA is more distinctive than MSA and SIFT.

Keywords: feature descriptor, illumination and affine invariant, feature matching

1. INTRODUCTION Image registration has been widely applied in many fields such as remote sensing, medical image analysis, cartography, computer vision and pattern recognition [1]. One of the critical procedures involved in the image registration is finding the correspondences by the description of the feature. Although there are lots of researches have introduced some descriptor for establishing reliable matches, mismatches remain in image registration especially for the images contain repetitive patterns.

Krystian Mikolajczyk and Cordelia Schmid compared the performance of several descriptors and concluded that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT-based descriptors perform best[1]. However, although SIFT is the best descriptor, it does not work much well on aerial image captured on the sea, which are monotonous, and have similar feature in the reference or sensed images.

It is necessary to take the information of the region around the feature points into consideration to deal with the ambiguity in feature point matching. Rahtu et al. proposed a new affine invariant descriptor called Multiscale Autoconvolution (MSA), which applied standard point-based invariants and combined with probabilistic ideas to describe regions[3]. However, in MSA, nonuniform illumination distortion is not considered sufficiently.

Considering the aforementioned pros and cons, we propose an illumination and affine invariant descriptor, which is used for describing the triangle region obtained by K-nearest neighbors (K-NN) graph. There are some small and slender triangles that will lead to error applying interposing. In order to improve the accuracy, region growth is adopted. Then illumination and affine invariant descriptor is defined to describe triangle regions and measure their similarity. The description is the combination of MSA and multiscale retinex (MSR). In the descriptor, illumination invariant is obtained by MSR removing the illuminance of the image, and then reflective image is obtained and replaces the original image. The MSR algorithm is an extension of the single scale retinex algorithm proposed by Jobson et al.[4], which is based on a model of the lightness and color perception of human vision. Experiments demonstrate that the new descriptor is distinctive for the aerial images with low overlapping areas and monotonous backgrounds or similar features.

2. AFFINE REGION GENERATION Graph structure can be constructed by Delaunay triangulation. The initial triangulation graphs are similar only if the nonrigid transform is slight, so it is not suitable for large affine transformation. However, K-NN graph and the corresponding adjacent graph are invariant to affine transformation[5]. In this paper, they are used for describing the

Optoelectronic Imaging and Multimedia Technology II, edited by Tsutomu Shimura, Guangyu Xu, Linmi Tao, Jesse Zheng, Proc. of SPIE Vol. 8558, 85580Z · © 2012 SPIE

CCC code: 0277-786/12/$18 · doi: 10.1117/12.999387

Proc. of SPIE Vol. 8558 85580Z-1

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 10/04/2013 Terms of Use: http://spiedl.org/terms

structures of the points. Suppose two point sets from two affinely transformed images are denoted by 1 2{ , ,..., ,..., }i nP p p p p= and { }1 2, ,..., ,...,i nQ q q q q= respectively, where ip and iq are the initial corresponding points.

Two graphs PG and 'QG are established according to the method described in[5] and shown in figure 1, where PG is the K-

NN graphs constructed by point set P and 'QG is PG ’s corresponding adjacent graph constructed by point set Q . The two graphs are invariant to affine transformation.

Figure 1 Initial feature points and their graphs.

The feature points ip and iq are local extreme points. The intensity distribution of the regions around the points is uneven and contains plenty of information, so are the regions in the triangles constituted by those points. Consequently, the triangle regions in the two graphs are used for evaluating the similarity of the corresponding points. They are considered as affine regions since the graph structures are invariant to affine transformation.

3. REGION GROWTH It is inevitable that there will always be some small and slender triangle in real application. Like what the Figure 2 illustrates, the pixels on the edge may be not that accurate applying interposing, which make the intensity distribution of the corresponding triangle in different image be quiet different. In this condition, three means of region growth for affine transformation are designed which is invariant to affine transformation to improve the accuracy. Since the regions covered by triangles constituted by the feature points are uneven and full of information, region growth will inevitably increase the difference between the triangle regions that does not match, which is another advantage of region growth. And the errors of the matched triangles are reduced after the triangles are grown.

Three ways for region growth is used for three types of special triangles as shown in Figure 3. The corresponding triangle in reference image and sensed image grow at the same time in the same proportion according to the following ways.

Firstly, as shown in Figure 3 (a), if two of the three angles are smaller than 15 degree in a triangle, the central line started from the biggest angle is extended at the speed of (1/ 2)n , where n is the smallest integer that keeps the end of the line inside the image after region grown. If the two angles are smaller than 10 degree, the three points are considered to be collinearity and the RAMI will not be calculated, so the triangle does not need to be expanded.

Secondly, if only one angle is smaller than 15 degree, shortest edge is extended as shown in Figure 3 (b).

Thirdly, if all of the angles of the triangle are bigger than 15 degree but the area is small, the line that connecting the center and three vertexes is extended at the speed of (1/ 2)n as shown in Figure 3 (c). Specially, if the triangle is at the edge of image, it will be expanded to the center of the image. Triangle will keep on growing by the above growth measures until all the angles of the corresponding triangle is big enough.

Region growth stops if any vertex of the two triangles happens to be out of image. So no matter how much the corresponding triangle grows, the triangles would not be out of the overlapping area. Figure 4 shows the process of region growth in real images. In Figure 4, (a) is the process of that one angle is smaller than 15 degree, two pink lines and yellow line compose the original triangle, the second and the third ways are applied. (b) is the process of that two angles are smaller than 15 degree, two yellow lines and one pink line compose the original triangle, the first and the third



_//

.4M

measures are applied in this original triangle. (c) is the same as (b), two yellow lines and one blue line compose the original triangle, but all the three measures are applied because one angle is still less than 15 degree after extending center line.

Figure 2 interposing the pixel on the edge of too slender triangle can reduce the accuracy of mass, which needs to be improved by region growth.

(a) (b) (c)

Figure 3 three means of region growth, red line represent the original triangle, black dashed line represent the changed edge of triangle.

(a) (b) (c)

Figure 4 the process of region growth in different situations.

4. ILLUMINATION AND AFFINE INVARIANT DESCRIPTOR

Define the affine transformation { },A A T t= by ' ( )x A x Tx t= = + , where 2,t x IR∈ and T is a 2×2 nonsingular matrix.

Suppose ( )f x is an image intensity function corresponding to a gray-scale image in 2IR . We may apply an affine

transformation A and illumination changes G to this image, which gives a new gray-scale image in 2IR with the image function ( ')g x , where ( ') ( ( ( )))g x G f A x= , we call g affine and illumination transformed version of f .

For affine transformation, Rahtu et al. presented an area based affine invariant descriptor MSA[3], which comes from the idea that the corresponding regions have the same intensity distribution and their mathematic expectations are invariant to affine transformation. MSA is given by the following equation.

2 2 2

1

IR IR IR2 2 2 3

1 1( , ) ( ) ( ) ( ) ( )L

x y u x yF f u f f f dxdyduf

α βα β γα β γ

− −= ∫ ∫ ∫ (1)

where f is an image intensity function.



After Fourier transform, equation (1) become the products from convolutions as follows.

2

^ ^ ^ ^

IR3

1( , ) ( ) ( ) ( ) ( )ˆ (0)F f f f f d

fα β ε αε βε γε ε= − −∫ (2)

where 2

^2

IR( ) ( )j xf e f x dxπ εε −= ∫ .For more detailed definitions of the MSA, the readers can refer to literature[3].

MSA, which is based on intensity distribution, is suitable for matching images with affine transformations. However, it is likely to be affected by large nonuniform illumination changes, which may cause the performance of MSA unstable.

Here, we will give the definition of illumination and affine invariant descriptor. In general, an image ( , )I x y is regarded as product ( , ) ( , ) ( , )I x y R x y L x y= , where ( , )R x y is the reflectance and ( , )L x y is the illuminance at each point ( , )x y [6]. In general, computing the reflectance and the illuminence fields from real images is an ill-posed problem. Therefore, various assumptions and simplifications about L , or R , or both are proposed with the attempt to solve the problem. Jobson proposed a multiscale surround retinex and defined a method of color restoration to get the reflectance R from input image I , which is introduced in [4] and are given by the following equation:

1

N

MSR n nn

R w R=

= ∑ (3)

where, MSRR is the MSR output and invariant to illumination to a certain extent, so it can be used for representing the original image. ( , ) log ( , ) log[ ( , ) ( , )]n nR x y I x y F x y I x y= − ∗ is the retinex output of nth scale, where

2 2/( , ) nr cnF x y Ke−= is the surround function. nw is the weight associated with the nth scale.

The new illumination and affine invariant descriptor is defined as follows:

2

^ ^ ^ ^

IR3

1( , ) ( ) ( ) ( ) ( )ˆ (0) MSR MSR MSR MSRF R R R R df

α β ε αε βε γε ε= − −∫ (4)

A vector { }1 1( , ),......, ( , ),......i iD F Fα β α β= is obtained by varyingα , β and γ , which should satisfy 1α β γ+ + = . In the following experiment, ( α , β ) is equal to(0,-0.5), (0.4,0.8), (-0.2,0.9) and (0.3,-0.5)．The vector D is a four dimensional vector and used as an affine descriptor to describe image region and evaluate their similarities.

5. EXPERIMENTAL RESULTS In this paper, we use visible images to evaluate the performance of the proposed IIMSA as shown in Figure 5 which were taken in the event of unprecedented amount of sea lettuce at TsingTao, China just one month before the sailing competition in 2008 Summer Olympic Games. K is set to 4 empirically. Experimental results show that the proposed descriptor is feasible to match point sets from different images.

The sea lettuce images have low overlapping area, contain a lot of outliers in non-overlapping area and have illumination changes. From Figure 5, it can be seen that initial graph of initial matched result by SIFT contain many outliers. They are removed by comparing their IIMSA descriptor. However, MSA failed in this situation. That is because that the distinctive power of MSA is limited to nonuniform illumination changes. From the experimental results, it can be seen that IIMSA can successfully match these feature points for images with monotonous background, low overlapping area, illumination changes transformation or a lot of outliers.



+ +

+

(a)

(b)

(c)

Figure 5 Initial graph and matching results of IIMSA and MSA. (a) shows initial graphs. (b) shows final graphs of IIMSA. (c) shows final graphs of MSA.

6. CONCLUSION In this paper, we have proposed a new illumination and affine invariants descriptor IIMSA for registering aerial images with large illumination changes and affine transformation, low overlapping areas, monotonous backgrounds or similar features. In this algorithm, triangle regions are obtained by K-NN graph and its corresponding adjacent graph. In order to improve the accuracy, region growth is applied to boost small and slender triangles. Then an illumination and affine invariant descriptor is defined to evaluate the similarity of these regions. Successfully tested with optical aerial images, the proposed descriptor has demonstrated its excellent ability to match point pair sets with many outliers from images with large illumination changes, similar patterns, low overlapping areas and affine transformation. Compared with MSA, the proposed descriptor has attested higher precision and is more robust to illumination changes.

ACKNOWLEDGMENTS

This work is supported by the National Natural Science Foundation (No. 61201454) of P. R. China.

Initial graphs

IIMSA

MSA



REFERENCES

[1] Zitova B, and Flusser J., “Image registration methods: a survey,” Image and Vision Computing, 21 (11), 977–1000 (2003).

[2] Mikolajczyk K, Schmid C., “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 27 (10), 1615-1630 (2005).

[3] Rahtu E, Salo M, Heikkila J., “Affine Invariant Pattern Recognition Using Multiscale Autoconvolution,” IEEE Transactions on Pattern Analysis and Machine Intelligence,.27 (6), 908-918 (2005).

[4] Jobson D.J., Rahman Z. and Woodell G.A., “A multiscale retinex for bridging the gap between color images and the human observations of scenes,” IEEE Transactions on Image Processing, 6 (7), 965–976 (1997).

[5] Liu Z., An J., Jing Y., “A Simple and robust feature point matching algorithm based on restricted spatial order constraints for aerial image registration,” IEEE Transactions on Geoscience and Remote Sensing, 50 (2), 514-527 (2012).

[6] Horn B. , Robot Vision, MIT Press (1986).



spie proceedings [spie photonics asia - beijing, china (monday 5 november 2012)] optoelectronic...

Documents