Transcript
Page 1: 1 Semantic Geo-Image Classification Using Image ... Semantic Geo-Image Classification Using Image Processing Techniques Annai College Of Engineering & Technology Kumbakonam Department

1

Semantic Geo-Image Classification UsingImage Processing Techniques

Annai College Of Engineering & Technology KumbakonamDepartment of Information TechnologyGuided by:Mrs.S.vidivelli ME.,Phd.,(AP/IT).

Prabha.M([email protected] ) , Sabitha.M([email protected]) ,Vinothini.K([email protected]) ,

Abstract:

Satellite image classification is one of the most significant applications

in remote sensing. Remote sensing data obtained from different optical

sensors have been commonly used to characterize and quantity land

information. However, conventional optical remote sensing is limited

by weather conditions. Synthetic aperture radar (SAR), with the all-

weather and all-time advantages, is important in the domain of earth

observation. Polarimetric SAR (PoISAR) images can provide more

target information and facilitate improvement of the land cover

classification accuracy. Therefore, land cover classification for

PolSAR images is important in remote sensing, especially for those

areas that change drastically with season. This project implement

algorithm for the land classification for the PoISAR images. We can

propose multilevel semantic features approach to extract the high level

features such as Entropy/Anisotropy/Alpha values. And implement

physical scattering properties and implement Latent Dirichlet

allocation scheme to discover high level semantics to provide

histogram for each pixels. Finally implement KNN classification to

classify the PoISAR images with various class labels such as water,

land and other properties. Experimental results validate the feasibility

of the proposed method for land cover classification of the various

places, Le., the overall accuracy reaches up to 90.91 %, while that for

the method based on the Wishart distance is 85.01 %, which exhibits

the superiority of the proposed method over state of art classification in

the various geo spatial data.

I. INTRODUCTION

HE multitude of modern remote sensing sensors allowsus to analyze tremendous amounts of high-resolution

earth observation (EO) images. Therefore, developing newcontent-based image retrieval (CBIR) systems being able toextract user-desired information from existing image data-bases is highly demanded. In the state-of-the-art literature,various CBIR systems have been proposed for EO imagemining, such as Intelligent Interactive Image KnowledgeRetrieval System (I3KR) [1], Knowledge-Driven InformationMining System [2], and its accelerated variant [3]. For theexisting systems, semantic image interpretations are usuallyprovided through either manual image annotation or useracceptance (in active learning scenarios), which require muchhuman effort and time, and bias the systems toward user

Fig. 1. Stepwise low- to high-level semantics generation.

perspectives [4]. Moreover, due to the differences betweenhuman image understanding and how computers interpretand process them (the so-called semantic gap [4], [5]),many image mining results provided by computers are stillunsatisfactory.

In this letter, we propose a multilevel semantics discoveryapproach for bringing computer interpretation of a particularremote sensing image type, namely, polarimetric syntheticaperture radar (PolSAR) images, closer to human semantics.This helps computers to discover existing semantic relation-ships within images and employ them for analyzing userqueries (which are based on semantics) and provide themwith semantically meaningful relevant results. Fig. 1 showsan overview of the proposed approach.

PolSAR images supply information with respect to thephysical scattering properties of the recorded ground targets,retrieved by applying coherent or incoherent target decom-position theorems to the first- and second-order polarimetricrepresentations [6]. A widely employed decomposition methodis Entropy/Anisotropy/Alpha (H/A/α), resulting in three para-meters describing the physics behind the scattering processes.These parameters lead to a superior pixel-based unsupervisedclassification scheme, the H/A/α classification method [7].A modification to this method, namely, the H/A/α-Wishartclassification method [8], shows that the complex Wishartdistribution parameters improve the classification substantially.

In our proposed approach, we employ the H/A/α-Wishartmethod for discovering the low-level semantics of PolSARimages as a set of classes, representing targets by their physicalscattering properties (e.g., low-entropy surface scattering (SS)with high anisotropy). The images are then tiled into patches,and each image patch is modeled as a bag-of-words (BoW)by generating a histogram of its assigned class labels, wherethe labels are the words in the BoWs (see Fig. 2). Afterthat, a generative statistical model, namely, latent Dirichletallocation (LDA) [9], is applied to the BoW histograms inorder to discover the latent semantics behind the image patches

vts-6
Text Box
vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 76
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017
Page 2: 1 Semantic Geo-Image Classification Using Image ... Semantic Geo-Image Classification Using Image Processing Techniques Annai College Of Engineering & Technology Kumbakonam Department

1

Fig. 2. H/α classification scheme (adapted from [7]). From the 18 classes,only 1–16 are feasible and are assigned as labels L1–L16 in Table I.as a set of topics. Our validation of the topics based on groundtruth (Google Earth1) images demonstrates that they provide Fig.3.grayscale conversion median filter

From (1) and (2), the following parameters can be computedhigh-level semantics, which are close to human semantics usedfor identifying land-cover types (e.g., woody vegetation).

As the topics are the basic land-cover types existing in

3

H = −pi log3 pi , A =i=1

p2 − p3

p2 + p3

3

, α = piαi (3)i=1

the image patches, further land-cover types can be defined as where pi = λi /3 λ j denotes the probability of the

combinations of the basic land-cover types (e.g., a shorelineis a combination of a water body, grassland, and woody vege-tation). Thus, the image patches can be modeled by vectors oftopic mixtures, the co-called bag-of-topics (BoT) model [10].The topic vectors form a multidimensional Euclidean spacein which each image patch is represented as a point. Sincethe dimensions of this space are semantically meaningful,it can be easily explored and assessed through immersivevisualization techniques to discover existing semantic rela-tionships and identify new semantic categories (e.g., mixedland-cover types). While a topic representation of images canadapt computer image interpretation to human semantics, thesemantic relationships can be used for designing rule-basedland-cover categorization methods.

Section II reviews H/A/α-Wishart classification. Section IIIdescribes low-level semantics discovery. Section IV brieflyintroduces LDA. Sections V and VI explain how to discoverhigh-level semantics and explore it for finding new semantics.Section VII concludes this letter.

II. ENTROPY/ANISOTROPY/ALPHA-WISHART

CLASSIFICATION

Our test image is an F-Synthetic Aperture Radar System air-borne complex-valued data set2 comprising four polarizationplanes (VV, HH, HV, and VH) [11]. The data were multilookedwith a factor of 5 in azimuth direction. In order to reducethe inherent speckle noise, PolSAR data are usually deliveredin a multilooked pixel-wise coherency matrix format T . Thismatrix is a 3× 3 Hermitian, positive, and semidefinite matrix,which can be written as

T = U · · U−1 (1)

where is a diagonal matrix composed of the eigenvaluesof T (λ1 , λ2 , and λ3 in descending order). The columns ofU contain the corresponding eigenvectors to the eigenvalues(u1, u2 , and u3), where each ui can be further decomposedinto [6]

u i = cos αi sin αi cos βi e j δi sin αi cos βi e jγiT

. (2)

1https://www.google.com/earth/2courtesy of Rolf Scheiber from DLR’s Microwaves and Radar Institute

eigenvalues λi . These parameters refer to the physics behindthe scattering processes. The entropy H discriminates pure anddistributed scatterers; the anisotropy A characterizes differenttypes of scattering [6], and the mean α angle shows thedominant scattering mechanism.

Combinations of these parameters can lead to very goodclassification schemes. For example, the scheme in Fig. 2divides the H /α plane into nine zones (classes), from whichonly eight are feasible. Including anisotropy will double thenumber of classes: nine for A ≤ 0.5 (denoted by odd labels)and nine for A > 0.5 (denoted by even labels). The H/A/αclassification method has its own drawbacks, such as the fixedboundaries of the classes, which do not dynamically adapt tothe input data. Dealing with this issue, Lee et al. [8] proposedto use the parameters of the complex Wishart distribution ofthe coherency matrix. This approach employs the H/A/α clas-sification for initializing the classes, and then uses the Wishartparameters in an iterative procedure to refine the classification.Its main advantages are to consider the magnitude of thecoherency matrix, which is important in detecting SS, andallowing a dynamic adaptation of the class boundaries to theinput data.

III. LOW-LEVEL SEMANTICS DISCOVERY

We apply the H/A/α − Wishart classification, based on5 × 5 pixel windows, to our PolSAR test image, shown inFig. 3 (left), in order to retrieve a classification map with16 classes. The class labels are then considered as low-levelsemantics, which refer to the scattering properties of therecorded targets. Fig. 3 (right) shows the classification results,with Table I defining the color coding.

The labels in Table I can be categorized into three mainscattering mechanisms: the first six labels (L1–L6) refer tomultiple scattering (MS), which usually occur in urban andforested areas. The next six labels (L7–L12) refer to volumescattering (VS), which is usually observed in forested andvegetated areas, while the labels L13–L16 refer to SS, whichcan be usually seen on rough land surfaces. These scatteringmechanisms can be further divided into several subcategoriesdepending on H , which hints to the number of scatterers

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 77
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017
Page 3: 1 Semantic Geo-Image Classification Using Image ... Semantic Geo-Image Classification Using Image Processing Techniques Annai College Of Engineering & Technology Kumbakonam Department

TABLE I

LOW-LEVEL SEMANTI CS

being present in each resolution cell, and on A, showing theimportance of the secondary scatterers.

IV. LATENT DIRICHLET ALLOCATION

A widely used statistical generative model for the discoveryof hidden semantic structures behind collections of images isLDA [9]. Assuming each image wd , as a combination of Ndvisual word tokens, wd = {wd1, wd 2, . . . , wd Nd }, LDA discov-ers its latent semantics as a set of k topics, where the topics aredistributions over a fixed dictionary of NV visual words. Theyare supposed to reflect semantic categories. Therefore, eachimage containing different targets is represented as a mixtureof the topics.

For estimating the required model parameters a and B , andinferring the posterior distributions (i.e., the topic distributionsin the images), LDA uses approximate inference algorithms,such as variational expectation maximization [9].

Having the model parameters and the posterior distributions,LDA can generate every nth visual word token (wdn) of eachimage wd through the generative process

Fig. 4. Topics discovered by LDA. The x-axis denotes the 16 labels inTable I (e.g., 1→ L1), while the y-axis shows their probabilities.

Fig. 5. Topic assignments to image pixels (left) and to image patches (right).

TABLE II

H IGH-LEVEL SEMANTI CS

p(wdn |a, B)= p(θd |a) p(zdn|θd ) p(wdn |zdn, B)zdn

dθd .

(4)between small patches keeping the semantic analysis simpleand bigger patches capturing the spatial context of objects.

In this process, LDA chooses a k-dimensional Dirichletrandom variable θd ∼ Dir(a), corresponding to wd , wherea determines the Dirichlet distribution’s prior [9]. Then, itselects a topic token zdn from the topic mixture θd and pickswdn from the multinomial probability distribution conditionedon the selected topic, p(wdn |zdn, B). The matrix BNV×k para-meterizes the visual word probabilities within the topics.

V. HIGH-LEVEL SEMANTICS DISCOVERY

Since the low-level semantics only refer to physical scatter-ing properties, it is difficult to associate them with commonland-cover categories. Therefore, in this section, we employLDA to discover higher level semantics. To this end, we spliteach image into adjacent patches of 64× 64 pixels and rep-resent each image patch as a BoW by generating a histogramof the 16 labels in Table I assigned to that image patch; theselected patch size of 64×64 pixels resulted as a compromise

By considering each image patch as a document, we applyLDA to the BoW models to find latent semantics as a setof topics. In our experiments, the number of topics to bediscovered was set to six based on an educated guess. Theresulting topics are shown in Fig. 4 as probability distributionsover the 16 labels. Fig. 5 shows the topic assignments toeach image pixel and patch, where the assignments were madeaccording to

word i → arg{max[p(wi |zk)]} (5)patch j → arg{max[p(zk |θ j )]}. (6)

By analyzing Figs. 4 and 5, and by using precisely coreg-istered Google Earth images as correct ground truth, it ispossible to reveal the semantics of the topics and the rulesconnecting them to the lower-level semantics. As shownin Table II, the probabilities of the respective types of scatter-ing can be written as percentages.

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 78
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017
Page 4: 1 Semantic Geo-Image Classification Using Image ... Semantic Geo-Image Classification Using Image Processing Techniques Annai College Of Engineering & Technology Kumbakonam Department

Fig.6. High level features exetract

1) T1 is mostly assigned to forested regions. The high MS andVS values are due to the presence of trunks and richcanopies of woody vegetation. In addition, the low SS valuemay be caused by canopy tops or ground. According to Fig.4, T 1’s main component is L 5, which is referred to in [7] asdouble-bounce scattering in a high-entropy environment, suchas vegetation, which has a well-developed branch and crownstructure. Therefore, we assign the semantic term woodyvegetation to T 1.

2) T2 is most often assigned to regions covered by trees andshrubs. The very small SS value indicates the absenceof rich canopies, which allows radar waves to scatterdirectly back to the sensor. The relatively large MSvalue shows the small number of tree trunks. T 2’s maincomponent is L9, which [7] refers to vegetated surfaceswith anisotropic scatterers and moderate correlation ofscatterer orientations. Thus, we assign the semantic termmixed woody vegetation and shrubs to T 2.

3) T3 is mostly characterized by MS caused by buildingwalls and tree trunks. In addition, VS and SS in this topiccan be caused by tree canopies and smooth surfaces(e.g., streets and roofs), respectively. T 3 is the only topicwith nonzero L1 and L2. These are referred to in [7]as double-bounce scattering events provided by isolateddielectric and metallic dihedral scatterers. Therefore, weassign the semantic term artificial/man-made structuresto T 3.

4) T4 is characterized only by VS indicating that it usuallyoccurs in regions covered by vegetation with thin stems,which do not backscatter the incident wave. Thus, weassign the semantic term herbaceous vegetation to T 4.

5) T5 is distinguished by a very large SS and a very smallVS value. This reveals thin land-cover allowing radarwaves to reach the ground. Therefore, we assign thesemantic term smooth surface to T 5.

6) T6 is completely composed of SS. Therefore, we assignthe semantic term specular surface to it.

Fig. 6 demonstrates the ground truth (Google Earth images)of some image patches, which are mainly characterized by asingle topic, in order to visualize the semantics of each topic.

VI. ASSESSMENT OF HIGH-LEVEL SEMANTICS

In this section, we assess the discovered high-level seman-tics by modeling each image patch as a vector of the extracted

topics (BoT model). These vectors form a multidimensionalEuclidean space, the so-called topic space, in which eachimage patch is represented as a point positioned accordingto its topic distribution. Since the topic space dimensions aresemantically meaningful (e.g., woody vegetation), the imagepatch distribution within this space is easily understandable.When these topics are the basic land-cover types, differenttopic mixtures result in various mixed land-cover types. Thus,by navigating through the topic space, users can easily reachthe image patches containing their desired land-cover typesand define new categories of mixed land-cover types.

In order to visualize topic spaces, we select three topics andpick up the image patches containing them. After discardingthe other topics, the chosen image patches are displayed in a3-D normalized topic space. We explore the topic space andpick groupings of image patches from different regions andvalidate their semantics via their ground truth.

A. Natural Land-Cover Variation

Our aim is to discover different natural land-cover types asmixtures of the basic land-cover types: woody vegetation (T 1),smooth land surfaces (T 5), and specular surfaces (T 6). Fig. 7shows the topic space of these topics in which the coordi-nates are (T 1, T 5, T 6). Here, the selected image patches arehighlighted and their ground truth is shown in Fig. 8.

In Fig. 7, starting from the point (0, 0, 1), we expectto have image patches of specular surfaces, such as waterbodies, which complies with Fig. 8 (grouping G1). By movingtoward the point (1, 0, 0), the proportion of the water bodiesdecreases while that of the forest increases, as demonstrated inFig. 8 (G2) and (G3). Reaching the point (1, 0, 0), the imagepatches are totally covered by forest, as shown in Fig. 8 (G4).Furthermore, around the point (0.3, 0.3, 0.3), a mixture ofthe three basic land-cover types is expected, which is trueaccording to Fig. 8 (G5). By moving toward the point (0, 1, 0),the land-cover types change, containing higher proportions ofgrasslands, as shown in Fig. 8 (G6) and (G7).

B. Natural and Man-Made Land-Cover Variation

The idea is to discover various natural and man-made land-cover types as mixtures of the basic land-cover types: mixedwoody vegetation and shrubs (T 2), man-made structures (T 3),and smooth surfaces (T 5). Fig. 9 shows the topic space of theimage patches in which the coordinates are (T 2, T 3, T 5).

As shown in Fig. 9, we selected three groupings ofimage patches from various regions: one grouping around

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 79
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017
Page 5: 1 Semantic Geo-Image Classification Using Image ... Semantic Geo-Image Classification Using Image Processing Techniques Annai College Of Engineering & Technology Kumbakonam Department

Fig. 7. Examples of new semantic classes spanned by T 5, T 6, and T 1corresponding to the highlighted points in Fig. 7.

Fig. 8. Three-dimensional topic space spanned by T 5, T 2, and T 3. Thehighlighted points are visualized in Fig. 10.

Fig. 9. Examples of new semantic classes spanned by T 5, T 2, and T 3corresponding to the highlighted points in Fig. 9.

the location (0.5, 0.1, 0.1), one grouping around the loca-tion (0.3, 0.3, 0.3), and one grouping around the location(0.1, 0.6, 0.1). The visualization of their ground truth inFig. 10 indicates that the image patches of the first groupingare mostly covered by sparse forest. The image patches of

the second grouping contain mixtures of all the three basicland-cover types, while the third grouping mostly representsresidential areas, including man-made structures (e.g., houses)and trees.

Altogether, representing images by their high-level seman-tics helps us to better understand their contents and discovernew semantic categories as mixtures of high-level semantics.This further allows us to categorize images by selecting a setof coordinates and to partition the topic space.

VII. CONCLUSION

In this letter, we propose a multilevel approach for semanticrelationship discovery in PolSAR images. First, we extractlow-level semantics (physical properties of the recorded tar-gets) by using the H/A/α − Wishar t classification method.Then, the images are tiled into patches and modeled asBoWs by generating histograms of the class labels assignedto the image patches. Next, LDA is applied to the BoWsto discover higher level semantics as a set of topics. Ouranalysis shows that the topics refer to basic land-cover typesand the image patches covered by other land-cover types(e.g., mixed land-cover types) are described as mixtures ofthe topics. Thus, modeling image patches by the topics (BoT)helps to better understand the existing semantic relationshipsand identify new land-cover type categories. The semanticrelationships can be used in designing parameter-based land-cover categorization methods. Moreover, a topic representationof PolSAR images helps image mining systems to adapt theirresults to human semantics, thus narrowing the semantic gap.

REFERENCES

[1] M.Datcu, R.Bahmanyar,Radu Tanas andGottfried Schwarzand“Discovery of Semantic Relationships in PolSAR Images Using LatentDirichlet Allocation,” IEEE Trans. vol. 14, no. 2, pp. 237–241, Dec.2016.

[2] M. Datcu et al., “Information mining in remote sensing image archives:System concepts,” IEEE Trans. Geosci. Remote Sens., vol. 41, no. 12,pp. 2923–2936, Dec. 2003.

[3] C.Chang,B.Moon,A,Acharya,C.Shock,A.Sussman,J.Saltz,”Titan:A highperformance remote sencing database”,Proc.13th int.conf.dataengineering,pp.375-384,1997.

[4] R. Bahmanyar, A. M. M. de Oca, and M. Datcu, “The semantic gap:An exploration of user and computer perspectives in Earth obser-vation images,” IEEE Geosci. Remote Sens. Lett., vol. 12, no. 10,pp. 2046–2050, Oct. 2015.

[5] D. Bratasanu, I. Nedelcu, and M. Datcu, “Bridging the semantic gap forsatellite image annotation and automatic mapping applications,” IEEE J.Sel. Topics Appl. Earth Observ. Remote Sens., vol. 4, no. 1, pp. 193–204,Mar. 2011.

[6] S. R. Cloude and E. Pottier, “A review of target decomposition theoremsin radar polarimetry,” IEEE Trans. Geosci. Remote Sens., vol. 34, no. 2,pp. 498–518, Mar. 1996.

[7] R.touzi,A.Lopies,”The Principle of Speckle filtering in PolarametricSAR imagery”,IEEE Trans .Geosci.Remote sencing,vo.32, no.5,pp.1110-1114, Sep.1994.

[8] J.-S. Lee, M. R. Grunes, T. L. Ainsworth, L.-J. Du, D. L. Schuler, andS. R. Cloude, “Unsupervised classification using polarimetric decompo-sition and the complex Wishart classifier,” IEEE Trans. Geosci. RemoteSens., vol. 37, no. 5, pp. 2249–2258, Sep. 1999.

[9] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet allocation,”J. Mach. Learn. Res., vol. 3, pp. 993–1022, Mar. 2003.

[10] R. Bahmanyar, S. Cui, and M. Datcu, “A comparative study of bag-of-words and bag-of-topics models of EO image patches,” IEEE Geosci.Remote Sens. Lett., vol. 12, no. 6, pp. 1357–1361, Jun. 2015.

[11] R. Horn, A. Nottensteiner, A. Reigber, J. Fischer, and R. Scheiber,“F-SAR—DLR’s new multifrequency polarimetric airborne SAR,” inProc. IEEE IGARSS, vol. 2. Jul. 2009, pp. 902–905.

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 80
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

Top Related