exploring uncertainty in remotely sensed data with parallel coordinate plots

10
Exploring uncertainty in remotely sensed data with parallel coordinate plots Yong Ge a, *, Sanping Li a , V. Chris Lakhan b , Arko Lucieer c a State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences & Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China b Department of Earth and Environmental Sciences, University of Windsor, Windsor, ON N9B 3P4 Canada c School of Geography and Environmental Studies, University of Tasmania, Private Bag 76, Hobart 7001, Tasmania, Australia 1. Introduction A major problem that needs to be addressed in remote sensing is the analysis, identification and visualization of the uncertainties arising from the classification of remotely sensed data with classifiers such as the Maximum Likelihood Classifier (MLC) and Fuzzy C-Means (FCM). While the estimation and mapping of uncertainty has been discussed by several authors (for example, Shi and Ehlers, 1996; van der Wel et al., 1998; Dungan et al., 2002; Foody and Atkinson, 2002; Lucieer and Kraak, 2004; Ibrahim et al., 2005; Ge and Li, 2008a), very little research has been done on identifying, targeting and visualizing pixels with different degrees of uncertainty. This paper, there- fore, applies parallel coordinate plots (PCP) (Inselberg, 1985, 2009; Inselberg and Dimsdale, 1990) to visualize the uncertainty in sample data and classified data with MLC and Fuzzy C-Means. A PCP is a multivariate visualization tool that plots multiple attributes on the X-axis against their values on the Y-axis and has been widely applied to data mining and visualization (Inselberg and Dimsdale, 1990; Guo, 2003; Edsall, 2003; Gahegan et al., 2002; Andrienko and Andrienko, 2004; Guo et al., 2005; Inselberg, 2009). The PCP is useful for providing a representation of high dimensional objects for visualizing uncertainty in remotely sensed data compared with two-dimensional, three- dimensional, animation and other visualization techniques (Ge and Li, 2008b). Several advantages of the PCP technique for visualizing multidimensional data have been outlined by Siirtoia and Ra ¨iha ¨ (2006). Data for the PCPs are from a 1999 Landsat Thematic Mapper (TM) image acquired over the Yellow River Delta, Shandong Province, China. After classifying the data the paper emphasizes the uncertainties arising from the classification process. The probability vector and fuzzy membership vector of each pixel, obtained with classifiers MLC and FCM, are then used in the calculation of Shannon’s entropy (S.E.), a measure of the degree of uncertainty. Axes on the PCP illustrate S.E. and the degrees of uncertainty. Brushing is then added to PCP visualization in order to highlight selected pixels. As demonstrated by Siirtoia and Ra ¨iha ¨ (2006) the brushing operation enhances PCP visualization by allowing interaction with PCPs whereby polylines can be selected and highlighted. International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422 ARTICLE INFO Article history: Received 9 April 2009 Accepted 20 August 2009 Keywords: Parallel coordinate plots (PCP) Remotely sensed data Shannon’s entropy Uncertainty Interactive visualization Brushing ABSTRACT The existence of uncertainty in classified remotely sensed data necessitates the application of enhanced techniques for identifying and visualizing the various degrees of uncertainty. This paper, therefore, applies the multidimensional graphical data analysis technique of parallel coordinate plots (PCP) to visualize the uncertainty in Landsat Thematic Mapper (TM) data classified by the Maximum Likelihood Classifier (MLC) and Fuzzy C-Means (FCM). The Landsat TM data are from the Yellow River Delta, Shandong Province, China. Image classification with MLC and FCM provides the probability vector and fuzzy membership vector of each pixel. Based on these vectors, the Shannon’s entropy (S.E.) of each pixel is calculated. PCPs are then produced for each classification output. The PCP axes denote the posterior probability vector and fuzzy membership vector and two additional axes represent S.E. and the associated degree of uncertainty. The PCPs highlight the distribution of probability values of different land cover types for each pixel, and also reflect the status of pixels with different degrees of uncertainty. Brushing functionality is then added to PCP visualization in order to highlight selected pixels of interest. This not only reduces the visualization uncertainty, but also provides invaluable information on the positional and spectral characteristics of targeted pixels. ß 2009 Elsevier B.V. All rights reserved. * Corresponding author. Tel.: +86 10 64888967; fax: +86 10 64889630. E-mail addresses: [email protected] (Y. Ge), [email protected] (V.C. Lakhan), [email protected] (A. Lucieer). Contents lists available at ScienceDirect International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag 0303-2434/$ – see front matter ß 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jag.2009.08.004

Upload: yong-ge

Post on 25-Oct-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Exploring uncertainty in remotely sensed data with parallel coordinate plots

International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422

Exploring uncertainty in remotely sensed data with parallel coordinate plots

Yong Ge a,*, Sanping Li a, V. Chris Lakhan b, Arko Lucieer c

a State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences & Natural Resources Research,

Chinese Academy of Sciences, Beijing 100101, Chinab Department of Earth and Environmental Sciences, University of Windsor, Windsor, ON N9B 3P4 Canadac School of Geography and Environmental Studies, University of Tasmania, Private Bag 76, Hobart 7001, Tasmania, Australia

A R T I C L E I N F O

Article history:

Received 9 April 2009

Accepted 20 August 2009

Keywords:

Parallel coordinate plots (PCP)

Remotely sensed data

Shannon’s entropy

Uncertainty

Interactive visualization

Brushing

A B S T R A C T

The existence of uncertainty in classified remotely sensed data necessitates the application of enhanced

techniques for identifying and visualizing the various degrees of uncertainty. This paper, therefore,

applies the multidimensional graphical data analysis technique of parallel coordinate plots (PCP) to

visualize the uncertainty in Landsat Thematic Mapper (TM) data classified by the Maximum Likelihood

Classifier (MLC) and Fuzzy C-Means (FCM). The Landsat TM data are from the Yellow River Delta,

Shandong Province, China. Image classification with MLC and FCM provides the probability vector and

fuzzy membership vector of each pixel. Based on these vectors, the Shannon’s entropy (S.E.) of each pixel

is calculated. PCPs are then produced for each classification output. The PCP axes denote the posterior

probability vector and fuzzy membership vector and two additional axes represent S.E. and the

associated degree of uncertainty. The PCPs highlight the distribution of probability values of different

land cover types for each pixel, and also reflect the status of pixels with different degrees of uncertainty.

Brushing functionality is then added to PCP visualization in order to highlight selected pixels of interest.

This not only reduces the visualization uncertainty, but also provides invaluable information on the

positional and spectral characteristics of targeted pixels.

� 2009 Elsevier B.V. All rights reserved.

Contents lists available at ScienceDirect

International Journal of Applied Earth Observation andGeoinformation

journal homepage: www.e lsev ier .com/ locate / jag

1. Introduction

A major problem that needs to be addressed in remotesensing is the analysis, identification and visualization of theuncertainties arising from the classification of remotely senseddata with classifiers such as the Maximum Likelihood Classifier(MLC) and Fuzzy C-Means (FCM). While the estimation andmapping of uncertainty has been discussed by several authors(for example, Shi and Ehlers, 1996; van der Wel et al., 1998;Dungan et al., 2002; Foody and Atkinson, 2002; Lucieer andKraak, 2004; Ibrahim et al., 2005; Ge and Li, 2008a), very littleresearch has been done on identifying, targeting and visualizingpixels with different degrees of uncertainty. This paper, there-fore, applies parallel coordinate plots (PCP) (Inselberg, 1985,2009; Inselberg and Dimsdale, 1990) to visualize the uncertaintyin sample data and classified data with MLC and Fuzzy C-Means.A PCP is a multivariate visualization tool that plots multipleattributes on the X-axis against their values on the Y-axis and has

* Corresponding author. Tel.: +86 10 64888967; fax: +86 10 64889630.

E-mail addresses: [email protected] (Y. Ge), [email protected] (V.C. Lakhan),

[email protected] (A. Lucieer).

0303-2434/$ – see front matter � 2009 Elsevier B.V. All rights reserved.

doi:10.1016/j.jag.2009.08.004

been widely applied to data mining and visualization (Inselbergand Dimsdale, 1990; Guo, 2003; Edsall, 2003; Gahegan et al.,2002; Andrienko and Andrienko, 2004; Guo et al., 2005;Inselberg, 2009). The PCP is useful for providing a representationof high dimensional objects for visualizing uncertainty inremotely sensed data compared with two-dimensional, three-dimensional, animation and other visualization techniques (Geand Li, 2008b). Several advantages of the PCP technique forvisualizing multidimensional data have been outlined by Siirtoiaand Raiha (2006).

Data for the PCPs are from a 1999 Landsat Thematic Mapper(TM) image acquired over the Yellow River Delta, ShandongProvince, China. After classifying the data the paper emphasizesthe uncertainties arising from the classification process. Theprobability vector and fuzzy membership vector of each pixel,obtained with classifiers MLC and FCM, are then used in thecalculation of Shannon’s entropy (S.E.), a measure of the degree ofuncertainty. Axes on the PCP illustrate S.E. and the degrees ofuncertainty. Brushing is then added to PCP visualization in order tohighlight selected pixels. As demonstrated by Siirtoia and Raiha(2006) the brushing operation enhances PCP visualization byallowing interaction with PCPs whereby polylines can be selectedand highlighted.

Page 2: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 1. Pseudo-color composition image of the study area.

Fig. 2. Sample data in the region of interest.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422414

2. Remarks on parallel coordinate plots and brushing

Parallel coordinate plots, a data analysis tool, can be applied to adiverse set of multidimensional problems (Inselberg and Dimsdale,1990). The basis of PCP is to place coordinate axes in parallel, and topresent a data point as a connection line between coordinatevalues. A given row of data could be represented by drawing a linethat connects the value of that row to each corresponding axis(Kreuseler, 2000; Shneiderman, 2004). A single set of connectedline segments representing one multidimensional data item iscalled a polyline (Siirtoia and Raiha, 2006). An entire dimensionaldataset can be viewed whereby all observations are plotted on thesame graph. All attributes in two-dimensional feature space could,therefore, be distinguished thereby allowing for the recognition ofuncertainties and outliers in the dataset.

The PCP can be used to visualize not only multidimensional databut also non-numerical multidimensional data. Jang and Yang(1996) discussed applications of PCP especially its usefulness as adynamic graphics tool for multivariate data analysis. In this paperthe PCP is applied to Landsat TM multidimensional data. This paperfollows the procedures of Lucieer (2004) and Lucieer and Kraak(2004) who adopted the PCP from Hauser et al. (2002) andLedermann (2003) to represent the uncertainties in the spectralcharacteristics of pixels and their corresponding fuzzy member-ship. To enhance the visualization of the PCP tinteractive brushingfunctionality is employed. In the brushing operation, pixels ofinterest are selected and then highlighted. Brushing permits notonly highlighting but also masking and deletion of specific pixelsand spatial areas.

3. Use of PCP to explore uncertainty in sample data

In the supervised classification process of remotely sensedimagery, the quantity of samples is a major factor affecting theaccuracy of the image classification. In this section, the uncertaintyin the sample data is, therefore, first explored with PCP.

3.1. Data acquired

In this paper, the Landsat TM image representing the study areawas acquired on August 28, 1999, and covers an area of the YellowRiver delta. The image area is at the intersection of the terrainbetween Dongying and Binzhou, Shandong Province. The upper leftlatitude and longitude coordinates of the image are 11880034.0700Eand 37822024.0000N, respectively. The lower right latitude andlongitude coordinates are 118810052.8300E and 37813058.1300N,respectively. The image size is 515 by 515 pixels, with each pixelhaving a resolution of 30 m. Fig. 1 is a pseudo-color compositeimage of bands 5, 4 and 3.

3.2. Exploring the uncertainty in sample data

The image includes six land cover types, namely Water,Agriculture_1, Agriculture_2 (Agriculture_1 and Agriculture_2are different crops), Urban, Bottomland (the channel of the YellowRiver), and Bareground. Sample data are selected from the region ofinterest (see Fig. 2), and represents a total of 26,639 pixels.

The Parbat software developed by Lucieer (2004) and Lucieerand Kraak (2004) is used to produce the PCP. The PCP depicts themultidimensional characteristics of pixels in the remote sensingimage through a set of parallel axes and polylines in a two-dimensional plane (Fig. 3). It is noticeable that there is a clearrepresentation of sample data from different land cover types, asshown by clustering of spectral signatures, and the dispersion andoverlapping of spectral signatures from different land cover types.The digital numbers (DNs) of all pixels in the land cover type,

Bottomland, are very concentrated within a narrow band. Therange of DNs of pixels in the Water class is also narrow, except forband 3. The land cover types of Water and Bottomland in Fig. 3 canbe easily distinguished from the other land covers in bands 5 and 7.Further differentiation is provided in band 3. The radiationresponses for Agriculture_1 and Agriculture_2 have close similar-ity in bands 1, 2, 3, 6 and 7, with a degree of overlap in bands 4 and5. There is an almost perfect positive correlation between bands 1and 2 for all categories. This occurrence presents difficulties inclearly differentiating pixels for Agriculture_1 and Agriculture_2.Hence, it is evident that there is uncertainty in differentiatingpixels for the land cover types Agriculture_1 and Agriculture_2.

4. Classification and measurement of uncertainty in classifiedremote sensing images

4.1. Uncertainties arising from the classification process

The supervised classifiers, MLC and FCM, are applied to theimage, with the condition that no pixel is assigned to a null class.The classified images are shown in Fig. 4a and b.

Comparison of Fig. 4a and b reveals that the classified resultsfrom MLC and FCM are not identical for the data pertaining to the

Page 3: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 3. PCP of sample data.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422 415

same region of interest (ROI). The difference images between thesetwo classified results are presented in Fig. 5a and b, with Fig. 5ashowing the classified results for MLC. The classified results of FCMfor the difference pixels are illustrated in Fig. 5b. The number ofdifference pixels total 16,416. These difference pixels aredistributed mainly on the banks of the river, and in mixed areasof Bareground, Agriculture_1, and Agriculture_2. For the MLCclassified result, 57.1% of the difference pixels are classified as

Fig. 4. (a) Classification result from MLC; (b) classification result from fuzzy C-

means.

Agriculture_1, and 36.7% are classified as Agriculture_2. The FCMclassified results, however, demonstrate that 90.5% of differencepixels are Bareground while 9.3% are Agriculture_2.

The number of pixels in the ROI and in each of the classificationcategories from MLC and FCM are illustrated in Fig. 6. Evidently,there is a significant difference in the number of pixels forAgriculture_1, Agriculture_2, and Bareground, while the number ofpixels for Water and Bottomland are very similar. This is also

Fig. 5. Spatial distribution of difference pixels between MLC and FCM: (a) difference

pixels in the classified result from MLC; (b) difference pixels in the classified result

from FCM.

Page 4: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 6. Comparison of numbers of pixels in the ROI; each category from MLC and FCM.

Fig. 7. Water class: (a) posterior probability from MLC; (b) fuzzy membership from

FCM.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422416

demonstrated in Fig. 5a and b. Based on the classified results fromMLC and FCM it is possible to claim that there are relatively highuncertainties in identifying Agriculture_1, Agriculture_2 andBareground. There are, however, lower uncertainties in theidentification of Water and Bareground.

4.2. Measurement

4.2.1. Probability/fuzzy membership

FCM produces a fuzzy membership vector for each pixel. Thisfuzzy membership can be taken as the area proportion within apixel (Bastin et al., 2002). It is possible for pixels to have the sameclass type, but their posterior probabilities or fuzzy membershipscould be different. Hence, the probability vector or fuzzymembership is normally used as a measure for uncertainty on apixel scale. The posterior probability and fuzzy membership ofWater and Agriculture_2 are illustrated in Figs. 7 and 8,respectively. The uncertainty in spatial distributions can be clearlyobserved in Figs. 7 and 8. In Fig. 7, pixels belonging to the Waterclass have larger posterior probability or fuzzy membership,thereby indicating that these pixels have smaller uncertainties.While there are insignificant variations between Fig. 7a and b thereare, however, noticeable differences between Fig. 8a and b,especially for the class, Agriculture_2.

4.2.2. Shannon entropy [S.E.]

Entropy is a measure of uncertainty and information formu-lated in terms of probability theory, which expresses the relativesupport associated with mutually exclusive alternative classes(Foody, 1996; Brown et al., 2009). Shannon’s entropy (Shannon,1948), applied to the measurement of the quality of a remotesensing image, is defined as the required amount of informationthat determines a pixel completely belonging to a category andexpresses the overall uncertainty information of the probability ormembership vector, and all elements in the probability vector areused in the calculation (Maselli et al., 1994). Therefore, it is anappropriate method to measure the uncertainty for each pixel inclassified remotely sensed images (van der Wel et al., 1998). Theuse of S.E. in this paper is represented by considering the following.

Given U as the universe of discourse in the remotely sensedimagery, U contains all pixels in this image and is partitioned by{X1, X2, . . ., Xn} where n is the number of classes. The probability ofeach partition is denoted as pi = P(Xi) giving the S.E. as

HðxÞ �Xn

i¼1

pi log2 pi (1)

where H(X) is the information entropy of the information source.When pi = 0, the equation becomes 0 log 0 = 0 (Zhang et al., 2001;Liang and Li, 2005). It is accepted that: 0 � H(X) � log n.

On the basis of Bayesian decision rules, MLC determines theclass type of every pixel according to its maximum posteriorprobability in a probability vector. The classification process could,therefore, be associated with uncertainty. S.E. derived from aprobability vector or fuzzy membership vector can represent thevariation in the posterior probability vector and can be taken asone of the measures on a pixel scale (van der Wel et al., 1998).Similarly, applying FCM to remotely sensed data can produce fuzzymembership values in all land cover classes. Hence, fuzzymembership can also be considered as a measure of uncertainty

Page 5: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 8. Agriculture_2 class: (a) posterior probability from MLC; (b) fuzzy

membership from FCM.

Fig. 9.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422 417

on a pixel scale. On the basis of the fuzzy membership of each pixelthe corresponding S.E. can be calculated. To compare the MLC andFCM methods it is necessary to normalize the computed S.E. values.

From Eq. (1) S.E. can be calculated through posterior probabilityand fuzzy memberships. Fig. 9a and b displays normalized S.E.values of classified pixels from MLC and FCM, respectively. Whenthe grey value of a pixel is zero then the uncertainty is zero, andwhen the grey value is 255 then the uncertainty will be at themaximum value of 1. From Fig. 9a and b it can be observed that theclasses of Water and Bottomland have lower uncertainties whilethe classes of Bareground, Agriculture_1 and Agriculture_2 havehigher uncertainties. These results emphasize that S.E. valuescalculated from MLC are comparatively higher than those obtainedfrom FCM. Obviously, FCM produces more information about theend members within a pixel than MLC.

4.2.3. Degree of uncertainty

While S.E. provides information on the uncertainties of pixels itis, however, known that when there is a large range of greyscalevalues representing brightness values [0,255] the subtle differ-ences in greyscales are not easily discernible to humans. Hence,there are difficulties in differentiating the degree of uncertainty. Toovercome this problem the S.E. is, therefore, discretized equidis-tantly. For instance, the S.E. is discretized into the followingintervals: 0.00, (0.00, 0.20], (0.20, 0.40], (0.40, 0.60], (0.60, 0.80],

(0.80, 1.00]. Measurements falling into the same interval have thesame degree of uncertainty. By assigning a color to each degree ofuncertainty, a pixel-based uncertainty visualization is produced(see Fig. 10a and b). This discretization clearly highlights thedegrees of uncertainty in the classified remotely sensing image. InFig. 10a, representing MLC, the degrees of uncertainty for most ofthe pixels are 0, 1 and 2 while in Fig. 10b, associated with FCM, thedegrees of uncertainty for most of the pixels are 1, 2 and 3 andoccasionally 4. It is worthwhile to note that the interval map of S.E.permits a comparison of the degrees of uncertainty in classifiedresults from different classifiers.

5. PCP and brushing

The PCP is useful for visually exploring the degree of dispersionor aggregation of the DN values of pixels in each band, and can beconveniently used to investigate the reasons contributing touncertainty. With the brushing operation, the pixels of interestcould be selected and highlighted.

5.1. PCP

From the MLC results two sets of pixels have been, respectively,randomly selected from the classes of Water and Agriculture_2 tobe represented on a PCP. The Parbat software (Lucieer, 2004;

Page 6: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 10. Degree of uncertainty: (a) derived from MLC; (b) derived from FCM.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422418

Lucieer and Kraak, 2004) is used to produce a PCP of Water (seeFig. 11) and Agriculture_2 (see Fig. 12) classified by MLC. The firstsix axes are posterior probabilities of the six land cover types andthe last two axes are the S.E. and degree of uncertainty. For

Fig. 11. PCP: posterior probability, Shannon entropy and

instance, Fig. 12 illustrates the uncertainties of the pixels selectedfrom the class type of Agriculture_2 and their distributions. Theposterior probability of Agriculture_2 to each pixel is relativelyhigher than other categories. Of significance, there is negativecorrelation between Agriculture_1 and Agriculture_2. In thisexample, the S.E. is equally divided into five intervals to obtainthe degree of uncertainty. The uncertainties of the pixels selectedfrom Water and Agriculture_2, respectively, and their distributionsare illustrated by Figs. 11 and 12.

Figs. 13 and 14 are obtained by placing all the DNs of thesepixels, fuzzy memberships, S.E., and associated degree ofuncertainty as attribute dimensions on the horizontal axis of thePCP. Similarly, pixels are randomly selected to be represented onthe PCP, and the S.E. is divided into five equal intervals to obtain thedegree of uncertainty. Fig. 13 highlights the uncertainty char-acteristics associated with the spectral signatures of pixels fromthe Water class while Fig. 14 shows the spectral signatures ofpixels from the Agriculture_2 class and their related uncertaintycharacteristics. B.1–B.7 and C.1–C.6 denote the seven bands of theTM image and their fuzzy memberships in the six land cover types,which are Water, Agriculture_1, Agriculture_2, Urban, Bottomland,and Bareground, respectively. The S.E. from the FCM classifier andthe degree of uncertainty within the range 0–5 are denoted by ShEand UnL. The red line within bands 1–7 emphasize pixels with adegree of uncertainty of zero. Different colors denote differentdegrees of uncertainty in the PCP. From the PCP it is possible todiscern the distribution of fuzzy memberships and Shannonentropies of pixels with different degrees of uncertainty.

A comparison of Figs. 11 and 13 reveals that the MLC posteriorprobabilities results on classified pixels in the Water class have anon-zero value which are concentrated in the Bareground class. Forthe FCM (see Fig. 13) most of the classified pixels in the Water classhave a fuzzy membership closer to 1. The fuzzy memberships ofsome pixels in the Bareground and Agriculture_1 and 2 classes areaway from zero. It is, therefore, apparent that pixels with a highdegree of uncertainty are mixed pixels in the Water andBareground classes, namely the boundary pixels between Waterand Bareground. As expected, the spectral response for these pixelscontains characteristics of both land cover types.

Fig. 13 provides additional information on the spectralcharacteristics of Water and its uncertainty distribution. Forinstance, the degrees of dispersion in bands 2, 3, 4 and 5 are high,and the distance to the red line in these four bands is also relativelyhigh. As such, their uncertainties are high. Pixels with a degree ofuncertainty of 3, represented by the blue line, are relatively far

uncertainty of pixels in Water classified from MLC.

Page 7: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 12. PCP: posterior probability, Shannon entropy and uncertainty of pixels in Agriculture_2 classified from MLC.

Fig. 13. Spectral features of the Water class and their uncertainty characteristics.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422 419

from pixels with degree of uncertainty of zero which arerepresented by the red line in bands 3 and 4. The fuzzymemberships of pixels for Bareground are in the range 0.2–0.4thereby giving rise to a high degree of uncertainty. Pixels with a

Fig. 14. Spectral features of pixels from the Agricultu

degree of uncertainty of 4, represented by the pink line, arerelatively far from those pixels with a degree of uncertainty of zero,as shown by the red line in band 4. There are two independentdistribution ranges in band 3. In the case of the Water class pixels

re_2 class and their uncertainty characteristics.

Page 8: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 15. (a) Class of Water: pixels with a low degree of uncertainty in the PCP; (b) class of Water: pixels with a high degree of uncertainty degree in the PCP.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422420

with a high degree of uncertainty, assuming their fuzzy member-ship on Bottomland is high, then the DNs in band 3 will be greaterthan the DNs of pixels with a degree of uncertainty of zero.

A comparison of Figs. 12 and 14 reveals that the classes ofAgriculture_1 and Bareground have the highest uncertainty inAgriculture_2. The difference between the PCPs from MLC andFCM is because the influence of Agriculture_1 on Agriculture_2 inMLC is greater than the influence of Bareground on Agricul-ture_2. For FCM the two influences on Agriculture_2 are almostsimilar. When the fuzzy memberships of pixels for Agriculture_1are greater, their DNs in bands 4 and 5 are less than that of pixelswith a degree of uncertainty of zero. The DNs in all bands aredispersed for pixels with large fuzzy membership for Bare-ground.

5.2. Brushing

From Figs. 11 and 14 it is demonstrated that when differentcolors are used to represent different degrees of uncertainty, acertain amount of overlap develops between color lines,especially on the axes where the distribution of polylines isconcentrated. The superposition of polylines definitely increasesthe difficulty in visual uncertainty analysis. A new ‘‘visual’’uncertainty is, thereby, introduced. To improve on this visualiza-tion the brushing operation is introduced to the PCP (Hauser et al.,

2002; Ledermann, 2003). The difference from that of a conven-tional approach is that the user selects pixels of interest, and thenhighlights them with a brush, instead of using colored polylines.This is a suitable and convenient method for conducting targetedanalysis on the spectral characteristics of pixels and theirassociated uncertainty.

Brushing is applied to a set of pixels with a low degree ofuncertainty (see Fig. 15a) and a set of pixels with a high degree ofuncertainty (see Fig. 15b). The pixels targeted for investigationbelong to the class type, Water. A green polyline denotes the pixelsbeing brushed, while a grey polyline represents the distributioncharacteristics of all pixels belonging to the class type, Water. Thered line for bands 1–7 represents pixels with a zero degree ofuncertainty. In Fig. 15a, there is a strong negative correlationbetween C.5 (Bottomland) and C.6 (Bareground). This means thatthe larger the memberships of pixels in the class of Bottomland,the smaller the memberships of pixels in the class of Bareground.To further investigate the class type, Agriculture_2, PCP andbrushing are used to visualize pixels with low uncertainty (seeFig. 16a) and pixels with high uncertainty (see Fig. 16b). From theFigures it becomes noticeable that when PCP is combined withbrushing the user could focus on the spectral characteristics andthe uncertainty distribution of pixels of interest. This effectivelyreduces the uncertainty introduced by the visualization of theremotely sensed data.

Page 9: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Fig. 16. (a) Class of Agriculture_2: pixels with a low degree of uncertainty in the PCP; (b) class of Agriculture_2: pixels with a high degree of uncertainty in the PCP.

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422 421

6. Discussion and conclusion

A major unresolved problem in image processing is how toidentify and visualize the uncertainty arising out of the classifica-tion of remotely sensed data. Without doubt, targeting uncertain-ties will not only permit better visualization of features ingeographical space but also enhance the capabilities of policy-makers who have to make reliable decisions on a broad range ofgeospatial issues. This paper has demonstrated the effectiveness ofthe combined PCP and brushing operation to explore and visualizethe uncertainties in remotely sensed data classified with MLC andFCM.

The MLC and FCM results demonstrate that water class pixels,with a high degree of uncertainty, have high posterior probabilityor fuzzy membership for the type Bottomland. Furthermore, someAgriculture_2 class pixels, with a high degree of uncertainty, havehigh fuzzy membership for the Bareground class type. Therefore, itis possible to compare the pixel distribution for Water andBottomland or Agriculture_2 and Bareground, and further analyzethe possible reasons causing the uncertainty by investigating thespectral characteristics for all bands. Essentially, it is necessary touse the probability vector and fuzzy membership vector of eachpixel to compute the S.E. The degree of uncertainty of each pixelcan then be represented on a PCP. As illustrated in this paper twoaxes on the PCP represent Shannon’s entropy and the degree ofuncertainty.

The PCP technique is also advantageous for highlighting thedistribution of probability values of different land covers of eachpixel, and also reflects the status of pixels with different degreesof uncertainty. Moreover, a PCP can be produced for the spectralcharacteristics of sample data and uncertainty attributes ofclassified data. The class type of the sample data can be includedin the PCP to evaluate the quality of the data. Moreover, thesample data can then be compared to the classified data toevaluate whether the sample data are a reasonable reflection ofthe spectral characteristics of all bands. The identification of anydissimilarities or uncertainties is a definite indication ofimprovement in the visualization process. This paper demon-strates that there could be enhancements in PCP visualizationwith the addition of the brushing operation. Instead of usingcolor polylines, as done with previous approaches, brushingpermits the user to select pixels of interest. These pixels couldthen be highlighted with brushing instead of with colorpolylines. Evidently, brushing facilitates targeted analysis ofthe spectral characteristics of pixels and any associateduncertainty. It could, therefore, be concluded that the integrationof PCP with the brushing operation is beneficial for not onlyvisualizing uncertainty but also gaining insights on the spectralcharacteristics and attribute information of pixels of interest. Byinteracting with the PCP through the brushing operation it ispossible to conduct an exploration of uncertainty, even at thesub-pixel level.

Page 10: Exploring uncertainty in remotely sensed data with parallel coordinate plots

Y. Ge et al. / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 413–422422

Acknowledgements

This research received partial support from the NationalNatural Science Foundation of China (Grant No. 40671136) andthe National High Technology Research and Development Programof China (Grant No. 2006AA120106).

References

Andrienko, G., Andrienko, N., 2004. Parallel coordinates for exploring properties ofsubsets CMV. In: Roberts, J. (Ed.), Proceedings International Conference onCoordinated & Multiple Views in Exploratory Visualization, London, England,July 13, 2004, pp. 93–104.

Bastin, L., Fisher, P.F., Wood, J., 2002. Visualizing uncertainty in multi-spectralremotely sensed imagery. Computers & Geosciences 28, 337–350.

Brown, K.M., Foody, G.M., Atkinson, P.M., 2009. Estimating per-pixel thematicuncertainty in remote sensing classifications. International Journal of RemoteSensing 30, 209–229.

Dungan, J.L., Kao, D., Pang, A., 2002. The uncertainty visualization problem inremote sensing analysis. In: Proceedings of IEEE International Geoscienceand Remote Sensing Symposium, Toronto, Canada, June 2, 2002, pp. 729–731.

Edsall, R.M., 2003. The parallel coordinate plot in action: design and use forgeographic visualization. Computational Statistics and Data Analysis 43,605–619.

Foody, G.M., Atkinson, P.M., 2002. Uncertainty in Remote Sensing and GIS. WileyBlackwell, London.

Foody, G.M., 1996. Approaches for the production and evaluation of fuzzy landcover classifications from remotely-sensed data. International Journal ofRemote Sensing 17, 1317–1340.

Gahegan, M., Takatsuka, M., Wheeler, M., amd Hardisty, F., 2002. Introducing geoVISTA studio: an integrated suite of visualization and computational methodsfor exploration and knowledge construction in geography. Computers, Envir-onment and Urban Systems 26, 267–292.

Ge, Y., Li, S.P., 2008a. Visualizing uncertainty in classified remote sensing images.Geo-information Science 10, 88–96 (in Chinese, with English Abstr.).

Ge, Y., Li, S.P., 2008b. Parallel coordinate plot: an application to visualize theuncertainty in thematic classifications of remotely sensed imagery. In: Pro-ceedings of IEEE International Geoscience and Remote Sensing Symposium.Boston, USA, July 6 (two-page abstract).

Guo, D., 2003. Coordinating computational and visualization approaches for inter-active feature selection and multivariate clustering. Information Visualization2, 232–246.

Guo, D., Gahegan, M., MacEachren, A.M., Zhou, B., 2005. Multivariate analysis andgeovisualization with an integrated geographic knowledge discovery approach.Cartography and Geographic Information Science 32, 113–132.

Hauser, H., Ledermann, F., Doleisch, H., 2002. Angular brushing for extended parallelcoordinates. In: Proceedings of the IEEE Symposium on Information Visualization(InfoVis’02), 2002. IEEE Computer Society, Washington, DC, USA, pp. 127–130.

Ibrahim, M.A., Arora, M.K., Ghosh, S.K., 2005. Estimating and accommodatinguncertainty through the soft classification of remote sensing data. InternationalJournal of Remote Sensing 26, 2995–3007.

Inselberg, A., 1985. The plane with parallel coordinates. Computational Geometry ofthe Visual Computer 1, 69–97.

Inselberg, A., Dimsdale, B., 1990. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of Visualization ‘90, San Diego, CA, IEEEComputer Society, Los Alamitos, CA, pp. 361–378.

Inselberg, A., 2009. Parallel Coordinates: Visual Multidimensional Geometry and ItsApplications. Springer, New York.

Jang, D.H., Yang, S.J., 1996. The dynamic parallel coordinate plot and its applications.Korean Journal of Applied Statistics 9, 45–52 (in Korean).

Kreuseler, M., 2000. Visualization of geographically related multidimensional datain virtual 3D scenes. Computers & Geosciences 26, 101–108.

Ledermann, F., 2003. Parvis—A Tool For Parallel Coordinates (PC) Visualisation ofMultidimensional Data Sets. , http://home.subnet.at/flo/mv/parvis/.

Liang, J.Y., Li, D.Y., 2005. Uncertainty and Knowledge Acquisition in the InformationSystem. Science Press, Beijing, China (in Chinese).

Lucieer, A., 2004. Uncertainties in Segmentation and their Visualization, Ph.D.Dissertation. International Institute for Geo-information Science and EarthObservation, Enschede, The Netherlands, http://parbat.lucieer.net/.

Lucieer, A., Kraak, M.J., 2004. Interactive and visual fuzzy classification of remotelysensed imagery for exploration of uncertainty. International Journal of Geo-graphical Information Science 18, 491–512.

Maselli, F., Conese, C., Petkov, L., 1994. Use of probability entropy for the estimationand graphical representation of the accuracy of maximum likelihood classifica-tions. ISPRS Journal of Photogrammetry and Remote Sensing 49, 13–20.

Shannon, C.E., 1948. A mathematical theory of communication. Bell System Tech-nical Journal 27 (379–423), 623–656.

Shi, W.Z., Ehlers, M., 1996. Determining uncertainties and their propagation indynamic change detection based on classified remotely-sensed images. Inter-national Journal of Remote Sensing 17, 2729–2741.

Shneiderman, B., 2004. Designing The User Interface: Strategies for EffectiveHuman–Computer Interaction, 4th edition. Addison Wesley.

Siirtoia, H., Raiha, K.-J., 2006. Interacting with parallel coordinates. Interacting withComputers 18, 1278–1309.

van der Wel, F.J.M., van der Gaag, L.C., Gorte, B.G.H., 1998. Visual exploration ofuncertainty in remote sensing classification. Computers & Geosciences 24, 335–343.

Zhang, W.X., Liang, J.Y., Wu, W.Z., Li, D.Y., 2001. Theory and Method of Rough Sets.Science Press, Beijing, China (in Chinese).