a visualization metric for dimensionality reduction

6
A visualization metric for dimensionality reduction Flora S. Tsai School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore article info Keywords: Visualization Dimensionality reduction Metric Manifold Multidimensional scaling Isomap Locally Linear Embedding abstract Data visualization of high-dimensional data is possible through the use of dimensionality reduction tech- niques. However, in deciding which dimensionality reduction techniques to use in practice, quantitative metrics are necessary for evaluating the results of the transformation and visualization of the lower dimensional embedding. In this paper, we propose a manifold visualization metric based on the pairwise correlation of the geodesic distance in a data manifold. This metric is compared with other metrics based on the Euclidean distance, Mahalanobis distance, City Block metric, Minkowski metric, cosine distance, Chebychev distance, and Spearman distance. The results of applying different dimensionality reduction techniques on various types of nonlinear manifolds are compared and discussed. Our experiments show that our proposed metric is suitable for quantitatively evaluating the results of the dimensionality reduc- tion techniques if the data lies on an open planar nonlinear manifold. This has practical significance in the implementation of knowledge-based visualization systems and the application of knowledge-based dimensionality reduction methods. Ó 2011 Elsevier Ltd. All rights reserved. 1. Introduction Dimensionality reduction is the search for a small set of features to describe a large set of observed dimensions. Dimensionality reduction is useful in visualizing data, discovering a compact rep- resentation, and decreasing computational processing time. In addition, reducing the number of dimensions can separate the important features or variables from the less important ones, thus providing additional insight into the nature of the data that may otherwise be left undiscovered. When analyzing large data of multiple dimensions, it may be necessary to perform dimensionality reduction (i.e. projection or feature extraction) techniques to transform the data into a smaller, more manageable set. By reducing the data set, we hope to uncover hidden structure that aids in the understanding as well as visualiza- tion of the data. Dimensionality reduction techniques such as Principal Component Analysis (PCA) (Pearson, 1901) and multidi- mensional scaling (MDS) (Cox & Cox, 2000; Davison, 2000; Kruskal & Wish, 1978) have existed for quite some time, but most are capable only of handling data that is inherently linear in nature. Recently, some unsupervised nonlinear techniques for dimensionality reduc- tion such as Locally Linear Embedding (LLE) (Roweis & Saul, 2000), Hessian LLE (HLLE) (Donoho & Grimes, 2003), Isometric Feature Mapping (Isomap) (Tenenbaum, de Silva, & Langford, 2000), Local Tangent Space Alignment (LTSA) (Zhang & Zha, 2005), Kernel PCA (Schölkopf, Smola, & Müller, 1998), diffusion maps (Lafon & Lee, 2006), and multilayer autoencoders (Hinton & Salakhutdinov, 2006) have achieved remarkable results for data that fit certain types of topological manifolds. However, when performing these dimensionality reduction techniques, it is necessary to quantita- tively evaluate the results of the lower dimensional embedding. This paper compares some common measures for judging the quality of the low-dimensional embedding results, and proposes a new mea- sure based on the geodesic pairwise correlation distance. The man- ifold visualization metric is shown to have better results if the data lies on a nonlinear manifold. 2. Manifold and topology learning A manifold is defined as a topological space which is locally Euclidean. Basic categories of manifolds include topological mani- folds, differentiable manifolds, Riemannian manifolds, Finsler manifolds, and Lie groups. As a topological space, a manifold can be compact or noncompact, connected or disconnected. A compact space is a space that resembles a closed and bounded subset of Euclidean space R n in that it is ‘‘small’’ in a certain sense and ‘‘con- tains all its limit points’’. The modern general definition calls a topological space compact if every open cover of it has a finite sub- cover. A topological space is said to be connected if it cannot be divided into two disjoint nonempty open sets whose union is the entire space. Equivalently, it cannot be divided into two disjoint nonempty closed sets (since the complement of an open set is closed) (Kirby & Siebenmann, 1977). A manifold can be classified according to the number of dimen- sions. A 1-dimensional manifold is often called a curve while a 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.08.080 E-mail address: [email protected] Expert Systems with Applications 39 (2012) 1747–1752 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Upload: flora-s-tsai

Post on 05-Sep-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A visualization metric for dimensionality reduction

Expert Systems with Applications 39 (2012) 1747–1752

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

A visualization metric for dimensionality reduction

Flora S. TsaiSchool of Electrical & Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore

a r t i c l e i n f o

Keywords:VisualizationDimensionality reductionMetricManifoldMultidimensional scalingIsomapLocally Linear Embedding

0957-4174/$ - see front matter � 2011 Elsevier Ltd. Adoi:10.1016/j.eswa.2011.08.080

E-mail address: [email protected]

a b s t r a c t

Data visualization of high-dimensional data is possible through the use of dimensionality reduction tech-niques. However, in deciding which dimensionality reduction techniques to use in practice, quantitativemetrics are necessary for evaluating the results of the transformation and visualization of the lowerdimensional embedding. In this paper, we propose a manifold visualization metric based on the pairwisecorrelation of the geodesic distance in a data manifold. This metric is compared with other metrics basedon the Euclidean distance, Mahalanobis distance, City Block metric, Minkowski metric, cosine distance,Chebychev distance, and Spearman distance. The results of applying different dimensionality reductiontechniques on various types of nonlinear manifolds are compared and discussed. Our experiments showthat our proposed metric is suitable for quantitatively evaluating the results of the dimensionality reduc-tion techniques if the data lies on an open planar nonlinear manifold. This has practical significance in theimplementation of knowledge-based visualization systems and the application of knowledge-baseddimensionality reduction methods.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Dimensionality reduction is the search for a small set of featuresto describe a large set of observed dimensions. Dimensionalityreduction is useful in visualizing data, discovering a compact rep-resentation, and decreasing computational processing time. Inaddition, reducing the number of dimensions can separate theimportant features or variables from the less important ones, thusproviding additional insight into the nature of the data that mayotherwise be left undiscovered.

When analyzing large data of multiple dimensions, it may benecessary to perform dimensionality reduction (i.e. projection orfeature extraction) techniques to transform the data into a smaller,more manageable set. By reducing the data set, we hope to uncoverhidden structure that aids in the understanding as well as visualiza-tion of the data. Dimensionality reduction techniques such asPrincipal Component Analysis (PCA) (Pearson, 1901) and multidi-mensional scaling (MDS) (Cox & Cox, 2000; Davison, 2000; Kruskal& Wish, 1978) have existed for quite some time, but most are capableonly of handling data that is inherently linear in nature. Recently,some unsupervised nonlinear techniques for dimensionality reduc-tion such as Locally Linear Embedding (LLE) (Roweis & Saul, 2000),Hessian LLE (HLLE) (Donoho & Grimes, 2003), Isometric FeatureMapping (Isomap) (Tenenbaum, de Silva, & Langford, 2000), LocalTangent Space Alignment (LTSA) (Zhang & Zha, 2005), Kernel PCA(Schölkopf, Smola, & Müller, 1998), diffusion maps (Lafon & Lee,

ll rights reserved.

2006), and multilayer autoencoders (Hinton & Salakhutdinov,2006) have achieved remarkable results for data that fit certaintypes of topological manifolds. However, when performing thesedimensionality reduction techniques, it is necessary to quantita-tively evaluate the results of the lower dimensional embedding. Thispaper compares some common measures for judging the quality ofthe low-dimensional embedding results, and proposes a new mea-sure based on the geodesic pairwise correlation distance. The man-ifold visualization metric is shown to have better results if the datalies on a nonlinear manifold.

2. Manifold and topology learning

A manifold is defined as a topological space which is locallyEuclidean. Basic categories of manifolds include topological mani-folds, differentiable manifolds, Riemannian manifolds, Finslermanifolds, and Lie groups. As a topological space, a manifold canbe compact or noncompact, connected or disconnected. A compactspace is a space that resembles a closed and bounded subset ofEuclidean space Rn in that it is ‘‘small’’ in a certain sense and ‘‘con-tains all its limit points’’. The modern general definition calls atopological space compact if every open cover of it has a finite sub-cover. A topological space is said to be connected if it cannot bedivided into two disjoint nonempty open sets whose union is theentire space. Equivalently, it cannot be divided into two disjointnonempty closed sets (since the complement of an open set isclosed) (Kirby & Siebenmann, 1977).

A manifold can be classified according to the number of dimen-sions. A 1-dimensional manifold is often called a curve while a

Page 2: A visualization metric for dimensionality reduction

1748 F.S. Tsai / Expert Systems with Applications 39 (2012) 1747–1752

2-dimensional manifold is called a surface. 2-manifolds can beclassified according to the number of holes they have (e.g., the sur-face of a ball has zero holes, the surface of a doughnut has one hole,etc.). This means that if two surfaces have the same number ofholes, no matter how different they might otherwise look, theyare mathematical the same. The two manifolds are thus topologi-cally equivalent; that is, one manifold can be topologically trans-formed into the other one using a continuous transformation,such as bending, stretching, compressing, or twisting (Devlin,1997). Higher dimensional manifolds are usually just calledn-manifolds. For example, manifolds of dimension n = 3, 4, or 5are referred to as 3-manifold, 4-manifold, and 5-manifold.

For manifolds definable by a particular choice of atlas, we canhave the following groupings: piecewise linear (PL) manifolds,smooth (differentiable) manifolds, and complex manifolds. PLmanifolds consist of piecewise linear structures. Smooth (differen-tiable) manifolds have a globally defined differentiable structure.Complex manifolds possess neighborhoods of the complex n-space. These types of manifolds are not necessarily mutuallyexclusive. For example, smooth manifolds have PL structures. ForPL manifolds, linear or local linear dimensionality reduction tech-niques can be used to recover their lower dimensional structure.For smooth manifolds, it depends on the actual topology. In thispaper, we focus on a particular type of smooth manifold, calledthe open planar manifold.

2.1. Open planar manifold

An open planar manifold is defined as any open manifold that,when ‘‘folded’’ out, is a planar structure. An example is a piece ofpaper that is distorted along a curvature. In an open planar struc-ture, the surface can be nonlinear, but when extended or smoothedout, is a two-dimensional planar surface. This is also known as anopen isometric embedding. Examples of manifolds with open pla-nar structures (S-Curve and Swiss Roll) are shown in Fig. 1. The un-ique characteristic of this type of manifold is that it is topologicallyequivalent to a flat plane, and, because of this, is very suitable forlower-dimensional embeddings. Manifold learning algorithmssuch as the Isomap (Tenenbaum et al., 2000) algorithm should per-form very well for manifolds that are open planar. Another mani-fold learning algorithm, Locally Linear Embedding (LLE)algorithm (Roweis & Saul, 2000) can also perform well for openplanar structures, but only if the surface is a smooth curve andthe neighborhood chosen is small enough. LLE falls into the generalcategory of local linear transformations (Kirby, 2000). Therefore,manifold learning algorithms such as LLE and Isomap should beable to handle manifolds with the OP topology, given the rightchoice in parameters. When we need to examine nonlinear datathat lie on surfaces that are not open planar, then the performanceof Isomap and LLE may be compromised.

Fig. 1. Examples of open planar manifolds.

2.2. Topology learning

High-dimensional data exploration of manifolds relies on topol-ogy learning (Aupetit, 2006) to visually represent the topologicalmanifolds. Although unsupervised manifold learning algorithmshave achieved remarkable results for data that fit certain types oftopological manifolds, many of the manifold learning techniquesperform poorly in nonlinear data with somewhat different topolo-gies (Saul & Roweis, 2003). In addition, the techniques tend to beextremely sensitive to noise. Thus, effective classification of datatopologies is critical to for choosing the ideal techniques andparameters for nonlinear dimensionality reduction. The conditionsto ensure provably correct topology with respect to the data maydepend on a number of factors, especially when faced with noisy,multi-scale, multidimensional or incomplete datasets. Therefore,learning the topologies of different datasets (Aupetit, 2006) cansignificantly aid in the selection of optimal learning algorithmsfor data which may fit the various nonlinear topologies. We tryto solve these problems by first comparing and assessing the qual-ity of lower dimensional representations using different visualiza-tion metrics, and propose a metric that is able to outperform othermetrics for a variety of nonlinear manifolds.

3. Visualization metrics

In this section, we describe different visualization metrics forevaluating the quality of visualization results. The goal of visualiza-tion is to represent the intrinsic structure of the input data as faith-fully as possible. Many previous studies (Tsai & Chan, 2007, 2009)use visual inspection to assess the quality of dimensionality reduc-tion techniques, which can only be used to visualize at most threedimensions. In our study, we used different distance functions tocompute the pairwise distances between points in the originaldata, and compute the correlation coefficient between the pairwiseEuclidean distances between points in the two-dimensionalembedding results. The results were compared to different dimen-sionality reduction techniques to assess which metric performedbetter. The programs were run using the Matlab Toolbox forDimensionality Reduction (Maaten, 2007).

3.1. Euclidean distance metric

A distance metric based on the Euclidean distance was used inGeng, Zhan, and Zhou (2005), which was based on the correlationcoefficient between the distance vectors of the true structure andthat of the recovered structure. The rationale is that when the dis-tances between all pairs of data points are simultaneously changedby a linear transformation, the relationship of the points will notchange. Therefore, the correlation coefficient can provide a goodmeasurement of the validity of the visualization procedure.

The correlation coefficient qX,Y between two random variables Xand Y with expected values lX and lY and standard deviations rX

and rY is defined as:

qX;Y ¼covðX;YÞ

rXrY¼ EððX � lXÞðY � lY ÞÞ

rXrY; ð1Þ

where E is the expected value operator and cov means covariance.The correlation is defined only if both of the standard deviations

are finite and both of them are nonzero. The correlation is 1 in thecase of an increasing linear relationship, �1 in the case of adecreasing linear relationship, and some value in between in allother cases, indicating the degree of linear dependence betweenthe variables. The closer the coefficient is to either �1 or 1, thestronger the correlation between the variables. If the variablesare independent then the correlation is 0, but the converse is not

Page 3: A visualization metric for dimensionality reduction

Fig. 2. Results of reducing dimension on S-curve, N = 2000.

Table 1Results of visualization metrics for S-Curve.

MDS ISO LLE HLLE LTSA KPCA DM AE

Manifold 0.7938 0.9999 0.5286 0.9003 0.9003 0.4236 0.7022 0.5645Chebychev 0.9499 0.7705 0.8244 0.8970 0.8974 0.6296 0.9048 0.7706City Block 0.9160 0.7779 0.8309 0.8725 0.8728 0.5836 0.8656 0.7517Cosine 0.5835 0.5774 0.5322 0.5374 0.5385 �0.0809 0.5663 0.4932Euclidean 0.9746 0.8104 0.8645 0.9255 0.9259 0.6339 0.9269 0.7953Mahalanobis 0.7991 0.7367 0.7606 0.8108 0.8110 0.5275 0.7540 0.6858Minkowski 0.9746 0.8104 0.8645 0.9255 0.9259 0.6339 0.9269 0.7953Spearman 0.4868 0.4037 0.4607 0.3884 0.3894 �0.0296 0.4521 0.2831

F.S. Tsai / Expert Systems with Applications 39 (2012) 1747–1752 1749

true because the correlation coefficient detects only linear depen-dencies between two variables.

In this paper, we tried a variety of distance metrics based on thecorrelation coefficient. The distance metrics differ in themeasurement of the pairwise correlation between all distances inthe initial manifold.

For the Euclidean distance, the distance is defined as

De ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffia2 þ b2 þ c2 þ � � �

qð2Þ

3.2. City Block metric

If one is walking between opposite corners of a block filled withbuildings, the sidewalk distance is given by the simple sum of adja-cent sides of the block. This distance rule is aptly named the ‘‘city-block’’ metric:

Dcb ¼ ðaþ bþ c þ � � �Þ ð3Þ

where D is the direct distance and a and b are the distances alongthe coordinate axes.

3.3. Minkowski distance metric

Both Euclidean and City Block are special cases of the Minkow-ski metric:

Dmin ¼ ðaN þ bN þ cN þ � � � Þ1=N ð4Þ

where N = 1 for the city-block metric and N = 2 for the Euclideanmetric.

3.4. Cosine distance metric

The cosine distance metric calculates the cosine of the angle be-tween the two distance vectors.

By the usual procedure for finding the angle between two vec-tors, the uncentered cosine distance is:

Dcos ¼ cos h ¼ x � ykxkkyk ð5Þ

3.5. Mahalanobis distance metric

Mahalanobis distance is a distance measure introduced by P.C.Mahalanobis in 1936. It is based on correlations between variables

Page 4: A visualization metric for dimensionality reduction

Fig. 3. Results of reducing dimension on Swiss Roll, N = 2000.

Table 2Results of visualization metrics for Swiss Roll.

MDS ISO LLE HLLE LTSA KPCA DM AE

Manifold 0.2672 0.9999 0.3906 0.7672 0.7673 0.1818 0.3452 �0.0167Chebychev 0.8120 0.2308 0.2969 0.3851 0.3861 0.4285 0.7879 0.6036City Block 0.8209 0.2639 0.4044 0.4901 0.4909 0.4644 0.7956 0.6016Cosine 0.7335 0.1082 0.1843 0.1824 0.1830 �0.0788 0.6688 0.7762Euclidean 0.8535 0.2599 0.3760 0.4676 0.4685 0.4729 0.8298 0.6317Mahalanobis 0.7986 0.2444 0.4215 0.5055 0.5063 0.4871 0.7692 0.6359Minkowski 0.8535 0.2599 0.3760 0.4676 0.4685 0.4729 0.8298 0.6317Spearman 0.3807 0.1806 0.1298 0.2553 0.2562 �0.1039 0.3135 0.4376

1750 F.S. Tsai / Expert Systems with Applications 39 (2012) 1747–1752

by which different patterns can be identified and analyzed. It is auseful way of determining similarity of an unknown sample setto a known one. It differs from Euclidean distance in that it takesinto account the correlations of the data set and is scale-invariant,i.e. not dependent on the scale of measurements.

Formally, the Mahalanobis distance from a group of values withmean l = (l1,l2,l3, . . . ,lp)T and covariance matrix R for a multi-variate vector x = (x1,x2,x3, . . . ,xp)T is defined as:

DMðxÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðx� lÞTR�1ðx� lÞ

qð6Þ

Mahalanobis distance can also be defined as dissimilarity measurebetween two random vectors~x and~y of the same distribution withthe covariance matrix R:

dð~x;~yÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið~x�~yÞTR�1ð~x�~yÞ

qð7Þ

If the covariance matrix is the identity matrix, the Mahalanobis dis-tance reduces to the Euclidean distance. If the covariance matrix isdiagonal, then the resulting distance measure is called the normal-ized Euclidean distance:

dð~x;~yÞ ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXp

i¼1

ðxi � yiÞ2

r2i

vuut ð8Þ

where ri is the standard deviation of the xi over the sample set.

3.6. Spearman distance metric

Spearman’s rank correlation coefficient or Spearman’s rho,named after Charles Spearman and often denoted by the Greekletter q or as rs, is a non-parametric measure of correlation thatis, it assesses how well an arbitrary monotonic function could de-scribe the relationship between two variables, without making anyassumptions about the frequency distribution of the variables.

In principle, q is simply a special case of the Pearson product-moment coefficient in which two sets of data Xi and Yi are con-verted to rankings xi and yi before calculating the coefficient(Myers & Well, 2003). In practice, however, a simpler procedureis normally used to calculate q. The raw scores are converted toranks, and the differences di between the ranks of each observationon the two variables are calculated.

If there are no tied ranks, i.e. :$i,j(i – j ^ (Xi = Xj _ Yi = Yj)).then qis given by:

q ¼ 1� 6P

d2i

nðn2 � 1Þ ð9Þ

where:di = xi � yi = the difference between the ranks of corresponding

values Xi and Yi, and n = the number of values in each data set(same for both sets).

Page 5: A visualization metric for dimensionality reduction

Fig. 4. Results of reducing dimension on Swiss Roll with Hole, N = 1000.

Table 3Results of visualization metrics for Swiss Roll with Hole.

MDS ISO LLE HLLE LTSA KPCA DM AE

Manifold 0.2790 0.9943 0.7362 0.7827 0.7835 0.2583 0.3332 �0.0460Chebychev 0.7789 0.2532 0.4734 0.4244 0.4255 0.4681 0.7424 0.5137City Block 0.7766 0.2789 0.5356 0.5126 0.5140 0.5066 0.7534 0.5183Cosine 0.6135 0.0763 0.1919 0.1609 0.1612 �0.0915 0.5316 0.2951Euclidean 0.8148 0.2794 0.5362 0.4998 0.5012 0.5164 0.7863 0.5423Mahalanobis 0.7693 0.2644 0.5457 0.5141 0.5155 0.5084 0.7383 0.5639Minkowski 0.8148 0.2794 0.5362 0.4998 0.5012 0.5164 0.7863 0.5423Spearman 0.2478 0.1323 0.2542 0.2109 0.2119 �0.1092 0.1888 0.0775

F.S. Tsai / Expert Systems with Applications 39 (2012) 1747–1752 1751

3.7. Chebychev distance metric

Chebychev distance (maximum coordinate difference) is a met-ric defined on a vector space where the distance between twovectors is the greatest of their differences along any coordinatedimension (Abello, Pardalos, & Resende, 2002) It is also known aschessboard distance, since in the game of chess the minimumnumber of moves needed by a king to go from one square on achessboard to another equals the Chebyshev distance betweenthe centers of the squares, if the squares have side length one, asrepresented in 2-D spatial coordinates with axes aligned to theedges of the board (Heijden, Duin, de Ridder, & Tax, 2004).

The Chebyshev distance between two vectors or points p and q,with standard coordinates pi and qi, respectively, is

DChebyshev ¼maxiðjpi � qijÞ ¼ lim

k!1

Xn

i¼1

jpi � qijk

!1=k

ð10Þ

It is an example of an injective metric.In two dimensions, if the points p and q have Cartesian coordi-

nates (x1,y1) and (x2,y2), their Chebyshev distance is

DChess ¼maxðjx2 � x1j; jy2 � y1jÞ ð11Þ

Under this metric, a circle of radius r, which is the set of points withChebyshev distance r from a center point, is a square whose sideshave the length 2r and are parallel to the coordinate axes.

3.8. Manifold visualization metric

A manifold visualization metric is proposed based on the corre-lation coefficient that computes the pairwise geodesic distancevector instead of the Euclidean distance vector. A geodesic distanceis the shortest path between two points in a curved space or man-ifold. For estimating the geodesic distance, we use Dijkstra’s algo-rithm (Dijkstra, 1959) for shortest paths on a weighted graph,instead of Euclidean distances. For a given source vertex (node)in the graph, the algorithm finds the path with lowest cost (i.e.the shortest path) between that vertex and every other vertex.By computing the geodesic distance on a manifold, the manifoldvisualization metric should provide better results for data that lieson a nonlinear manifold.

4. Experiments and results

In order to evaluate and compare the various techniques,experiments were conducted using dimensionality reduction

Page 6: A visualization metric for dimensionality reduction

1752 F.S. Tsai / Expert Systems with Applications 39 (2012) 1747–1752

techniques on various nonlinear manifolds. We examined theresults of the visualization metrics on different types of nonlinearmanifolds.

4.1. Results of dimensionality reduction for nonlinear manifolds

Fig. 2 shows the lower dimensional embedding results obtainedusing classical MDS, Isomap, LLE, HLLE, LTSA, Kernel PCA, diffusionmaps, and autoencoders on the S-Curve dataset, with 2000 datapoints. The correlation metric shown in the graph is the manifoldcorrelation distance.

Table 1 shows the other metrics corresponding to the previousgraph. The manifold metric is the best in evaluating the quality ofthe visualization because it showed the highest value (shown in boldin Table 1) for Isomap, HLLE, and LTSA, which correspond to the qual-ity of the visualization results shown in the previous figure.

Fig. 3 shows the lower dimensional embedding results obtainedusing classical MDS, Isomap, LLE, HLLE, LTSA, Kernel PCA, diffusionmaps, and autoencoders on the Swiss Roll dataset, with 2000 datapoints. The correlation metric shown in the graph is the manifoldcorrelation distance.

Table 2 shows the other metrics corresponding to the previousgraph. The manifold metric is the best in evaluating the quality ofthe visualization because it showed the highest value for Isomap,HLLE, and LTSA, which correspond to the quality of the visualiza-tion results shown in the previous figure.

Fig. 4 shows the lower dimensional embedding results obtainedusing classical MDS, Isomap, LLE, HLLE, LTSA, Kernel PCA, diffusionmaps, and autoencoders on the Swiss Holes dataset, with 1000data points. The correlation metric shown in the graph is the man-ifold correlation distance.

Table 3 shows the other metrics corresponding to the previousgraph. For this dataset, the manifold metric seemed better than theother metrics in evaluating the quality of the results for the casesof Isomap, HLLE, and LTSA. However, in comparing the results ofIsomap, HLLE, and LTSA, the quality produced by HLLE and LTSAappear to be higher than Isomap, but the quantitative measuresdid not agree with this assessment. Therefore, in contrast to thetwo previous datasets, which are both in the category of open pla-nar manifolds, the manifold metric did not appear as good in eval-uating the Swiss Holes dataset, which is a special case of the openplanar manifolds. However, compared to the other metrics, themanifold metric still appears the most consistent overall in evalu-ating the lower dimensional embedding results.

5. Conclusion

The growth of high-dimensional data creates a need for dimen-sionality reduction techniques to transform the data into a smaller,more manageable set which can be easily visualized. The goal ofvisualization is to represent the intrinsic structure of the input dataas faithfully as possible. Many previous studies use visual inspec-tion to assess the quality of dimensionality reduction techniques,which can only be used to visualize at most three dimensions. Inour study, we used different distance functions to compute thepairwise distances between points in the original data, and com-pute the correlation coefficient between the pairwise Euclideandistances between points in the two-dimensional embedding re-

sults. The results were compared to different dimensionalityreduction techniques to assess which metric performed better.

Our proposed metric, the manifold visualization metric fordimensionality reduction, was compared it to other types of visu-alization metrics for reducing the dimension of basic types of non-linear manifolds, and found to produce a better quantitativemeasure of visualization results if the data lies on an open planarnonlinear manifold. For the nonlinear manifolds that we explored,the dimensionality reduction techniques that produced the bestvisualization results overall were Isomap, HLLE, and LTSA, and thisalso corresponded to the quantitative assessments produced byour manifold visualization metric. Therefore, the manifold visuali-zation metric is a better judge of the quality of lower dimensionalembedding results in two dimensions, and can be applied to assessthe quality of embedding results for higher dimensional manifolds.

References

Abello, J., Pardalos, P. M., & Resende, M. G. C. (2002). Preface.Aupetit, M. (2006). Learning topology with the generative Gaussian graph and the

em algorithm. In Y. Weiss, B. Schölkopf, & J. Platt (Eds.). Advances in neuralinformation processing systems (Vol. 18, pp. 83–90). Cambridge, MA: MIT Press.

Cox, T. F., & Cox, M. A. A. (2000). Multidimensional scaling (Second ed.). New York:Chapman & Hall/CRC.

Davison, M. (2000). Multidimensional scaling. Florida: Krieger Publishing Company.Devlin, K. (1997). Mathematics: The science of patterns. New York: Scientific

American Library.Dijkstra, E. (1959). A note on two problems in connection with graphs. Numerical

Mathematics, 1, 269–271.Donoho, D. L., & Grimes, C. (2003). Hessian eigenmaps: Locally linear embedding

techniques for high-dimensional data. PNAS, 100(10), 5591–5596.Geng, X., Zhan, D.-C., & Zhou, Z.-H. (2005). Supervised nonlinear dimensionality

reduction for visualization and classification. IEEE Transactions on Systems, Man,and Cybernetics, Part B, 35(6), 1098–1107.

Heijden, F., Duin, R., de Ridder, D., & Tax, D. (2004). Classification, parameterestimation and state estimation: An engineering approach using MATLAB. WestSussex, England: Wiley.

Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of datawith neural networks. Science, 313(5786), 504–507.

Myers, J. L., & Well, A. (2003). Research design and statistical analysis. New Jersey:Lawrence Erlbaum.

Kirby, M. (2000). Geometric data analysis: An empirical approach to dimensionalityreduction and the study of patterns. New York, NY, USA: John Wiley & Sons, Inc..

Kirby, R. C., & Siebenmann, L. C. (1977). Foundational essays on topological manifolds,smoothings, and triangulations. Annals of mathematics studies (Vol. 88).Princeton: Princeton University Press.

Kruskal, J., & Wish, M. (1978). Multidimensional scaling. London: Sage Publications.Lafon, S., & Lee, A. B. (2006). Diffusion maps and coarse-graining: A unified

framework for dimensionality reduction, graph partitioning, and data setparameterization. IEEE Transactions on Pattern Analysis and Machine Intelligence,28(9), 1393–1403.

Maaten, L. (2007). An introduction to dimensionality reduction using matlab. Tech.rep., Maastricht, The Netherlands.

Pearson, K. (1901). On lines and planes of closest fit to systems of points in space.Philosophical Magazine, Series B, 2(11), 559–572.

Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locallylinear embedding. Science, 290(5500), 2323–2326.

Saul, L. K., & Roweis, S. T. (2003). Think globally, fit locally: Unsupervised learning oflow dimensional manifolds. Journal of Machine Learning Resource, 4,119–155.

Schölkopf, B., Smola, A., & Müller, K.-R. (1998). Nonlinear component analysis as akernel eigenvalue problem. Neural Computation, 10(5), 1299–1319.

Tenenbaum, J., de Silva, V., & Langford, J. (2000). A global geometric framework fornonlinear dimensionality reduction. Science, 290(5500), 2319–2323.

Tsai, F. S. & Chan, K. L. (2007). Dimensionality reduction techniques for dataexploration. In 2007 6th international conference on information, communicationsand signal processing, ICICS (pp. 1–5).

Tsai, F. S., & Chan, K. L. (2009). Blog data mining for cyber security threats. In Datamining for business applications (pp. 169–182). Springer.

Zhang, Z., & Zha, H. (2005). Principal manifolds and nonlinear dimensionalityreduction via tangent space alignment. SIAM Journal on Scientific Computing,26(1), 313–338.