[ieee 2006 ieee symposium on computational intelligence and bioinformatics and computational biology...

6
A Fuzzy Kohonen SOM Implementation and Clustering of Bio-active Compound Structures for Drug Discovery Jehan Zeb Shah and Naomie bt Salim Abstract--Hierarchical methods like Ward’s and Group Average (Gave) and nonhierarchical methods like Jarvis Patrick’s and k-means are preferred methods to cluster a diverse set of compounds for a number of drug targets (using fingerprints based descriptors). In this work the applications of fuzzy kohonen neural network and other self-organizing map (SOM) algorithms to the clustering of chemical datasets are investigated. The self-organizing maps networks usually possess a number of parameters such as the learning rate and neighborhood size, which are heuristically selected; whilst in the fuzzy self-organizing map, an optimization technique which automatically selects the best parameters using the fuzzy membership functions has been used. The results of the fuzzy SOM neural network are evaluated in comparison with other SOM neural networks (namely kohonen, neural gas and enhanced neural gas), Wards and Group Average methods for the clustering of different biologically active chemical structures using topological descriptors. The results show that the performance of Fuzzy SOM method not only outperforms other SOM networks with the best heuristics, but also give better results than the Wards and Group Average methods. Index Terms--cluster analysis, chemoinformatics, Kohonen self-organizing map, fuzzy Kohonen clustering, topological indices I. INTRODUCTION luster analysis is used to group similar compounds, after which a representative compound from each cluster was chosen to get a smaller subset that is representative of the structural content of the original compound database [1]. Compound clustering can also be used for property prediction, where a dataset which includes compounds with unknown properties are clustered and the unknown properties of those compounds can be predicted from the properties of other compounds in the same cluster [2]. In the context of selecting compounds for biological screening, the ability to group together molecules with comparable properties or activities ______________________________________________________________ This work was supported by the Malaysian Ministry of Science, Technology and Innovation under grant no. 74252. Naomie Salim is Assoc. Prof. of Chemical Information Systems in the Faculty of Computer Science & Information Systems, UTM, Malaysia. Tel: (+60) 7 553208; Fax: (+60) 75565044, Email: naomie@)fsksm.utm.my Jehan Zeb Shah is currently pursuing his Ph. D. studies in the Faculty of computer Science & Information Systems, UTM, Malaysia. (Email: [email protected] ;). has been used as a quantitative measure of effectiveness. In chemical information system, most of the clustering methods are hierarchical like Single and complete linkage algorithms, Wards and Group Average algorithms. The traditional non hierarchical methods like k-means and Jarvis- Patrick’s nearest neighbor methods are not very popular because of their inferior cluster quality. Among the hierarchical methods, the best result was produced by Ward's hierarchical agglomerative method and Jarvis-Patrick produced the best results compared to the other non-hierarchical methods tested [3]. However, in another study [4] the performance of Jarvis Patrick’s method was proved to be very poor. The performance of the Wards method was the best. The good performance of Ward’s was confirmed in an other study [5] where Wards, Jarvis-Patrick’s (fixed and variable length nearest neighbor lists), Group Average and Minimum Diameter (fixed and variable diameter) methods were evaluated on a large dataset of 21000 compounds comprised of four biologically active groups. The performance of Wards was found to be the best across all the descriptors and datasets, Group average and minimum diameter methods were slightly inferior. The performance of Jarvis Patrick method was very poor for fixed as well as for variable length nearest neighbor lists. Fuzzy clustering methods have also been applied recently for clustering of chemical datasets. In [6] and [7], it had been shown that fuzzy clustering methods can outperform Wards, Group average and hard K-means methods. But since fuzzy methods are computationally very intensive, they are not considered suitable for the clustering of large databases. Self organizing map (SOM) neural network have already been applied to various problems in chemo and bioinformatics like protein classification [8-12], clustering [13], secondary structure mapping [14], and molecular surface unfolding [15- 17]. In [18], SOM was used to discriminate dopamine from benzodiazepine agonists out of a training set of 172 compounds by using autocorrelation descriptors. Recently, neural SOM methods like kohonen and neural gas have been used for the clustering of chemical structures but their results are not convincing due to the use of heuristics for the parameter selection [19]. In this work a fuzzy SOM neural network have been employed for the clustering of a small dataset of bioactive C 1-4244-0623-4/06/$20.00 ©2006 IEEE

Upload: naomie-bt

Post on 15-Dec-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology - Toronto, ON, Canada (2006.09.28-2006.09.29)] 2006 IEEE Symposium on Computational

A Fuzzy Kohonen SOM Implementation and Clustering of Bio-active Compound Structures for

Drug Discovery

Jehan Zeb Shah and Naomie bt Salim

Abstract--Hierarchical methods like Ward’s and Group Average (Gave) and nonhierarchical methods like Jarvis Patrick’s and k-means are preferred methods to cluster a diverse set of compounds for a number of drug targets (using fingerprints based descriptors). In this work the applications of fuzzy kohonen neural network and other self-organizing map (SOM) algorithms to the clustering of chemical datasets are investigated. The self-organizing maps networks usually possess a number of parameters such as the learning rate and neighborhood size, which are heuristically selected; whilst in the fuzzy self-organizing map, an optimization technique which automatically selects the best parameters using the fuzzy membership functions has been used. The results of the fuzzy SOM neural network are evaluated in comparison with other SOM neural networks (namely kohonen, neural gas and enhanced neural gas), Wards and Group Average methods for the clustering of different biologically active chemical structures using topological descriptors. The results show that the performance of Fuzzy SOM method not only outperforms other SOM networks with the best heuristics, but also give better results than the Wards and Group Average methods.

Index Terms--cluster analysis, chemoinformatics, Kohonen self-organizing map, fuzzy Kohonen clustering, topological indices

I. INTRODUCTION

luster analysis is used to group similar compounds, after which a representative compound from each cluster was

chosen to get a smaller subset that is representative of the structural content of the original compound database [1]. Compound clustering can also be used for property prediction, where a dataset which includes compounds with unknown properties are clustered and the unknown properties of those compounds can be predicted from the properties of other compounds in the same cluster [2]. In the context of selecting compounds for biological screening, the ability to group together molecules with comparable properties or activities ______________________________________________________________

This work was supported by the Malaysian Ministry of Science, Technology and Innovation under grant no. 74252.

Naomie Salim is Assoc. Prof. of Chemical Information Systems in the Faculty of Computer Science & Information Systems, UTM, Malaysia. Tel: (+60) 7 553208; Fax: (+60) 75565044, Email: naomie@)fsksm.utm.my

Jehan Zeb Shah is currently pursuing his Ph. D. studies in the Faculty of computer Science & Information Systems, UTM, Malaysia. (Email: [email protected];).

has been used as a quantitative measure of effectiveness. In chemical information system, most of the clustering

methods are hierarchical like Single and complete linkage algorithms, Wards and Group Average algorithms. The traditional non hierarchical methods like k-means and Jarvis-Patrick’s nearest neighbor methods are not very popular because of their inferior cluster quality.

Among the hierarchical methods, the best result was produced by Ward's hierarchical agglomerative method and Jarvis-Patrick produced the best results compared to the other non-hierarchical methods tested [3].

However, in another study [4] the performance of Jarvis Patrick’s method was proved to be very poor. The performance of the Wards method was the best.

The good performance of Ward’s was confirmed in an other study [5] where Wards, Jarvis-Patrick’s (fixed and variable length nearest neighbor lists), Group Average and Minimum Diameter (fixed and variable diameter) methods were evaluated on a large dataset of 21000 compounds comprised of four biologically active groups. The performance of Wards was found to be the best across all the descriptors and datasets, Group average and minimum diameter methods were slightly inferior. The performance of Jarvis Patrick method was very poor for fixed as well as for variable length nearest neighbor lists.

Fuzzy clustering methods have also been applied recently for clustering of chemical datasets. In [6] and [7], it had been shown that fuzzy clustering methods can outperform Wards, Group average and hard K-means methods. But since fuzzy methods are computationally very intensive, they are not considered suitable for the clustering of large databases.

Self organizing map (SOM) neural network have already been applied to various problems in chemo and bioinformatics like protein classification [8-12], clustering [13], secondary structure mapping [14], and molecular surface unfolding [15-17]. In [18], SOM was used to discriminate dopamine from benzodiazepine agonists out of a training set of 172 compounds by using autocorrelation descriptors. Recently, neural SOM methods like kohonen and neural gas have been used for the clustering of chemical structures but their results are not convincing due to the use of heuristics for the parameter selection [19].

In this work a fuzzy SOM neural network have been employed for the clustering of a small dataset of bioactive

C

1-4244-0623-4/06/$20.00 ©2006 IEEE

Page 2: [IEEE 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology - Toronto, ON, Canada (2006.09.28-2006.09.29)] 2006 IEEE Symposium on Computational

compounds and the performance has been evaluated in comparison with other neural methods like kohonen, neural gas and enhanced neural gas algorithms. The fuzzy SOM uses an optimization technique which optimizes the parameters like learning rate and neighborhood size and thus overcome some of the limitations of other SOM neural networks. The results are also compared with the traditional Ward’s and Group Average clustering algorithms. The next section describes the dataset and descriptors used in this work, section III gives some details about the methods used and section IV is devoted to results and discussion. In the last section V the work is concluded.

II. DATA and DESCRIPTORS

In this study a subset of the MDDR database comprised of around 15 biologically active groups of compounds have been used. Most of the activities chosen are highly diverse whereas the first four categories can be regarded as the most heterogeneous as compared to the rest of the compounds. This experiment used the 1388-molecule from the MDL Drug Data Report (MDDR) database, containing molecules of drugs launched or under development, as referenced in the patent literature, conference proceedings, and other sources [20]. The main groups, their subgroups and their aggregate activity are summarized in Table 1.

In order to classify compounds structures based on their biological activity, we need structural features with good ability to represent the structure-activity relationship. The stronger the structure activity relationship, the better the results will be in terms of practical importance.

Topological indices are a set of features that characterize the arrangement and composition of the vertices, edges and their interconnections in a molecular bonding topology. These indices are calculated from the matrix information of the molecular structure using some mathematical formula. Among hundreds of possible descriptors, a few have been found useful in characterization of molecular properties and activities, such as the Weiner’s index [21-23], Harary index [24], MTI index [25, 26], Balaban index [27, 28], and Zagreb group indices [29, 30]. Topological indices have been used in classification, clustering and QSAR/QSPR studies. For instance, in [31] the heats of formation for 60 hydrocarbons had been satisfactorily predicted. In [32] a QSAR study was carried out for modeling the DNA modeling affinity with help of distance matrix based TI. Although, there are many methods available for the quantifications of molecular structures, but the TI is the most suitable, popular, easily obtainable from molecular structure and according to a study [33] is the first effective choice for QSAR studies.

Here we have calculated 99 topological descriptors using Melano chemometrics’ Dragon software [34] for our dataset. Principal component analysis (PCA) [35] was then carried out to remove the unnecessary linearly dependent features so that each molecule is characterized by only 10 variable that accounts for 98% variance in our selected dataset. PCA was carried out using the MVSP 3.13 [36]. A list of the 10 features selected is shown in Table 2.

III. METHODS

Although there are available a large number of methods for clustering chemical dataset but a few are more important because of their good performance and efficiency. As discussed in the introduction Wards and Group average are the two hierarchical methods that outperform all the remaining classical methods. The kohonen self organizing map, neural gas and fuzzy kohonen neural network methods that we have used in this work are as follows.

TABLE 1 GROUPS AND ACTIVITIES OF THE DATASET

S.No Activity No. molecul

es 1 Interacting on 5HT receptor 5HT Antagonists 48

5HT1 agonists 66

5HT1C agonists 57

5HT1D agonists 100

2 Antidepressants Mao A inhibitors 84

Mao B inhibitors 174

3 Antiparkinsonians Dopamine (D1) agonists 32

Dopamine (D2) agonists 104

4 Antiallergic/antiasthmatic Adenosine A3 antagonists 73

Leukotine B4 antagonists 150

5 Agents for Heart Failure Phosphodiesterase inhibitors 100

6 AntiArrythmics Potassium channel blockers 100

Calcium channel blockers 100

7 Antihypertensives ACE inhibitors 100

Adrenergic (alpha 2) blockers 100

Total molecules 1388

TABLE 2

SELECTED TOPOLOGICAL INDICES TI Description Gnar Narumi geometric topological index Xt Total structure connectivity index Dz Pogliani index SMTI Schultz Molecular Topological Index PW3 path/walk 3 – Randic shape index PW4 path/walk 4 – Randic shape index PW5 path/walk 5 – Randic shape index PJI2 2D Petitjean shape index CSI eccentric connectivity index D/Dr03 distance/detour ring index of order 3

Page 3: [IEEE 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology - Toronto, ON, Canada (2006.09.28-2006.09.29)] 2006 IEEE Symposium on Computational

A. Self-Organizing Kohonen Neural Networks

Self organizing map (SOM) is an unsupervised neural network proposed by Kohonen [37] which consists of only two layers of neurons. The input layer and the output kohonen layer, which is usually designed as 2 or 3 dimensional map of neurons.

It is basically, a competitive network with the characteristics of self organization where similar sub-maps of output layer neurons designate a class or group of the input dataset. If Zk = [zk1, zk2, … zkp] is a p dimensional data object then Wl = [wl1, wl2, … wlp] is the weight vector associated with the neuron l of the output layer. The objective of the kohonen learning law is to find the winning neuron or a neighborhood of neurons closest to the object presented as input stimulus. The winning neuron is required to be moved closer to the input object by some proportion of the distance between the input and the winning neuron (or neurons).

For each compound k of the dataset, the distance dk between the weight vector Wl of each neuron and the input compound Zk is computed. The neuron (or neurons) having very small distance from the input is the winner. The weights of the winner neuron are then updated using some learning rule. Usually, Euclidean distance is used for distance computations. The index q of the winning neuron (neurons) is determined as follows:

(1) )(arg min kll

ZtWq −=

The weights for the winning neurons are updated as follows: (2) )()()()()1( kqqq ZtWtttWtW −Ω+=+ α

Where α(t) is the learning parameter and Ω(t) is the neighborhood size parameter both of which are functions of time and are decreased continuously with each iteration. The value of this function is some positive value between 0 and 1 inside the map neighborhood. If the neuron is not inside the neighborhood of the winning neuron the learning rate parameter is 0. It means that in kohonen learning rule, the neurons inside the neighborhood are learning neurons and so only these are updated. B. Neural Gas Clustering Method

The neural gas algorithm is an important neural network, first introduced by Martinez [38] for the prediction of time series and then applied successfully to the clustering of various databases [39], vector quantization and pattern recognition [40, 41], and topology representation[42] etc. The neural gas algorithm has a number of advantages like, 1- converges quickly to low distortion errors, 2- reaches a distortion error which is lower than that resulting from k-means clustering and maximum entropy clustering (for practically feasible number of iterations), and from kohonen feature map and 3- at the same time obeys a gradient descent on an energy surface (like the maximum entropy clustering, in contrast to kohonen’s feature map) [38].

The neural gas algorithm generates a list of the ranks of weight vectors Wk corresponding to each input pattern Zk

which gives the weights a descending order based on the closeness to the input pattern with Wk0 being the closest weight vector to the input pattern, Wk1 as the second close weight vector and Wki , i = 1, 2, …, c-1 being the weight vector for which there are i vectors Wj with kikjk WZWZ −<− . If the ranking index number k associated with each vector Wk is denoted by R(Zk, Wki), which depends on Zk and the whole set Wki = (Wk0, Wk1, …, Wkc-1) of weight vectors, then the adaptation step we employ for updating the Wk ‘s is given by

[ ] (3) )()),(()()()1( tWZWZRhttWtW kikkikkiki −−=+ ληThe learning rate parameter [ ]1,0)( ∈tη describes the overall extent of modification and usually is taken as an exponentially decreasing function of time

(4) onMaxiteratitifit /)/()( ηηηη =

Where ηi and ηf are the initial and final values of the learning rate, respectively, which are initialized in advance. The Maxiteration is also a constant specifies the number of maximum t steps, also initialized at the start of the algorithm.

As already stated the ranking index R(Zk, Wki) which depend on the input pattern Zk and the whole set of the weight vectors Wki = (Wk0, Wk1, …, Wkc-1) of weight vectors which also serve as the prototypes of the dataset Z. The value of R(Zk, Wki) is zero for the closest weight vector and is the maximum for the farthest weight vector. The ranking adaptation parameter hλ(t) is a function of the ranking index R and lies between 0 and 1. Martinez [38] has suggested an exponential decreasing function

(5) )/exp()( λλ RRh −=the value of hλ(t) thus depends on the rank of the weight vector, as the rank increases the updating rate decreases and vice versa.

An other variant of the neural gas algorithm [43] is the enhanced neural gas algorithm which can perform well in presence of noise and outliers and is less affected by the order of examples presented. The algorithm employees an outlier resistant mechanism by keeping a historical distance information for each cluster. When a new compound is presented its distance from the cluster is compared with the historical distance in the previous iteration. If it is greater or equal to the historical distance, the historical distance is updated as the harmonic mean otherwise as an arithmetic mean. C. Fuzzy Kohonen Self-Organizing Feature Map

Nearly all of the self organizing maps based neural networks such as kohonen, neural gas and enhanced neural gas are heuristics algorithms which are not based on the minimization of some objective functional like fuzyy c-mean. The resultant maps in all these cases are not always fully labeled because of the heuristic nature of the parameter (like learning rate and neighborhood size) selection strategy. Since, fuzzy c-mean is an optimization procedure [44], combination of fuzzy c-mean and the kohonen self-organizing feature maps was first considered by huntsberger [45] to make the kohonen algorithm an optimization procedure. The algorithm combines

Page 4: [IEEE 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology - Toronto, ON, Canada (2006.09.28-2006.09.29)] 2006 IEEE Symposium on Computational

the fuzzy membership values of the fuzzy c-mean for the learning and neighborhood size parameters and the update rules of the kohonen feature maps.

The update rule for the fuzzy kohonen algorithm can be given as:

(6) cink

WZWW tiktiktiti ,...,2,1,...,2,1

),( ,1,,1, ==

−+= ++ α

where Wi,t is the centroid of the ith cluster at iteration t , Zk is

the kth data point (or compound feature set) and αik is the only parameter of the algorithm and according to Huntsberger

(7) )(,, )( tmtiktik U=α

where m(t) is an exponent like the fuzzification index in fuzzy c-mean and Uik,t is the membership value of the compound Zk to be part of cluster i. Both of these constant varies with each iteration t and given as:

(8) cink

DDU c

j

mjkik

ik≤≤≤≤

=∑

=

− 11

,)/(

1

1

)1/(2

(9) )()( ikT

ikik WZWZD −−= (10) tmmtm *0)( ∆−=(11) itermmm f max/)0( −=∆

where Dik is the distance and m0 is some constant greater than the final value (mf) of the fuzzification parameter m. The final value mf should not be less than 1.1, in order to avoid the divide by zero error in equation 8.

IV. RESULTS AND DISCUSSION

Each clustering method was repeated ten times for 10, 20... 100 sets of clusters. The purpose of the analysis carried out on each of these sets of clusters is to determine the extent to which the actives have been separated from the inactives. For each of the groups in our dataset, we have taken the compounds belonging to one group as the active compounds and the rest of compounds belonging to the rest of the groups are taken as inactives. This scheme was repeated for each group in the dataset and an average percentage of actives proportion was determined. An active cluster is defined as one which contains at least one active compound and so a cluster

which does not contain any active compound is inactive cluster. A subset of the dataset is termed as active cluster subset containing all the compounds in active clusters [13]. The active cluster subset must not contain the structures in the active singletons, as this will give rise to the proportion of actives incorrectly. The proportion of actives in the active cluster subset gives us the degree of separation between actives and inactives.

In kohonen SOM neural network clustering, some parameters have to be initialized in advance. These parameters are the learning rate and the neighborhood map shape and size. The learning rate parameters can be linear, inverted or non linear (power) where the learning rate decreases exponentially. The radius of the neighborhood function can have a number of options but the results here are only shown for the Gaussian as it give the best results among all the available options. The shape of the map was kept rectangular. For the learning rate a number of strategies have been

10

12

14

16

18

20

22

24

10 20 30 40 50 60 70 80 90 100

No. of Clusters

% P

ropo

rtion

of A

ctiv

es SOMSOM_POWGPAVWARD'sNGENGF-SOM

Fig. 1. Performance of of the clustering methods: The non-singleton clusters containing at least one active molecule are combined to make one cluster called the active cluster subset. The percent proportion is the average proportion of all the major 15 activities (shown in Table 2)found in the dataset. GPAV - Group Average Method, WARDS - Ward’s Clustering Method, SOM – Kohonen Self-Organizing NN with linear learning rate, SOM_POW – Kohonen Self-Organizing NN with exponential learning rate, NG – Neural Gas Network, ENG – Enhanced Neural Gas Network, and F-SOM – Fuzzy Kohonen Self Organizing NN.

TABLE 3NUMBER OF UNLABELLED CELLS

Method No. of Cluster Fuzzy SOM Kohonen

SOM Neural Gas Enhanced

Neural Gas Wards Group

Average 10 0 2 2 2 0 0 20 0 0 5 3 0 0 30 0 1 8 14 0 0 40 0 2 9 16 0 0 50 0 4 11 32 0 0 60 0 8 16 34 0 0 70 0 8 21 38 0 0 80 0 12 26 49 0 0 90 0 15 29 47 0 0 100 0 20 34 62 0 0

Page 5: [IEEE 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology - Toronto, ON, Canada (2006.09.28-2006.09.29)] 2006 IEEE Symposium on Computational

applied like linear increase but the results obtained with the exponential increase were the best.

Fig.1 shows the performance of fuzzy kohonen, kohonen SOM (with linear and exponential learning rates), neural gas, and enhanced neural gas in comparison with Wards and group average methods. The data is sampled for every 10 clusters. The x-axis plots the number of clusters whereas the y-axis plots the proportion of actives in active cluster subset. The performance of the fuzzy kohonen network is not good for smaller number of clusters but as the number of clusters increases it increases. The SOM with the exponential learning rate is better than the one with linear learning rate.

It has been observed that as the number of clusters is increased the number of unlabelled cells of the self- organizing maps increases in neural methods. The enhanced neural gas performance is the worst due to its behavior to reduce the number of clusters. This algorithm tries to avoid the noise and outliers to be accommodated in clusters of their own and tries to confine them to the inside of the clusters closer to them, which results in less number of clusters. The number of unlabelled cells is almost zero for the fuzzy SOM, which shows best convergence, good choice of learning rate and neighborhood size. However, it has been observed that if the number of clusters is largely increased then there occur some unlabelled cells, e.g. for 400 clusters there were 6 unlabelled cells and for 1000 clusters their number was 106. It is described in table 3. The convergence of the algorithm remained very uniform and was not affected by the variation in number of clusters. It is shown in Fig. 3.

It has been observed that as we increase the number of

clusters the proportion of actives in active cluster subsets increases. The best proportion obtained for the fuzzy SOM, kohonen SOM, SOM-POW, neural gas, enhanced neural gas, group average and Wards was respectively 23.31537, 16.39037, 17.03327, 16.54251, 15.13585, 23.26094 and 23.06908964 when the number of clusters were set to 100.

V. CONCLUSION

In this work a dataset of drug like compounds have been clustered using four neural and two hierarchical methods and the performance is compared. The results show that the fuzzy SOM method performs better than the other neural methods. The fuzzy SOM method prediction accuracy is slightly higher than that of the traditional Wards and group average methods for the increased number of clusters. Additionally, the fuzzy SOM is more efficient and converges to lower errors in only 100 iterations for any number of clusters.

It is recommended that the fuzzy SOM method may be evaluated on large chemical dataset having tens of thousands of compounds where some other descriptors like physiochemical properties and 3D descriptors can also be used.

REFERENCES

[1] J. B. Dunbar, "Cluster-based selection," Perspective in Drug

Discovery and Design, vol. 7, pp. 51-63, 1997. [2] R. D. Brown and Y. C. Martin, "The information content of 2-D

and 3-D structural descriptors relevant to ligand-receptor binding," Journal of Chemical Information and Computer Sciences, vol. 37, pp. 1-9, 1997.

[3] P. Willett, Similarity And Clustering In Chemical Information Systems. Letchworth: Research Studies Press, 1987.

[4] G. M. Downs, P. Willett, and W. Fisanick, "Similarity searching and clustering of chemical structure databases using molecular property data," Journal of Chemical Information and Computer Science, vol. 34:, pp. 1094-1102, 1994.

[5] R. D. Brown and Y. C. Martin, "Use of structure- Activity data to compare structure based clustering methods and descriptors for use in compound selection," Journal of chemical Information and computer science, vol. 36, pp. 572-584., 1996.

[6] J. D. Holliday, S. L. Rodgers, and P. Willet, "Clustering Files of chemical Structures Using the Fuzzy k-means Clustering Method," Journal of chemical Information and computer science, vol. 44, pp. 894-902, 2004.

[7] J. Z. Shah and N. Salim, "FCM and G-K clustering of chemical dataset using topological indices," First International Symposium on Bio-Inspired Computing, Johor Bahru, Malaysia, 2005.

[8] T. F. Betham, T. Engan, J. Krane, and D. Axelson, "Analysisi and classification of proton NMR spectra of lipoprotein from healthy volunteers and patients with cancer or CHD," AntiCancer Research, vol. 20, pp. 2393-2408, 2000.

[9] E. A. Ferran, "Self organized neural maps of human protein sequences," Protein Science, vol. 3, pp. 507-521, 1994.

[10] W. Huichun, J. Dopazo, L. G. d. l. Fraga, Y. P. Zhu, and J. M. Carazo, "Self organizing tree growing network for classification of protein sequences," Protein Science, vol. 7, pp. 2613-2622, 1998.

[11] J. Kaartinen, Y. Hiltunen, P. T. Kovanen, and M. Ala-Korpela, "Application of self organizing maps for the detection and classification of human blood plasma lipoprotein lipid profiles on the basis of 1H NMR spectroscopy data," NMR in Biomedecine, vol. 11, pp. 168-176, 1998.

[12] J. Schuchhardt, "Local structural motif of protein backbones are classified by self organizing neural networks," Protein Engineering,, vol. 9, pp. 833-842, 1996.

[13] P. Somervuo and T. Kohonen, "clustering and visualization of large protein sequence databases by means of an extension of the self organizing map," presented at 3rd International Conference on Discovery Science, Berlin, Germany, 2000.

[14] R. m. MacCallum, "Computational analysis of protein sequence and structure," in Department of Biochemistry and molecular

1E-271E-241E-211E-181E-151E-121E-091E-060.001

11000

1 10 19 28 37 46 55 64 73 82 91 100

Epochs

Erro

r (R

MS

)

Err100Err90Err80Err70Err60Err50

Fig. 3. Convergence of the Fuzzy Kohonen SOM Network

The Y-axis plots the Root Mean Square (RMS) Error. The Scale is logarithmic. Err100 – Error for 100 clusters, Err90 – Error for 90 clusters

and so on.

Page 6: [IEEE 2006 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology - Toronto, ON, Canada (2006.09.28-2006.09.29)] 2006 IEEE Symposium on Computational

biology, vol. Ph. D. Thesis. London: University College London, 1997.

[15] J. Gasteiger and A. Uschold, "The beauty of molecular surfaces as revealed by self organizing neural networks," Journal of Molecular Graphics, vol. 12, pp. 90-97, 1994.

[16] J. Polanski and B. Walczak, "comparative molecular surface analysis (COMSA): A novel tool for molecular design," computers and chemistry, vol. 24, pp. 615-625, 2000.

[17] P. Unneberg, J. J. Merelo, P. Chacon, and F. Moran, "SOMCD: Method for evaluating protein secondary structure from UV circular dichroism spectra," Protein-structure function and genetics, vol. 42, pp. 460-470, 2001.

[18] H. Bauknecht, A. Zell, H. Bayer, P. Levi, M. Wagener, J. Sadowski, and J. Gasteiger, "Locating Biologically Active Compounds in Medium-Sized Heterogeneous Datasets by Topological Autocorrelation Vectors: Dopamine and Benzodiazepine Agonists " Journal of chemical information and computer sciencevol. 36, pp. 1205-1213, 1996.

[19] J. Z. Shah and N. Salim, "Clustering of Biologically Relevant Compounds using Neural Networks and Topological Indices," presented at IEEE International Conference on Industrial and Information Systems, Peradeniya, Sri lanka, 2006.

[20] "MDL’s Drug Data Report," Elsevier MDL. http://www.mdli.com/products/knowledge/drug_data_report/index.jsp

[21] H. Wiener, "Correlation of Heats of Isomerization, and Differences in Heats of Vaporization of Isomers, Among the Paraffin Hydrocarbons " Journal of American Chemical Society, vol. 69, pp. 2636-2638, 1947.

[22] H. Wiener, "Relation of the Physical Properties of the Isomeric Alkanes to Molecular Structure. Surface Tension, Specific Dispersion, and Critical Solution Temperature in Aniline " Journal of Physical chemistry, vol. 52, pp. 1082-1089, 1948.

[23] H. Wiener, "Structural Determination of Paraffin Boiling Points " Journal of American Chemical Society, vol. 69, pp. 17-20, 1947.

[24] D. Plavsic, S. Nikolic, N. Trinajstic, and Z. Mihalic, "On the Harary index for the characterization of chemical graphs.," Journal of Mathematical Chemistry, vol. 12, pp. 235-250, 1993.

[25] J. V. Knop, W. R. Mueller, K. Szymanski, and N. Trinajstic, "Use of small computers for large computations: enumeration of polyhex hydrocarbons " Journal of chemical Information and computer science, vol. 30, pp. 159-160, 1990.

[26] H. P. Schultz, "Topological organic chemistry. 1. Graph theory and topological indices of alkanes " journal of chemical Information and computer science, vol. 29, pp. 227-228, 1989.

[27] A. T. Balaban, "Highly Discriminating Distance-Based Topological Index," Chemical Physics Letters, vol. 89, pp. 399-404, 1982.

[28] M. Randic, "Characterization of molecular branching ", vol. 97, pp. 6609-6615, 1975.

[29] I. Gutman, B. Ruscic, N. Trinajstic, and J. C. F. Wilcox, "Graph theory and molecular orbitals. XII. Acyclic polyenes," Journal of Chemical Physics, vol. 62, pp. 3399-3405, 1975.

[30] I. Gutman and N. Trinajstic, "Graph theory and molecular orbitals. Total ¼-electron energy of alternant hydrocarbons," Chemical Physics Letters, vol. 17, pp. 535-538, 1972.

[31] A. Mercader, E. A. Castro, and A. A. Toropov, "Maximum Topological Distances Based Indices as Molecular Descriptors for QSPR. 4. Modeling the Enthalpy of Formation of Hydrocarbons from Elements," International Journal of Molecular Science, vol. 2, pp. 121-132, 2001.

[32] A. Thakur, M. Thakur, N. Kakani, A. Joshi, S. Thakur, and A. Gupta, "Application of topological and physicochemical descriptors: QSAR study of phenylamino-acridine derivatives," ARKIVOC, vol. xiv, 2004.

[33] S. C. Basak, B. D. Gute, and G. D. Grunwald, "A hierarchical approach to the development of QSAR models using topological, geometrical and quantum chemical parameters. ," in Topological Indices and Related Descriptors in QSAR and QSPR, J. Devillers and A. T. Balaban, Eds.: Gordon and Breach Science Publishers, 1999, pp. 692.

[34] "Dragon," melano chemoinformatics. http://www.talete.mi.it [35] I. Jolife, Principal component analysis. New York: Springer-

Verlag, 1986. [36] "MVSP 3.13, Kovach computing services: ."

http://www.kovcomp.com/ [37] T. Kohonen, "The self organizing map," IEEE Proc., vol. 78,

1990. [38] T. T. Martinez, S. G. Berkovich, and K. J. Schulten, "Neural gas

network for vector quantization and its application to time series prediction," IEEE Transaction on Neural Networks, vol. 4, 1993.

[39] A. K. Qin and P. N. Suganthan, "Robust growing neural gas algorithm with application in cluster analysis," Journal of Neural Networks, vol. 17, pp. 1135-1148, 2004.

[40] F. Camastra and A. Vinciarelli, "combining neural gas and vector quantization for cursive character recognition," Journal of Neurocomputing, vol. 51, pp. 147-159, 2003.

[41] B. L. Zhang, M. Y. Fu, and H. Yan, , . "Hand written character verification based on neural gas based vector quantization," IEEE proc., 1998.

[42] Z. Cselenyi, "Mapping the dimensionality, density and topology of data: the growing adaptive neural gas," computer methods and programs in biomedicine, vol. 78, pp. 141-156, 2005.

[43] A. K. Qin and P. N. Suganthan, "Enhanced neural gas network for prototype-based clustering," Pattern recognition, vol. 38, pp. 1275-1288, 2005.

[44] J. C. Bezdek, E. C.-K. Tsao, and N. R. Pal, "Fuzzy Kohonen clustering networks," presented at IEEE International Conference on Fuzzy Systems, 1992.

[45] T. Huntsberger and P. Ajjimarangsee, "parallel self-organizing feature maps for unsupervised pattern recognition," International journal of General Systems, vol. 16, pp. 357-372, 1989.