[Studies in Computational Intelligence] Computational Intelligence in Bioinformatics Volume 94 ||

Download [Studies in Computational Intelligence] Computational Intelligence in Bioinformatics Volume 94 ||

Post on 16-Mar-2017

239 views

Category:

Documents

5 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>Arpad Kelemen, Ajith Abraham and Yuehui Chen (Eds.)</p><p>Computational Intelligence in Bioinformatics</p></li><li><p>Studies in Computational Intelligence, Volume 94</p><p>Editor-in-chiefProf. Janusz KacprzykSystems Research InstitutePolish Academy of Sciencesul. Newelska 601-447 WarsawPolandE-mail: kacprzyk@ibspan.waw.pl</p><p>Further volumes of this series can be found on ourhomepage: springer.com</p><p>Vol. 73. Petra Perner (Ed.)Case-Based Reasoning on Images and Signals, 2008ISBN 978-3-540-73178-8</p><p>Vol. 74. Robert SchaeferFoundation of Global Genetic Optimization, 2007ISBN 978-3-540-73191-7</p><p>Vol. 75. Crina Grosan, Ajith Abraham and Hisao Ishibuchi(Eds.)Hybrid Evolutionary Algorithms, 2007ISBN 978-3-540-73296-9</p><p>Vol. 76. Subhas Chandra Mukhopadhyay and Gourab SenGupta (Eds.)Autonomous Robots and Agents, 2007ISBN 978-3-540-73423-9</p><p>Vol. 77. Barbara Hammer and Pascal Hitzler (Eds.)Perspectives of Neural-Symbolic Integration, 2007ISBN 978-3-540-73953-1</p><p>Vol. 78. Costin Badica and Marcin Paprzycki (Eds.)Intelligent and Distributed Computing, 2008ISBN 978-3-540-74929-5</p><p>Vol. 79. Xing Cai and T.-C. Jim Yeh (Eds.)Quantitative Information Fusion for HydrologicalSciences, 2008ISBN 978-3-540-75383-4</p><p>Vol. 80. Joachim DiederichRule Extraction from Support Vector Machines, 2008ISBN 978-3-540-75389-6</p><p>Vol. 81. K. SridharanRobotic Exploration and Landmark Determination, 2008ISBN 978-3-540-75393-3</p><p>Vol. 82. Ajith Abraham, Crina Grosan and WitoldPedrycz (Eds.)Engineering Evolutionary Intelligent Systems, 2008ISBN 978-3-540-75395-7</p><p>Vol. 83. Bhanu Prasad and S.R.M. Prasanna (Eds.)Speech, Audio, Image and Biomedical Signal Processingusing Neural Networks, 2008ISBN 978-3-540-75397-1</p><p>Vol. 84. Marek R. Ogiela and Ryszard TadeusiewiczModern Computational Intelligence Methodsfor the Interpretation of Medical Images, 2008ISBN 978-3-540-75399-5</p><p>Vol. 85. Arpad Kelemen, Ajith Abraham and Yulan Liang(Eds.)Computational Intelligence in Medical Informatics, 2008ISBN 978-3-540-75766-5</p><p>Vol. 86. Zbigniew Les and Mogdalena LesShape Understanding Systems, 2008ISBN 978-3-540-75768-9</p><p>Vol. 87. Yuri Avramenko and Andrzej KraslawskiCase Based Design, 2008ISBN 978-3-540-75705-4</p><p>Vol. 88. Tina Yu, David Davis, Cem Baydar and RajkumarRoy (Eds.)Evolutionary Computation in Practice, 2008ISBN 978-3-540-75770-2</p><p>Vol. 89. Ito Takayuki, Hattori Hiromitsu, Zhang Minjieand Matsuo Tokuro (Eds.)Rational, Robust, Secure, 2008ISBN 978-3-540-76281-2</p><p>Vol. 90. Simone Marinai and Hiromichi Fujisawa (Eds.)Machine Learning in Document Analysisand Recognition, 2008ISBN 978-3-540-76279-9</p><p>Vol. 91. Horst Bunke, Kandel Abraham and Last Mark (Eds.)Applied Pattern Recognition, 2008ISBN 978-3-540-76830-2</p><p>Vol. 92. Ang Yang, Yin Shan and Lam Thu Bui (Eds.)Success in Evolutionary Computation, 2008ISBN 978-3-540-76285-0</p><p>Vol. 93. Manolis Wallace, Marios Angelides and PhivosMylonas (Eds.)Advances in Semantic Media Adaptation andPersonalization, 2008ISBN 978-3-540-76359-8</p><p>Vol. 94. Arpad Kelemen, Ajith Abraham and Yuehui Chen(Eds.)Computational Intelligence in Bioinformatics, 2008ISBN 978-3-540-76802-9</p></li><li><p>Arpad KelemenAjith AbrahamYuehui Chen(Eds.)</p><p>Computational Intelligencein Bioinformatics</p><p>123</p><p>With 104 Figures and 34 Tables</p></li><li><p>Arpad Kelemen</p><p>Ajith AbrahamCentre for Quantifiable Quality of Service</p><p>in Communication Systems (Q2S)Centre of ExcellenceNorwegian University of Science</p><p>and TechnologyO.S. Bragstads plass 2EN-7491 TrondheimNorway</p><p>ajith.abraham@ieee.org</p><p>Yuehui ChenSchool of Information</p><p>Science and EngineeringJinan UniversityJiwei Road 106, Jinan 250022P.R. China</p><p>yhchen@ujn.edu.cn</p><p>ISBN 978-3-540-76802-9 e-ISBN 978-3-540-76803-6</p><p>Studies in Computational Intelligence ISSN 1860-949X</p><p>Library of Congress Control Number: 2007940153</p><p>c 2008 Springer-Verlag Berlin Heidelberg</p><p>This work is subject to copyright. All rights are reserved, whether the whole or part of the materialis concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broad-casting, reproduction on microfilm or in any other way, and storage in data banks. Duplication ofthis publication or parts thereof is permitted only under the provisions of the German Copyright Lawof September 9, 1965, in its current version, and permission for use must always be obtained fromSpringer-Verlag. Violations are liable to prosecution under the German Copyright Law.</p><p>The use of general descriptive names, registered names, trademarks, etc. in this publication does notimply, even in the absence of a specific statement, that such names are exempt from the relevant</p><p>Department of NeurologyBuffalo Neuroimaging Analysis CenterThe Jacobs Neurological Institute</p><p>NY 14203, U.S.A.</p><p>akelemen@buffalo.edu</p><p>University at Buffalo, the State Universityof New York, 100 High Street, Buffalo </p><p>Printed on acid-free paper</p><p>9 8 7 6 5 4 3 2 1</p><p>springer.com</p><p>Cover Design: Deblik, Berlin, Germany</p><p>protective laws and regulations and therefore free for general use.</p></li><li><p>Preface</p><p>Bioinformatics involve the creation and advancement of algorithms using techniquesincluding computational intelligence, applied mathematics and statistics, informat-ics, and biochemistry to solve biological problems usually on the molecular level.Major research efforts in the field include sequence analysis, gene finding, genomeannotation, protein structure alignment analysis and prediction, prediction of geneexpression, protein-protein docking/interactions, and the modeling of evolution.</p><p>Computational intelligence is a well-established paradigm, where new theorieswith a sound biological understanding have been evolving. Defining computationalintelligence is not an easy task. In a nutshell, which becomes quite apparent in lightof the current research pursuits, the area is heterogeneous with a combination ofsuch technologies as neural networks, fuzzy systems, rough set, evolutionary com-putation, swarm intelligence, probabilistic reasoning, multi-agent systems etc. Therecent trend is to integrate different components to take advantage of complementaryfeatures and to develop a synergistic system.</p><p>This book deals with the application of computational intelligence in bioinfor-matics. Addressing the various issues of bioinforatics using different computationalintelligence approaches is the novelty of this edited volume. This volume comprisesof 13 chapters including some introductory chapters giving the fundamental defini-tions and some important research challenges. Chapters were selected on the basisof fundamental ideas/concepts rather than the thoroughness of techniques deployed.The thirteen chapters are organized as follows.</p><p>In the introductory Chapter, Tasoulis et al. present neural networks, evolution-ary algorithms and clustering algorithms and their application to DNA microar-ray experimental data analysis. Authors also discus different dimension reductiontechniques.</p><p>Chapter 2 by Kaderali and Radde provide an overview of the reconstructionof gene regulatory networks from gene expression measurements. Authors presentseveral different approaches to gene regulatory network inference, discuss theirstrengths and weaknesses, and provide guidelines on which models are appropriateunder what circumstances.</p></li><li><p>VI Preface</p><p>Chapter 3 by Donkers and Tuyls introduce Bayesian belief networks and describetheir current use within bioinformatics. The goal of the chapter is to help the readerto understand and apply belief networks in the domain of bioinformatics. Authorspresent the current state-of-the-art by discussing several real-world applications inbioinformatics, and also discuss some available software tools.</p><p>Das et al. in Chapter 4 explore the role of swarm intelligence algorithms in cer-tain bioinformatics tasks like micro-array data clustering, multiple sequence align-ment, protein structure prediction and molecular docking. This chapter begins withan overview of the basic concepts of bioinformatics along with their biological basisand then provides a detailed survey of the state of the art research centered aroundthe applications of swarm intelligence algorithms in bioinformatics.</p><p>Liang and Kelemen in the fifth Chapter propose a time lagged recurrent neuralnetwork with trajectory learning for identifying and classifying the gene functionalpatterns from the heterogeneous nonlinear time series microarray experiments. Op-timal network architectures with different memory structures were selected basedon Akaike and Bayesian information criteria using two-way factorial design. Theoptimal model performance was compared to other popular gene classification algo-rithms, such as nearest neighbor, support vector machine, and self-organized map.</p><p>In Chapter 6, Busa-Fekete et al. suggest two algorithms for protein sequenceclassification that are based on a weighted binary tree representation of protein simi-larity data. TreeInsert assigns the class label to the query by determining a minimumcost necessary to insert the query in the (precomputed) trees representing the vari-ous classes. Then TreNN assigns the label to the query based on an analysis of thequerys neighborhood within a binary tree containing members of the known classes.The algorithms were tested in combination with various sequence similarity scoringmethods using a large number of classification tasks representing various degrees ofdifficulty.</p><p>In Chapter 7, Smith compares the traditional dynamic programming RNA genefinding methodolgy with an alternative evolutionary computation approach. Exper-iment results indicate that dynamic programming returns an exact score at the costof very large computational resource usage, while the evolutionary computing ap-proach allows for faster approximate search, but uses the RNA secondary structureinformation in the covariance model from the start.</p><p>Schaefer et al. in Chapter 8, illustrate how fuzzy rule-based classification canbe applied successfully to analyze gene expression data. The generated classifierconsists of an ensemble of fuzzy if-then rules, which together provide a reliable andaccurate classification of the underlying data.</p><p>In Chapter 9, Huang and Chow overview the existing gene selection approachesand summarize the main challenges of gene selection. Using a typical gene selectionmodel, authors further illustrate the implementation of these strategies and evaluatetheir contributions.</p><p>Cao et al. in Chapter 10, propose a fuzzy logic based novel gene regulatorynetwork. The key motivation for this algorithm is that genes with regulatory rela-tionships may be modeled via fuzzy logic, and the strength of regulations may berepresented as the length of accumulated distance during a period of time intervals.</p></li><li><p>Preface VII</p><p>One unique feature of this algorithm is that it makes very limited a priori assumptionsconcerning the modeling.</p><p>In Chapter 11, Haavisto and Hytyniemi apply linear multivariate regressiontools for microarray gene expression data. Two examples comprising of yeast cellresponse to environmental changes and expression during the cell cycle, are used todemonstrate the presented subspace identification method for data-based modelingof genome dynamics.</p><p>Navas-Delgado et al. in Chapter 12 present an architecture for the developmentof Semantic Web applications, and the way it is applied to build an applicationfor systems biology. The architecture is based on an ontology-based system withconnected biomodules that could be globally analyzed as far as possible.</p><p>In the last Chapter Han and Zhu illustrate the various methods of encoding infor-mation in DNA strands and present the corresponding DNA algorithms, which willbenefit the further research on DNA computing.</p><p>We are very much grateful to the authors of this volume and to the reviewers fortheir tremendous service by critically reviewing the chapters. The editors would liketo thank Dr. Thomas Ditzinger (Springer Engineering Inhouse Editor) and ProfessorJanusz Kacprzyk (Editor-in-Chief, Springer Studies in Computational IntelligenceSeries) and Ms. Heather King (Springer Verlag, Heidelberg) for the editorial as-sistance and excellent cooperative collaboration to produce this important scientificwork. We hope that the reader will share our excitement to present this volume onComputational Intelligence in Bioinformatics and will find it useful.</p><p>Arpad Kelemen, Ajith Abraham and Yuehui Chen (Editors)September 2007</p></li><li><p>Contents</p><p>1 Computational Intelligence Algorithms and DNA MicroarraysD.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2</p><p>1.2.1 The Rprop Neural Network Training Algorithm . . . . . . . . . . . . . . . . . 31.2.2 The Adaptive Online Neural Network Training Algorithm . . . . . . . . . 4</p><p>1.3 Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5The Differential Evolution Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 5</p><p>1.4 Feature Selection and Dimension Reduction Techniques . . . . . . . . . . . . . . . 71.4.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.4.2 Reducing the Dimensions using Clustering . . . . . . . . . . . . . . . . . . . . . 7</p><p>Unsupervised kWindows Clustering Algorithm . . . . . . . . . . . . . . . . . 9The DBSCAN Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 10The Fuzzy cMeans Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . 12</p><p>1.4.3 The PDDP Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.4.4 Growing Neural Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.4.5 A Hybrid Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13</p><p>1.5 Experimental Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.5.1 DNA Microarray Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.5.2 Evaluation of the Clustering Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 15</p><p>Clustering Based on Supervised Gene Selection . . . . . . . . . . . . 15Clustering Based on Unsupervised Gene Selection . . . . . . . . . 18</p><p>1.5.3 Using Clustering and FNN classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . 211.5.4 Using Evolutionary Algorithms and FNN classifiers . . . . . . . . . . . . . . 24</p><p>1.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27</p><p>2 Inferring Gene Regulatory Networks from Expression DataLars Kaderali and Nicole Radde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....</p></li></ul>