beginning microarray data analysis: a biologist's guide to analysis of dna microarray data
TRANSCRIPT
Book Reviews 1649
A Biologist’s Guide toAnalysis of DNA MicroarrayDataby Steen KnudsenJohn Wiley & Sons (2002) 144 pages. ISBN0471224901US$44.95
DNA microarrays have revolutionizedbiology. Instead of studying one gene orone protein at a time, scientists are nowstudying many simultaneously. Thisglobal approach has created many newopportunities to study human disease.For example, a number of microarraystudies have demonstrated the existenceof different clinical subtypes of cancerwith different prognoses from thoseidentified by other methods (Alizadeh etal., 2000; Bittner et al., 2000). Manybiologists have jumped on thisbandwagon and started performing theirown microarray experiments. However,data analysis can often be confusing,because, on the one hand, this field isevolving quickly, and, on the other, themodern data mining techniques mayappear to be daunting and intractable.Although there are several microarraybooks on the market, and few arededicated to data analysis (Leung, 2002),
there is no single book tailored tobiologists.
A Biologist’s Guide to Analysis of DNAMicroarray Datais a good starting pointfor biologists new to data analysis.Written by Steen Knudsen, the book iscomposed of 14 chapters. The bookstarts with an introductory chapterexplaining the main principles andusage of DNA microarrays. This isfollowed by a chapter presenting anoverview of data analysis, in which allthe methods are summarized in a simpleflowchart. This useful chart clearlyshows the basic workflow of microarraydata analysis and includes theexperimental setups.
Most contemporary data analysismethods are discussed in chapters 3-8,in which the underlying principlesare illustrated with vivid and simpleexamples. In chapter 3, Knudsendescribes the basic data analysismethods, including scaling themeasurements in the sample and thecontrol, calculating the change inexpression level of a gene anddetermining the significance of theexpression level of a gene by usingstudent’s t-tests, ANOVA or non-parametric tests. Potential problems,such as outliers, and multiple testingare discussed. Chapters 4-8 introducevarious data processing and miningmethods: principal component analysisfor dimensionality reduction; clusteranalysis, including hierarchicalclustering, K-means clustering and self-organizing maps; various distancemeasures and their effects on dataclustering; normalization methods tocorrect systematic biases; miningfunctions of orphan proteins andregulatory relationships between genes;reverse engineering of regulatorynetworks by time-series and steady-state approaches; constructingmolecular classifiers, including nearestneighbor, neural networks and supportvector machines. The constraints ofthese data analysis methods areemphasized and discussed in detail.This is very helpful for those without astrong background in statistics, asthe limitations of statistical analysismethods are often overlooked.
In chapter 9, Knudsen discusses the
various considerations that need to betaken into account when selecting theappropriate probes for arrays. Inchapter 10, the limitations of expressionanalysis are outlined. In particular,microarray expression study,transcriptomics, is primarily focusedon gene expression and neglectsmany other aspects of cellulardynamics, such as alternative splicing,protein translation, post-translationalmodifications and degradation. Usersneed to be very cautious before makingbold conclusions on the basis of theirexpression data. The genotyping array,a close relative to expression array, isbriefly discussed in chapter 11. Thediscussion is largely concentrated onthe author’s interest in neural networksequence prediction.
Cell biologists often want to know whichsoftware is best for microarray dataanalysis. Chapters 12 and 13 provide aquick overview of the issues relatedto the choice of software. Oftencommercial software gives a false senseof security: they have inherentlimitations, such as making implicitanalysis assumptions for you. Therefore,Knudsen advocates the use of opensource/free software for data analysis.There are a few important take homemessages regarding software:standardizing the data format willgreatly assist data sharing andcomparability; learning a scriptinglanguage like Awk or Perl will allow youto manipulate your data with ease; andlearning an open source statisticallanguage, such as R, will allow you torun different analyses. In addition, withR there are numerous extensionmodules, libraries, that are writtenspecifically for microarray data analysis,and almost all are free. A great featureof this book is that it shows a number ofsimple Awk scripts and R commands forvarious statistical analyses. Therefore,the reader can follow these step-by-stepcodes to experience first hand command-line-driven programs.
There are some drawbacks to this book.Firstly, the background to these variousstatistical analyses is only brieflydiscussed; therefore, it requires somestatistical training to appreciate many ofthe chapters. This conflicts with thebook’s objective of guiding biologists
Beginning MicroarrayData Analysis
1650 Journal of Cell Science 116 (9)
All about Drosophilaeye development(almost)
individual topics. These range from theearliest establishment of the eye-field toretinal connections, colour vision andDrosophila as a model for disease, andtogether give a fairly comprehensiveoverview of the current areas of interestin this field. Reading the book from coverto cover is slightly problematic. Therewas quite a lot of repetition in the earlychapters, which describe early patterninggenes; TGF-β and Hedgehog signallingfeatured in several places too. Thesechapters created a modicum of confusionsince the conclusions seemed to differslightly. Beyond these early chapters wewere hooked by a wide range ofcontributions that each touch on adifferent problem. Particularly refreshingwere those topics that we had not comeacross in other reviews, such as theevolution of colour vision andapplications to human disease modelling.The latter chapter starts with a good,succinct overview of some of the majorcontributions made through studies of theeye and would be a useful chapter to giveto upper-level undergraduates to illustratethe diversity of applications of this model.
The book, therefore, provides a valuableintroduction to an important paradigm indevelopmental biology. As a whole itmight not be of immediate relevance tocell biologists, although chapters onregulation of growth and proliferation,protein stability, and programmed celldeath would be of interest to cell anddevelopmental biologists alike. Inaddition there is one gem by Don Readyabout the emergence of form in the eye,which describes the progression in cellshapes and cell contacts that occur as theeye develops. This short essay highlightssome of the amazing changes in cellmorphology and considers how thesecould contribute forces that shape thegeometrical regularity of the Drosophilacompound eye.
There is a danger that the book willbecome dated as the field progresses, andso those chapters with a well-roundedhistorical perspective are likely to be theones that better stand the test of time.Some of the chapters tend to focus on themost recent findings, whereas others,even amongst the better known topics,manage to achieve a balance betweenthe two. Occasionally, we foundourselves wanting more debate on
without special training through theanalysis step. However this is anunavoidable trade-off to make the bookeasier to read. Secondly, the book doesnot emphasize enough the experimentaldesign, which could significantly affectthe data analysis in the later stages of theexperiment. Without thorough planningand an understanding of the analysismethods, microarray analysis risksbeing a ‘fishing expedition’. But witha careful and critical approachexperiments can be quite the opposite.Finally, some of the chapters in thisbook are just too short to be justified assuch. For example, chapters 9-11 areonly between three and six pages long.More discussion of the issues raisedwould be welcome, even in anintroductory text.
Nonetheless this book is a good startingpoint for cell biologists who areinterested in analysis of DNAmicroarrays. It provides a background tomicroarray data analysis and a quickoverview of the current trends. ABiologist’s Guide to Analysis of DNAMicroarray Data does a marvelous jobof introducing biologists into the realmof genomic data analysis.
ReferencesAlizadeh, A. A., Eisen, M. B., Davis, R. E.,Ma, C., Lossos, I. S., Rosenwald, A.,Boldrick, J. C., Sabet, H., Tran, T. anf Yu,X. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expressionprofiling. Nature403, 503-511.Bittner, M. , Meltzer, P., Chen, Y., Jiang,Y., Seftor, E., Hendrix, M., Radmacher,M., Simon, R., Yakhini, Z. and Ben-Dor,A. (2000). Molecular classification ofcutaneous malignant melanoma by geneexpression profiling. Nature406, 536-540.Leung, Y. F. (2002). Microarray dataanalysis for dummies... and experts too?Trends Biochem. Sci.27, 433-434.
Bao Jian Fan 1 and Yuk Fai Leung 2
1Department of Ophthalmology andVisual Sciences, The Chinese Universityof Hong Kong, Hong Kong2Bauer Center for Genomics Research,Harvard University, 7 Divinity Avenue,Cambridge, MA 02138, USAJournal of Cell Science 116, 1649-1651 © 2003 TheCompany of Biologists Ltddoi:10.1242/jcs.00436
Drosophila EyeDevelopmentedited by Kevin MosesSpringer-Verlag (2002) 282 pages. ISBN 3-540-42590-X
£97.50/$149
Knowledge of Drosophila eyedevelopment has grown almostexponentially over the past few decades.Not only are the mechanisms thataccount for the formation of thismultifaceted structure intrinsicallyinteresting in their own right, but theyhave also contributed enormously to ourunderstanding of general developmentalparadigms and molecular pathways. Theexplosion in research into theDrosophilaeye was sparked principallyby the groups of Seymour Benzer andGerry Rubin, and it is primarily theiroffspring who have contributed thechapters to the recent volumeDrosophila Eye Development (Vol. 37 inthe series Results and Problems in CellDifferentiation), edited by Kevin Moses.
Each chapter in the book is a stand-alonereview making it possible to readup on
Book Reviews 1651
the differing current models andcontroversies, but the extensivereference lists should allow readers toexplore these for themselves. Thequality of figures used to illustrate eachchapter also varies considerably. Thosecomparing vertebrate and Drosophilaeye development are exceptionallygood, particularly the colour diagrams.The chapter on cell death is also nicelyillustrated, but in some others the figures
are thin on the ground, which makessections a bit dry or hard to follow.
With contributions from many keyresearchers in the field, this bookprovides an excellent reference text forthose already working with theDrosophilaeye and, for those about to,it conveys a fascination for the eye andthe intricacy of its development. Theonly major shortcoming is its cover price
(almost £100), which means that a bookthat many might like to have on theirpersonal bookshelves will instead beconfined largely to library shelves.
Sarah Bray and Ruth Johnson
Department of Anatomy, University ofCambridge, UKJournal of Cell Science 116, 1649-1651 © 2003 TheCompany of Biologists Ltddoi:10.1242/jcs.00435
CommentariesJCS Commentaries highlight and critically discuss recent exciting work that willinterest those working in cell biology, molecular biology, genetics and relateddisciplines. These short reviews are commissioned from leading figures in the fieldand are subject to rigorous peer-review and in-house editorial appraisal. Each issueof the journal contains at least two Commentaries. JCS thus provides readers withmore than 50 Commentaries over the year, which cover the complete spectrum of cellscience. The following are just some of the Commentaries appearing in JCS over thecoming months.
Holiday junction resolvases Paul RussellI-κB complexes Anthony ManningIntermediate filament motility Robert GoldmanVav Victor TybulewiczFormins Charlie BooneSignalling roles of α-catenin Elaine FuchsThe functions of dynamin Harvey McMahonElectron tomography Wolfgang BaumeisterInteractions between Ras- and Rho-dependent signalling pathways in celltransformation Chris MarshallSignalling in three dimensions Mina BissellMechanosensitive channels Boris MartinacImmunodeficiency, albinism and Rab27a Gillian GriffithsExpanding the view of inositol signaling: the genomic era John YorkMediator function Danny Reinberg
Although we discourage submission of unsolicited Commentaries to the journal,ideas for future articles – in the form of a short proposal and some key references –are welcome and should be sent to the Executive Editor at the address below.
Journal of Cell Science, Bidder Building, 140 Cowley Rd, Cambridge, UK CB4 0DLE-mail: [email protected]; http://jcs.biologists.org