cancer diagnostics with dna microarrays (knudsen/cancer diagnostics with dna microarrays) ||...
TRANSCRIPT
Cancer Diagnostics withDNA Microarrays
Cancer Diagnostics withDNA Microarrays
Steen KnudsenMedical Prognosis Institute, Hørsholm, Denmark
A John Wiley & Sons, Inc., Publication
Copyright c© 2006 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form orby any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except aspermitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the priorwritten permission of the Publisher, or authorization through payment of the appropriate per-copy fee tothe Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax(978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should beaddressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030,(201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts inpreparing this book, they make no representations or warranties with respect to the accuracy orcompleteness of the contents of this book and specifically disclaim any implied warranties ofmerchantability or fitness for a particular purpose. No warranty may be created or extended by salesrepresentatives or written sales materials. The advice and strategies contained herein may not be suitablefor your situation. You should consult with a professional where appropriate. Neither the publisher norauthor shall be liable for any loss of profit or any other commercial damages, including but not limited tospecial, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact ourCustomer Care Department within the United States at (800) 762-2974, outside the United States at (317)572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print maynot be available in electronic formats. For more information about Wiley products, visit our web site atwww.wiley.com.
Library of Congress Cataloging-in-Publication Data is available.
ISBN-13: 978-0-471-78407-4ISBN-10: 0-471-78407-9
Printed in the United States of America.
10 9 8 7 6 5 4 3 2 1
To Tarja
Contents
Preface xv
Acknowledgments xvii
1. Introduction to DNA Microarray Technology 1
1.1 Hybridization / 11.2 The Technology Behind DNA Microarrays / 3
1.2.1 Affymetrix GeneChip Technology / 41.2.2 Spotted Arrays / 61.2.3 Digital Micromirror Arrays / 81.2.4 Inkjet Arrays / 81.2.5 Bead Arrays / 91.2.6 Electronic Microarrays / 91.2.7 Serial Analysis of Gene Expression (SAGE) / 10
1.3 Parallel Sequencing on Microbead Arrays / 111.3.1 Emerging Technologies / 12
1.4 Summary / 13Further Reading / 14
2. Image Analysis 16
2.1 Gridding / 162.2 Segmentation / 172.3 Intensity Extraction / 172.4 Background Correction / 17
vii
viii CONTENTS
2.5 Software / 192.5.1 Free Software for Array Image Analysis / 192.5.2 Commercial Software for Array Image Analysis / 19
2.6 Summary / 20
3. Basic Data Analysis 22
3.1 Normalization / 223.1.1 One or More Genes Assumed Expressed at Constant Rate / 233.1.2 Sum of Genes Is Assumed Constant / 243.1.3 Subset of Genes Is Assumed Constant / 243.1.4 Majority of Genes Assumed Constant / 243.1.5 Logit Normalization / 243.1.6 Spike Controls / 25
3.2 Dye Bias, Spatial Bias, Print Tip Bias / 253.3 Expression Indices / 26
3.3.1 Average Difference / 263.3.2 Signal / 263.3.3 Model-Based Expression Index / 263.3.4 Robust Multiarray Average / 273.3.5 Position-Dependent Nearest Neighbor Model / 273.3.6 Logit-t / 27
3.4 Detection of Outliers / 283.5 Fold Change / 283.6 Significance / 29
3.6.1 Multiple Conditions / 303.6.2 Nonparametric Tests / 303.6.3 Correction for Multiple Testing / 303.6.4 Example I: t-Test and ANOVA / 313.6.5 Example II: Number of Replicates / 32
3.7 Mixed Cell Populations / 333.8 Summary / 34
Further Reading / 34
4. Visualization by Reduction of Dimensionality 37
4.1 Principal Component Analysis / 374.2 Supervised Principal Component Analysis / 394.3 Independent Component Analysis / 394.4 Example I: PCA on Small Data Matrix / 394.5 Example II: PCA on Real Data / 414.6 Summary / 42
Further Reading / 42
5. Cluster Analysis 44
5.1 Hierarchical Clustering / 44
CONTENTS ix
5.2 K -Means Clustering / 465.3 Self-Organizing Maps / 465.4 Distance Measures / 47
5.4.1 Example: Comparison of Distance Measures / 495.5 Time-Series Analysis / 515.6 Gene Normalization / 515.7 Visualization of Clusters / 525.8 Summary / 52
Further Reading / 52
6. Molecular Classifiers for Cancer 54
6.1 Supervised versus Unsupervised Analysis / 546.2 Feature Selection / 556.3 Validation / 556.4 Classification Schemes / 56
6.4.1 Nearest Neighbor / 566.4.2 Nearest Centroid / 576.4.3 Neural Networks / 576.4.4 Support Vector Machine / 58
6.5 Performance Evaluation / 586.6 Example I: Classification of SRBCT Cancer Subtypes / 596.7 A Network Approach to Molecular Classification / 59
6.7.1 Construction of HG-U95Av2-Chip Based Gene Network / 616.7.2 Construction of HG-U133A-Chip Based Gene Network / 616.7.3 Example: Lung Cancer Classification with Mapped
Subnetwork / 616.7.4 Example: Brain Cancer Classification with Mapped
Subnetwork / 626.7.5 Example: Breast Cancer Classification with Mapped
Subnetwork / 626.7.6 Example: Leukemia Classification with Mapped Subnetwork / 62
6.8 Summary / 63Further Reading / 64
7. Survival Analysis 66
7.1 Kaplan–Meier Approach / 667.2 Cox Proportional Hazards Model / 677.3 Software / 68
Further Reading / 68
8. Meta-Analysis 69
8.1 Gene Matching / 698.2 Combining Measurements / 70
8.2.1 Intensity / 70
x CONTENTS
8.2.2 Effect Size / 708.2.3 P -Values / 708.2.4 Selected Features / 708.2.5 Vote Counting / 70
8.3 Meta-Classification / 718.4 Summary / 71
Further Reading / 71
9. The Design of Probes for Arrays 73
9.1 Gene Finding / 739.2 Selection of Regions Within Genes / 749.3 Selection of Primers for PCR / 74
9.3.1 Example: Finding PCR Primers for Gene AF105374 / 749.4 Selection of Unique Oligomer Probes / 759.5 Remapping of Probes / 76
Further Reading / 76
10. Software Issues and Data Formats 78
10.1 Standardization Efforts / 7910.2 Databases / 7910.3 Standard File Format / 7910.4 Software for Clustering / 8010.5 Software for Statistical Analysis / 80
10.5.1 Example: Statistical Analysis with R / 8010.5.2 The affy Package of Bioconductor / 8310.5.3 Survival Analysis / 8510.5.4 Commercial Statistics Packages / 85
10.6 Summary / 85Further Reading / 85
11. Breast Cancer 87
11.1 Introduction / 8711.1.1 Anatomy of the Breast / 8711.1.2 Breast Tumors / 87
11.2 Current Diagnosis / 8811.2.1 Staging / 8811.2.2 Histopathological Grading / 8911.2.3 Clinical Markers / 89
11.3 Current Therapy / 9011.3.1 Surgery / 9011.3.2 Chemotherapy / 9011.3.3 Hormone Therapy / 9011.3.4 Monoclonal Antibody Therapy / 90
CONTENTS xi
11.4 Microarray Studies of Breast Cancer / 9011.4.1 The Stanford Group / 9011.4.2 The Netherlands Cancer Institute Group / 9111.4.3 The Duke University Group / 9211.4.4 The Lund Group / 9211.4.5 The Karolinska Group / 9311.4.6 The Tokyo Group / 9311.4.7 The Veridex Group / 9311.4.8 The National Cancer Institute Group / 93
11.5 Meta-Classification of Breast Cancer / 9311.6 Summary / 95
Further Reading / 95
12. Leukemia 97
12.1 Current Diagnosis / 9812.2 Current Therapy / 9812.3 Microarray Studies of Leukemia / 99
12.3.1 The Boston Group / 9912.3.2 The Austria Group / 9912.3.3 The Utah Group / 9912.3.4 The Memphis Group / 10012.3.5 The Japan Group / 10012.3.6 The Munich Group / 10012.3.7 The Stanford Group / 10112.3.8 The Copenhagen Group / 10112.3.9 The Netherlands Group / 101
12.4 Summary / 101Further Reading / 101
13. Lymphoma 104
13.1 Microarray Studies of Lymphoma / 10513.1.1 The Stanford Group / 10513.1.2 The Boston Group / 10513.1.3 The NIH Group / 10513.1.4 The NCI Group / 106
13.2 Meta-Classification of Lymphoma / 10613.2.1 Matching of Components / 10713.2.2 Blind Test Set / 109
13.3 Summary / 112Further Reading / 112
14. Lung Cancer 114
14.1 Microarray Studies of Lung Cancer / 11414.1.1 The Harvard/Whitehead Group / 114
xii CONTENTS
14.1.2 The Minnesota Group / 11614.1.3 The Vanderbilt Group / 11614.1.4 The Tokyo Group / 11614.1.5 The Michigan Group / 11614.1.6 The Mayo Clinic / 11614.1.7 The Toronto Group / 11714.1.8 The NIH Group / 11714.1.9 The Stanford Group / 11714.1.10 The Israel Group / 117
14.2 Meta-Classification of Lung Cancer / 11714.3 Summary / 122
Further Reading / 122
15. Bladder Cancer 123
15.1 Microarray Studies of Bladder Cancer / 12415.1.1 The Aarhus Group / 12415.1.2 The New York Group / 12515.1.3 The Dusseldorf Group / 125
15.2 Summary / 125Further Reading / 125
16. Colon Cancer 127
16.1 Microarray Studies of Colon Cancer / 12716.1.1 The Princeton Group / 12716.1.2 The Maryland Group / 12816.1.3 The Aarhus Group / 12816.1.4 The Marseilles Group / 12816.1.5 The San Diego Group / 12816.1.6 The Helsinki Group / 129
16.2 Summary / 129Further Reading / 129
17. Ovarian Cancer 130
17.1 Microarray Studies of Ovarian Cancer / 13017.1.1 Duke University Medical Center / 13017.1.2 The Stanford Group / 13117.1.3 The London Group / 13117.1.4 The UCLA Group / 13117.1.5 The NIH Group / 13117.1.6 The Novartis Group / 132
17.2 Summary / 132Further Reading / 132
CONTENTS xiii
18. Prostate Cancer 134
18.1 Microarray Studies of Prostate Cancer / 13418.1.1 University of Michigan Medical School / 13418.1.2 Memorial Sloan-Kettering Cancer Center / 13418.1.3 The NIH Group / 13518.1.4 The Harvard Group / 13518.1.5 The Australian Group / 13518.1.6 The Stanford Group / 136
18.2 Summary / 136Further Reading / 136
19. Melanoma 138
19.1 Microarray Studies of Melanoma / 13819.1.1 Memorial Sloan-Kettering Cancer Center / 13819.1.2 The NIH Group / 138
19.2 Summary / 139Further Reading / 139
20. Brain Tumors 142
20.1 Microarray Studies of Brain Tumors / 14420.1.1 Harvard Medical School / 14420.1.2 M. D. Anderson Cancer Center / 14420.1.3 UCLA School of Medicine / 14420.1.4 The Lausanne Group / 14420.1.5 Deutsches Krebsforschungszentrum / 14420.1.6 Children’s Hospital, Boston / 145
20.2 Summary / 145Further Reading / 145
21. Organ or Tissue Specific Classification 147
Further Reading / 148
22. Sample Collection and Stability 149
22.1 Tissue Samples / 14922.1.1 Stability of Tissue After Surgical Removal / 14922.1.2 Stability of Sample in Storage / 15022.1.3 Paraffin-Embedded Tissue Samples / 150
22.2 Blood Samples / 15022.3 Sample Heterogeneity / 151
xiv CONTENTS
22.4 Summary / 151Further Reading / 151
References 152
Index 183
Preface
A new technology is about to enter cancer diagnostics. DNA microarrays are currentlyshowing great promise in all the medical research projects to which they are beingapplied. This book presents the current status of the area as well as reviews andsummaries of the results from specific cancer types. For types where several comparablestudies have been published, a meta-analysis of the results is presented.
This book is intended for a wide audience from the practicing physician to the statis-tician. Both will find the chapters where I review areas of their expertise superficial,but I hope they will find other introductory chapters useful in understanding the manyaspects of microarray technology applied to cancer diagnostics.
I first describe the current state of the technology as well as emerging technologies.Then I describe the statistical analysis that is necessary to interpret the data. Next Icover some of the major human cancer types where microarrays have been appliedwith success, including studies that I have been a part of. I conclude that for severalcancer types the results are so good and consistent that DNA microarrays are readyto be deployed in clinical practice. The clinical application in question is helping toselect patients for adjuvant chemotherapy after surgery by determining the prognosismore accurately than what is possible today.
Chapters 1–6 on technology and statistical analysis and Chapters 9 and 10 on chipdesign and software are updated and expanded versions of chapters in my previousWiley book, Guide to Analysis of DNA Microarray Data, Second Edition (2004). Theremaining 13 chapters are new.
Steen Knudsen
Birkerød, DenmarkOctober 2005
xv
Acknowledgments
My previous books, on which some of the chapters in this book are based, werewritten while I worked for the Technical University of Denmark. I am grateful to theUniversity leadership, the funding agencies, my group members, and collaborators fortheir significant role in that endeavor. The review of individual cancer types, to alarge extent based on information from the American Cancer Society, was also writtenduring my employment at the Technical University of Denmark.
The remaining chapters were written while I was employed by the Medical PrognosisInstitute. The original research on meta-analysis of cancer classification as well assubnetwork mapping of individual cancer types has been patented by Medical PrognosisInstitute, and I am grateful for being allowed to present the results here.
S. K.
xvii