pathway analysis of coronary atherosclerosis

17
doi:10.1152/physiolgenomics.00101.2005 23:103-118, 2005. First published Jun 7, 2005; Physiol. Genomics Thomas Quertermous and Euan A. Ashley Zohar Yakhini, Amir Ben-Dor, Annette Adler, Laurakay Bruhn, Philip Tsao, David Xing-Fei Deng, Andrew Connolly, Peng Zhang, Eugene Yang, Clifton Watt, M. Chen, Allan Kuchinsky, Aditya Vailaya, Robert Kincaid, Anya Tsalenko, Jennifer Y. King, Rossella Ferrara, Raymond Tabibiazar, Joshua M. Spin, Mary Pathway analysis of coronary atherosclerosis You might find this additional information useful... 55 articles, 26 of which you can access free at: This article cites http://physiolgenomics.physiology.org/cgi/content/full/23/1/103#BIBL including high-resolution figures, can be found at: Updated information and services http://physiolgenomics.physiology.org/cgi/content/full/23/1/103 can be found at: Physiological Genomics about Additional material and information http://www.the-aps.org/publications/pg This information is current as of October 3, 2005 . http://www.the-aps.org/. the American Physiological Society. ISSN: 1094-8341, ESSN: 1531-2267. Visit our website at July, and October by the American Physiological Society, 9650 Rockville Pike, Bethesda MD 20814-3991. Copyright © 2005 by techniques linking genes and pathways to physiology, from prokaryotes to eukaryotes. It is published quarterly in January, April, publishes results of a wide variety of studies from human and from informative model systems with Physiological Genomics on October 3, 2005 physiolgenomics.physiology.org Downloaded from

Upload: independent

Post on 11-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

 doi:10.1152/physiolgenomics.00101.2005 23:103-118, 2005. First published Jun 7, 2005;Physiol. Genomics

Thomas Quertermous and Euan A. Ashley Zohar Yakhini, Amir Ben-Dor, Annette Adler, Laurakay Bruhn, Philip Tsao, David Xing-Fei Deng, Andrew Connolly, Peng Zhang, Eugene Yang, Clifton Watt,M. Chen, Allan Kuchinsky, Aditya Vailaya, Robert Kincaid, Anya Tsalenko, Jennifer Y. King, Rossella Ferrara, Raymond Tabibiazar, Joshua M. Spin, MaryPathway analysis of coronary atherosclerosis

You might find this additional information useful...

55 articles, 26 of which you can access free at: This article cites http://physiolgenomics.physiology.org/cgi/content/full/23/1/103#BIBL

including high-resolution figures, can be found at: Updated information and services http://physiolgenomics.physiology.org/cgi/content/full/23/1/103

can be found at: Physiological Genomicsabout Additional material and information http://www.the-aps.org/publications/pg

This information is current as of October 3, 2005 .  

http://www.the-aps.org/.the American Physiological Society. ISSN: 1094-8341, ESSN: 1531-2267. Visit our website at July, and October by the American Physiological Society, 9650 Rockville Pike, Bethesda MD 20814-3991. Copyright © 2005 bytechniques linking genes and pathways to physiology, from prokaryotes to eukaryotes. It is published quarterly in January, April,

publishes results of a wide variety of studies from human and from informative model systems withPhysiological Genomics

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

Pathway analysis of coronary atherosclerosis

Jennifer Y. King,1,* Rossella Ferrara,1,* Raymond Tabibiazar,1 Joshua M. Spin,1 Mary M. Chen,1

Allan Kuchinsky,2 Aditya Vailaya,2 Robert Kincaid,2 Anya Tsalenko,2 David Xing-Fei Deng,2

Andrew Connolly,1 Peng Zhang,1 Eugene Yang,1 Clifton Watt,1 Zohar Yakhini,2 Amir Ben-Dor,2

Annette Adler,2 Laurakay Bruhn,2 Philip Tsao,1 Thomas Quertermous,1,* and Euan A. Ashley1,*1Donald W. Reynolds Cardiovascular Research Center, Division of Cardiovascular Medicine, Falk CardiovascularResearch Center, Stanford University, Stanford; and 2Agilent Laboratories, Palo Alto, California

Submitted 2 May 2005; accepted in final form 29 May 2005

King, Jennifer Y., Rossella Ferrara, Raymond Tabibiazar, JoshuaM. Spin, Mary M. Chen, Allan Kuchinsky, Aditya Vailaya, RobertKincaid, Anya Tsalenko, David Xing-Fei Deng, Andrew Connolly,Peng Zhang, Eugene Yang, Clifton Watt, Zohar Yakhini, AmirBen-Dor, Annette Adler, Laurakay Bruhn, Philip Tsao, ThomasQuertermous, and Euan A. Ashley. Pathway analysis of coronaryatherosclerosis. Physiol Genomics 23: 103–118, 2006. First publishedJune 7, 2005; 10.1152/physiolgenomics.00101.2005.—Large-scale geneexpression studies provide significant insight into genes differentiallyregulated in disease processes such as cancer. However, these inves-tigations offer limited understanding of multisystem, multicellulardiseases such as atherosclerosis. A systems biology approach thataccounts for gene interactions, incorporates nontranscriptionally reg-ulated genes, and integrates prior knowledge offers many advantages.We performed a comprehensive gene level assessment of coronaryatherosclerosis using 51 coronary artery segments isolated from theexplanted hearts of 22 cardiac transplant patients. After histologicalgrading of vascular segments according to American Heart Associa-tion guidelines, isolated RNA was hybridized onto a customized 22-Koligonucleotide microarray, and significance analysis of microarraysand gene ontology analyses were performed to identify significantgene expression profiles. Our studies revealed that loss of differenti-ated smooth muscle cell gene expression is the primary expressionsignature of disease progression in atherosclerosis. Furthermore, weprovide insight into the severe form of coronary artery diseaseassociated with diabetes, reporting an overabundance of immune andinflammatory signals in diabetics. We present a novel approach topathway development based on connectivity, determined by languageparsing of the published literature, and ranking, determined by thesignificance of differentially regulated genes in the network. In doingthis, we identify highly connected “nexus” genes that are attractivecandidates for therapeutic targeting and followup studies. Our use ofpathway techniques to study atherosclerosis as an integrated networkof gene interactions expands on traditional microarray analysis meth-ods and emphasizes the significant advantages of a systems-basedapproach to analyzing complex disease.

pathways; networks; systems biology; gene expression profiling; mi-croarray; cardiovascular disease; coronary arterial disease

THE DEVELOPMENT of microarray technology has grown frommodest beginnings to the present day, where the ability toexpression profile whole genomes is routine. However, high-

throughput gene expression profiling presents a unique diffi-culty in the need to identify and distinguish significant changesin gene expression from among the tens of thousands of genesthat can be assayed simultaneously. The challenge of multipletesting has been met by many statisticians, but two approacheshave become standard: significance analysis of microarrays(SAM) and hierarchical clustering. SAM determines statisti-cally significant changes in gene expression by applying amodified t-test (54), controlling false discovery through apermutation technique. Hierarchical clustering applies statisti-cal algorithms to group genes according to the similarity ingene expression pattern (10), where “similarity” is commonlydefined by Euclidean distance or a correlation coefficient.Although these approaches are useful in identifying and dis-tinguishing sets of statistically significant genes from the thou-sands of other genes on an array, they do not lend structure orintegration to the results. Hierarchical clustering has been usedas a pathway discovery tool (changes in expression of genes inactivated networks are expected to correlate) (21), and therehas been interest in extending this approach to model topolog-ical and dynamic properties of the networks that control thebehavior of the cell, especially in organisms such as Esche-richia coli (3, 25, 28, 32). Although successful in lower-orderorganisms, there has been limited application of such ap-proaches to human disease. Furthermore, the pathway-basedapproach we present here can take into account prior knowl-edge by expanding the context beyond the genes and genechanges in the current experiment (14).

Atherosclerosis, an inflammatory disease stemming fromgenetic and environmental factors, is the primary disease ofcoronary arteries (27, 38, 41, 42). It is the number one killer inthe United States (1) and every year claims more lives than thenext five leading causes of death combined (1). The role ofgenetics in atherosclerosis pathophysiology has been recog-nized for some time: inheritance of risk factors was first shownin classical twin (11, 18, 20, 35) and family history studies (40,56). Diabetes (4, 57), hypercholesterolemia (26), hypertension(57), obesity (7), smoking (57), and physical inactivity (2) arealso known risk factors for the disease. Although interventionalcardiology procedures such as balloon angioplasty, stenting(39), and atherectomy (44) have been successful in combatinglocal coronary artery disease (CAD), this has not been met byequivalent success in interrupting the underlying disease at themolecular level (pleiotrophic effects of statins and aspirinnotwithstanding). There is, for example, no treatment currentlyon the market designed to target the molecular interactions ofthe disease process itself.

Traditional approaches to the molecular study of atheroscle-rosis target one molecule or gene thought to have an important

* J. Y. King and R. Ferrara, and T. Quertermous and E. A. Ashley,contributed equally to this work.

Article published online before print. See web site for date of publication(http://physiolgenomics.physiology.org).

Address for reprint requests and other correspondence: T. Quertermous or E.A. Ashley, Donald W. Reynolds Cardiovascular Research Center, Div. ofCardiovascular Medicine, Falk Cardiovascular Research Center, StanfordUniv., 300 Pasteur Dr., Stanford, CA 94305 (e-mail: [email protected] [email protected]).

Physiol Genomics 23: 103–118, 2005.First published June 7, 2005; 10.1152/physiolgenomics.00101.2005.

1094-8341/05 $8.00 Copyright © 2005 the American Physiological Society 103

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

role in the development of the disease and then manipulate itthrough knockout, knockdown, or transgenic technology (13,34). Although much has been learned about atherosclerosisthrough these studies, there remains a lack of understanding ofthe disease as an integrated whole. With this in mind, weundertook a comprehensive gene level assessment of coronaryvascular disease taking a network, or pathway-based, approachto analysis. A customized microarray platform was designed toassay the expression profiles of vessels from human heartsexplanted at the time of orthotopic heart transplant. Statisticaltools mined the resulting data for differentially regulatedgenes, and further study was conducted with pathway genera-tion and network analysis tools developed specifically for thesedata. Our findings have immediate significance for the inves-tigation of pharmacological approaches to the prevention andinterruption of atherosclerosis.

MATERIALS AND METHODS

Development of the Array Platform

In developing the array platform in an era that preceded “wholegenome” arrays, our intention was to include the widest array ofpossible genes expressed in the cardiovascular system. To do this, weused a strategy that included sequencing clones from stimulatedvascular cells in culture (to provide potentially unknown genes),searching the literature and databases, and combination with a sub-stantial commercial clone set (Incyte).

Stanford clone set. Human aortic smooth muscle cells (HASMCs)and human aortic endothelial cells (HAECs) (Clonetics; San Diego,CA) were serum starved and stimulated separately with 10 ng/mlTNF-� (R&D Systems; Minneapolis, MN). HASMCs were alsostimulated with 3 ng/ml transforming growth factor (TGF)-� (R&DSystems) and 20 ng/ml PDGF-BB (R&D Systems). Cells collected at30-min, 3-h, and 24-h time points were pooled for poly(A)� RNAisolation, and suppression subtraction was performed as previouslydescribed (17). A total of 6,954 cDNAs was cloned into plasmid,miniprepped, sequenced, and matched to GenBank accession num-bers, which were collapsed into UniGene clusters and annotated usingRefSeq where possible.

Search strategy. A team of six investigators compiled lists of genesrelevant to the cardiovascular system under subheadings that included“atherosclerosis,” “smooth muscle cell,” “endothelial cell,” “apopto-sis,” “cytokine,” and “adhesion molecule.” Genes were drawn fromPubMed and interest groups such as the Cytokine Family cDNADatabase (http://cytokine.medic.kumamoto-u.ac.jp/). Genes were thenidentified within the National Institutes of Health (NIH) UniGenedatabase with preference given to genes curated in RefSeq.

Bioinformatics. The Stanford clone set and search-output gene listswere then collated. All sequences were checked for quality with Phredand Phrap software and then masked for repeats and vectors (RepeatMasker: http://repeatmasker.genome.washington.edu/). Sequenceswere then compared using the BLAST algorithm on a local Linuxserver to the most up-to-date UniGene database. Hits with E scores �20 were discarded. Hits on the curated RefSeq database were pre-ferred. In the case of no hits, long expressed sequence tags (ESTs)with good quality were kept as sequences. After the literature andStanford clone lists were combined, redundancy checking was carriedout. Finally, an algorithm was written to check for splice variants. Ourapproach to splice variants was to use the consensus sequence wher-ever possible. To generate the 22-K feature array used for this study,60-mer oligonucleotide sequences from the Agilent human 1A (V2)and human 1B arrays (Agilent; Palo Alto, CA) were used in caseswhere there was a match to sequences from the collated and filteredStanford clone set and search-output gene list. In cases where genesequences from the Stanford list were not represented on either of

these arrays, Agilent’s custom microarray probe design algorithmswere applied to design custom 60-mer probes. The resulting custom-ized array was printed using inkjet technology.

Vessel Harvest

Major epicardial coronary arteries were carefully dissected fromthe explanted hearts of 22 patients undergoing orthotopic heart trans-plantation. Investigators were on call with the surgical team andcollected the organ at the time of explant. After dissection, whichoccurred immediately after explant, vessels were dissected longitudi-nally to expose the endoluminal surface. Lesions were identified andscored by inspection through a dissecting microscope. Arteries werethen divided into 1.0- to 2.0-cm segments macroscopically designatedas disease or disease-free segments. Midportions of each segmentwere separated and stored both as frozen sections (Optimal CuttingTemperature) and in formalin for later embedding, sectioning, andstaining. Both Institutional Review Board protocol approval andinformed consent from transplant patients were obtained for thisstudy.

Histology and Immunohistochemistry

Cryostat sections of 4 �m were fixed in acetone for 2 min, stainedwith hematoxylin for 30 s, stained with eosin for 5 s, dehydrated ingraded alcohols, and coverslipped with permanent mounting solutionafter being cleared with xylene.

Immunohistochemistry was carried out using the following primaryantibodies: CD14 (CBL453, Chemicon; Temecula, CA), �-actinin(MAB 1682, Chemicon), and IL-2 receptor-� (AB9496, Abcam;Cambridge MA). Frozen sections as described above were rehydratedwith distilled water and then blocked with peroxidase (3%) andprotein (PK-6102, Vectastain ABC Kit, Vector Labs; Burlingame,CA). Primary antibodies were titrated at 1:500 and 1:1,000 dilution at200 �l/slide for 1 h at room temperature. After slides were washed,the secondary antibody (PK-6102, Vectastain ABC Kit) was added for30 min at room temperature. Horseradish peroxidase (PK-6102, Vec-tastain ABC Kit) was added for 30 min at room temperature followedby diaminobenzidine (DAB) substrate for 5 min. Slides were coun-terstained with hematoxylin and dehydrated with graded alcohol.

American Heart Association Classification

In 32 samples out of our total 51 hybridized samples, we obtainedhistological grading according to American Heart Association (AHA)classification guidelines (50–52). Type I or initial lesions are recog-nized by an increase in the number of intimal macrophages and theappearance of lipid droplet-filled macrophages, or foam cells. Type IIIlesions are intermediate lesions that appear rich in lipid-laden cellsand contain scattered collections of extracellular lipid droplets andparticles that disrupt the coherence of some intimal smooth musclecells. Type V or advanced lesions are rich in fibrous connective tissue.They usually contain a lipid core and thick layers of connective tissue.The prototypical lesion is designated as fibroatheroma or a type Valesion. Advanced lesions with prominent calcification are designatedas type Vb. Advanced lesions with little accumulated lipid, minimal orabsent calcification, and a predominance of connective tissue aredesignated type Vc. We did not find samples with lesion types II, IV,or VI, and only two samples were classified as “no disease” (see Table2) (51, 52).

RNA Isolation and Quantification

Total RNA was isolated from segments of coronary artery using acombination of TRIzol reagent (Invitrogen; Carlsbad, CA) andRNeasy Mini kit (Qiagen; Valencia, CA) techniques. One-centimeter-long coronary artery segments were flash frozen in liquid nitrogen andthen placed into a spring-loaded hammer mechanism designed tocrush frozen tissue into a fine powder by freeze fracture. The powder

104 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

was collected, and TRIzol reagent was added before tissue homoge-nization with a 7 � 120-mm generator (Pro Scientific; Oxford, CT)attached to a fixed motor. After complete disruption of the tissue,chloroform was added, and the sample was shaken thoroughly beforea brief incubation at room temperature. The samples were thencentrifuged, and the supernatant (containing the RNA) was carefullyremoved without disturbing the cellular debris below it. The super-natant was placed into a fresh tube, and a mixture of Buffer RLT(Qiagen) and �-mercaptoethanol as well as 100% ethanol was added.The resulting solution was centrifuged through RNeasy Mini columns,allowing for binding of the RNA to the column membrane. Afterward,contaminants were removed by repeated wash spins of the membrane.The RNA was then eluted with RNase-free water into a new micro-centrifuge tube.

Extensive quality control testing of all RNA was done beforehybridization to the array. Samples were quantitated with a NanoDropND-1000 Spectrophotometer (NanoDrop; Wilmington, DE), which isable to detect small amounts of RNA with a high degree of sensitivity.RNA integrity was also assessed using Agilent’s 2100 BioanalyzerSystem and RNA 6000 Pico LabChip Kit. The Bioanalyzer is amicrofluidics-based system that allows for rapid visualization of RNAquality and quantity even with extremely small amounts of RNA. Thesystem detects biomolecules intercalated with fluorescent dye bylaser-induced fluorescence.

Direct Labeling and Oligonucleotide Array Hybridization

Agilent’s array technology involves the use of dual-dye channels,one for sample RNA and the other for reference RNA. By pilot testingfor maximum yield of positive reference data, it was determined thatthe best reference RNA was a mixture of 80% HeLa RNA and 20%human umbilical vein endothelial cell (HUVEC) RNA. Ten micro-grams of RNA for both the sample and reference were used forhybridization. Reference and sample RNA were labeled separatelyand then combined later during the hybridization process. A DNAprimer was annealed to both sample and reference RNA during a briefincubation period. Fluorescent-labeled cDNA was then reverse tran-scribed from the RNA using SuperScript II (Invitrogen) and eithercyanine-3-dCTP or cyanine-5-dCTP (Perkin-Elmer/NEN; Boston,MA). In our case, reference RNA was labeled with cyanine-3-dCTP,and sample RNA was labeled with cyanine-5-dCTP. After reversetranscription, any remaining RNA was degraded by adding RNase A(Amersham; Piscataway, NJ). Each labeled sample cDNA was thencombined with its corresponding labeled reference cDNA and purifiedthrough QIAquick PCR Purification Kit spin columns (Qiagen) toremove unincorporated dye-labeled nucleotides. After a wash spin toremove contaminants, labeled cDNA (from both initial sample andreference RNA) was eluted into a new microcentrifuge tube. After theaddition of 10� control targets (Agilent) and 2� hybridization buffer(Agilent), the resulting hybridization solution was dispensed on to agasket slide (Agilent), and the array was carefully lowered on top. Thesandwiched slides were placed into SureHyb hybridization chambers(Agilent), which were hand tightened to ensure a tight seal betweenthe gasket slide and the array. The arrays were incubated for 16 h ina specialized 60°C hybridization oven, which kept the hybridizationchambers in constant rotation so that the hybridization solution wouldbe in continuous moving contact with the entire surface of the array.After incubation, the arrays were removed from their chambers,separated from their gasket slides, and washed in two different washbuffers before being spin dried in a centrifuge and scanned.

Scanning and Feature Extraction of Arrays

All arrays were scanned with Agilent’s G2565AA MicroarrayScanner System. The scanner uses two lasers, a single harmonicgenerator-yttrium aluminum garnet laser and a helium-neon laser, toexcite and measure fluorescence from the cyanine-3- and cyanine-5-labeled cDNA.

Feature extraction software (Agilent) calculates log ratios and Pvalues for valid features on each array and provides a confidencemeasure of gene differential expression by performing outlier re-moval, background subtraction, and dye normalization for each fea-ture. The software filters features that are not positive and significantwith respect to background or features that are saturated. It then fits anormalization curve across the array using the locally weighted linearregression curve fit (LOWESS) algorithm to detect and correct dyebias.

Feature values excluded according to the above criteria wereestimated using the k-nearest neighbor (KNN) imputation method ofTroyanskaya et al. (53). Imputed values were used when missingvalues constituted �30% of row features. Otherwise, the row meanwas used.

Data Analysis

Significance analysis of microarrays. Our primary difference mea-sure was SAM (54). For each gene, the algorithm computes a tstatistic (di) with the denominator modified by the addition of a smallpositive constant to moderate independence of variance and geneexpression level:

di �x� i�post � x� i�pre

Si � S0

where di is the SAM score for gene i (referred to as the “d score” inthe text), x�i-post is the mean expression level of gene i in group post,x�i-pre is the mean expression level of gene i in group pre, Si is thestandard deviation for the numerator calculation, and S0 is a smallpositive constant. Genes are ranked according to this statistic, anderror is controlled by a permutation procedure. After random permu-tation of values between groups within each gene, resulting SAMscores are ranked across the data set, and the process is repeatedseveral hundred times. The null distribution is then represented by theaverage SAM score across these permutations for a given rankposition.

Heatmap builder. Heatmaps were generated by Stanford softwaredeveloped by the authors and freely available to the academic com-munity (http://mozart.stanford.edu/heatmap.htm). Heatmaps are pre-sented as row normalized (maximum and minimum values are calcu-lated for each gene). If xij is the expression level for gene i � 1,2,. . . imax and sample j � 1, 2,. . . jmax, and a particular red-green-bluecolor, color m, is divided into kmax number of shades (m1,m2,. . . mkmax), where kmax is the user-defined number of bins, then xij

is assigned a color (denoted color xij) according to the followingalgorithm: for k � 1, 2, . . ., kmax.

If xij lies in the interval

�min xij �k � 1

kmax

�max xij � minxij�,

min xij �k

kmax

�maxxij � minxij��,

then color xij � mk.Gene ontology and curated pathway analysis. Each list of differ-

entially expressed genes was analyzed in the context of gene ontology(GO) to identify groups of genes with similar functions or processes.GO annotation for genes represented on the Stanford-Agilent cardio-vascular oligonucleotide microarray was obtained using the Biomol-ecule Naming Service (BNS) (23). BNS is a high-speed directoryservice developed at Agilent, which can resolve differences betweenalias and official gene symbols as well as between different geneidentifier schemes, and links to publicly available databases. Gene gwas called linked to a GO term, term t, if GO annotation for gene gcontained term t or a child of term t. For each GO term t, we tested thehypothesis that term t is over- or underrepresented in a given list of

105PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

genes against the null hypothesis that distribution of terms is random.This term to gene list association was measured using hypergeometricdistribution as the probability of observing k or more genes annotatedby term t in the set of n genes, when there are K genes annotated byterm t in the whole set of N genes. Namely, for each term t and list ofgenes l, the P value was calculated as follows:

Pt,l � 1 �

y�0

k �ny��N � n

K � y��N

K�where N is the total number of unique genes represented in themicroarray, n is the total number of unique genes represented in set l,K is the total number of entries in N genes mapping to GO term t, andk is the number of entries in n genes mapping to a specific GO term.

Also, for an ordered list of genes of length L, we computed minimalhypergeometric statistics

Pmint,l � minl�n�L�1 �

y�0

kn�1 �ny��N � n

K � y��N

K� �,

where (n) is the number of genes annotated by term t in the top ngenes in list l.

A similar algorithm was used to identify curated pathways thatwere significantly overrepresented in our data. In this case, significantover- or underrepresentation of differentially regulated genes wasdetermined in a master list of genes derived from the Kyoto Ency-clopedia of Genes and Genomes (KEGG) (22). The KEGG databaseis a collection of graphical diagrams (pathway maps) representingmolecular interaction networks in various cellular processes. Eachreference pathway is manually drawn and updated with a standardnotation. There is a strong representation of metabolic pathways.

Connectivity analysis and visualization. We used a metasearchtool, as described in Vailaya et al. (55), for automatically queryingmultiple text-based search engines to aid biologists with the task ofmanually searching and extracting associations among genes/proteinsof interest. The tool launches user queries to multiple search engines,and the retrieved results (documents) are fetched from their respectivesources. Each document is then parsed into sentences. An associationis extracted for every sentence containing at least two gene names andone verb. Associations are then converted into interactions, which arefurther grouped into a computational network representation. Sen-tences and source hyperlinks for each association are further stored asattributes of the corresponding interactions.

In this way, a large association network was constructed from over350,000 PubMed abstracts by running a query for each microarraygene and retrieving the 100 most recent articles. The process retrieved�150,000 PubMed articles. Automated queries were also run for 16well-known immune diseases (such as atherosclerosis, rheumatoidarthritis, lupus, Type I diabetes, etc.). The process retrieved around200,000 additional PubMed abstracts. We used user context files andBNS (23) for identifying interaction verbs and gene/protein names intext. The entire list of 350,000 abstracts yielded a large associationnetwork consisting of �5,200 concepts (genes/proteins) and 19,200associations among them.

A set of discriminatory genes at a given false discovery rate (FDR)was constructed using SAM analysis (see above). A subnetwork wasextracted for each gene, from the large association network, consistingonly of genes in the discriminatory set. A cumulative SAM d scorewas computed by summing the d scores of all other genes in the set

of interactions that contained the candidate gene as a participant. Thisscore allows an immediate sense of the change significance of a givennetwork within the analysis of interest. To control for the dispropor-tionate effect of genes with very large networks, we also calculated anaverage score (the cumulative score divided by the number of con-nections). In this way, networks with high overall significance couldbe identified. We refer to the genes at the center of these networks as“nexus” genes as they potentially regulate a large number of genescausing the disease phenotype. We chose the term nexus to distinguishit from the concept of a “hub” gene. According to this idea (5), certaingenes have highly correlated gene expression and exist in scale-freetopology networks. Although our networks also conform to scale-freetopology, no such assumption of correlated gene expression is madefor the nexus gene and its network partners. Indeed, it is the fact thatthe nexus gene has an experimentally based functional relationshipwith its connected genes rather than a simple gene expression corre-lation that is the key to its distinction from a hub gene. What theseconcepts do share, however, is that both hub and nexus genes areattractive candidates for therapeutic targeting and followup studies.

The extracted subnets for the genes of interest were visualizedusing Cytoscape to display the network diagrammatically. Cytoscapeis an open-source bioinformatics software platform for visualizingmolecular interaction networks and integrating these interactions withgene expression profiles and other data (47). Each connection wasvalidated manually. With the use of data overlay techniques, SAM dscores for genes in the network are superimposed, via color encoding,upon the nodes of the diagram that represent genes in the network.Additionally, for each node in the network diagram, the expressionvalues for the gene represented by that node are superimposed as a“heatstrip” visualization beneath the node. In the heatstrip visualiza-tion, a rectangular area below the node is divided into a set of verticalstrips of equal width, with each strip containing a color-coded verticalbar. The width of each bar is equal to the width of the rectangulardisplay area, in pixels, divided by the number of columns in thecorresponding heatmap. The vertical bars extend either upward ordownward from an imaginary centerline that bisects the rectangulararea. Upregulated values are encoded as bars that extend upward fromthe centerline, whereas downregulated values are encoded as bars thatdescend downward from the centerline. The bars are color coded todistinguish between experimental classes.

RESULTS

Patient Population

Fifty-one coronary artery samples obtained from twenty-twopatients were hybridized to the array. The distribution ofcardiovascular risk factors in each patient can be found inTable 1. In particular, 11 (50%) patients presented with hyper-cholesterolemia, 6 (27%) with diabetes, and 12 (55%) withhistory of hypertension. With respect to drug therapy, 10 (45%)patients were under treatment with angiotensin-converting en-zyme inhibitors, 7 (32%) with angiotensin II receptor blockers,12 (55%) with �-blockers, and 10 (45%) with statins. None ofthe patients was implanted with a left ventricular assist device.

Histological grading according to AHA criteria for his-topathological grading of atherosclerosis was obtained for 32of the collected coronary artery samples (Table 2). Fifteen(47%) samples were classified as advanced lesions (grades Va,Vb, and Vc), eight (25%) as intermediate lesions (grade III),and seven (22%) as initial lesions (grade I). Only two (6%)samples were classified as disease free. Representative imagesof the different disease states can be seen in Fig. 1, C–F.

106 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

Differential Gene Expression According to Vascular Disease

Our first set of analyses focused on identifying differentialgene expression among various disease severity classes. Inthese analyses, the two samples classified as disease free, onegrade III sample, and two grade V samples were excludedbecause the RNA obtained did not meet our quality standardsfor hybridization. The lack of entirely normal arteries, such asmight be obtained from young donor hearts, serves as alimitation to our discovery of early stage candidate genes.

We first analyzed gene expression according to histologicalgrading. A grade I (7 samples) versus grade V (13 samples)SAM analysis revealed 168 differentially regulated genes witha FDR of 0.4% (Fig. 1A). Seventy-two genes were upregulatedin grade V advanced lesions. Specifically, inflammatory genessuch as chemokine (C-C motif) receptor-like 2, chemokine-likefactor 7, and myeloid differentiation primary response geneshowed the highest d scores (d � 4). We also found cell cycleregulatory genes such as TBC1 domain family member 2, themost significantly upregulated gene in advanced grade V le-sions (d score � 4), complement 32, and branched chainaminotransferase 1, cytosolic (d scores � 3), and genes thatregulate lipid metabolism such as human prostaglandin endo-peroxide synthase, dolichyl-phosphate N-acetylglucosamine-

phosphotransferase 1, and hexosaminidase B (d scores � 3).GO analysis confirmed these results by indicating “mitotic cellcycle” and “metabolism” (both overrepresented in the top 415genes with P values � 0.001) as significant processes in gradeV advanced stage lesions (Table 3). In initial stage lesions,genes that regulate cytoskeleton organization such as �1-actinin, muscle-specific genes such as smoothelin, sarcoglycan,and vinculin, and genes that regulate cell growth such asTGF-3, endometrial bleeding-associated factor, fibroblastgrowth factor 1 (acidic), and myocardin were found to beupregulated with d scores greater than or equal to �3. Onceagain, GO analysis was consistent with these results revealingterms such as “morphogenesis” and “muscle development”(both overrepresented in the top 336 genes with P values �0.001) to be the most significant biological processes in gradeI initial stage lesions (Table 3).

A comparison of AHA grade III and V samples revealed 169differentially regulated genes (Fig. 1B). Almost all genes foundwere downregulated in the advanced lesion sample class exceptfor tropomyosin 3 and laminin-�2, which were downregulatedin the intermediate lesion class. Of the 167 genes downregu-lated in the advanced lesion class, the most significant (d scoresgreater than or equal to �3) were genes that regulated cellproliferation and growth, the apoptotic process [integrin-linkedkinase 2 and insulin-like growth factor (IGF) binding protein 3]and cell-to-cell and cell-to-matrix interactions (collagen typeXVIII-�1 and thrombospondin 2). GO analysis of downregu-lated genes in advanced stage grade V lesions showed “regu-lation of cell growth” and “cell adhesion” (both overrepre-sented in the top 162 genes with P values � 0.001) as the mostsignificant biological processes and “structural molecule activ-ity” and “IGF binding” (both overrepresented in the top 174genes with P values � 0.001) as the most significant molecularfunction terms regulated by these genes (Table 4).

In contrast, an analysis of grade I versus grade III samplesyielded no significant differences in gene expression, with verysmall numbers of genes at a high FDR. Our findings indicatethat the microscopic progression of disease severity in athero-

Table 1. Patient demographics including clinical history and drug therapy

Overall Ischemic IDCM Other

Total number of patients 22 9 9 4Age, yr 49 (46.3�7.8) 60 (57.5�9.3) 49 (46.7�7.8) 15 (20�12.08)Male 17 (77%) 8 (89%) 6 (67%) 3 (75%)Ethnicity

Caucasian 13 (59%) 6 (67%) 4 (44%) 3 (75%)Black 3 (14%) NA 3 (33%) NAHispanic 2 (9%) NA 1 (11%) 1 (25%)

Hypertension 12 (55%) 8 (89%) 3 (33%) 1 (25%)Hypercholesteremia 11 (50%) 7 (78%) 3 (33%) 1 (25%)Family history 6 (27%) 2 (22%) 3 (33%) 1 (25%)Tobacco 9 (41%) 3 (33%) 5 (56%) NADiabetes 6 (27%) 2 (22%) 3 (33%) 1 (25%)Previous ACS 7 (32%) 7 (78%) NA NADrug therapy

ACE inhibitors 10 (45%) 2 (22%) 6 (67%) 2 (50%)ARB 7 (32%) 6 (67%) 1 (11%) NA�-Blocker 12 (55%) 6 (67%) 5 (56%) 1 (25%)Digoxin 14 (64%) 5 (56%) 7 (78%) 2 (50%)Statin 10 (45%) 8 (89%) 2 (22%) NA

Clinical history and drug therapy statistics for all 22 patients in the study are shown. IDCM, ideopathic dialated cardiomyopathy; ACS, acute coronarysyndrome; ACE, angiotensin-converting enzyme; ARB, angiotensin II receptor blockers; NA, not applicable.

Table 2. AHA histological classification of samples

AHA Grade No. of Samples

No disease 2I 7III 8V

Va 2Vb 12Vc 1

Total 32

The number of histology samples obtained for each American Heart Asso-ciation (AHA) grade class is shown. Initial lesions are type I lesions, inter-mediate lesions are type III lesions, and advanced lesions are type Va, Vb, andVc lesions. A total number of 32 samples was graded.

107PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

sclerosis is not linearly related with the expression profile;instead, grade I and III lesions are more similar to one anotherthan either is alone compared with grade V lesions. Thisfinding is consistent with the fact that the presence of lipidpools (rather than cell type changes) is the distinguishingfeature of grade III disease.

Differential Gene Expression According toClinical Variables

Comparisons of samples from patients known to suffer fromclinical CAD with those from patients with no clinicallymanifest CAD (24 vs. 27 samples) revealed only minor signif-icant differences in gene expression between the classes (a 5%FDR was associated with only 77 differentially regulatedgenes). The lack of significant expression patterns that couldpredict clinical manifestation in the face of ubiquitous disease

may be explained by the heterogeneity of our clinical group(predominant stable flow obstructing CAD patients were notdifferentiated from those with predominant acute coronarysyndrome presentation). Similarly, samples from patients witha history of hypertension were not different from those withoutsuch a history (10 vs. 41 samples, 11 differentially regulatedgenes with an FDR of 18% as the lowest possible value).Again, the lack of proximate hypertension by definition intransplant candidates (as opposed to the categorization byhistory) is likely explanatory, masking effects that may havebeen revealed at an earlier time point.

Diabetes is known to be a strong risk factor for CAD, but thedisease mechanism is not well understood. We were interestedin studying the diabetes signal in our data and conducted ananalysis comparing samples from diabetic patients with thosefrom patients without diabetes. Prior analysis (not shown)

Fig. 1. Gene expression profiles and American Heart Association (AHA) histological classification of atherosclerotic lesions. AHA-graded samples fromcoronary artery segments were hybridized onto a 22-K oligonucleotide microarray. Significance analysis of microarrays (SAM) indicated differences in geneexpression pattern between initial (grade I) and advanced (grade V) lesions (A) and between intermediate (grade III) and advanced (grade V) lesions (B). Therow-normalized heatmaps display individual experiments by column and genes ordered by decreasing significance. Color intensity is scaled within each row sothat the highest expression value corresponds to bright red and the lowest to bright green. Unique gene identifiers and gene names are listed to the right. In A,only downregulated genes are shown for clarity. C–F: slides were acetone-fixed, 4-�m cryostat sections stained with hematoxylin and eosin. C: no disease. D:grade I. E: grade III. F: grade V. Magnification, �100.

108 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

Table 3. GO-discovered categories for disease severity grade I vs. grade V analysis

GO ID GO Name P ValueActualCount

ExpectedCount

Number ofAnnotation

Genes

Categories found to be highly significant in grade V lesions

CellularComponent5622 Intracellular 8.6e�10 271 215.64 3735737 Cytoplasm 5.4e�7 153 109.38 3735739 Mitochondrion 1.1e�6 45 21.25 37330894 Replisome 1.6e�4 6 0.88 3735657 Replication fork 2.1e�4 6 0.92 3735660 �-DNA polymerase cofactor complex 3.2e�4 3 0.18 373

Molecularfunction

3824 Catalytic activity 8.8e�9 205 149.21 4244527 Exonuclease activity 1.5e�5 10 1.91 42416616 Oxidoreductase activity, acting on CH-OH group of donors with

NAD or NADP as acceptor2.1e�5 12 2.85 424

16614 Oxidoreductase activity, acting on CH-OH group of donors 4.5e�5 12 3.06 42416779 Nucleotidyltransferase activity 1.5e�4 12 3.44 4243690 Double-stranded DNA binding 2.9e�4 5 0.64 424

Biological process278 Mitotic cell cycle 9.1e�8 27 8.53 4157049 Cell cycle 8.6e�7 54 27.29 4158152 Metabolism 9.9e�7 224 176.36 41582 G1/S transition of mitotic cell cycle 1.2e�5 11 2.29 41567 DNA replication and chromosome cycle 6.1e�5 19 6.94 4157582 Physiological process 6.4e�5 316 280.65 4156260 DNA replication 2.6e�4 15 5.32 41584 S phase of mitotic cell cycle 2.6e�4 15 5.32 4156139 Nucleobase, nucleoside, nucleotide, and nucleic acid metabolism 2.6e�4 74 49.84 4158283 Cell proliferation 2.8e�4 64 41.62 415

Categories found to be significant in grade I lesions

Cellularcomponent

15629 Actin cytoskeleton 6.4e�5 17 5.79 2975884 Actin filament 2.9e�4 5 0.63 297

Biological process9653 Morphogenesis 2.0e�6 47 23.09 3367517 Muscle development 2.3e�6 14 3.130 3367275 Development 3.2e�5 70 43.72 3369887 Organogenesis 1.9e�4 39 21.41 3366928 Cell motility 3.7e�4 20 8.57 3367154 Cell communication 6.6e�4 121 93.98 336

Gene ontology (GO) analysis was applied to significance analysis of microarrays (SAM)-derived results from the severity analysis of grade I versus grade Vlesions. Calculated P values for each category are also shown.

Table 4. GO-discovered categories for diabetes analysis

GO ID GO Name P ValueActualCount

ExpectedCount

Number ofAnnotation

Genes

Cellular component5576 Extracellular 9.3e�5 45 25.45 247

Biological process9605 Response to external stimulus 9.6e�8 54 25.80 2716952 Defense response 1.8e�7 46 20.66 2719607 Response to biotic stimulus 3.0e�7 48 22.44 2716955 Immune response 4.3e�7 42 18.57 27116064 Humoral defense mechanism 8.2e�7 15 3.27 2716956 Complement activation 1.3e�5 7 0.83 2719613 Response to pest/pathogen/parasite 1.5e�5 27 11.02 271

Genes identified as significantly upregulated in diabetic blood vessels were found to fall into immune GO categories such as defense response, immuneresponse, humoral defense mechanism, and complement activation. Calculated P values for each category are also shown.

109PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

demonstrated strikingly similar findings across all disease se-verity classes, so only the combined data are presented here.Nineteen diabetic samples were compared with thirty-twonondiabetic samples. Our analysis revealed 653 upregulatedgenes in the no diabetes class and 37 upregulated genes in thediabetes class with an FDR of 0.08% (Fig. 2). Among thegenes significantly upregulated in the diabetes class, IGF-1 was

the top gene, with a d score of 5.3. IL-1 receptor (fibroblasttype) and IL-2 receptor-� were also significantly upregulated(d � 3) and are genes that mediate cytokine-induced immuneand inflammatory responses. In addition, CD14 antigen, amediator of the innate immune response that leads to NF-�Bpathway activation and inflammation, was found to be upregu-lated (d � 3). The presence of these genes suggests that

Fig. 2. Genes found to be upregulated in diabetic coronary arteries with SAM. A two-class, unpaired SAM analysis was performed using diabetes as a supervisingvector across all samples. Nineteen samples were in the diabetes class, and thirty-two samples were in the nondiabetes class. Thirty-seven genes were significantlyupregulated in the diabetes class, with a false discovery rate (FDR) of 0.08%. The row-normalized heatmaps display individual experiments by column and genesordered by decreasing significance. Color intensity is scaled within each row so that the highest expression value corresponds to bright red and the lowest to brightgreen. Unique gene identifiers and gene names are listed to the right.

110 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

inflammation is more prominent in diabetic CAD than innondiabetic CAD. Importantly, GO analysis confirmed thisfinding with processes such as “immune response” and “hu-moral defense mechanism” (both overrepresented in the top271 genes with P values � 0.001) as some of the mostsignificant terms for the diabetes class (Table 4).

Protein Validation and Cellular Origin of Transcriptsof Interest

A potential limitation of expression profiling of whole tissuesegments is the lack of insight into the cell type of messageorigin. Although techniques such as laser capture microscopyare beginning to provide one answer to this question, a moredetailed answer for specific targets is provided by immunohis-tochemistry. This has the dual benefit of validating the pres-ence of the ultimate product of transcription, the protein, whileat the same time localizing it to cell type. We carried outimmunohistochemistry for a series of targets to illustrate theutility of this approach (Fig. 3). In many cases, the cell type ispredictable from previous knowledge. For example, �-actininis a component of the thin filament of smooth muscle (31). Assuch, it is not surprising that to see clear staining of smooth

muscle cells in the media and intima of two diseased humanarteries (Fig. 3, A and B). Similarly, the role of CD14 inlipopolysaccharide-dependent macrophage activation is known(9), and the presence of CD14 in typical cells in atheroma aswell as in subendothelial cells in early disease fatty streaklesions is consistent with this (Fig. 3, C and D). However,immunohistochemistry can also provide important informationon cellular localization when there is little prior data to guideexpectation or when expectations are confounded. The �-sub-unit of IL receptor 2 (CD25) might be expected to be found onT lymphocytes migrating into the vascular wall and, indeed, anexample of this can be seen in Fig. 3E. However, an extensionof staining to endothelial cells is also seen (Fig. 3F), which hasnot been previously described. Certainly, endothelial cells areincreasingly recognized to play an important role in innateimmune processes (15) and antigen presentation (43), makingthis an intriguing finding worthy of further study.

Pathway Identification by Overabundance

Looking for overabundance of differentially regulated genesin curated pathways can add structure to genomic data. Wecarried out overabundance analysis according to the hypergeo-metric distribution for disease severity and diabetes groups ofgenes for pathways within KEGG. This database features alarge number of metabolic pathways. We identified several asoverrepresented in our data. Within the disease severity group(genes differentially regulated between AHA groups I and V),the cell cycle pathway was the most significantly overrepre-sented (Table 5), consistent with an actively changing cellphenotype and providing further support for the idea thatsmooth muscle dedifferentiation is a key process in diseaseprogression. This pathway is shown in Fig. 4 with differentiallyregulated genes highlighted. Within the diabetes group, ubiq-uitin-mediated proteolysis is the most overrepresented path-way, with components of the proteosome also highly ranked.

Pathway Discovery by Connectivity Analysis

Although the identification of overrepresented curated path-ways offers structure to genomic data and some insight into thesignificance of differentially regulated genes, new informationis limited to already well-recognized pathways. In addition, thebinary nature of the hypergeometric statistical test where agene is either “present” or “absent” in the pathway does notallow for the uncertainty present in any significance call andgives overdue prominence to the choice of the FDR used todetermine presence or absence. To this end, we developed atool that creates a master network of multiple connectionsdrawn from language parsing of the scientific literature andthen estimates the extent to which connected genes are differ-entially regulated in a particular data set, expressing this as anumerical score. This not only allows novel connections andnetworks to be recognized but also provides a sense of thesignificance of each network while at the same time allowingthe inclusion of genes that may be regulated at a posttransla-tional level.

Using a SAM significant gene list derived from our diseaseseverity analysis, we generated connectivity networks for ev-ery gene. Networks can then be ranked according to d score,cumulative network d score, or average network d score in aninteractive spreadsheet. At this point, networks of interest are

Table 5. Overrepresentation of differentially regulatedgenes in curated pathways

z Score

Disease severity pathwayCell cycle 6.56484Galactose metabolism 4.778853Biotin metabolism 4.16849Pantothenate and coa biosynthesis 4.16849Purine metabolism 3.799147Tetrachloroethene degradation 3.793941Globoside metabolism 3.556855Aminosugars metabolism 3.436999Sulfur metabolism 2.682408Synthesis and degradation of ketone bodies 2.682408Terpenoid biosynthesis 2.682408Glycerolipid metabolism 1.506697Tricarbaxylic acid cycle 2.760081Lysine degradation 2.575302�-Alanine metabolism �3.51485Taurine and hypotaurine metabolism �3.14233

Diabetes pathwayProteasome 5.592628Ubiquitin-mediated proteolysis 4.649652ATP synthesis 3.568731Ethylbenzene degradation 2.73551Lysine biosynthesis 2.73551Biotin metabolism 2.73551Oxidative phosphorylation 2.375255Terpenoid biosynthesis �1.19895Porphyrin and chlorophyll metabolism �1.39263�-Hexachlorocyclohexane degradation �2.32695Amyotrophic lateral sclerosis �2.63123Benzoate degradation via hydroxylation �2.7841,4-Dichlorobenzene degradation �2.784Taurine and hypotaurine metabolism �3.14313Nitrogen metabolism �3.3083Chondroitin heparan sulfate biosynthesis �3.98487Tryptophan metabolism �4.54521

Genes identified from SAM were tested for overrepresentation within thepathways for the Kyoto Encyclopedia of Genes and Genomes under thehypergeometric distribution assumption; z scores are provided for each path-way.

111PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

mapped using the alfa network viewer (55) to create a liveinteractive network in which all literature references can bevalidated. For reproduction in print, we developed a moduleusing Cytoscape architecture (Figs. 5 and 6) (47). Nodes arecolored from a gradient of bright red to black to bright green,with bright red nodes representing high d score genes in theanalysis (in this case, genes upregulated in grade V lesions)and bright green nodes representing low d score genes (thosedownregulated in grade V lesions). Black nodes represent

genes with no significant differential gene expression in eitherclass. Beneath each node is a heatstrip corresponding to therepresentative row from the heatmap of the analysis. In thisanalysis, the blue bars represent expression of the class Vsamples, and the brown bars represent expression of the classI samples. The height and direction of the bars represent themagnitude of the log ratio and whether it is positive ornegative. This format allows for simultaneous visualizationof multiple interactions, significance levels of gene expres-

Table 6. Connectedness ranking for disease severity network

GeneSymbol Gene Name Connections d Score

Cumulatived Score

Average dScore

foxc1 Forkhead box C1 5 �2.82792 8.687528 1.737506anapc11 APC11 anaphase-promoting complex subunit 11 homolog (yeast) 5 �2.27386 7.955029 1.591006rfc4 Replication factor C (activator 1) 4, 37 kDa 5 2.461017 7.623374 1.524675rfc2 Replication factor C (activator 1) 2, 40 kDa 5 2.751432 7.623374 1.524675bclaf1 BCL2-associated transcription factor 1 9 2.512801 12.18545 1.353939faim Fas apoptotic inhibitory molecule 5 2.144176 6.724575 1.344915ccnb1 Cyclin B1 10 2.361627 12.90826 1.290826abcc3 ATP-binding cassette, subfamily C, member 3 6 4.195204 7.51136 1.251893bcat1 Branched chain aminotransferase 1, cytosolic 5 3.819329 5.999074 1.199815cspg2 Chondroitin sulfate proteoglycan 2 (versican) 5 �2.92815 5.985945 1.197189m6prbp1 Mannose-6-phosphate receptor binding protein 1 7 2.051108 7.749187 1.107027il17d Interleukin-17D 9 �2.63673 9.860027 1.095559slc1a3 Solute carrier family 1, member 3 6 2.554206 6.571425 1.095238ryr2 Ryanodine receptor 2 (cardiac) 5 �3.23865 5.325963 1.065193bckdhb Branched chain keto acid dehydrogenase E1, �-polypeptide 10 2.168964 10.38178 1.038178gpx1 Glutathione peroxidase 1 12 2.376621 12.00668 1.000556

Connectivity rankings were calculated for all genes significantly differentially regulated across disease severity classes. Connections are derived from languageparsing of the published literature and d scores are based on significance values (d statistic value). Genes displayed are ranked according to the average d score.The cumulative score represents the total of all the significance values within a given gene network, and the average score represents the cumulative score dividedby the number of connections. Only genes with more than four connections are shown for clarity.

Fig. 3. Immunohistochemistry of selected proteins. Antibodies raised against selected proteins identified by SAM were used for immunohistochemistry of humancoronary arteries cut in frozen sections. A and B: �-actinin staining of medial and intimal smooth muscle cells (brown staining). C and D: CD14 staining of rarecells typical for macrophages within atheroma and of subendothelial cells (arrow) in early lesions. E and F: staining of the �-subunit of the IL-2 receptor (CD25)on lymphocytes (thick arrow) and endothelial cells (thin arrow). The counterstain is hematoxylin.

112 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

sion changes, and raw data. We labeled these genes as nexusgenes to indicate their fundamental role within a generatednetwork (and to differentiate them from hub genes, asdiscussed above).

Disease severity. Networks generated from genes differen-tially regulated between AHA class I and class V disease alsofeatured cell cycle and immune connections (Table 6). Forclarity, we have only shown networks where more than fourconnections were discovered. Networks are ranked by averaged score. Two key networks are illustrated in Fig. 5. Cyclin B1

is a cell cycle protein essential for the control at the G2/Mtransition. It was found to be coordinately regulated withtopoisomerase II-� (a DNA topoisomerase) and inverse coor-dinately regulated with matrix metalloproteinases (MMPs)such as MMP-1, MMP-8, and MMP-9 (33). We also show asubnetwork likely active in T lymphocytes present in the vesselwall. IL-27 (IL17d) is known to support proliferation of naiveCD4 T cells and enhance interferon-� production by activatedT cells and natural killer cells (16, 30). Furthermore, it inducesphosphorylation of signal transducer and activator of transduc-tion (STAT)1 and STAT3 in both human and murine cell lines,providing an excellent example of a posttranslational relation-ship that could not have been appreciated from RNA expres-sion data alone.

Diabetes, inflammation, and immunity. A similar connectiv-ity analysis was carried out for diabetes analysis using genesidentified as significant by SAM statistics (Table 7 and Fig. 6).In this case, bright red nodes indicate genes found to beupregulated in diabetics, and bright green nodes denote down-regulated genes in diabetics. In the heatstrip below each node,brown bars represent expression of the diabetes class, and bluebars represent expression of the no diabetes class. Genes withthe highest cumulative d scores were interferon-� (248 con-nections, nonsignificant expression change, cumulative dscore � 94.7) and IL-6 (196 connections, d score � 2.34,cumulative d score � 86.1). Notably, however, the size of thenetwork means that the average d score for these nexus genesis low. Interestingly, genes with high average d score ratings(Table 7) include a large number clearly related to diabetes (forexample, insulin receptor, glucose-related protein, insulin-de-grading enzyme, IGF receptor 1, IGF-2, and IGF bindingprotein 3).

DISCUSSION

We employed a pathway-based approach toward the study ofatherosclerosis by applying network development tools to alarge, complex data set derived from human coronary artery

Fig. 4. Differential gene expression for a cell cycle curated pathway. Analysis of overrepresentation of differentially regulated genes identified by SAM in pathwaysfrom the Kyoto Encyclopedia of Genes and Genomes is shown. The cell cycle pathway is illustrated with significantly regulated genes highlighted in red.

113PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

tissue hybridization experiments. Significant genes were stud-ied in the broader context of ontology and by mapping ontoknown and novel pathways. Discovered connections revealedinsights into biology not drawn from basic gene lists alone;instead, highly connected nexus genes were identified andplaced in the integrated context of the disease whole.

Disease Severity

Analysis of histopathology graded samples according tonational guidelines revealed that many genes expressed at arelatively high level in early disease were important markergenes of smooth muscle cell differentiation, regulation, oractivation (49). Furthermore, many of these genes were cate-gorized under ontology terms such as muscle development,actin cytoskeleton, and actin filament. Overrepresentation ofcell cycle pathway components (KEGG pathway) also sug-gested active cellular differentiation processes were present,and our connectedness ranking (which adds a weighting ac-cording to the “significance” of connected gene expression)found cell cycle and immune networks most prominentlyweighted. Immunohistochemistry of selected proteins con-firmed that the cell of origin of prominently downregulatedgenes such as �-actinin was indeed the smooth muscle cell.

Although many processes are known to be involved in thedevelopment of atherosclerosis, their relative significance hasnot been established. Our analysis, in which key smoothmuscle genes and ontologies are prominent over and abovethose of other cells or recognized immune signals, suggeststhat the key process in the progression of atherosclerosis relatesto smooth muscle cell dedifferentiation. Although it is possiblethat this observation reflects a different cellular milieu pre-defined by the classification system (the array could be actingsimply as a sophisticated microscope), extension of the histo-logical classification to multiple cell types, the presence ofwell-characterized changes in the phenotype of smooth musclecells in culture, the lack of expression differences betweenclasses I and III, and the breadth of potential ontologies argues

against this. Indeed, the power of this approach lies in theabsence of an a priori hypothesis. The strongest change in geneexpression signal across disease severity might have originatedfrom endothelial cells, macrophages, or migrating circulatingimmune cells. The advantage of examining the vessel wall asa whole is apparent: our analysis suggests a renewed focus onthe changes in smooth muscle phenotype as a strategy forinterrupting atherosclerosis at the molecular level.

Diabetes, Inflammation, and Immunity

Although atherosclerosis has been known for some time tobe a disease characterized by inflammation, diabetic CAD hasuntil this point been viewed simply as a particularly severevariant. The clinical observation of small lumen vessels withsevere disease has not yet been adequately explained at themolecular level. In dividing our samples according to thediabetic status of our patients, we searched for an explanationand found that many genes upregulated in diabetic arteries fellinto immune system-related categories such as immune re-sponse, defense response, response to pest/pathogen/parasite,and humoral defense mechanism. Pathway tools allowed us todevelop networks that revealed relationships between genesnot previously connected in this context. Although both aninflammatory state (12, 24, 46) and the innate immune system(8, 36, 37) are known to play important roles in the develop-ment of diabetes itself, no study to date has linked these ideaswith the development of CAD in diabetics; a genome-scalecomparison of expression in human diabetic and nondiabeticcoronaries has not been reported (29). The pathway analysispresented here provides new insight into the mechanism behindthe contribution of these systems to the development of thedisease.

More specific insight is gained from the generated connec-tivity analysis. Gene networks with high average d scoresincluded many genes obviously related to diabetes, insulin, andIGFs. Also prominent were immune mediators such as theIL-10 receptor. The strong immunity signal was most apparent,

Fig. 5. Connectivity analysis network fordisease severity. A connectivity network wasgenerated to identify nexus genes derivedfrom genes differentially regulated acrossdifferent levels of disease severity. Nodecolor indicates gene significance (red, up-regulated in grade V lesions; green, down-regulated in grade V lesions). The heatstripbelow each node represents the raw expres-sion data: upregulated values are encoded asbars that extend upward from the centerline,whereas downregulated values are encodedas bars that descend downward from thecenterline. The bars are color coded to dis-tinguish between experimental classes: bluebars for grade I samples and brown bars forgrade V samples.

114 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

however, for the cumulative d scores. The top 30 networksincluded IL-1, -2, -4, -6, and -10, interferon-�, and TNFsuperfamily members-1, -2, -5, and -6. Also of interest wereIGF-1 and vascular endothelial growth factor.

Nexus genes function as pivotal nodes in a complex net-work. We hypothesize that altering the function of a nexusgene by mutation, environmental influence, or pharmaceuticaltargeting will have more profound consequences than for a lessconnected gene. As a critical validation of this concept, IL-6has been implicated in the pathogenesis of atherosclerosis inType 2 diabetics (36). Several lines of evidence suggest aproatherosclerotic role for IL-6: 1) oxidized LDL enhances theexpression of proinflammatory cytokines such as IL-6 (26); 2)supraphysiological concentrations of exogenous IL-6 enhanceatherosclerosis in the apolipoprotein E-deficient (ApoE�/�)mouse (19); and 3) plasma levels of IL-6 are enhanced inpatients with unstable angina and predict the outcome ofpatients with acute coronary syndromes (6). However, theApoE�/�/IL-6�/� double-knockout mouse actually demon-strates enhanced atherosclerotic lesion formation and increasedserum cholesterol levels (45), suggesting a more complex

picture. In fact the finding that LDL receptor�/�/IL-6�/�

double-knockout mice showed little difference from controlmice indicates a complex balance of pro- and antiinflammatoryfactors and further validates the network model approach (48).

As a nexus gene, IL-6 would be a strong candidate fortherapeutic targeting and followup study. In addition, ouranalysis suggests several little-studied nexus genes that wouldbe potential candidates for further scrutiny and analysis. Forexample, very little is known of the role of the breakpointcluster region gene (also highly ranked) in atherosclerosis orleukemia inhibitory factor. Similarly, PubMed literaturesearches for the involvement of other nexus genes in athero-sclerotic disease [such as IGF-1, IL-1 receptor, cyclin-depen-dent kinase inhibitor 2a (CDKN2a), and Tale family ho-meobox-induced factor (TGIF)] revealed few studies, suggest-ing that these genes would be strong candidates for furtherstudy in the disease modification of diabetes.

A further benefit of our approach is the ability to account fora priori knowledge (in making connections) without prejudicein the process of significant gene discovery (transcriptionalprofiling is “blind,” in contrast with alternative approaches,

Fig. 6. Connectivity analysis network for diabetes. A connectivity network was generated to identify nexus genes derived from genes differentially regulatedin diabetic patients. Node color indicates gene significance (red, upregulated in diabetes; green, downregulated in diabetes). The heatstrip below each noderepresents the raw expression data: upregulated values are encoded as bars that extend upward from the centerline, whereas downregulated values are encodedas bars that descend downward from the centerline. The bars are color coded to distinguish between experimental classes: brown bars for the diabetes class andblue bars for the no diabetes class.

115PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

which necessitate a prescribed narrow focus by definition). Apotential drawback of the connectivity scores, however, is theirreliance on published literature. Science is known to manifesta sociological component, and it could be argued that more isknown about certain processes, not because these are morefundamental or biologically important but because they havebeen studied more (for example, NIH funding for cancerresearch and human immunodeficiency virus is greater than forcardiovascular medicine). Although this may hold to a limitedextent in relation to the literature as a whole (connections haveto have been studied at least once to be present in the litera-ture), our approach both avoids this issue and as much as itmeets it, fails to support it. By weighting our analysis by thesignificance of differentially regulated genes, we minimize theeffect of publication bias (most highly connected genes wereconnected to �100 other genes). Furthermore, for discoverednexus genes, we found little relationship between the numberof publications and connectedness. If publication bias wasindeed a factor, a linear relationship between the nexus geneconnectivity score and number of publications would result. Infact, this was not the case. IL-6, with one of the highest

cumulative significance scores, was connected to 1,165 publi-cations, whereas IGF-1, a lower ranked nexus gene, had over17,560 publications associated with it.

In summary, insight into the true nature of a disease can onlybe gained by combining multiple approaches to investigation.Traditional approaches to the molecular genetics of atheroscle-rosis have focused on manipulating single genes with trans-genic technology. Although this is ultimately revealing, it istime consuming, and choosing the pathways on which to focusthis mechanistic effort is challenging. To provide an answer tothis, we used transcription profiling, a technique that allowssimultaneous assay of tens of thousands of genes, to investigateimportant pathways in atherosclerosis without the requirementfor an a priori focus. We applied widely recognized techniquesto control false discovery and then identified prominent signalsin ontologies and known pathways, finally identifying nexusgenes through the use of a connectivity analysis and visualiza-tion solution. Our approach takes advantage of the unmatchedhigh throughput of expression profiling while accounting for itssignificant drawback: many genes are regulated at the post-translational level. Because our approach connects genes

Table 7. Connectedness ranking for diabetes network

GeneSymbol Gene Name Connections d Score

Cumulatived Score

Average dScore

insr Insulin receptor 6 2.165246 12.14521 2.024201casp9 Caspase 9, apoptosis-related cysteine protease 6 2.430773 10.99824 1.833041h2afx H2A histone family, member X 6 2.820754 8.785137 1.46419ran RAN, member RAS oncogene family 11 3.290324 15.99435 1.454032tmf1 TATA element modulatory factor 1 11 3.026121 15.99435 1.454032mat2b Methionine adenosyltransferase IL-� 7 2.772645 9.975451 1.425064hist2h2aa Histone 2, H2aa 8 �2.2004 11.21635 1.402044surb7 SRB7 suppressor of RNA polymerase B homolog (yeast) 8 2.465227 10.99389 1.374236mre11a MRE11 meiotic recombination 11 homolog A (S. cerevisiae) 8 1.906797 10.98647 1.373308gnaq Guanine nucleotide-binding protein (G protein), q polypeptide 12 3.022406 16.35835 1.363196za20d2 Zinc finger protein 216 6 2.163187 7.679124 1.279854ddx5 Asp-Glu-Ala-Asp box polypeptide 5 6 2.305021 7.605856 1.267643serpinb1 Serine (or cysteine) proteinase inhibitor, clade B, member 1 8 �2.52357 10.02203 1.252754zmynd11 Adenovirus 5 E1A binding protein 6 2.90827 7.504926 1.250821rnf14 Ring finger protein 14 13 2.075105 15.99435 1.230334gfap Glial fibrillary acidic protein 16 �2.57461 19.4386 1.214912grp58 Glucose-regulated protein, 58 kDa 15 3.047668 18.06967 1.204645asip Agouti signaling protein, nonagouti homolog (mouse) 6 �2.27216 7.170907 1.195151il10ra IL-10 receptor-� 6 �2.71053 7.152875 1.192146mrps27 Mitochondrial ribosomal protein S27 9 2.867958 10.66507 1.185008mapk9 Mitogen-activated protein kinase 9 7 2.512859 8.28175 1.183107cop1 Constitutive photomorphogenic protein 12 1.928966 14.1273 1.177275dct Dopachrome tautomerase 7 �2.41546 8.189752 1.169965ezh2 Enhancer of zeste homolog 2 (Drosophila) 6 �2.76411 7.012886 1.168814tnfrsf11b Basic transcription factor 3, like 1 7 2.70328 8.176898 1.168128nat2 N-acetyltransferase 2 (arylamine N-acetyltransferase) 11 1.860787 12.56943 1.142675pnr Putative neurotransmitter receptor 7 3.072439 7.81515 1.11645dlg3 Discs, large homolog 3 (neuroendocrine-dlg, Drosophila) 9 2.088173 9.936481 1.104053tpr Translocated promoter region (to activated MET oncogene) 15 2.476345 16.53732 1.102488xpa Xeroderma pigmentosum, complementation group A 8 3.698296 8.606495 1.075812bcar3 Breast cancer antiestrogen resistance 3 6 2.157678 6.428479 1.071413hdac8 Histone deacetylase 8 6 2.384186 6.391956 1.065326rpe Ribulose-5-phosphate-3-epimerase 10 2.452221 10.63192 1.063192amt Aminomethyltransferase (glycine cleavage system protein T) 8 �2.37204 8.419163 1.052395sfrs1 Splicing factor, arginine/serine-rich 1 17 2.418068 17.69482 1.040872ide Insulin-degrading enzyme 7 1.884278 7.273596 1.039085akr1c2 Aldo-keto reductase family 1, member C2 6 2.472241 6.105277 1.017546

Connectivity rankings were calculated. Genes significantly differentially regulated across diabetes/no diabetes classes are displayed and ranked according tothe average d score. Connections are derived from language parsing of the published literature and d scores are based on significance values (d statistic value).The cumulative score represents the total of all the significance values within a given gene network and the average score represents the cumulative score dividedby the number of connections. Only genes with greater than five connections are displayed for clarity.

116 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

whose gene expression is significantly changed but onlyweights with significance, overall patterns of pathway activa-tion become apparent even if several members of the networkare not regulated at the transcriptional level. By incorporatingprior knowledge in the choice of nexus genes and using acomprehensive gene expression platform, new insights intodisease processes can be drawn and significant power in theestimation of nexus genes harnessed.

GRANTS

This work was supported by the Donald W. Reynolds CardiovascularClinical Research Center at Stanford University.

REFERENCES

1. American Heart Association. Heart Disease and Stroke Statistics–2004Update. Dallas, TX: American Heart Association, 2003.

2. Armen J and Smith BW. Exercise considerations in coronary arterydisease, peripheral vascular disease, and diabetes mellitus. Clin SportsMed 22: viii and 123–133, 2003.

3. Barabasi AL and Oltvai ZN. Network biology: understanding the cell’sfunctional organization. Nat Rev Genet 5: 101–113, 2004.

4. Beckman JA, Creager MA, and Libby P. Diabetes and atherosclerosis:epidemiology, pathophysiology, and management. JAMA 287: 2570–2581, 2002.

5. Bergmann S, Ihmels J, and Barkai N. Similarities and differences ingenome-wide expression data of six organisms. PLoS Biol 2: E9, 2004.

6. Biasucci LM, Vitelli A, Liuzzo G, Altamura S, Caligiuri G, Monaco C,Rebuzzi AG, Ciliberto G, and Maseri A. Elevated levels of interleukin-6in unstable angina. Circulation 94: 874–877, 1996.

7. Correia ML and Haynes WG. Leptin, obesity and cardiovascular dis-ease. Curr Opin Nephrol Hypertens 13: 215–223, 2004.

8. Crook M. Type 2 diabetes mellitus: a disease of the innate immunesystem? An update. Diabet Med 21: 203–207, 2004.

9. Dobrovolskaia MA and Vogel SN. Toll receptors, CD14, and macro-phage activation and deactivation by LPS. Microbes Infect 4: 903–914,2002.

10. Eisen MB, Spellman PT, Brown PO, and Botstein D. Cluster analysisand display of genome-wide expression patterns. Proc Natl Acad Sci USA95: 14863–14868, 1998.

11. Evans A, Van Baal GC, McCarron P, DeLange M, Soerensen TI, DeGeus EJ, Kyvik K, Pedersen NL, Spector TD, Andrew T, Patterson C,Whitfield JB, Zhu G, Martin NG, Kaprio J, and Boomsma DI. Thegenetics of coronary heart disease: the contribution of twin studies. TwinRes 6: 432–441, 2003.

12. Festa A, D’Agostino R Jr, Tracy RP, and Haffner SM. Elevated levelsof acute-phase proteins and plasminogen activator inhibitor-1 predict thedevelopment of type 2 diabetes: the insulin resistance atherosclerosisstudy. Diabetes 51: 1131–1137, 2002.

13. Franz WM, Frey N, Muller O, Kubler W, and Katus HA. [A transgenicanimal model: new possibilities for cardiovascular research]. Z Kardiol 84,Suppl 4: 17–32, 1995.

14. Ghazalpour A, Doss S, Yang X, Aten J, Toomey EM, Van Nas A,Wang S, Drake TA, and Lusis AJ. Thematic review series: the patho-genesis of atherosclerosis. Toward a biological network for atherosclero-sis. J Lipid Res 45: 1793–1805, 2004.

15. Henneke P and Golenbock DT. Innate immune recognition of lipopoly-saccharide by endothelial cells. Crit Care Med 30: S207–S213, 2002.

16. Hibbert L, Pflanz S, De Waal Malefyt R, and Kastelein RA. IL-27 andIFN-alpha signal via Stat1 and Stat3 and induce T-Bet and IL-12Rbeta2 innaive T cells. J Interferon Cytokine Res 23: 513–522, 2003.

17. Ho M, Yang E, Matcuk G, Deng D, Sampas N, Tsalenko A, TabibiazarR, Zhang Y, Chen M, Talbi S, Ho YD, Wang J, Tsao PS, Ben-Dor A,Yakhini Z, Bruhn L, and Quertermous T. Identification of endothelialcell genes by combined database mining and microarray analysis. PhysiolGenomics 13: 249–262, 2003.

18. Hong Y, de Faire U, Heller DA, McClearn GE, and Pedersen N.Genetic and environmental influences on blood pressure in elderly twins.Hypertension 24: 663–670, 1994.

19. Huber SA, Sakkinen P, Conze D, Hardin N, and Tracy R. Interleukin-6exacerbates early atherosclerosis in mice. Arterioscler Thromb Vasc Biol19: 2364–2367, 1999.

20. Iliadou A, Lichtenstein P, Morgenstern R, Forsberg L, Svensson R, deFaire U, Martin NG, and Pedersen NL. Repeated blood pressuremeasurements in a sample of Swedish twins: heritabilities and associationswith polymorphisms in the renin-angiotensin-aldosterone system. J Hy-pertens 20: 1543–1550, 2002.

21. Johnson CD, Balagurunathan Y, Lu KP, Tadesse M, FalahatpishehMH, Carroll RJ, Dougherty ER, Afshari CA, and Ramos KS. Genomicprofiles and predictive biological networks in oxidant-induced atherogen-esis. Physiol Genomics 13: 263–275, 2003.

22. Kanehisa M and Goto S. KEGG: Kyoto encyclopedia of genes andgenomes. Nucleic Acids Res 28: 27–30, 2000.

23. Kincaid R, Kleusing D, and Vailaya A. BNS: an LDAP-Based Biomol-ecule Naming Service. Washington, DC: Objects in Bio- and Chem-Informatics, 2002.

24. Laaksonen DE, Niskanen L, Nyyssonen K, Punnonen K, TuomainenTP, Valkonen VP, Salonen R, and Salonen JT. C-reactive protein andthe development of the metabolic syndrome and diabetes in middle-agedmen. Diabetologia 47: 1403–1410, 2004.

25. Larkin JE, Frank BC, Gaspard RM, Duka I, Gavras H, and Quack-enbush J. Cardiac transcriptional response to acute and chronic angioten-sin II treatments. Physiol Genomics 18: 152–166, 2004.

26. Libby P. Inflammation in atherosclerosis. Nature 420: 868–874, 2002.27. Libby P, Ridker PM, and Maseri A. Inflammation and atherosclerosis.

Circulation 105: 1135–1143, 2002.28. Lopes N, Vasudevan SS, Alvarez RJ, Binkley PF, and Goldschmidt

PJ. Pathophysiology of plaque instability: insights at the genomic level.Prog Cardiovasc Dis 44: 323–338, 2002.

29. Lu H, Raptis M, Black E, Stan M, Amar S, and Graves DT. Influenceof diabetes on the exacerbation of an inflammatory response in cardiovas-cular tissue. Endocrinology 145: 4934–4939, 2004.

30. Lucas S, Ghilardi N, Li J, and de Sauvage FJ. IL-27 regulates IL-12responsiveness of naive CD4� T cells through Stat1-dependent and-independent mechanisms. Proc Natl Acad Sci USA 100: 15047–15052,2003.

31. Marston SB and Smith CW. The thin filaments of smooth muscles.J Muscle Res Cell Motil 6: 669–708, 1985.

32. Mulvihill ER, Jaeger J, Sengupta R, Ruzzo WL, Reimer C, Lukito S,and Schwartz SM. Atherosclerotic plaque smooth muscle cells have adistinct phenotype. Arterioscler Thromb Vasc Biol 24: 1283–1289, 2004.

33. Nordskog BK, Blixt AD, Morgan WT, Fields WR, and Hellmann GM.Matrix-degrading and pro-inflammatory changes in human vascular endo-thelial cells exposed to cigarette smoke condensate. Cardiovasc Toxicol 3:101–117, 2003.

34. Ohashi R, Mu H, Yao Q, and Chen C. Cellular and molecular mecha-nisms of atherosclerosis with mouse models. Trends Cardiovasc Med 14:187–190, 2004.

35. Peltonen L. GenomEUtwin: a strategy to identify genetic influences onhealth and disease. Twin Res 6: 354–360, 2003.

36. Pickup JC. Inflammation and activated innate immunity in the pathogen-esis of type 2 diabetes. Diabetes Care 27: 813–823, 2004.

37. Pickup JC and Crook MA. Is type II diabetes mellitus a disease of theinnate immune system? Diabetologia 41: 1241–1248, 1998.

38. Poulter N. Coronary heart disease is a multifactorial disease. Am JHypertens 12: 92S-95S, 1999.

39. Presbitero P, Zavalloni D, Scatturin M, Marisco F, Pagnotta P, andBoccuzzi G. Procedural and long-term results of sirolimus-eluting stent inpatients at high risk for restenosis. Minerva Cardioangiol 52: 189–194,2004.

40. Romaldini CC, Issler H, Cardoso AL, Diament J, and Forti N. [Riskfactors for atherosclerosis in children and adolescents with family historyof premature coronary artery disease]. J Pediatr 80: 135–140, 2004.

41. Rosal MC, Ockene JK, Luckmann R, Zapka J, Goins KV, Saperia G,Mason T, and Donnelly G. Coronary heart disease multiple risk factorreduction; providers’ perspectives. Am J Prev Med 27: 54–60, 2004.

42. Ross R. Atherosclerosis–an inflammatory disease. N Engl J Med 340:115–126, 1999.

43. Rothermel AL, Wang Y, Schechner J, Mook-Kanamori B, Aird WC,Pober JS, Tellides G, and Johnson DR. Endothelial cells presentantigens in vivo. BMC Immunol 5: 5, 2004.

44. Rubartelli P, Niccoli L, Alberti A, Giachero C, Ettori F, Missiroli B,Bernardi G, Maiello L, Reimers B, Cernigliaro C, Sardella G, andBramucci E. Coronary rotational atherectomy in current practice: acuteand mid-term results in high- and low-volume centers. Catheter Cardio-vasc Interv 61: 463–471, 2004.

117PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from

45. Schieffer B, Selle T, Hilfiker A, Hilfiker-Kleiner D, Grote K, TietgeUJ, Trautwein C, Luchtefeld M, Schmittkamp C, Heeneman S, Dae-men MJ, and Drexler H. Impact of interleukin-6 on plaque developmentand morphology in experimental atherosclerosis. Circulation 110: 3493–3500, 2004.

46. Schmidt MI, Duncan BB, Sharrett AR, Lindberg G, Savage PJ, Offen-bacher S, Azambuja MI, Tracy RP, and Heiss G. Markers of inflammationand prediction of diabetes mellitus in adults (Atherosclerosis Risk in Com-munities study): a cohort study. Lancet 353: 1649–1652, 1999.

47. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D,Amin N, Schwikowski B, and Ideker T. Cytoscape: a software environ-ment for integrated models of biomolecular interaction networks. GenomeRes 13: 2498–2504, 2003.

48. Song L and Schindler C. IL-6 and the acute phase response in murineatherosclerosis. Atherosclerosis 177: 43–51, 2004.

49. Spin JM, Nallamshetty S, Tabibiazar R, Ashley EA, King JY, ChenM, Tsao PS, and Quertermous T. Transcriptional profiling of in vitrosmooth muscle cell differentiation identifies specific patterns of gene andpathway activation. Physiol Genomics 19: 292–302, 2004.

50. Stary HC. Natural history and histological classification of atheroscleroticlesions: an update. Arterioscler Thromb Vasc Biol 20: 1177–1178, 2000.

51. Stary HC, Chandler AB, Dinsmore RE, Fuster V, Glagov S, Insull WJr, Rosenfeld ME, Schwartz CJ, Wagner WD, and Wissler RW. Adefinition of advanced types of atherosclerotic lesions and a histological

classification of atherosclerosis. A report from the Committee on VascularLesions of the Council on Arteriosclerosis, American Heart Association.Circulation 92: 1355–1374, 1995.

52. Stary HC, Chandler AB, Glagov S, Guyton JR, Insull W Jr, RosenfeldME, Schaffer SA, Schwartz CJ, Wagner WD, and Wissler RW. Adefinition of initial, fatty streak, and intermediate lesions of atherosclero-sis. A report from the Committee on Vascular Lesions of the Council onArteriosclerosis, American Heart Association. Circulation 89: 2462–2478,1994.

53. Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tib-shirani R, Botstein D, and Altman RB. Missing value estimationmethods for DNA microarrays. Bioinformatics 17: 520–525, 2001.

54. Tusher VG, Tibshirani R, and Chu G. Significance analysis of microar-rays applied to the ionizing radiation response. Proc Natl Acad Sci USA98: 5116–5121, 2001.

55. Vailaya A, Bluvas P, Kincaid R, Kuchinsky A, Creech M, and AdlerA. An architecture for biological information extraction and representa-tion. Bioinformatics 21: 430–438, 2005.

56. Valentine RJ, Guerra R, Stephan P, Scoggins E, Clagett GP, andCohen J. Family history is a major determinant of subclinical peripheralarterial disease in young adults. J Vasc Surg 39: 351–356, 2004.

57. Willerson JT and Ridker PM. Inflammation as a cardiovascular riskfactor. Circulation 109: II2–II10, 2004.

118 PATHWAY ANALYSIS OF CORONARY ATHEROSCLEROSIS

Physiol Genomics • VOL 23 • www.physiolgenomics.org

on October 3, 2005

physiolgenomics.physiology.org

Dow

nloaded from