bioinformatics 2 - the university of edinburgh › teaching › courses › bio2 › lectures...1...
TRANSCRIPT
![Page 1: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/1.jpg)
1
Armstrong, 2007
Bioinformatics 2
From genomics & proteomics tobiological networks
Armstrong, 2007
Aims
• Briefly review functional genomics• Biological Networks in general• Genetic Networks• Briefly review proteomics• Protein Networks
![Page 2: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/2.jpg)
2
Armstrong, 2007
protein-geneinteractions
protein-proteininteractions
PROTEOME
GENOME
Citrate Cycle
METABOLISM
Bio-chemicalreactions
Bio-Map
Slide from http://www.nd.edu/~networks/
Armstrong, 2007
Biological Profiling
• Microarrays– cDNA arrays– oligonucleotide arrays– whole genome arrays
• Proteomics– yeast two hybrid– PAGE techniques
![Page 3: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/3.jpg)
3
Armstrong, 2007
Why microarrays?
• What genes are expressed in a tissue andhow does that tissue respond to one of anumber of factors:– change in physical environment– experience– pharmacological manipulation– influence of specific mutations
Armstrong, 2007
What do we actually get?
• A snap-shot of the mRNA profile in abiological sample
• With the correct experimental conditions wecan compare two situations
• Not all biological processes are regulatedthrough mRNA expression levels
![Page 4: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/4.jpg)
4
Armstrong, 2007
What can we learn?
• Identify functionally related genes• Find promoter regions (common regulation)• Predict genetic interactions• If we change one variable a network of gene
responses should compensate• Homeostasis is a fundamental principle of biology
- almost all biological systems exist in a controlledstate of negative feedback.
Armstrong, 2007
The Transcriptome
• Microarrays work by revealing DNA-DNAbinding.
• Transcriptional activators also bind DNA• Spot genomic DNA onto glass slides• Label protein extracts• Hybridise to the genomic probes• Reveals domains that include promoter
regions
![Page 5: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/5.jpg)
5
Armstrong, 2007
ChIP to ChipChromatin Immunoprecipitation to Microarray (Chip)
Protein-DNA interactionsde-novo prediction has many false positivesWhich DNA sites do actually bind a specific TF?Requires an antibody to the protein
Armstrong, 2007
http://proteomics.swmed.edu/chiptochip.htm
![Page 6: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/6.jpg)
6
Armstrong, 2007
http://proteomics.swmed.edu/chiptochip.htm
Armstrong, 2007
Biological networks
![Page 7: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/7.jpg)
7
Armstrong, 2007
Yeast protein networkNodes: proteinsLinks: physical interactions (binding)
P. Uetz, et al. Nature 403, 623-7 (2000).
Prot Interaction map
Slide from http://www.nd.edu/~networks/
Armstrong, 2007
Building networks…
• Biological Networks– Random networks– Scale free networks– Small worlds
• Metabolic Networks• Proteomic Networks• The Mammalian Synapse• Other synapse models?
![Page 8: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/8.jpg)
8
Armstrong, 2007
Biological Networks
• Genes - act in cascades• Proteins - form functional complexes• Metabolism - formed from enzymes and substrates• The CNS - neurons act in functional networks• Epidemiology - mechanics of disease spread• Social networks - interactions between individuals
in a population• Food Chains
Armstrong, 2007
Protein Interactions
• Individual Proteins form functionalcomplexes
• These complexes are semi-redundant• The individual proteins are sparsely
connected• The networks can be represented and
analysed as an undirected graph
![Page 9: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/9.jpg)
9
Armstrong, 2007
Large scaleorganisation
– Networks in biology generally modeled usingclassic random network theory.
– Each pair of nodes is connected withprobability p
– Results in model where most nodes have thesame number of links <k>
– The probability of any number of links pernode is P(k)≈e-k
Armstrong, 2007
![Page 10: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/10.jpg)
10
Armstrong, 2007
Non-biological networks
• Research into WWW, internet and humansocial networks observed different networkproperties– ‘Scale-free’ networks– P(k) follows a power law: P(k)≈k-γ
– Network is dominated by a small number ofhighly connected nodes - hubs
– These connect the other more sparselyconnected nodes
Armstrong, 2007
![Page 11: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/11.jpg)
11
Armstrong, 2007
Internet-Map
the internet
Armstrong, 2007
Small worlds
• General feature of scale-free networks– any two nodes can be connected by a relatively
short path– average between any two people is around 6
• What about SARS???– 19 clicks takes you from any page to any other
on the internet.
![Page 12: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/12.jpg)
12
Armstrong, 2007
Armstrong, 2007
![Page 13: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/13.jpg)
13
Armstrong, 2007
Biological organisation
• Pioneering work by Oltvai and Barabasi• Systematically examined the metabolic
pathways in 43 organisms• Used the WIT database
– ‘what is there’ database– http://wit.mcs.anl.gov/WIT2/– Genomics of metabolic pathways
Jeong et al., 2000 The large-scale organisation of metabolicnetworks. Nature 407, 651-654
Armstrong, 2007
Using metabolic substrates asnodes
archae
all 43eukaryote
bacteria
=scale free!!!
![Page 14: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/14.jpg)
14
Armstrong, 2007
Random mutations in metabolicnetworks
• Simulate the effect of random mutations ormutations targeted towards hub nodes.– Measure network diameter– Sensitive to hub attack– Robust to random
Armstrong, 2007
Consequences for scale freenetworks
• Removal of highly connected hubs leads to rapid increasein network diameter– Rapid degeneration into isolated clusters– Isolate clusters = loss of functionality
• Random mutations usually hit non hub nodes– therefore robust
• Redundant connectivity (many more paths between nodes)
![Page 15: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/15.jpg)
15
Armstrong, 2007
Network Motifs
• Do all types of connections exist innetworks?
• Milo et al studied the transcriptionalregulatory networks in yeast and E.Coli.
• Calculated all the three and four genecombinations possible and looked at theirfrequency
Armstrong, 2007
Milo et al. 2002 Network Motifs: Simple Building Blocks of ComplexNetworks. Science 298: 824-827
Biological Networks
Three node possibilities
![Page 16: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/16.jpg)
16
Armstrong, 2007
Gene sub networks
Heavy bias in both yeast and E.coli towards these two subnetwork architectures
Armstrong, 2007
![Page 17: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/17.jpg)
17
Armstrong, 2007
Gene networks and network inference
Armstrong, 2007
What is a gene network
• Genes do not act alone.• Gene products interact with other genes
– Inhibitors– Promoters
• The nature of genetic interactions in complex– Takes time– Can be binary, linear, stochastic etc– Can involve many different genes
![Page 18: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/18.jpg)
18
Armstrong, 2007
What makes boys boys and girls girls?
Sugar, Spice and synthetic Oestrogens?
Armstrong, 2007
Sex determination: a genecascade (in flies…)
RuntSisterlessScute
DaughterlessDeadpanExtramachrochaete
6 Genes detect X:A ratioFemales Males
![Page 19: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/19.jpg)
19
Armstrong, 2007
Sex determination(in flies…)
RuntSisterlessScute
DaughterlessDeadpanExtramachrochaete
6 Genes regulate ‘Sexlethal’
Sexlethal
+ effect
- effect
Armstrong, 2007
Sex determination(in flies…)
RuntSisterlessScute
DaughterlessDeadpanExtramachrochaete
Sexlethal can then regulate itself...
Sexlethal
![Page 20: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/20.jpg)
20
Armstrong, 2007
Sex determination(in flies…)
Downstream cascade builds...
RuntSisterlessScute
DaughterlessDeadpanExtramachrochaete
Sexlethal transformer doublesex
Armstrong, 2007
Gene expression and time1 Runt2 Sisterless3 Scute
4 Daughterless5 Deadpan6 Extramachrochaete
7 Sexlethal 8 transformer 9 doublesex
![Page 21: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/21.jpg)
21
Armstrong, 2007
Gene microarrays
time
Armstrong, 2007
Gene Network Inference
• Gene micro-array data• Learning from micro-array data• Unsupervised Methods• Supervised Methods• Edinburgh Methods
![Page 22: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/22.jpg)
22
Armstrong, 2007
Gene Network Inference
• Gene micro-array data– Time Series array data– Tests under ranges of conditions
• Unlike example - 1000s genes• Lots of noise• Clustering would group many of these genes
together• Aim: To infer as much of the network as possible
Armstrong, 2007
Learning from Gene arrays
• Big growth industry but difficult problem• Initial attempts based on unsupervised
methods:– Basic clustering analysis - related genes– Principal Component Analysis– Self Organising Maps– Bayesian Networks
![Page 23: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/23.jpg)
23
Armstrong, 2007
Bayesian ‘gene’ networks
• Developed by Nir Friedman and Dana Pe’er• Can be easily adapted to a supervised
method
Armstrong, 2007
Learning Gene Networks
• The field is generally moving towards moresupervised methods:– Bayesian networks can use priors– Support Vector machines– Neural Networks– Decision Trees
![Page 24: Bioinformatics 2 - The University of Edinburgh › teaching › courses › bio2 › lectures...1 Armstrong, 2007 Bioinformatics 2 From genomics & proteomics to biological networks](https://reader035.vdocuments.mx/reader035/viewer/2022070805/5f03d1127e708231d40aea03/html5/thumbnails/24.jpg)
24
Armstrong, 2007
• Scale free architecture– Chance of new edges is proportional to existing ones– Highly connected nodes may well be known to be
lethal
• Network motifs– Constrain the types of sub networks
• Prior Knowledge– Many sub networks already known
Can we combine networkknowledge with gene inference?
Armstrong, 2007
Conclusions
• Gene network analysis is a big growth area• Several promising fields starting to converge
– Complex systems analysis– Using prior knowledge– Application of advance machine learning algorithms– AI approaches show promise