bioinformatics research centre university of glasgow
DESCRIPTION
Bioinformatics Research Centre University of Glasgow. David Gilbert www.brc.dcs.gla.ac.uk Department of Computing Science , University of Glasgow. Bio informatics. Bio informatics. Bioinformatics. Bioinformatics. Bio - Molecular Biology. Informatics - Computer Science. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/1.jpg)
David Gilbert: [email protected] BRC Glasgow 1
Bioinformatics Research CentreUniversity of Glasgow
David Gilbertwww.brc.dcs.gla.ac.uk
Department of Computing Science, University of Glasgow
![Page 2: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/2.jpg)
David Gilbert: [email protected] BRC Glasgow 2
Bioinformatics Bioinformatics Bioinformatics Bioinformatics
•Bioinformatics - the study of the application of - molecular biology, computer science, artificial intelligence, statistics and mathematics
- to model, organise, understand and discover interesting information associated with the large scale molecular biology databases,
- to guide assays for biological experiments.
(Computational Biology - USA).
•Bio - Molecular Biology
•Informatics - Computer Science
![Page 3: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/3.jpg)
David Gilbert: [email protected] BRC Glasgow 3
Bioinformatics in context -a new discipline?
ComputingMaths &
Stats
Lifesciences
PhysicalSciences
?Psychology?
![Page 5: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/5.jpg)
David Gilbert: [email protected] BRC Glasgow 5
How can we analyse the flood of data ?Data: don't just store it, analyze it ! By comparing
sequences, one can find out about things like
• How organisms are related & evolution
• How proteins function
• Population variability
• How diseases occur
![Page 7: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/7.jpg)
David Gilbert: [email protected] BRC Glasgow 7
Dirty data?
Big Horn Sheep [Ovis canadensis]The Big Horn Sheep [Ovis canadensis] is a large North American species with a brown coat, which turns to bluish-grey in winter. It is so named from the size of the horns of the ram, which often measure over 1 m/3.3 ft round the curve.Classification: Ovis canadensis is in family Bovidae, order Artiodactyla
![Page 8: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/8.jpg)
David Gilbert: [email protected] BRC Glasgow 8
Data, information, knowledge … • data : nucleotide sequence
• information : where are the “genes”.
Found using classifier, pattern, rule which has been mined/discovered
• knowledge : facts and rules
If a gene X has a weak psi-blast assignment to a function F
–and that gene is in an expression cluster –and sufficient members of that cluster are known to have function F, then believe assignment of F to X.
gene
TATA boxTermination
(stop)
start
controlstatement
controlstatement
![Page 10: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/10.jpg)
David Gilbert: [email protected] BRC Glasgow 10
![Page 12: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/12.jpg)
David Gilbert: [email protected] BRC Glasgow 12
IndexingEla Hunt [email protected]
• String indexing structures can be used to index DNA, proteins, XML and phylogenetic trees
• All data is read once, index in created on disk
• Index reduces the search space of the query (we read a % of disk only)
![Page 13: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/13.jpg)
David Gilbert: [email protected] BRC Glasgow 13
Distributed databases and computation Cardiovascular Functional Genomics
• -£5.4 million project, 5 UK Universities: Glasgow, Leicester, Edinburgh, Oxford, Imperial; + Maastricht
• Led by Clinicians
• Combined studies: – scientific models of disease (Rat)
– parallel studies of patients
– large family and population DNA collections
• 3 pronged approach– Targeted transcript sequencing
– Microarray gene expression profiling
– Comparative genome analysis.
• Data generated at each of the 5 sites & made available for analysis:
• Issues of distributed data and computation.
• Mapping gene sequences Rat Mouse Human– an added layer of complexity in the computation.
![Page 14: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/14.jpg)
David Gilbert: [email protected] BRC Glasgow 14
Wellcome Trust: Cardiovascular Functional Genomics
Glasgow Edinburgh
Leicester
Oxford
LondonNetherlands
Shared dataPublic curated
data
![Page 15: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/15.jpg)
David Gilbert: [email protected] BRC Glasgow 15
BRIDGES: BioMedical Research Informatics
Delivered by Grid Enabled Services • National e-Science Centre, Bioinformatics Research Centre, IBM UK Life Sciences
• Incrementally develop and explore database integration over 6 geographically distributed research sites within the framework of the large Wellcome Trust biomedical research project Cardiovascular Functional Genomics.
• Three classes of integration will be developed to support a sophisticated bioinformatics infrastructure supporting:
– data sources (both public and project generated),
– bioinformatics analysis and visualisation tools,
– research activities combining shared and private data.
• The inclusion of patient records and animal experiment data means that privacy and access control are particular concerns.
• An exploration of index factories accelerating sequence processing will test the hypothesis that the Grid makes a new class of e-Science indexes feasible. Both OGSA-DAI and IBM DiscoveryLink
technology will be employed and a report will identify how each performed in this context.
![Page 16: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/16.jpg)
David Gilbert: [email protected] BRC Glasgow 16
Functional GenomicsFunctional Genomics
~44,000GENES
~44,000GENES ~33% OF GENES HAVE
UNKNOWN FUNCTION
~33% OF GENES HAVE UNKNOWN FUNCTION
![Page 17: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/17.jpg)
David Gilbert: [email protected] BRC Glasgow 17
Solution……• Solve the problem of the twilight zone (sequence
alignments below 30% sequence identity)• How?• Predict protein function using an alternative method to
BLAST:• Predict protein functional class from sequence, structural
and phylogenetic features using machine learning• Combination of these (computationally and statistically)
would provide the biologists like yourselves with the most accurate functional prediction of proteins that fall in the twilight zone.
Ali Al-ShahibChao He, Mark Girolami
![Page 18: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/18.jpg)
David Gilbert: [email protected] BRC Glasgow 18
Locating genome duplicationsQ: did one or more genome-wide events affect all gene families?
Lamprey
Mouse
Mouse
Human
Human
gene duplication
Lamprey
Mouse
Human
Reptiles + Birds
Lungfish
Teleosts
Sharks & Rays
happened somewhere here
Molecular Evolution: A Phylogenetic Approach Rod [email protected]
![Page 19: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/19.jpg)
David Gilbert: [email protected] BRC Glasgow 19
TOPSProtein
topology
David Gilbert, Juris Viksna,
Gilleain Torrance (BRC, Glasgow),
David Westhead and Ioannis Michalopoulos
(Leeds)BBSRC/EPSRC funded
![Page 21: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/21.jpg)
David Gilbert: [email protected] BRC Glasgow 21
Structure comparison
2bop (probe)
against
(subset of) CATH
![Page 22: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/22.jpg)
David Gilbert: [email protected] BRC Glasgow 22
TOPS comparison server: www.tops.leeds.ac.uk
PDB file
TOPS diagram (graph)
Matches to motif library
(v.fast)
Pairwise comparison to structures in
database
(slower)
![Page 23: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/23.jpg)
David Gilbert: [email protected] BRC Glasgow 23
Protein designDesign of a Novel Globular Protein Fold with Atomic-Level Accuracy
Brian Kuhlman,1 Gautam Dantas,1 Gregory C. Ireton,4 Gabriele Varani,1,2 Barry L. Stoddard,4 David Baker1,3
“A major challenge of computational protein design is the creation of novel proteins with arbitrarily chosen three-dimensional structures.
Here, we used a general computational strategy that iterates between sequence design and structure prediction to design a 93-residue /ß protein called Top7 with a novel sequence and topology.
Top7 was found experimentally to be folded and extremely stable, and the x-ray crystal structure of Top7 is similar (root mean square deviation equals 1.2 angstroms) to the design model.
The ability to design a new protein fold makes possible the exploration of the large regions of the protein universe not yet observed in nature.”1 Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.2 Department of Chemistry, University of Washington, Seattle, WA 98195, USA.3 Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.4 Division of Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109, USA
Science. 2003 Nov 21;302(5649):1364-8
![Page 24: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/24.jpg)
David Gilbert: [email protected] BRC Glasgow 24
Protein design
Generation of starting models.
“The target structure for the de novo design process can range from a detailed backbone model to a back-of-the-envelope sketch.”
“Because we aimed to create a novel protein fold,we selected a topology not present in the PDB according to the Topology of Protein Structure (TOPS) server (17).”
![Page 25: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/25.jpg)
David Gilbert: [email protected] BRC Glasgow 25
User = [email protected] at 20:29:51 on 3/06/03Structure code = top7atype = PDB (user declared), Database = atlasDetails of sheets etc (including all connected SSEs): Sheet: [6,7,4,1,2]======================================================Domain Code RankComparison time : 43 sectop7a target_query 01bbi00 4.10.100.10.1 71pi200 4.10.100.10.1 71sro00 2.40.29.10.1 71atx00 2.20.20.10.1 92sh100 2.20.20.10.1 91vcc00 3.30.66.10.1 111hpm02 3.10.140.10.1 121csp00 2.40.50.40.1 132snv01 2.40.10.20.3 133tss02 2.40.50.50.3 131bcpF0 2.40.50.50.2 141bovA0 2.40.50.30.2 141tle00 2.10.25.10.1 141cdb00 2.60.40.10.1 151ckmA3 4.10.87.10.1 151kxf01 2.40.10.20.3 151svpA1 2.40.10.20.3 152pkaX0 2.40.10.20.1 151apo00 2.10.25.10.6 161ate00 2.10.40.10.1 161aww00 2.30.30.10.1 161cuk01 2.40.50.80.1 16
Use of TOPS for protein design
Top7a NEEheEC 1:2A 1:4A 2:4R 4:6R 4:7A 6:7A 1:4R 4:6R
![Page 27: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/27.jpg)
David Gilbert: [email protected] BRC Glasgow 27
Systems biology – some definitions
• Systems biology is the study of all the elements in a biological system (all genes, mRNAs, proteins, etc) and their relationships one to another in response to perturbations.
• Systems approaches attempt to study the behaviour of all of the elements in a system and relate these behaviours to the systems or emergent properties
![Page 28: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/28.jpg)
David Gilbert: [email protected] BRC Glasgow 28
A Framework for Systems Biology(Ideker, Galitski & Hood, 2001)
• Define all of the components of the system
• Systematically perturb and monitor components of the system
• Reconcile the experimentally observed responses with those predicted by the model
• Design and perform new perturbation experiments to
distinguish between multiple or competing model hypotheses
![Page 29: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/29.jpg)
David Gilbert: [email protected] BRC Glasgow 29
New database technologies for storing the output from high-throughput biological experiments
Andrew Jones
• Proteomics – study the set of proteins expressed in a sample
• Complex, variable output:• High-Resolution images• Numerical data generated by lab. equipment
and software• Human Annotation
• The data is not suitable for storage in a standard relational database
• Storage, retrieval and exchange of data is important• XML (Extensible Markup Language) is being
investigated for storing such data
![Page 30: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/30.jpg)
David Gilbert: [email protected] BRC Glasgow 30
• Maintained by National Library of Medicine
• Free of charge, since 1997
• > 10 million references since 1971
• > 4000 biomedical journals
• > 80% in English• > 80% have an abstract
"Biochemical Network Data Mined from Scientific Texts"Te Ren (PhD student)with CXR Biosciences.
![Page 31: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/31.jpg)
Data complexityMethionine Biosynthesis in E.coli
L-aspartate
L-Aspartate-4-P
2.7.2.4
1.2.1.11
L-Homoserine
L-Aspartate semialdehyde
1.1.1.3
aspartate biosynth.aspartate biosynth.
aplha-succinyl-L-Homoserine
2.3.1.46
4.2.99.9
Homocysteine
Cystathionine
4.4.1.8
L-Methionine
2.1.1.13
2.5.1.6
L-Adenosyl-L-Methionine
2.1.1.14
AporepressorAporepressor
metJmetJ
codes for
is part ofis part of
is part ofis part of inhibitsinhibits
inhibitsinhibits
lysine biosynth.lysine biosynth.
threonine biosynth.threonine biosynth.
asdasd aspartate semialdehyde deshydrogenaseaspartate semialdehyde deshydrogenase
codes for catalyzescatalyzes
metAmetA homoserine-O-succinyltransferase
codes for catalyzescatalyzes
homoserine-O-succinyltransferase
catalyzes
cystathionine-gamma-synthasecystathionine-gamma-synthase
codes for catalyzes
metCmetC cystathionine-beta-lyasecystathionine-beta-lyase
codes for catalyzescatalyzes
metEmetECobalamin-independent homocysteine transmethylaseCobalamin-independent homocysteine transmethylase
codes for catalyzescatalyzes
codes for catalyzescatalyzes
Cobalamin-dependent homocysteine transmethylaseCobalamin-dependent homocysteine transmethylasemetHmetH
metRmetR
codes for
metR activatormetR activator
up-regulatesup-regulatesup-regulates
repressesrepresses
repressesrepresses
repressesrepresses
aspartate kinase II/homoserine dehydrogenase IIaspartate kinase II/homoserine dehydrogenase II
codes for catalyzescatalyzes
catalyzescatalyzes
repressesrepresses
repressesrepresses
ATPATP
ADPADP
NADPH; H+NADPH; H+
NADP+; PiNADP+; Pi
NADPH;H+NADPH;H+
NADP+NADP+
Succinyl SCoASuccinyl SCoA
HSCoAHSCoA
L-CysteineL-Cysteine
SuccinateSuccinate
H2OH2O
Pyruvate; NH4+Pyruvate; NH4+
5-Methyl THF5-Methyl THF
THFTHF
2.7.2.4
1.2.1.11
1.1.1.3
2.3.1.46
4.2.99.9
4.4.1.8
2.1.1.14 2.1.1.13up-regulates
ATPATP
Pi; PPiPi; PPi
2.5.1.6
expression
expression
expression
expression
expression
expression
expression
expression
expression
metB
metL
metBL operonmetBL operon
metB
metL
represses
Holorepressor
![Page 32: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/32.jpg)
Biochemical networks
• Pathway navigation
• Pathway comparison
• Pathway motif discovery
• Pathway simulation
• High-level abstraction inferred from low-level descriptions
• Novel pathways from gene expression experiments
DNA chip experiment
Transcription profiles
ClusteringClusters of
co-regulated genes
Functional meaning ?
Pathway extractionin metabolic reaction graph
Putative metabolic pathways
Matching againstmetabolic pathway
database
Known pathways
Novel pathways
Visualization
![Page 33: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/33.jpg)
David Gilbert: [email protected] BRC Glasgow 33
A Software System forPattern Matching and Motif Discovery
in Biochemical NetworksSebastian Oehm
• Design a suitable data model using bipartite graphs• Define patterns and develop algorithms for pattern
matching in biochemical networks• Define pathway motifs and develop algorithms for
motif searching in biochemical networks• Develop algorithms for automated motif discovery• Develop algorithms to search for the largest common
part of two or more biochemical networks• Develop a measure of similarity for pathway
comparison
L-aspartyl-4-P
L-Aspartate
L-Homoserine
Homocysteine
L-Methionine
S-Adenosyl-L-Methionine
L-aspartic semialdehyde
1.1.1.3
2.7.2.4
2.1.1.14
2.5.1.6
1.2.1.11
L-aspartyl-4-P
L-Aspartate
L-Homoserine
Homocysteine
L-Methionine
S-Adenosyl-L-Methionine
L-aspartic semialdehyde
1.1.1.3
2.7.2.4
2.1.1.14
2.5.1.6
1.2.1.11
S.cerevisiae E.coli
O-acetyl-homoserine
2.3.1.31
4.2.99.10
Alpha-succinyl-L-Homoserine
Cystathionine
2.3.1.46
4.2.99.9
4.4.1.8
![Page 34: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/34.jpg)
David Gilbert: [email protected] BRC Glasgow 34
Biochemical Pathway Simulator A Software Tool for Simulation &
Analysis of Biochemical Networks
Muffy Calder David Gilbert
Walter Kolch Keith van Rijsbergen
Brian Ross Oliver Sturm
DTI ‘Beacon’ project, £0.9M, 4 years
![Page 36: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/36.jpg)
David Gilbert: [email protected] BRC Glasgow 36
Complexity: real bioinformatics
Closing the loop from wet lab to in-silico
LabMAPK
LiteratureApoptosis Database
Apoptosis
DatabaseMAPK
Simulator
Analysis
Rules
DA
TA
PathwayEditor Use
r In
terf
ace
BioLab/Literature
BioinformaticsTools, database, interface
SimulatorConcurrency theory
Human feedback (in-the-loop)
Text miner
Abstract model
Web
por
tal
Mitogens Growth factors
Receptorreceptor
Ras
Raf
P PP
P
MEKP
ERK
P P
cytoplasmic substrates
ElkSAP Gene
Mitogens Growth factors
Receptorreceptor
Ras
Raf
P PP
P
MEKP
ERK
P P
Mitogens Growth factors
Receptorreceptor
Mitogens Growth factors
Receptor
Mitogens Growth factors
ReceptorReceptorreceptor
RasRasRasRas
Raf
P PP
P
MEKP
ERK
P P
Raf
P PP
P
MEKP
ERK
P P
cytoplasmic substrates
ElkSAP Gene
cytoplasmic substrates
ElkSAP Gene
ElkSAP Gene
![Page 37: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/37.jpg)
David Gilbert: [email protected] BRC Glasgow 37
Proliferation (Cell division) vs Differentiation (Neurite outgrowth)in PC12 cell model
NGF (50 ng/ml)Differentiation into
nerve cell type
EGF (50 ng/ml)
Proliferation
neurite outgrowthcell division stimulated withoutneurite outgrowth
![Page 38: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/38.jpg)
David Gilbert: [email protected] BRC Glasgow 38
Dynamic Behaviour of the Network
MEK1,2
ERK1,2
Ras
Receptor
Raf-1
Raf-1 is expressed in allcells, and its activationinduces ERK activation
MEK1,2
ERK1,2
RascAMP
PKA
Receptor
Raf-1
Many receptors that activate ERKalso elevate cAMP levels leadingto activation of PKA. PKA inhibits Raf-1 and blocks ERK activation
MEK1,2
ERK1,2
B-Raf
RascAMP
PKA
Receptor
Raf-1
However, cAMP induces activationof B-raf. In cells which expressB-raf, cAMP activates the ERK
pathway despite of Raf-1 inhibition.
Cell growth Growth arrest Cell growth
![Page 39: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/39.jpg)
David Gilbert: [email protected] BRC Glasgow 39
![Page 40: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/40.jpg)
David Gilbert: [email protected] BRC Glasgow 40
Mobility
Sometimes a signal sent in a communications network can change the connections or topology of that network. In the example below, a cell-phone is being carried out of range of Cell 1. The base station must send the frequency of the appropriate new Cell (Cell 2) to the phone. The phone connects to Cell 2 and discards its previous link to Cell 1.
Base Base
Cell 1Cell 2
Cell 2Frequency
Cell 2Frequency Conversation Conversation
Conversation Conversation
![Page 41: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/41.jpg)
David Gilbert: [email protected] BRC Glasgow 41
GDP
Ras
SoSSoSGDP
GTP
GTP
Ras Raf
In biochemical networks, a protein can be granted or denied the opportunity to interact with certain other molecules by exchange factors, effectively changing the network topology dynamically. In the example below, the protein Ras is bound to a molecule of GDP, which renders Ras inactive. A molecule of SoS can interact with this Ras-GDP complex, causing the GDP to be exchanged for GTP. The Ras-GTP complex is active, permitting interaction with the protein Raf.
![Page 42: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/42.jpg)
David Gilbert: [email protected] BRC Glasgow 42
ExtractedLit. Data
Reusable Subcomponents of a Solution forOffline Integration of 3rd party Databases
• By-products of the total process may correspond to other reusable sub-services
– Schema Translation – various schema definition langs are translated into one common, interpretable schema lang.
– Record Matching – builds a cross reference index that identifies records about a “same entity” and records the source and location of the matching records. Two or more records may match.
aMaze DB
MAPKsource data
cAMP PKsource data
IntegratedDatabase
Integrator
InputSchemas
DefaultValues
ConflictResolution
Rules
RecordMatching
Rules
RecordMatcher
Cross-refIndex
SchemaTranslator
Trans LocalSchemas
RecordMerger
Target Schema
![Page 43: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/43.jpg)
David Gilbert: [email protected] BRC Glasgow 43
Validation
Drug target discovery: What is a good drug target? How do we select it?
Drug target validation: Does hitting the target change the biological response?
Side effects: What else is affected when the selected target is hit?
Lead Compound Selection: Which compounds should be taken further for development. What properties should the drug have?
Current Bottlenecks in Drug Development
![Page 44: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/44.jpg)
David Gilbert: [email protected] BRC Glasgow 44
Validation
Drug target discovery: What is a good drug target? How do we select it?
Drug target validation: Does hitting the target change the biological response?
Side effects: What else is affected when the selected target is hit?
Lead Compound Selection: Which compounds should be taken further for development. What properties should the drug have?
Current Bottlenecks in Drug Development
EMPIRICAL
SLOW
EXPENSIVE
![Page 45: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/45.jpg)
David Gilbert: [email protected] BRC Glasgow 45
Validation
A robust Pathway Simulation Software can help to …
Drug target discovery: What is a good drug target? How do we select it?
Drug target validation: Does hitting the target change the biological response?
Side effects: What else is affected when the selected target is hit?
Lead Compound Selection: Which compounds should be taken further for development. What properties should the drug have?
Current Bottlenecks in Drug Development
Select targets by defining its topology & function in the regulatory networks.
Validate the target by predicting how the biological response should change.
Predict side effects to allow early and targeted testing.
Predict the optimal drug profile to improve selection criteria.
![Page 46: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/46.jpg)
David Gilbert: [email protected] BRC Glasgow 46
Validation
What we propose …
Rap B-rafRap B-raf
Ras Raf-1Ras Raf-1
EGF
proliferation
EGFEGFEGF
proliferationproliferation
MEK ERKMEK ERK
Transient ERK activity
Transient ERK activity
NGF
differentiation
NGFNGF
differentiationdifferentiationSustained ERK activitySustained ERK activity
PC12 cell model of neuronal differentiation
Target Validation: Predict & test the effect of Raf-1 and B-Raf inhibitors to the biological response to EGF vs. NGF.
Lead Compound Selection: Predict & test which inhibitory efficacy is necessary and sufficient to achieve the desired biological response.
![Page 47: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/47.jpg)
David Gilbert: [email protected] BRC Glasgow 47
Nanofab &cell culture
Bioinformatics
Fab methodology
Model of cell behaviour
External databases
Other pathway data
Measured cell
behaviour
Morphology
Proteome
Dynamic behaviour
Adhesion
Gene expression
Cell shape
Physical substrate
Biochemical environment (other cells + biochemicals)
Genetic engineering
Bionanotechnology & Bioinformatics
![Page 48: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/48.jpg)
David Gilbert: [email protected] BRC Glasgow 48
Machine Learning for Bioinformatics• Classification• Clustering• Characterisation
• Techniques:– ensemble methods– decision trees– inductive logic programming– pattern discovery– Statistical approaches– SVMs
![Page 49: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/49.jpg)
David Gilbert: [email protected] BRC Glasgow 49
Cancer Classification Problem
ALL acute lymphoblastic leukemia
(lymphoid precursors)
AML acute myeloid leukemia
(myeloid precursor)
(Golub et al 1999)
![Page 50: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/50.jpg)
David Gilbert: [email protected] BRC Glasgow 50
Machine Learning Approach
Machine Learning
Classifier
C4.5SVMk-NNANN
Gene Expression
ProfilesALL AML ALL AML
![Page 51: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/51.jpg)
David Gilbert: [email protected] BRC Glasgow 51
Biological Data: Distributed and Heterogeneous!!
LPSYVDWRSA GAVVDIKSQG ECGGCWAFSA IATVEGINKI TSGSLISLSE QELIDCGRTQ NTRGCDGGYI TDGFQFIIND GGINTEENYP YTAQDGDCDV
Sequence Structure FunctionProtein
Gene expression Morphology
Microarray analysis
![Page 52: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/52.jpg)
David Gilbert: [email protected] BRC Glasgow 52
Integrative Machine Learning
(Pratt Emotif)
Aik Choon Tan
![Page 53: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/53.jpg)
David Gilbert: [email protected] BRC Glasgow 53
What kind of computational approaches do we use?• Operations over
– sequences (match)
– trees (e.g. suffix trees, supertree, joining, ...)
– graphs (sub-graph isomorphism, maximal common subgraph, path searching)
• Data modelling, databases, data conversion
• Machine learning, knowledge discovery, pattern discovery,...
• Clustering
• Theorem proving, concurrency analysis,…
• Integration: data, knowledge
• Data visualisation
• Web services, Grid, Coarse Grain parallelism, eScience,...
![Page 54: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/54.jpg)
David Gilbert: [email protected] BRC Glasgow 54
Latest from BRC• New Systems Biology lab (March 9)
• Web services, www.brc.dcs.gla.ac.uk
• Research teams: Databases & Visualisation (Ela Hunt)Grid & eScience (Richard Sinnott)Functional genomics (David Leader)Machine learning (Mark Girolami)Structural bioinformatics (Pawel Herzyk)Systems biology (David Gilbert)
• Teaching: MScIT Bioinformatics Strand
![Page 55: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/55.jpg)
David Gilbert: [email protected] BRC Glasgow 55
BRC members• Investigators:
– Yves Deville (Biochemical Networks) dcs– David Gilbert (Systems biololgy, Protein structure) dcs– Mark Girolomi (Machine learning) dcs– Pawel Herzyk (Protein structure) ibls– Ela Hunt (Database indexing, Data integration, Visualisation,…) dcs– David Leader (Visualisation tools) ibls– Gerhard May (Signalling pathways) ibls– Rod Page (Phylogenetic trees) ibls– Richard Sinnott (Grid computing / eScience) dcs– Juris Viksna (Graph algorithms) dcs
• Research Assistants: Micha Bayer, Rainer Breitling, Neil Hanlon, Derek Houghton, Richard Orton, Evangelos Pafilis, Oliver Sturm, Gilleain Torrance
• Research students: Ali Al-Shahib, David Cook, Iain Darroch, Amelie Gormand, Susan Fairley, Robert Japp, Andrew Jones, Julie Morrison, Te Ren, Aik Choon Tan, Tim Troup, Mallika Veeramalai
• Executive Assistant: Margaret Jackson • Associated: Malcolm Atkinson, Ernst Wit, John McClure, Mathis Riehle, Des Higham, Oliver
Sand
![Page 56: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/56.jpg)
David Gilbert: [email protected] BRC Glasgow 56
Funding sources
EPSRCBBSRCMRC
Wellcome TrustDTI
Scottish EnterpriseSynergy
Carnegie TrustRoyal Society
Daiwa FoundationSHEFCE
EU
![Page 57: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/57.jpg)
David Gilbert: [email protected] BRC Glasgow 57
Scottish Bioinformatics Forum
• Network of Bioinformatics researchers and industries in Scotland• A vehicle for developing Scotland as a Centre of Bioinformatics
Excellence• Nodes in Glasgow, Edinburgh, Dundee, Aberdeen, ...• Promoting collaborative research• Development of a Bioinformatics educational programme• www.sbforum.org, [email protected]
Visionary Meeting, 27 May (Zoology Building)Keynote : Prof Thornton
Director of the European Bioinformatics Centrewww.brc.dcs.gla.ac.uk/events.html
![Page 58: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/58.jpg)
David Gilbert: [email protected] BRC Glasgow 58
![Page 59: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/59.jpg)
David Gilbert: [email protected] BRC Glasgow 59
Bioinformatics Research CentreDavidson Building: 15 workstations + visitors’ facilities
Webserver
Fileserver
Unix Appserver
Microsoft App server
ClusterScotgrid+
2x100 CPU5 TB
Boyd-Orr Building(backup)
17 Lilybank Gardens
fire
wal
l
KelvinBuilding
Sun GridEngine
Databaseserver
3TB1TB
![Page 61: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/61.jpg)
David Gilbert: [email protected] BRC Glasgow 61
Where we are
Department of Computing Science
BRC (in Davidson Building)
BRC & Functional Genomics(Joseph Black)
Functional Genomics; Centre for Cell Engineering
Medicine & Theraputics
Vet School BeatsonInstitute
NeSC Hub
![Page 63: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/63.jpg)
David Gilbert: [email protected] BRC Glasgow 63
Bioinformatics Research centre (230m2)G
ard
iner
lab
(w
et la
b)
Visitors’area
Visitors’area
![Page 64: Bioinformatics Research Centre University of Glasgow](https://reader036.vdocuments.mx/reader036/viewer/2022062322/5681458f550346895db27d21/html5/thumbnails/64.jpg)
David Gilbert: [email protected] BRC Glasgow 64
The Future
Closing the loop from wet lab to in silico !
www.brc.dcs.gla.ac.uk
Collaboration!