![Page 1: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/1.jpg)
Systems analysis of innate immune mechanisms in infection – a role for HPC
Peter Ghazal
![Page 2: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/2.jpg)
What is Pathway Biology?
Pathway biology is….
A systems biology approach for understanding a biological process
- empirically by functional association of
multiple gene products & metabolites
- computationally by defining networks of cause-effect relationships.
Pathway Models link molecular; cellular; whole organism levels.
FORMAL MODELS --- ALLOW PREDICTING the outcome of Costly or Intractable Experiments
![Page 3: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/3.jpg)
Focus and outline of talk
• High through-put approaches to mapping and understanding host-response to infection.
• Targeting the host NOT the “bug” as anti-infective strategy
• Making HPC more accessible: SPRINT a new framework for high dimensional biostatistic computation
Story starts at the bed side
![Page 4: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/4.jpg)
Differentially expressed genes Differentially expressed genes in neonatesin neonatescontrol vs Infected (FDR p>1x10control vs Infected (FDR p>1x10-5-5, FC±4), FC±4)
![Page 5: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/5.jpg)
Dealing with HTP data: Impact of data variability
Replicate
Patient where,
j
iby ijiij
),0(~ 2bi Nb ),0(~ 2
Nij
222
variationTechnical
variationBiological
variationTotal
b
• Model for introducing biological and technical variation:
![Page 6: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/6.jpg)
Modelling patient variability and biomarkers for classification
Machine Learning methods:
Random Forest (RF)
Support Vector Machine (SVM)
Linear Discriminant Analysis (LDA)
K-Nearest Neighbour (K-NN)
How different data characteristics affect the misclassification errors?
Factors investigated:Data variability (biological and technical variations)
Training set sizeNumber of replications
Correlation between RNA biomarkers
Mizanur Khondoker
![Page 7: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/7.jpg)
Error rate vs. (number of biomarkers, total variation) An example of a simulation model to quantify number of biomarkers
and level of patient variability
![Page 8: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/8.jpg)
Conclusions from simulations• There is increased predictive value using multiple markers – although there is
no magic number that can be recommended as optimal in all situations.
• Optimal number greatly depends on the data under study.
• The important determining factors of optimal number of biomarkers are:
• The degree of differential expression (fold-change, p-values etc.)
• Amount of biological and technical variation in the data.
• The size of the training set upon which the classifier is to be built.
• The number of replication for each biomarkers.
• The degree of correlation between biomarkers.
• Now possible to predict optimal number through simulation.
![Page 9: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/9.jpg)
Rule of five: Criteria for pathogenesis based biomarkers
• Readily accessible• Multiple markers• Appropriately powered statistical association
• Physiological relevance• Causally linked to phenotype
Key challenge is mapping biomarkers into: biological context and understanding
Requires an experimental model system
![Page 10: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/10.jpg)
PluripotentStem Cell
MyeloidStem Cell
“Activated” Cytolytic Macrophage
Primed Macrophage
Resident Macrophage(immature)
ActivatedT-Lymphocyte
Promonocyte(Primary Signal)
InflammationIFN-gamma
(Secondary Signal)Endotoxin, IFN-gamma
Lymphokines
?
Monocyte?
![Page 11: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/11.jpg)
Transcriptional profile of MΦ activated by Ifng
![Page 12: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/12.jpg)
How do we tackle this?
PATHWAY BIOLOGYLiteratureData-mining
ModellingNetwork analysis
Experimentationgenetic screens
microarraysY2H
mechanism basedstudies
A sub-system study of cause effect relationships with a defined start (input) and end (output).
![Page 13: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/13.jpg)
![Page 14: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/14.jpg)
Mapping new nodes
PATHWAY BIOLOGYLiteratureData-mining
Experimentation
![Page 15: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/15.jpg)
Transcriptional profile of MΦinfected with CMV
![Page 16: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/16.jpg)
Hypothesis generation
• Blue zone vs red zone
![Page 17: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/17.jpg)
Down regulation of sterol pathway
![Page 18: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/18.jpg)
BUT… recorded changes are small – Do they have any
effect?
Next step modelling
PATHWAY BIOLOGY
Pure and applied modellingNetwork inference analysis
Experimentaldata
![Page 19: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/19.jpg)
Workflow
ODE model
Literature derived model
Known parameters
Unknown parameters
Vary parametersby an order of
magnitude
Order of magnitudeestimation
Ensemble of ODE models Results
Ensembleaverage
![Page 20: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/20.jpg)
Modelling
Where available, parameters obtained from the Brenda enzyme database
http://www.brenda-enzymes.info/
Cholesterol Synthesis
ODE model, Michaelis-Menten interactions• 57 Parameters
• 25 Known Parameters• 32 Unknown Parameters
Algorithm• Using the first three time points, calculate an equilibrium state
• Release model from equilibrium and simulate using enzyme data• For each unknown, consider this model across 3 orders of magnitude,
holding the other unknowns parameters fixed.
![Page 21: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/21.jpg)
Cholesterol (output of sterol pathway) results from simulation and expts
Cholesterol rate/flux Cholesterol levels
Predictions: Experiments:
![Page 22: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/22.jpg)
Lipidomic – mass spec results
![Page 23: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/23.jpg)
• Infection down regulate cholesterol biosynthesis pathway and free intra-cellular cholesterol.
• Can now predict the behaviour of the pathway.
• But?• Just as a good as UK (Met Office) weather predictions……
because……
![Page 24: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/24.jpg)
![Page 25: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/25.jpg)
Scalability issues related to increased complexity
• Increasing complexity and size of biological data
• Solution: High Performance Computing (HPC)?
HPC for High Throughput Post-Genomic Data
![Page 26: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/26.jpg)
Problems with large biological data sets
– Volume of data• Many research groups can now routinely generate high volumes of data
– Memory (RAM) handling:• Input data size is too big• Algorithms cause linear, exponential or other growth in
data volume
– CPU performance:• Routine analyses take too long
![Page 27: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/27.jpg)
Limitation examples: Clustering
• Gene clustering using R on a high-spec workstation:– 16,000 genes, k=12 gene clusters runs for ~30min– 16,000 genes, k=40 gene clusters runs for ~10hrs
Partitioning-Around-Medoids, n genes, k=12 clusters requested
Memory fail limit
![Page 28: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/28.jpg)
Outcome: Adverse effect on research
• Arbitrary size reduction of input data
• Batch processing of data
• Analyses in smaller steps
• Avoidance of some algorithms
• Failure to analyse
![Page 29: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/29.jpg)
Solution: High Performance Computing
• HPC takes many forms:– clusters, networks, supercomputer, grid, GPUs, “cloud”, ...
• Provides more computational power
• HPC is technically accessible for most: – Department own, Eddie, HECToR,...
However!
![Page 30: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/30.jpg)
HPC Access Hurdles
• Cost of access
• Time to adapt
• Complex, require specialist skills • Consultancy (e.g. EPCC) only feasible on
ad-hoc basis, not routinely
![Page 31: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/31.jpg)
HPC Access Hurdles
HPC is (currently) optimal for:- Specific problems that can be tackled as a project- Individuals who are familiar with parallelisation and
system architectures
HPC is not optimal for:- Routine/casual analyses of high-throughput data- Ad-hoc and ever-changing analyses algorithms- Data analysts without time or knowledge to sidestep
into parallelisation software/hardware.
![Page 32: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/32.jpg)
Need a step change (up!) to broaden HPC
access to all biologists
Challenge two fold!!
• Provide a generic solution
• Easy to use
![Page 33: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/33.jpg)
SPRINT
Post Genomic
Data
R
Biological Results
Very Large Post Genomic
Data
R
Very Large Post Genomic
Data
R
Biological Results
HPC(Eddie)
A solution for analyses using R
SPRINT (DPM & EPCC))
![Page 34: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/34.jpg)
SPRINTSPRINT has 2 components:1.HPC harness manages access to HPC2.Library of parallel R functions
e.g. cor (correlation) pam (clustering)
maxt (permutation
Allows non-specialists to make use of HPC resources, with analysis functions parallelised by us or the R community.
![Page 35: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/35.jpg)
data(golub)smallgd <- golub[1:100,] classlabel <- golub.cl
resT <- mt.maxT(smallgd, classlabel, test="t", side="abs")
quit(save="no")
library("sprint")
data(golub)smallgd <- golub[1:100,] classlabel <- golub.cl
resT <- pmaxT(smallgd, classlabel, test="t", side="abs")
pterminate()
quit(save="no")
Code comparison
![Page 36: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/36.jpg)
Permutation Benchmark
Input Array Data Size
Permutation Count
Estimatedmaxt 1 CPU
Pmaxt on 256 CPUs
(s)
36,612 x 76 500,000 6 hrs 73.18
36,612 x 76 1,000,000 12 hrs 46.64
73,224 x 76 500,000 10 hrs 148.46
100,000 x 320 1,000,000 20 hrs 294.61
![Page 37: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/37.jpg)
Correlation Benchmark
Input Array Data Size Output Array Data Size pcor() on 256 CPUs
(s)
11,000 x 320(27 MB)
923 MB 4.76
22,000 x 320
(54 MB)
3.6 GB 13.87
35,000 x 320
(85 MB)
9.1 GB 36.64
45,000 x 320
(110 MB)
15 GB 42.18
![Page 38: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/38.jpg)
Clustering Benchmarks
![Page 39: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/39.jpg)
Future
• Cloud (confidentiality issues)
• GPU (limitations is data size)
![Page 40: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/40.jpg)
New therapeutic and diagnostic opportunities
Viral Interaction Networks
Host Interaction Networks
Virus Antiviral HostSystemic Therapeutic
Bed-bench-models-almost back to bed
![Page 41: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/41.jpg)
THANK YOU&
Acknowlegments to our sponsors
![Page 42: Systems analysis of innate immune mechanisms in infection – a role for HPC Peter Ghazal](https://reader036.vdocuments.mx/reader036/viewer/2022062417/5515edfc550346dd6f8b52a3/html5/thumbnails/42.jpg)
Mathieu Blanc
Steven Watterson
Mizanur Khondoker
Paul Dickinson
Thorsten Forster
Muriel Mewissen
Terry Sloan
Jon Hill
Michal Piotrowski
Arthur Trew
Acknowledgement
EPCC