www.exeter.ac.uk/biomedicalhub biomedical informatics hub
TRANSCRIPT
www.exeter.ac.uk/biomedicalhub
Biomedical Informatics Hub
WELLCOME TRUST ISSF BIOMEDICAL INFORMATICS
HUB
Dr Konrad Paszkiewicz
Contents
Wellcome Trust ISSF• Background
Seed Corn Fund• Aims
Biomedical Informatics Hub• Purpose• Expertise• Resources• Training• Seminars• Project booking/Prioritisation• Sustainability
Institutional Strategic Support Fund
A £1.5 million Wellcome Trust award.. .....match funded by the
University
to enhance institutional strategies for the biomedical sciences
to support scientific progress, translation and interdisciplinary collaboration
to encourage greater efficiency, effectiveness and accountability in the stewardship of Wellcome Trust funding
University of Exeter ISSF project
Create a virtual Biomedical Informatics Hub• to maximise output from emerging technologies and
large datasets
Establish an ISSF Seed Corn Fund • to support highly-talented researchers to generate
preliminary data in support of independent fellowship and grant applications
• 8 projects supported in 2012
University match-funding = new academic posts • that will utilise the Hub to enable major advances in
interdisciplinary biomedical research
Seed Corn Fund
Round 3: Who?
outstanding postdoctoral researchers - to generate preliminary data to support independent Fellowship applications
early career academics - to generate preliminary data in support of research grant applications
mid-career and senior academics seeking pump priming support for new activity that will lead to a Wellcome Trust application (first time applicants to the Trust particularly encouraged)
Round 3: What?
Must be in the Wellcome Trust’s remit
5 to 7 projects (up to 12 months)
Up to £20k (£30k for senior researchers)
Closing date = 1st Feb 2013
Assessment criteria• track record• proposed project• outputs
Contact Allison McCaig ([email protected])
Biomedical Informatics Hub
Purpose
1. Supply informatics expertise in bioinformatics, mass-spectrometry, imaging and statistics
2. Train researchers through existing and newly developed programs to make best use of their data
3. Supply PhD students, postdocs and junior researchers with the tools and expertise to produce 4* research
4. Provide leadership, governance and coordination to leverage and extend Exeter's IT research infrastructure
5. Support researchers with access to centralised computational infrastructure and equipment
6. Act as a central point of contact for initial research ideas
Expertise
1. Bioinformatics• Medical - Marcus Tuke, Anna Murray & Mike Weedon
(Prof. Tim Frayling)• Bioscience - Dr Christine Sambles (Sequencing
Service)2. Metabolomics/Proteomics
• Dr Venura Perera (Dr Hannah Florance)3. Image analysis
• Dr Jeremy Metz (starting November) (Prof. Rob Beardmore)
4. Tomography and EM• TBC (Prof. Gero Steinberg)
5. Statistics• TBC (Prof. Tim Frayling)
6. Computational modelling/sensor-based networks• Dr Mahmood Javaid (Prof. Ed Watkins)
Resources
Sequencing Service (Konrad Paszkiewicz, Karen Moore)• Illumina HiSeq 2000
Mass-spectrometry Service (Hannah Florance)• Agilent QQQ• Agilent QTOF• Metabolomics/Proteomics
Imaging (Martin Schuster, Gero Steinberg)• Confocal, EM Tomography, SEM, TEM
Computational Resources• Zeus cluster
Resources
Zeus Systems Biology cluster
19 nodes with 144 cores Nodes have between 24Gb and 192Gb RAM 150TB of storage Designed for serial jobs, not MPI/OpenMP
jobs Upcoming £150k upgrade
Contact Konrad for access
Hub Training Courses
Training courses currently offered: 1. Amazon EC2 cloud computing
2. Unix & Perl
3. Short read genomics
4. Phylogenetics
5. Use of R statistical package
6. RNA-seq
New courses in 2013
6. Seminars
Monthly seminars/workshop meetingsScope: bioinformatics, statistics, image
analysis, programming
Submitting requests
You can approach Hub members informally for initial discussions about projects
If you’re unsure – email me
Submitting requests
BUT before they undertake any work for you: You must apply via bioinformatics-
hub.ex.ac.uk Await approval by the head of discipline (no
more than a few days) You need to specify up-front outcomes in
terms of grant applications and/or publications and when you expect to achieve these
Allison McCaig will review outputs. Users who use Hub time and do not publish or apply for grants outcomes will be penalised in future hub applications.
Some screenshots.....
Project proposal submission
bio-informaticsHub.ex.ac.uk
Project proposal submission
bio-informaticsHub.ex.ac.uk
Project proposal tracking
bio-informaticsHub.ex.ac.uk
Prioritisation
Wellcome Trust remit projects have priority “to achieve extraordinary improvements in human and
animal health....... we support the brightest minds in biomedical research and the medical humanities.”
• See Research Challenges at www.wellcome.ac.uk
1. Seed Corn Fund projects2. Existing Wellcome Trust funded projects3. Wellcome Trust remit research4. Other research
Sustainability
Funding via the Wellcome Trust ISSF for up to three years (annual approval)
To continue the Hub, we need to ensure members are costed into grant applications now
Any projects which use Hub expertise and subsequently apply for a grant need to cost relevant members
Hub team presentations
Ven Perera
Who Am I Dr Venura Perera Located Room:
M01 Mezzanine Floor
Biosciences
College of Life and Environmental Sciences
Geoffrey Pope
Background: PhD in Systems biology/Bioinformatics
Using mathematical methods for analysis of untargeted mass profiling data sets.
Applied mathematics Use of numerical methods for forecasting and predictive modeling of
non-linear systems Data drive techniques for parameter optimization Ordinary and partial differential equations for system modeling
My skills Applied Mathematics
Use of Numerical models for data modeling Development of dynamic models using numerical methods Statistical Modeling using non-parametric methods
Large Data Sets Analysis Focus on Untargeted Metabolite data
New methods to solve any question … Metabolite Pipeline ExSpec
Development of metabolite centers untargeted pipeline Programming
Java Data Modeling R Mainly for data visualization Matlab Data Modeling and Visualization HTML, PHP, MySQL and basic Java Script
Further the use of MS techniques Numerical methods for data modelling
Using probabilistic methods for data driven coupled systems
Development of techniques for the MS fingerprinting
Using MS methods for polygenetic tree reconstruction
My research interests
Targeted / Quantitative Analysis
Multiple Reaction Monitorin
g
Statistical analysis,
Modelling and Clustering
SampleComparis
onQuantificationExploratory
Non-targeted Analysis
Extract Data
Molecular Feature
Extraction (MFE) and data pre-
processing
My WorkSmall Molecule analysis:
Untargeted profiling and Targeted quantification
Targeted / Quantitative Analysis
Multiple Reaction
Monitoring
Statistical analysis,
Modelling and Clustering
SampleComparison
QuantificationExploratory
Non-targeted Analysis
Extract Data
Molecular Feature
Extraction (MFE) and data pre-
processing
Example ProjectTwo groups of patient types :
• Healthy• Cancer
Three patients
Project Aim: Using a variety of techniques both targeted analysis as well as untargeted profiling to determine key components which are effected by the cancer
Example Project
Example Project
Patient 24 had a large amount of inflammation causing the control and cancer tissue to share a great deal of features similarities.
http://biosciences.exeter.ac.uk/facilities/spectrometry/
http://bio-massspeclocal.ex.ac.uk/
• Nick Smirnoff (Director of Mass Spectrometry) [email protected]
• Hannah Florance (MS Facility Manager) [email protected]
• Venura Perera (Bioinformatics and Mathematical Support) [email protected]
Numerical Dynamic Modeling
Reaction scheme:
• AK: known substrate of B (can be removed if un-mapped
• AU: Unknown substrate of B• B : Substrate to model• E : Enzyme(s) controlling the reaction• C : Known product of B• CU: Unknown product of B• BU : Unknown substrate of C
Ak
AU
B
CU
C
E
BU
λ
α
β
μ
γ
Example of Substrate-Product combo
Substrate-Product combo: • Switch profiles show delayed affect
indicating possible reaction scheme• Profiles visually illustrate similarity
Ak
AU
B
CU
C
E
BU
λ
α
β
μ
γ
Anna Murray & Mike Weedon
Mike Weedon, PhDLecturer in Bioinformatics and Statistics, Exeter Medical School, St Luke’s Campus
Graduate in Biochemistry and Molecular Biology
PhD in molecular genetics of type 2 diabetes, Peninsula Medical School
Postdoc on genetics of complex traits, Peninsula Medical School
Current research interests – genome-wide investigations of monogenic and polygenic traits
Anna Murray, PhDSenior Lecturer in Human Genetics, Exeter Medical School, St Luke’s Campus
Graduate in Biology
PhD in molecular biology of T cell receptor rearrangement in coeliac disease in Southampton
Postdoc on population genetics of Fragile X syndrome in Salisbury/Southampton
Current research interests – Genetics of female reproductive lifespan
Whole exome/genome sequencing
ENCODE and other projects to annotate non-coding genome
Genome-wide SNP and expression array studies
Biological pathway analysis
Statistical analysis of large datasets
Marcus Tuke
Background
Computer Science Graduate
University of Exeter Medical School
Supercomputer systems administrator
8 months working on projects at the Wellcome Trust Centre for Human Genetics
MSc Bioinformatics
Expertise I can offer
Human Next Generation Sequence data processing and
analysis
Human RNA-Seq processing and
analysis
Computational expertise
Expertise I can offer
Human RNA-Seq processing and analysis
Human Next Generation Sequence data
processing and analysis
Computational expertise
Computational expertise
• Help with projects that could benefit from expertise in:
Programming/scripting
Database/web development
Linux command line and cluster computing
Expertise I can offer
Human RNA-Seq processing and analysis
Computational expertise
Human Next Generation Sequence data processing and
analysis
Human Next Generation Sequence data processing and analysis
• Align Next Generation Sequencing reads to reference genome
• Pipelines to process and call variation in aligned genome including:
• Pipelines to filter false positives and QC analysis
Reference Genome
g
c
cgggc
Deletion SNP g>c heterozygote
Realign reads mis-aligned in genome due to indels
Expertise I can offer
Computational expertise
Human Next Generation Sequence data
processing and analysis
Human RNA-Seq processing and
analysis
Human RNA-Seq processing and analysis
Align RNA-Seq reads to Human Genome
Transcriptome assembly
Differential expression analysis
Research Interests
Understanding the genetic basis of human diseases and traits
Genome Wide Association Studies (GWAS) have identified numerous genomic regions associated with several major diseases
However, these studies have focussed mostly on common single nucleotide genetic variants
How much more disease-associated genetic variation can we discover from 'higher resolution' technologies such as next generation sequencing (NGS)?
- Rarer- Structural- Non-autosomal- Sub-groups
Whole-genome sequencing in the Inchianti Study
Ongoing projects
• A longitudinal study of aging from the Chianti region of Tuscany, Italy
• 1453 individuals have been followed up over 4 waves since 1998• Extensive phenotypic and biomedical information collected• Hundreds of circulating biomarkers (e.g. Interleukins, sex
hormones, vitamins) measured• RNA expression profiling in lymphocytes (Illumina 46K array)• Methylation profiling (450K array)• Already had the Illumina 550K SNP Chip genotyped• We are performing low-pass (median 7X) whole genome
sequencing on 680 of these individuals
http://www.inchiantistudy.net
RNA-Seq whole-transcriptome assembly and analysis
• Assemble transcriptomes for 3 primary human microvascular endothelial cells
• 2 treated with cathepsin L and D respectively, and 1 control• Analyse whether there are any differences in expression of RNA between
two treatments and control
Christine Sambles
Christine's prezi
Jeremy Metz
JEREMY METZ, PHD
PhD: Quantum computing - theory & simulations Imperial College London
Postdoc: Biological image analysis & cellular modelling Einstein College of Medicine, NY
Research Interests
• Image processing and computer vision applied to biological systemso Object segmentation o Tracking in range of dynamical scenarioso Quantitative analysis
• Modelling biological systems - analytical and numerical approacheso Monte-carlo simulationso Integration of data and insight from multiple scales
of inquiry, e.g. AFM, single cell analysis, high throughput screening
Skills and Expertise
• Python, C++, Java, Matlab
• Image processingo Matlab - image processing toolkito Python - Scipy and OpenCV bindingso Java - ImageJ plugins
• Simulationo C/C++ for fast low-level routineso Python as "glue" code and visualization
• Linux, shell scripting, Oracle Grid Engine
Past Projects - Image processing
Object tracking: Developed novel cross-correlation and Bayesian state estimation based object tracker
Object segmentation: (In progress) Object detection and segmentation based on Scale-space representation formalism
Simulation
Biological system: Kinesin diffusion-reaction during mitosis, spindle photo -bleached at t=0. Model using minimalist feature cell:
Check for presence of competition for binding sites between diffusing species
Quantum system: Noisy (open) atom-cavity system, laser illumination
How can we work together?
• With my expertise in the fields of biological data/image analysis, and mathematical & computational modeling, how can we combine our skills to produce an outstanding research project?
Experiments Extract data
Model, Simulation
Theory
Mahmood Javaid
Area of expertise:• Computational Modelling• Sensor-based Networks and Embedded
Systems• High Performance Computing• Software Development
MAHMOOD JAVAID PHD COMPUTER SCIENCEEXPERIMENTAL OFFICER- RESEARCH COMPUTINGWELLCOME TRUST BIOMEDICAL INFORMATICS HUB
Computational Modelling
Agent-based modelling using Agent-based simulation frameworks such as FLAME and SWARM
Benefits: Close association between the model entities and
the real-world agents Heterogeneous agents within an environment Interaction between the agents through message
passing Potential to uncover emergent behaviours Possibility of multi-scale modelling
Some examples
Prospecting asteroid belt
Hypothesis testing
Economics and Systems biological applications of ABM
Eurace project
(Dept. of Computer Science, University of Sheffield.)
System Understanding of Microbial Oxygen Responses (SUMO)
Sensor-based networks, embedded systems
Involving proximity sensors such as Ultrasonic, Infrared, and Laser-based.
Inertial management units such Accelerometer, Gyroscope, and Magnetometer.
Interfacing between sensors and computation units using interconnects such as Serial ports, Bluetooth, and I2C.
PIC and AVR microcontroller based systems
Sensory augmentation device
In order to move around in the world safely and quickly most of us are highly reliant on our visual sense.
When vision is compromised, the problem of safely finding our way becomes much more difficult.
Possibility of augmenting our existing senses with a form of “remote touch” generated by artificial distance sensors and tactile stimulus.
Movement is critical to how we use our tactile sense.
We explore objects through touch by controlling the way that we move our sensory surfaces over them—stroking with the fingertips to investigate texture, for instance, or palpating to investigate shape.
Active Touch
Inspired from biological and biomimetics
Physical Components
First Prototype
High Performance Computing
Agent-based models running on Linux
HPC clusters
Administration of Linux HPC cluster
Publishing legacy applications running
on Linux clusters using web-services
Software Development
Desktop applications for Linux and
Windows platforms including Matlab
Web-based applications and publishing
legacy systems using web-services
Mobile Operating System applications
Online Depression and Mood Disorder Screener
www.exeter.ac.uk/biomedicalhub
Thanks for listening.....