integrative information management for systems biology

45
Integrative Information Management for Systems Biology Neil Swainston Manchester Centre for Integrative Systems Biology Data Integration in the Life Sciences, Gothenburg, Sweden 27 August 2010

Upload: neil-swainston

Post on 05-Dec-2014

1.589 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Integrative information management for systems biology

Integrative Information Management for Systems

Biology

Neil SwainstonManchester Centre for Integrative Systems Biology

Data Integration in the Life Sciences, Gothenburg, Sweden27 August 2010

Page 2: Integrative information management for systems biology

The MCISB

• Pioneer the development of new experimental and computational technologies in systems biology

• Currently employs 9.5 multidisciplinary people• Mathmaticians, informaticians, experimentalists, etc.• All share same office, lab

• Develop kinetic models of yeast metabolism

Page 3: Integrative information management for systems biology

Metabolism

Page 4: Integrative information management for systems biology

Models

• Genome-scale SBML model of yeast metabolism• Not kinetic / quantitative!• Annotated model

– All >2000 molecules have unique database references– MIRIAM standards have been followed (RDF)– Should be entirely unambiguous for third party users– Should be usable in third party tools– Should allow experimental data to be imported easily

– Herrgård MJ, Swainston N, et al. A consensus yeast metabolic network reconstruction obtained from a community approach to systems biology. Nat Biotechnol. 2008, 26, 1155-60.

Page 5: Integrative information management for systems biology

Bottom-up systems biology

• Steps in kinetic modeling:

• Identify the pathway or portion of a network that is to be modeled

• Associate the model with functions and parameter values that represent its dynamic behavior, either from databases or experimentation

• Analyze and/or simulate the resulting model to understand its properties

Page 6: Integrative information management for systems biology

Bottom-up systems biology

• In common practice, model construction is a manual process, in which a modeler associates a model with experimental data for simulation

• Such an approach can give rise to good quality models, but is more a cottage industry than as a highly scalable production process

• Can this be automated?

Page 7: Integrative information management for systems biology

Automation of the process

• Experimental data is captured from instruments, and subject to primary analyses

• Experimental data and results of the primary analyses are archived in experimental data repositories

• The information required for modeling is extracted from the experimental data resources and stored in a Key Results Database (KRDB)

• A workflow obtains qualitative model information, represented using SBML, parameterizes this model with results in the KRDB, and analyses/simulates the resulting quantitative model

Page 8: Integrative information management for systems biology

Enzyme kineticsQuantitative

metabolomicsQuantitativeproteomics

SBML Model

Parameters(KM, kcat)

Variables(metabolite, protein concentrations)

PRIDE XML MeMo SABIO-RK

Web service

KRDB

Web service

Page 9: Integrative information management for systems biology

From instrument to result

• Raw data typically needs analysed before use

• Experimental data is often managed in an ad hoc way

• Experimentalists are not keen to spend time on data curation for archiving or sharing

• Try to capture necessary metadata as part of primary data analysis

Page 10: Integrative information management for systems biology

Requirements

• The experimental techniques share requirements:• perform analyses on the raw experimental data to

derive the secondary quantitative parameters required in the model

• store the raw experimental data along with relevant metadata and the derived parameters, thus providing the facility to trace back and reanalyze raw data should this be required

• Where possible, existing data standards and tools are reused, although in practice data standards tend to lag behind technique development, and tools tend to lag behind standards

Page 11: Integrative information management for systems biology

Data capture

• Software wizards have been developed that step experimentalists through the analysis of primary data• QconCAT PrideWizard for proteomics• KineticsWizard for enzyme kinetics

• Metadata collected along the way, as unobtrusively as possible• Heavily reliant on database web services

Page 12: Integrative information management for systems biology

KineticsWizard

Page 13: Integrative information management for systems biology

QconCAT PrideWizard

Page 14: Integrative information management for systems biology

QconCAT PrideWizard

eXist database

PRIDE XML

Identify

QconCAT Pride Wizard

Quantify

Format

Upload

Web / web service

Browser

Mascot

PRIDE XMLPRIDE Converter

mzData

Pride

Page 15: Integrative information management for systems biology

Web interfaces

Page 16: Integrative information management for systems biology

From instrument to result

• All laboratories carry out primary analyses of experimental data

• All laboratories carry out some form of secondary analyses based on primary results

• Many laboratories struggle to manage the results of these processes in a systematic manner

• We see the key to obtaining manageable results as being to integrate data capture and management with necessary analyses

Page 17: Integrative information management for systems biology

But…

• …MCISB has to manage “only” three types of experiment• Proteomics, metabolomics, enzyme kinetics

• Informatics team share office with experimentalists and modellers

• We’ve been doing this for years…• Lots of time, lots of people, lots of resource• Infrastructure development is part of our remit

Page 18: Integrative information management for systems biology

And…

• …many projects are far more diverse

• Informatics team separated from experimentalists, who are separated from modellers

• Less informatics resource

• Heavyweight approach of MCISB (bespoke tools for each experiment) not always applicable…

Page 19: Integrative information management for systems biology

So…

• …lightweight approach may be more suitable

• Store only secondary data necessary for modelling• Not raw data

• Key Results Database (KRDB)• More modeller-focussed

Page 20: Integrative information management for systems biology

Key Results Database

• Who, what, some how and why?• Measure “something” under “some conditions”• Measurements are generally a number but may

be some other artifact• Conditions may apply across entire experiment

(Static Factors)• Conditions may change across measurements

(Variable Factors)• Measurements may take place at a certain time

Page 21: Integrative information management for systems biology

Key Results Database

Page 22: Integrative information management for systems biology

KRDB structure

Page 23: Integrative information management for systems biology

KRDB web interface

Page 24: Integrative information management for systems biology

KRDB web interface

Page 25: Integrative information management for systems biology

Key Results Database

• Deployed in Liverpool, MCISB, UCD

• Easily extensible interface

• eXist “lets it all hang out” as RESTful web services

Page 26: Integrative information management for systems biology

Modelling infrastructure

Page 27: Integrative information management for systems biology

Taverna

http://taverna.sourceforge.net

Page 29: Integrative information management for systems biology

Modelling life-cycle workflows

Page 30: Integrative information management for systems biology

Qualitative model construction

Input: list of ORFs

Output: SBML file

1. Get reaction info

3. Create species

2. Create compartments

4. Create reactions

Page 31: Integrative information management for systems biology

Qualitative model construction

Page 32: Integrative information management for systems biology

Qual to quan: parameterisation

• Data requirements• Qualitative SBML model

• Starting concentrations for enzymes and source metabolites• Key Results Database

• Enzyme kinetics data• SABIO-RK database web service

Page 33: Integrative information management for systems biology

Qual to quan: parameterisation

Page 34: Integrative information management for systems biology

Model parameterisation

Page 35: Integrative information management for systems biology

Model calibration

• Optional modification of parameters in reaction kinetics until the output of the model produces results similar to those obtained from experimentation

• Data requirements• Parameterised SBML model• Experimental data

• Metabolite concentrations from KRDB• Calibration by COPASI web service

Page 36: Integrative information management for systems biology

COPASI web service

Design and Architecture of Web Services for Simulation of Biochemical Systems. Dada JO, Mendes P. Data Integration in the Life Sciences, Manchester, UK (2009).

Page 37: Integrative information management for systems biology

Model calibration

Page 38: Integrative information management for systems biology

Model simulation

• The running of a parameterized (and calibrated?) model using a specified simulation operation

Page 39: Integrative information management for systems biology

Model simulation

Page 40: Integrative information management for systems biology

SBRML

• Simulation results are data too, and are represented in our case in SBRML• Systems Biology Results Markup Language• Developed by Joseph Dada, et al. (Manchester)

• Structured format for representing simulation results• And experimental data?

• Dada JO, et al. SBRML: a markup language for associating systems biology data with models. Bioinformatics 2010, 26, 932-938.

Page 41: Integrative information management for systems biology

Model simulation

Page 42: Integrative information management for systems biology

Conclusion

• Classically, systems biology has been a cottage industry

• Experimental results are selected for use in modelling in an ad hoc manner

• Modellers develop and refine models using a time consuming and partially documented process

Page 43: Integrative information management for systems biology

Conclusion

• Large scale experimentation should lead to more systematic behaviour

• Data integration to support the construction and parameterisation of models

• Large scale computational experimentation to support the comparison of models and their results

Page 44: Integrative information management for systems biology

Thanks…

Page 45: Integrative information management for systems biology

Integrative Information Management for Systems

Biology

Neil SwainstonManchester Centre for Integrative Systems Biology

Data Integration in the Life Sciences, Gothenburg, Sweden27 August 2010