data management and protein production – from lab notebooks to lims robert esnouf

40
Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf Oxford Protein Production Facility, University of Oxford… …and the PIMS development team EBI, 23/9/2008

Upload: cooper

Post on 04-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf Oxford Protein Production Facility, University of Oxford… …and the PIMS development team. EBI, 23/9/2008. Introductory presentation. What is information management? What types of laboratory process are there? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Data Management and Protein Production – from

Lab Notebooks to LIMSRobert Esnouf

Oxford Protein Production Facility,University of Oxford…

…and the PIMS development team

EBI, 23/9/2008

Page 2: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Introductory presentation■ What is information management?■ What types of laboratory process are there?■ What information should be recorded?■ Where can LIMS benefit protein scientists?■ What are the potential pitfalls?

■ Introduction to PIMS concepts

■ Linking data upstream and downstream Target selection and crystallization

Page 3: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

What is information management?

■ A way of storing information so that it can be retrieved later:■ Mechanism varies from human memory

(surprisingly common) to sophisticated relational database systems

■ Purpose of retrieval can include supporting the next experiment, sharing data with collaborators, publication and depositing data

■ In a laboratory setting information management is a branch of bioinformatics

■ Automated experiments may require electronic information management

Page 4: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Types of information management

■ Paper-based records■ Well suited to independent research■ Long-term archive

■ Electronic Laboratory Notebooks (ELN)■ Central repository of information■ Electronic version of paper systems

■ Laboratory Information Management Systems (LIMS)■ Relational database■ Model for laboratory processes■ Snapshot of current state of laboratory

Page 5: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

What information could be recorded?

■ Progress (target) tracking■ Records status of a project■ Progress through a (usually) linear workflow

■ Sample tracking■ Where is a sample (which fridge, freezer etc.)?■ Exchange of samples between labs

■ User auditing■ Who did what and when?■ Who authorized/checked work?

■ What is a sample?■ What experiments have been done?

Page 6: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

What is a sample?

■ Is it a reagent?■ Which batch number was it?■ Does it have use-by date?■ How much is left?

■ What target does it “belong” to?■ Does it belong to multiple targets (a complex)?

■ What is in the sample?■ Is it a single entity?■ Is it a mixture or a complex?■ Record a full chemical description or how to

recreate sample?

Page 7: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

What is an experiment?

■ Is it part of a fixed workflow?■ Suited to repetitive, well defined work■ Maps directly onto progress tracking■ Workflow should be relatively invariant

■ Is it an isolated process?■ Take input sample(s)■ Work on them■ Record results■ Produce output sample(s)■ No concept of a workflow

Page 8: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

(Electronic) laboratory notebooks

■ A store for unformatted information■ Easy to enter data■ Fairly easy for single user to retrieve data■ Difficult for others to retrieve data■ Difficult to search for data■ Difficult to share non-electronic data

■ Most data organized chronologically■ Single projects can become fragmented■ Data associated temporally■ Non-electronic data can prove date of

discovery

Page 9: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Blurring ELN and LIMS

■ Linking samples and targets■ Sharing data■ Controlled vocabulary

■ E. coli not Escherichia coli■ Sodium chloride not NaCl

■ Fixed descriptions of protocols

■ Moving towards the LIMS relational database■ Model for laboratory workflows and processes

Page 10: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Benefits of LIMS

■ Distributed projects■ Information can be accessed anywhere

■ Collaborative projects■ Different people record into same store

■ Miniaturized projects■ Labelling of samples becomes impossible

■ Automated projects■ Handling layouts in plates etc.

■ High-throughput processes■ System managed by computer with automated

sample tracking

Page 11: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Potential pitfalls of LIMS■ Data loss

■ Hardware failure – managable■ Data corruption – potentially catastrophic

■ Data integrity■ Data need to be described properly■ LIMS can default to being ELN

■ Extra burden of recording data■ Takes time for no immediate benefit■ Need easy and intuitive input – risk of sloppiness

■ Compliance■ Unrecorded data are lost■ Incomplete data may break data “chain”

Page 12: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

■ Example 96 constructs through PCR, Gateway cloning & expression screening with 2 cell lines & 2 protocols:

■ Uses 34 96-well plates and 36 24-well plates and generates 480 images of colony wells, 1536 lanes on agarose gels and 416 lanes on SDS–PAGE gels

Recording Gateway cloning protocols

Page 13: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

■ Example 96 constructs through PCR, Gateway cloning & expression screening with 2 cell lines & 2 protocols:

■ Uses 34 96-well plates and 36 24-well plates and generates 480 images of colony wells, 1536 lanes on agarose gels and 416 lanes on SDS–PAGE gels

Recording Gateway cloning protocols

Difficult to decide which data

need to be recorded…

Burden of recording unnecessary

data can be fatal!

Page 14: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

What is PIMS?■ BBSRC SPoRT funded two consortia:

■ Scottish Structural Proteomics Facility (SSPF)■ Membrane Protein Structure Initiative (MPSI)

■ PIMS is funded to develop a laboratory information management system (LIMS):■ Funded by the BBSRC SPoRT initiative■ Funding Jan 2005 – Dec 2009■ Supports SSPF, MPSI, OPPF & YSBL■ Developers in Daresbury, EBI, OPPF & YSBL■ Support from OPPF, Dundee & Daresbury

■ http://www.pims-lims.org/

Page 15: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Why develop PIMS?■ Longstanding need for rational data

management for protein production■ Complex, ever-changing workflow■ To exploit higher throughput■ To aid collaboration and make data public

■ Academic LIMS (and industrial?)■ LISA, HalX, SESAME, MOLE, (Beehive)■ Specific to one site, hard to maintain

■ PIMS is a collaborative effort to find a common solution■ Most laboratories have some similar processes■ All have some unique processes■ PIMS is fully featured LIMS, not target tracking

Page 16: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Some ancient history■ Starts with 2001 Airlie House agreement

■ To share protein production data■ TargetDB is limited implementation■ Detailed specification of terms

■ European-based projects■ eHTPX: data exchange models (dictionaries)■ HAL & HALX: LIMS concentrating on workflow■ MOLE: LIMS built on generic data model■ SPINE encouraged collaboration

■ Loose PIMS consortium formed to seek a common solution■ BBSRC SPoRT provided funding opportunity

Page 17: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Technologies used

■ PIMS is used from a web browser■ Mozilla Firefox or Internet Explorer■ No client software to install (perhaps plugins)■ Windows, Macintosh and Linux clients

■ PIMS requires a web and database server■ Typically the same machine■ Web server Apache Tomcat■ Development on free PostgreSQL■ Now available for Oracle■ Windows and Linux servers

■ Technologies used by developers■ Java1.5, Hibernate, JUnit, BioJava, dot, batik,

AJAX, ...

Page 18: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf
Page 19: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

PIMS will be properly introduced

by the other speakers…

Page 20: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Aspects of application developmentCrystallographic

applications

Graphicsapplications

LIMSRobotics

Page 21: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Basic concepts of PIMS

PIMS uses a few simple key concepts which can be linked together to model complex workflows

Targets■ Description of sequences, store annotations

Constructs■ Starting points for real experiments, link to targets

Samples■ Tracked samples made & used by experiments■ Samples have types, owners, locations etc.

Experiments■ Take one (or more samples), produce new sample(s) as outputs

Page 22: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Experiments and protocols

A protocol is a reusable user-defined template describing what you record for your experiments.

Parameters■ Numerical values, free text values, T/F. E.g. incubation temperature or the number of PCR cycles; details of incubation conditions; was reagent added?

Input Samples■ Samples or reagents used when performing an experiment that you wish to track

Output Samples■ Samples or reagents produced when performing an experiment that you wish to track

Page 23: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Typing of PIMS items

Typing helps PIMS offer sensible choices: only a plasmid can be used for transfection experiments…

Samples■ Typed to show what they are

Input/Output samples for protocols■ State what type of sample can be used and what is produced

Experiments and protocols■ An experiment type is defined by its protocol. A protocol type links similar protocols together

Page 24: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Experiments & samples → WorkflowsSample A

Expt 1

Sample B

Expt 2

Sample C

Expt 3

Sample D

Expt 4

Sample E2Sample E1

Page 25: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

The PIMS holder (plate experiments)

A holder groups samples. This allows PIMS to perform plate experiments in groups

Samples■ For plate experiments output samples of previous experiment are mapped to input samples of next. (Provided sample type matches!)

User interface for plate experiments■ Gives graphical and spreadsheet views. Allows editing, reformatting and spreadsheet upload

Page 26: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Can attach

files to

samples,

experiments

& targets

Page 27: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Basic protocols used at OPPFPCR

PCR Clean Up

InFusion

Transformation

Plasmid Prep

Trial Expression

Verification

Sequencing

Scale Up

Lysis

Purification

Concentration

Page 28: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

A workflow derived from PIMS

Page 29: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Experiments can read/write data

Page 30: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

How was a sample produced?

Page 31: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf
Page 32: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

PDF reports of how samples are made asa permanent record

Export to externaltarget tracker software

Page 33: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Target annotation at the OPPF

Albeck et al. (2006) Acta Cryst. D62, 1184

■ Used at OPPF and externally by YSBL & SSPF■ OPAL freely available over the web■ OPTIC DB managing annotations for 12226 targets

Page 34: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Example: predicting crystallizability

■ Simple to understand, calculated in advance■ Helps to set priorities for lab work

Page 35: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Example: improving construct design

Page 36: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

4°C, 1000 plates

21°C, 10000 platesOPPF crystallization facility robots

Page 37: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Automation of crystallization trials

■ Live since Jan 2002■ 22570 plates■ 51845521 images■ >28TB images■ 351 registered users

for OPPF site■ 100 OPPF/STRUBI■ 118 Other Oxford■ 50 Other UK■ 83 Elsewhere

Mayo et al. (2005) Structure 13, 175

Page 38: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

Controlled web-based access from anywhere, e.g. synchrotrons

OPPF crystallization management

Page 39: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

xtalPIMS managing high-throughput crystallization trials■ Fusion of PIMS, Vault and eHTPX work

■ Can support multiple imaging and storage robots

Page 40: Data Management and Protein Production – from Lab Notebooks to LIMS Robert Esnouf

AcknowledgmentsPIMS PIs■ Kim Henrick, Dave Stuart, Keith Wilson, Richard

Blake, Jim Naismith, Neil IsaacsPIMS supported■ Anne Pajon, Ed Daniel, Marc Savitsky, Susy Griffiths,

Katya PilichevaCCP4 supported■ Chris Morris, Bill LinOther projects■ OPPF: Jon Diprose■ MPSI & SSPF: Petr Troshin, Jo van Niekerk■ xtalPIMS: Ian Berry, Gael Seroul, Diederick De Vries