![Page 1: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/1.jpg)
Enabling Systems Biology:
Development and Implementation of
Proteomics Standards and Services
![Page 2: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/2.jpg)
Engineering 1850
•Nuts and bolts fit perfectly
together, but only if they
originate from the same
factory
•Standardisation proposal in
1864 by William Sellers
•It took until after WWII until it
was generally accepted,
though …
Proteomics today
•Proteomics results are perfectly
compatible, but only if they are from
the same lab, from the same
software
•Fragmentation of proteomics data
•“Publish and vanish”
![Page 3: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/3.jpg)
Proteomics Data Sharing
![Page 4: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/4.jpg)
Incompleteness of the
public record
•Nucleotide sequences, protein sequences,
macromolecular 3D structures, DNA microarrays:
Database submission mandatory
•Proteomics: No standardised reporting, no standard
database submission
•Proteomics data is generated at a high rate, and lost at
a high rate
•Simple question like “Give me all tissues in which my
protein of interest was identified” are currently
unanswerable
•Experiments are repeated unnecessarily, the field
advances slower than necessary
![Page 5: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/5.jpg)
The tide is turning, though …
•Bradshaw RA, Burlingame AL, Carr S, Aebersold R.
Reporting protein identification data: the next
generation of guidelines.
Mol Cell Proteomics. 2006 May;5(5):787-8.
•Wilkins et al.
Guidelines for the next 10 years of proteomics.
Proteomics. 2006 Jan;6(1):4-8.
•Nature Biotechnology 2006, Nov:
• Editorial: Standard Operating Procedures
• Burgoon LD. The need for standards, not guidelines, in biological data
reporting and sharing.
• Ball C. Are we stuck in standards?
•Nature Biotechnology: Community Consultation on
Standards: http://www.nature.com/nbt/consult/index.html
![Page 6: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/6.jpg)
Community Consultation
•Nature Biotechnology community consultation
•http://www.nature.com/nbt/consult/index.html
•Currently nine “standards” papers on NBT website for public
consultation, thereof six from PSI
• MIAPE parent
• MIAPE MS
• MIAPE MS Informatics
• MIAPE Gel
• MIAPE MI
• PSI MO
![Page 7: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/7.jpg)
HUPO Proteomics Standards Initiative
•Develop data format standards
•Data representation and annotation
standards
•Involve data producers, database providers,
software producers, publishers
•Open community initiative
![Page 8: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/8.jpg)
PSI deliverables
•Minimum Information about a Proteomics
Experiment (MIAPE)
•XML schema
•Detailed controlled vocabularies
•Support tools
![Page 9: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/9.jpg)
Document process
• Significant investments
into PSI standards require
formal process for PSI
standards
• Process ensures good
balance between expert
design and public scrutiny
• Document process
approved at PSI spring
meeting, San Francisco,
April 2006
• Now in implementation
![Page 10: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/10.jpg)
PSI work groups
PSI-MI
Molecular
Interactions
PSI-MS
Mass
Spectrometry
PSI-MOD
Separations
![Page 11: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/11.jpg)
FuGE
Functional Genomics Experiment model
MGED collaboration
PSI-MI
Molecular
Interactions
PSI-MS
Mass
Spectrometry
MGED
MIAME
MAGE-OM
Microarray
Standard
PSI-MOD
Separations
![Page 12: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/12.jpg)
FuGE
Functional Genomics Experiment model
PSI work groups: MI
PSI-MI
Molecular
Interactions
PSI-MS
Mass
Spectrometry
MGED
MIAME
MAGE-OM
Microarray
Standard
PSI-MOD
Separations
![Page 13: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/13.jpg)
PSI-MI community standard
•Community standard for Molecular Interactions
•Jointly developed by major data providers: BIND, CellZome, DIP, GSK, HPRD, Hybrigenics, IntAct, MINT, MIPS,
Serono, U. Bielefeld, U. Bordeaux, U. Cambridge, and others
•XML schema
•Controlled vocabularies
•Tools
•Minimum requirements (submitted)
•Implementated by major data providers
![Page 14: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/14.jpg)
•PSI develops not only formats, but also
controlled vocabularies/ontologies where
necessary
•Example: > 20 ways to write: yeast two hybrid, Y2H, 2H, yeast-two-hybrid, two-hybrid, …
•Ca. 800 terms, fully defined and cross-
referenced
•GO format
PSI-MI controlled vocabularies
![Page 15: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/15.jpg)
PSI-MI format development
•Iterative development:
Do the feasible first, leave the unfeasible for later
•Version 1.0 published in February 2004
• The HUPO PSI Molecular Interaction Format - A community standard for the
representation of protein interaction data.
Henning Hermjakob et al,
Nature Biotechnology 2004, 22, 176-183.
•Version 2.5 released December 2005
• Technical improvements
• Quantitative parameters
• Additional interactor types: DNA, RNA, small molecules
• Additional, simplified tabular format
• Submitted
![Page 16: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/16.jpg)
PSI-MI Support
•Data: DIP, HPRD, IntAct, MINT, MIPS, …
•Tools
•Conversion Tabular – PSI XML
•XML -> HTML
•Semantic validation
•Visualisation
•PimWalker ®: http://pim.hybrigenics.com/pimwalker
•ProViz: http://cbi.labri.fr/eng/proviz.htm
•Cytoscape: http://www.cytoscape.org
![Page 17: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/17.jpg)
IntAct as an implementation of PSI MI
•Curated molecular interaction database
•128.000 binary interactions
•Open source
•Open data
•http://www.ebi.ac.uk/intact
![Page 18: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/18.jpg)
IntAct curation
•Detailed, “deep” curation
•Based on full text papers
•Experimental conditions
•Detailed interactor identification
•Use of detailed controlled vocabularies
•Annotation of binding domains, protein modifications, etc.
![Page 19: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/19.jpg)
The IMEx consortium
•International Molecular-Interaction Exchange
consortium
•DIP, IntAct, MINT, MIPS
are establishing an exchange of curated literature data
in PSI-MI format from summer 2006 onwards to
provide a network of stable, comprehensive resources
for molecular interaction data
•Aims:
•Consistent body of public data
•Avoid redundant curation
•http://imex.sf.net
![Page 20: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/20.jpg)
IMEx data deposition
•Deposition of published data in one of the IMEx
databases is strongly encouraged
•Any dataset submitted in one of the IMEx databases
will be replicated to the other IMEx databases
•IMEx partners are already co-ordinating their curation
efforts now
•Public guidelines:Orchard et al. The Minimum Information on a Molecular Interaction
Experiment (MIMIx).
Nature Biotechnology, accepted.
![Page 21: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/21.jpg)
FuGE
Functional Genomics Experiment model
PSI work groups: MS
PSI-MI
Molecular
Interactions
PSI-MS
Mass
Spectrometry
MGED
MIAME
MAGE-OM
Microarray
Standard
PSI-MOD
Separations
![Page 22: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/22.jpg)
Mass spectrometry: PSI-MS
•mzData format as common instrument output format
•Format beta version accepted in Nice, April 2004
•EBI workshop July 2004
•Version 1.05 released January 4, 2005
•Next revision spring 2007, in collaboration with the Institute
for Systems Biology (ISB), merging mzData and mzXML
•Controlled vocabularies developed jointly with ASTM
•Key concept:
Request direct vendor support to avoid version problems due
to vendor API changes
•Move to mzML (merge of mzData and mzXML)
![Page 23: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/23.jpg)
Current mzData support•Applied Biosystems
•Bruker
•EBI
•GeneBio
• Insilicos
•Kratos
•MatrixScience
•Swiss Institute of Bioinformatics
•GPM
•Thermo Electron
•Waters
![Page 24: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/24.jpg)
Mass spectrometry: PSI-MS
•analysisXML format as common search
engine output format
•Suggested in Nice, April 2004
•Further developed in Siena, April 2005
•Aim: Facilitate comparison and archiving of search
engine output, in particular in comparative projects
like the HUPO PPP
•Beta release under internal review
![Page 25: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/25.jpg)
PSI-MS based data flow
proprie-
tary
format
mass
spectrometer B
mass
spectrometer A converter
mzData
search
engine A
search
engine B
analysisXML
Public repository
![Page 26: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/26.jpg)
PRIDE – Protein Identification Database
•Turns publicly available data into publicly accessible
data
•Protein identifications
•Experimental detail
•Peak lists
•Linkout to raw data
•Fully open source
•Fully open data
•Implementation of PSI standards as they are released
![Page 27: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/27.jpg)
![Page 28: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/28.jpg)
![Page 29: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/29.jpg)
![Page 30: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/30.jpg)
Data views
![Page 31: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/31.jpg)
Experiment Comparison
![Page 32: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/32.jpg)
Lab B
Private Data in
PRIDE “Collaboration”
Comparison
Reviewer
Lab A
Lab C
PRIDE private mode
Publicly available data
•Private mode allows data
analysis within a
collaboration
•PRIDE tools are already
accessible in private mode, in
particular experiment
comparison (alpha)
•On manuscript submission,
reviewers can access the data
in standard format
![Page 33: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/33.jpg)
Lab B
Private Data
“Collaboration”
Reviewer
Lab A
Lab C
PRIDE private mode
Publicly available data
•Private mode allows data
analysis within a
collaboration
•PRIDE tools are already
accessible in private mode, in
particular experiment
comparison (alpha)
•On manuscript submission,
reviewers can access the data
in standard format
•On manuscript publication,
the data becomes public
![Page 34: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/34.jpg)
Data entry
•Register
•XML-based data deposition
• Target group: Larger labs with good bioinformatics support, large scale
data sets
•Generate PRIDE XML directly
•Supporting toolkit currently under development
• Fully automated, web-based submission
•Excel-based
• Target group: Smaller labs, low to medium throughput
• “Biologists love Excel”
•Advanced Excel spreadsheet will allow user input in “familiar” Excel
environment
•Spreadsheet supports use of controlled vocabularies and validation
•Automatic submission direction from spreadsheet into PRIDE
![Page 35: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/35.jpg)
Medium term vision
•Collaborate with regional or project centers for data collection and
analysis
•Establish data exchange and collaboration between PeptideAtlas,
GPMDB, PRIDE, PRIDE@NPC, …
•Provide a set of compatible, synchronized, public resources for protein
identification data
Regional
Center
PRIDEPeptide
Atlas
Regional
CenterHUPO
xPP
PRIDE@NPC
![Page 36: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/36.jpg)
Acknowledgements
•All PSI participants
• Luisa Montecchi-Palazzi
• Sandra Orchard
• Chris Taylor
• Randy Julian (Lilly)
• Patrick Pedrioli (ISB, ETH)
•PRIDE
• Phil Jones
• Lennart Martens
• Richard Cote
• Sebastian Klie
•BBSRC ISPIDER Grant
•BBSRC ProteomeHarvest Grant
•EU ProDaC grant
•Henning Hermjakob
•http://www.psidev.info
![Page 37: Enabling Systems Biology: Development and Implementation of Proteomics Standards … · 2017-08-16 · HUPO Proteomics Standards Initiative •Develop data format standards •Data](https://reader030.vdocuments.mx/reader030/viewer/2022040616/5f106f7d7e708231d44918c7/html5/thumbnails/37.jpg)
Resources
•http://psidev.sf.net
•http://imex.sf.net
•http://www.ebi.ac.uk/intact
•http://www.ebi.ac.uk/pride
•http://www.nature.com/nbt/consult/index.html