data integration, mass spectrometry proteomics software development
TRANSCRIPT
![Page 1: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/1.jpg)
![Page 2: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/2.jpg)
Overview
• Quantitative proteomics
• Data integration in kinetic modelling in systems biology
![Page 3: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/3.jpg)
A typical proteomics experiment
• Various routes through this mapSeparating by size or charge in most cases
Identify peptides as a proxy for proteins, comparing theoretical
to experimental spectra
![Page 4: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/4.jpg)
Quantitative proteomics
• Approach described is qualitative• Peptides / proteins identified but not quantified
• Mass spectrometry is not quantitative per se• Different compounds have different physiochemical
properties• May ionise differently, more / less readily
• Therefore peak intensities cannot be compared between two different compounds• Applies to peptides / proteins
![Page 5: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/5.jpg)
Quantitative proteomics
• BUT peak intensities can be compared between compounds sharing the same physiochemical properties• Isotopes
• Same physiochemical properties• Different molecular masses (ΔM = 6Da)
![Page 7: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/7.jpg)
Quantitative proteomics
• Can apply the same principle for peptides:
• IDVAVDSTGVFK• IDVAVDSTGVFK*
• Lysine (K) residue is labelled with C13
• Same physiochemical properties• Different molecular masses (ΔM = 6Da)
![Page 8: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/8.jpg)
Quantitative proteomics
• Absolute quantitative proteomics requires isotopically-labelled peptide of known concentration spiked into sample
• Isotopically-identical peptides behave consistently– Comparable peak intensity, comparable retention time
• Ratio of labelled over non-labelled peptide can be used to determined absolute concentration of sample peptide
![Page 9: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/9.jpg)
expected and observed ratio areas 3 peptides
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
proportion of light present
area
rat
io H
/(H
+L)
expected L/(L+H)
area L/(L+H) peptide 2
area L/(L+H) peptide 1
area L/(L+H) peptide 3
Linear (expectedL/(L+H))
DBKtest07 #1073 RT: 15.60 AV: 1 NL: 1.67E6T: FTMS + c ESI Full ms [300.00-2000.00]
516 517 518 519 520 521 522 523 524
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lativ
e A
bu
nd
an
ce
519.29
516.28
519.79
516.78
520.29
517.28
518.79
520.79519.14517.78520.14 521.29518.28 519.46516.45516.11 522.13515.77
Mixture 40:60
Data: Kathleen Carroll (Orbitrap MS)
Quantitative proteomics: QconCAT
![Page 10: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/10.jpg)
Quantitative proteomics: QconCAT
• Requirements:• Determine absolute protein concentrations under a
given cellular condition• Quantify a number (~50) proteins simultaneously
• Apply QconCAT methodology• Allows simultaneous introduction of many labelled
peptides into sample• Multiplexed absolute quantification for proteomics using
concatenated signature peptides encoded by QconCAT genes. Pratt JM, et al. Nat Protoc. 2006, 1:1029-43.
![Page 11: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/11.jpg)
Quantitative proteomics: QconCAT
• Construct an artificial protein containing many peptides– At least one from each protein of interest– Ensure that the artificial protein is isotopically-labelled
![Page 12: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/12.jpg)
• Numerous absolute protein quantitations can be performed simultaneously
Quantitative proteomics: QconCAT
![Page 13: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/13.jpg)
…from instrument to browser
• From an QconCAT informatics perspective, there are three steps…
1. Selection of QconCAT peptides2. Analysis and submission of data3. Browsing / querying
![Page 14: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/14.jpg)
Selection of QconCAT peptides
Q. Given a given protein, which peptides are suitable candidates for QconCAT peptides?
Must…• Be unique across organism• Be detectable (digestible, flyable)
Preferably…• Be unmodified
![Page 15: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/15.jpg)
QconCAT Selection Wizard
• Takes protein accession numbers as input (and other parameters)
• Provides list of potential QconCAT peptides• Downloads sequence• Performs BLAST against species-specific UniProt (tests
uniqueness)• Filters peptides “appropriately”• Applies score to peptide, using PeptideSieve (predict
flyability)• Computational prediction of proteotypic peptides for quantitative
proteomics. Mallick P, et al. Nat Biotechnol. 2007, 25:125-31.
![Page 16: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/16.jpg)
![Page 17: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/17.jpg)
![Page 18: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/18.jpg)
![Page 19: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/19.jpg)
QconCAT…
Multiplexed absolute quantification for proteomics using concatenated signature peptides encoded by QconCAT genes. Pratt JM, et al. Nature Protocols 1, 1029-1043 (2006)
![Page 20: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/20.jpg)
QconCAT data analysis
• Identify and quantify peptides / proteins of interest
• Generate results in standard data format• Facilitates data sharing• Exploit existing software tools
• PRIDE XML• PRoteomics IDEntifications• Community developed standard• http://www.ebi.ac.uk/pride/
![Page 21: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/21.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 22: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/22.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 23: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/23.jpg)
Pride Converter
• Pride Converter (EBI) used to extract meta-data• Who ran the sample, what was the sample,
instrument used? etc.• http://code.google.com/p/pride-converter/• PRIDE Converter: making proteomics data-sharing easy. Barsnes H,
et al. Nat Biotechnol. 2009, 27:598-9.
• Simple wizard allowing experimental data to be marked up with meta-data
![Page 25: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/25.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 26: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/26.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 27: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/27.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 28: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/28.jpg)
QconCAT PrideWizard: Identify
• Goal: to identify heavily-labelled QconCAT peptides• Uses Mascot• http://www.matrixscience.com/search_form_select.ht
ml
• De facto standard database search engine for identifying peptides / proteins
![Page 29: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/29.jpg)
![Page 30: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/30.jpg)
QconCAT PrideWizard: Identify
• Mascot results are parsed to find labelled QconCAT peptides:
![Page 31: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/31.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 32: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/32.jpg)
QconCAT PrideWizard: Quantify
• Goal: to quantify heavily-labelled QconCAT peptides
• We now know m/z and retention time of peak identified as a QconCAT peptide
• First step: extract mass chromatogram for both heavy (labelled) and light (unlabelled) peptide
![Page 33: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/33.jpg)
QconCAT PrideWizard: Quantify
• Extracted mass chromatograms• Heavy and light peptide should overlay as they should
have same retention time
![Page 34: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/34.jpg)
QconCAT PrideWizard: Quantify
• Could use peak areas to quantify heavy versus light• BUT hard (and inaccurate) to determine start and end
![Page 35: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/35.jpg)
QconCAT PrideWizard: Quantify
• Alternative: extract individual scans showing isotopic clusters for both heavy and light
![Page 36: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/36.jpg)
QconCAT PrideWizard: Quantify
• Apply sliding window and plot heavy versus light:
![Page 37: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/37.jpg)
QconCAT PrideWizard: Quantify
• Final step: apply linear regression to determine heavy:light ratio (and an error):
![Page 38: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/38.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 39: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/39.jpg)
QconCAT data analysis
eXist database
PRIDE XML
Identify
QconCAT Pride Wizard
Quantify
Format
Upload
Web / web service
Browser
Mascot
PRIDE XMLPRIDE Converter
mzData
![Page 40: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/40.jpg)
MCISB Proteome Database
• Searchable repository of quantitative proteomics data
• Geeky bit…• eXist native XML database holding PRIDE XML• JSP front end• Querying extensible through XQuery
• Web and web-service interface• Both human and computer-queryable
![Page 41: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/41.jpg)
![Page 42: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/42.jpg)
QconCAT informatics pipeline
• Reference:
• A QconCAT informatics pipeline for the analysis, visualization and sharing of absolute quantitative proteomics data. Swainston N, et al. Proteomics. 2011, 11:329-33.
![Page 43: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/43.jpg)
Data Integration
![Page 44: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/44.jpg)
Systems biology modelling
Enzyme kineticsQuantitativemetabolomics
Quantitativeproteomics
Systems Biology Model
Parameters(KM, Kcat)
Variables(metabolite, proteinconcentrations)
PRIDE XML MeMo SABIO-RK
Web serviceWeb serviceWeb service
MeMo-RK
Web service
![Page 45: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/45.jpg)
Systems biology modelling
Enzyme kineticsQuantitativemetabolomics
Quantitativeproteomics
Systems Biology Model
Parameters(KM, Kcat)
Variables(metabolite, proteinconcentrations)
PRIDE XML MeMo SABIO-RK
Web serviceWeb serviceWeb service
MeMo-RK
Web service
![Page 46: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/46.jpg)
Systems biology modelling
Enzyme kineticsQuantitativemetabolomics
Quantitativeproteomics
Systems Biology Model
Parameters(KM, Kcat)
Variables(metabolite, proteinconcentrations)
PRIDE XML MeMo SABIO-RK
Web serviceWeb serviceWeb service
MeMo-RK
Web service
![Page 47: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/47.jpg)
Systems biology modelling
Enzyme kineticsQuantitativemetabolomics
Quantitativeproteomics
Systems Biology Model
Parameters(KM, Kcat)
Variables(metabolite, proteinconcentrations)
PRIDE XML MeMo SABIO-RK
Web serviceWeb serviceWeb service
MeMo-RK
Web service
![Page 48: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/48.jpg)
Modelling life-cycle workflows
![Page 49: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/49.jpg)
From experiment to simulation
Kinetic models
Experimental data
Systematic integration of experimental data and models in systems biology. Li P, et al. BMC
Bioinformatics. 2010, 11:582.
![Page 50: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/50.jpg)
Conclusion
• An informatics pipeline has been developed for analysis of quantitative proteomics data• Data is associated with metadata, identified,
quantified, and uploaded to database• Community standards have been followed
• Experimental data can be incorporated in systems biology models• Allows simulations of biological systems to be
performed
![Page 51: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/51.jpg)
Thanks…
![Page 52: Data Integration, Mass Spectrometry Proteomics Software Development](https://reader035.vdocuments.mx/reader035/viewer/2022062703/554e9892b4c90573338b5201/html5/thumbnails/52.jpg)