ascq me: a new engine for peptide mass fingerprint...

31
ASCQ_ME: a new engine for peptide mass fingerprint directly from mass spectrum without mass list extraction Jean-Charles BOISSON 1 , Laetitia JOURDAN 1 , El-Ghazali TALBI 1 , Cécile CREN-OLIVE 2 et Christian ROLANDO 2 Université des Sciences et Technologies de Lille 1 LIFL, Laboratoire d’Informatique Fondamentale de Lille, UMR 8022 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009

Upload: others

Post on 13-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

ASCQ_ME: a new engine for peptide mass fingerprint directly

from mass spectrum without mass list extraction

Jean-Charles BOISSON1, Laetitia JOURDAN1, El-Ghazali TALBI1, Cécile CREN-OLIVE2 et Christian ROLANDO2

Université des Sciences et Technologies de Lille1 LIFL, Laboratoire d’Informatique Fondamentale de Lille, UMR 80222 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009

Page 2: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 2

Outline

Actual identification methods (MS level).

Characteristics of our approach.

Global scheme.

Digestion algorithm.

Spectrum simulation (Fast Fourier Transform).

Scoring. The ASCQ_ME application.

Performances.

Conclusions and perspectives.

Page 3: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 3

The protein identification (MS level)

Data : MS spectrum list of mass/intensity peaks.

Mono isotopic peaks extraction phase : Proprietary application. For the most interesting samples human intervention is still required Time consuming Potential risk of information lose.

Identification with different scoring methods : Mascot Sequest ProteinProspector and lately correlation of the 3 scoring methods using a metascoring algorithm…

Page 4: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 4

Characteristics of our approach

Direct interrogation from the MS spectrum Suppression of the mono isotopic

extraction step. Identification by correlating the experimental spectrum with the theoretical one.

Complete algorithm no external source code.

Sources and algorithm proofs available. OPEN SOURCE

Page 5: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 5

Global scheme

Protein databases (FASTA format)

Set of peptides(chemical formula)

Isotopicdistribution

computation

Simulated spectrum

Experimental spectrum

ScoringDigestion

Page 6: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 6

Digestion phase

Development of a linear iterative algorithm. time only proportional to the protein size.

Generic algorithm no limitations for the configuration parameters (number of miss-cleavage, enzyme used, …).

Dynamic grammar No limitation on the complexity of rules Detection of consensus sequence (fixed or variation) The cleavage may be triggered by the amino acid

composition after or before the cleavage site Proof of the completeness of the digestion tree

available.

Page 7: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 7

Example of digestion tree

Page 8: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 8

Example of digestion

Cytochrome C bovine : 0 miss-cleavage 21 peptides. 1 miss-cleavage 41 peptides. 2 miss-cleavage 66 peptides. … 10 miss-cleavage 176 peptides.

Titine (33 500 amino acids): 0 miss-cleavage 4 112 peptides (1 s). 10 miss-cleavage 45 177 peptides (15 s). 20 miss-cleavage 86 142 peptides (45 s). Max miss-cleavage 8 456 328 peptides (2h).

Page 9: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 9

The simulated spectra generator

Based on the algorithm proposed by A.L. Rockwood (1995) for the computation of isotopic distribution.

Use of Fourier transform.

Spectrum generation from the isotopic distribution of each peptide.

Exact algorithm whatever the number of atoms. No combinatorial explosion.

The mass of the monositopic peak comes from addition, multiplication of the atomic mass, the isotope peak distribution and mass from the algorithm. The right size for the FFT is approximately the size of the original MS spectrum.

Page 10: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 10

Generation of the simulated spectra

Each atom has its basic isotopic distribution (example : Cn, Hm, …). passage in the Fourier (frequency) space. atom quantity multiplications with the atom basic isotopic distribution.

return to the Euclidian (mass) space. multiplication with the isotopic distribution already computed for the current peptide.

Page 11: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 11

A basic example : the Cl2 distribution

Fourier Transform

Inverse Fourier

Transform

multiplication (here 2)atom quantity

Page 12: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 12

Scoring

Study of the correlation between the theoretical peptides and the experimental spectrum.

Filtering to know which peptides are useful for the identification.

Visual representation for an easy interpretation of the results.

Page 13: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 13

Scoring : explanations

Based on the convolution of the two spectra by scalar multiplication of the two vectors.

Convolution made peptide by peptide partial score of each peptide.

First version: a “naïve” scoring with fixed threshold for determining the significant peaks.

Page 14: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 14

The ASCQ_ME application

Combination of the digestion algorithm and the scoring.

Entirely governed by a text-only configuration file (20 parameters for the first version).

On the web soon : Site for online identification requests. Download of the complete sources and the different

libraries composing ASQC_ME.

Page 15: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 15

Example of configuration file

Page 16: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 16

Results example (1/5) (score 1st version) Full spectrum display

MS spectrum of Cytochrome C bovine (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 17: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 17

Results example (2/5) (score 1st version) significant peptide

MS spectrum of Cytochrome C bovine (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 18: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 18

Results example (3/5) (score 1st version) peptides mix

MS spectrum of Cytochrome C bovine (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 19: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 19

Results example(4/5) (score 1st version) peptide in the noise

MS spectrum of Cytochrome C bovine (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 20: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 20

Results example (5/5) (score 1st version) Peptide participation in the scoring

Page 21: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 21

Scoring : second version

Non significant peptides are due to the scoring with the noise in the experimental spectrum

But the implementation of a dynamic threshold is not obvious, as for noisy spectra the distinction is not so clear.

Incorporation of threshold based on peak shape detection the experimental and the calculated spectra

intensity variation must be in a given ratio (1/3 to 3) or the peak scoring is rejected.

Page 22: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 22

Results example (1/4) (score 2nd version) Full spectrum (native human Apo AI)

MS spectrum of Apo AI human (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 23: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 23

Results example (2/4) (score 2nd version) significant peptide

MS spectrum of Apo AI human (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 24: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 24

Results example(3/4) (score 2st version) peptide in the noise

MS spectrum of Apo AI human (MALDI TOF)

Simulated spectrumCorrelation spectrum

Page 25: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 25

Results example (4/4) (score 2nd version) peptide participation in the scoring

Page 26: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 26

Result file for user viewing (1/2)

Page 27: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 27

Result file for user viewing (2/2)

Page 28: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 28

ASCQ_ME Performances

First release : Computation of digestion and isotopic distribution

performed for each protein (most time consuming task).

average 1 second per protein (mono processor machine Xeon 2 GHz, 2 Go memory).

MS spectrum of cytochrome C, Swissprot databank (august 2005), 10 miss-cleavages maximum for the digestion, filter BOVINE (average 1600 proteins).

28 min for the identification.

Page 29: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 29

Conclusions and perspectives (1/2)

New approach for the protein identification Based only on the MS spectrum. Without mono isotopic peaks extraction. Digestion algorithm based on a formal proof. Dynamic grammar including consensus sequence

detection. All the algorithm including post-translational modifications is based on chemical formula (and not on

numerical mass). Spectrum simulation with isotopic distribution . Shape recognition in the scoring for detecting only significant peaks and not noise.

Page 30: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 30

Conclusions and perspectives (2/2)

Optimization of the algorithm speed by generating a peptide library.

Scoring optimization to eliminate calibration errors.

Adaptation of the algorithm to MS/MS data.

Implementation of statistics tools in order to validate the results.

More realistic simulated spectrum by including a factor response according to the nature of the peptide (hydrophobic, basic).

Page 31: ASCQ ME: a new engine for peptide mass fingerprint ...jourdan/publi/ASCQ_ME_scba_2005_FINAL_BOISSON.pdf · 2 COM, Chimie Organique et Macromoléculaire , UMR CNRS 8009. 22/10/05 J.C

22/10/05 J.C. Boisson (SCBA 2005) 31

Questions ?

Thank you for your attention