advancing statistical analysis of multiplexed ms/ms quantitative data with scaffold q+

54
Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with Scaffold Q+ Brian C. Searle and Mark Turner Proteome Software Inc. Vancouver Canada, ASMS 2012 Creative Commons Attribution

Upload: jayden

Post on 12-Feb-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with Scaffold Q+. Brian C. Searle and Mark Turner Proteome Software Inc. Vancouver Canada, ASMS 2012. Creative Commons Attribution. Reference. 114. 115. 116. 117. Ref. Ref. 114. 114. 115. 115. 116. 116. 117. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with

Scaffold Q+

Brian C. Searle and Mark TurnerProteome Software Inc.

Vancouver Canada, ASMS 2012

Creative Commons Attribution

Page 2: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

114 115 116 117

Reference

Page 3: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

114 115 116 117 114 115 116 117

ANOVA

114

115

116

117

114

(2)

115

(2)

116

(2)

117

(2)0

0.5

1

1.5

2

Oberg et al 2008 (doi:10.1021/pr700734f)

Ref Ref

Page 4: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

“High Quality” Data• Virtually no

missing data

• Symmetric distribution

• High Kurtosis

Page 5: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

“Normal Quality” Data• High Skew due

to truncation• >20% of intensities

are missing in this channel!

• Either ignore channels with any missing data (0.84 = 41%) …

Page 6: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

“Normal Quality” Data…Or deal with a very

non-Gaussian data!

Page 7: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Contents

• A Simple, Non-parametric Normalization Model

• Refinement 1: Intelligent Intensity Weighting

• Refinement 2: Standard Deviation Estimation

• Refinement 3: Kernel Density Estimation

• Refinement 4: Permutation Testing

Page 8: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Simple, Non-parametric Normalization Model

Page 9: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Additive Effects on Log Scale

• Experiment: sample handling effects across MS acquisitions (LC and MS variation, calibration etc)

• Sample: sample handling effects between channels (pipetting errors, etc)

• Peptide: ionization effects

• Error: variation due to imprecise measurements

log2(intensity) experiment sample peptide error

Oberg et al 2008 (doi:10.1021/pr700734f)

Page 10: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Additive Effects on Log Scale

Effect Subtract Add Back Across

Experimentmedian for all intensities in MS/MS

median for all intensities

entire experiment

Sample median for each channel

median of all channels each MS/MS

Peptide summed intensity for each peptide

median summed intensity each protein

Page 11: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Median Polish

RemoveInter-Experiment

Effects

RemoveIntra-Sample

Effects

RemovePeptideEffects

3x

“Non-Parametric ANOVA”

Page 12: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Refinement 1: Intensity Weighting

Page 13: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Linear Intensity Weighting

Low Intensity,Low Weight High Intensity,

High Weight

Page 14: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Desired Intensity Weighting

Low Intensity,Low Weight

Most Data,High Weight

Saturated Data,Decreased Weight

Page 15: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Variance At Different Intensities

Page 16: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Estimate Confidence from Protein Deviation

Page 17: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Estimate Confidence from Protein Deviation

• Pij = 2 * cumulative t-distribution(tij), wherei = raw intensity binj = each spectrum in bin i = protein median for spectrum j

tij =

• Pi =

x ij

sn

n 1

Pijni

x

Page 18: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Data Dependent Intensity Weighting

Low Intensity,Low Weight

Most Data,High Weight Saturated Data,

Decreased Weight

Page 19: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Desired Intensity Weighting

Low Intensity,Low Weight

Most Data,High Weight

Saturated Data,Decreased Weight

Page 20: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Data Dependent Intensity Weighting

Low Intensity,Low Weight

Most Data,High Weight

Page 21: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Algorithm Schematic

RemoveInter-Experiment

Effects

RemoveIntra-Sample

Effects

RemovePeptideEffects

3xData Dependent

Intensity Weighting

Page 22: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Refinement 2: Standard Deviation Estimation

Page 23: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Standard Deviation Estimation

i = intensity binj = each spectrum in bin i = protein median for spectrum j

Stdev i x ijni

x

Page 24: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Data Dependent Standard Deviation Estimation

Page 25: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Data Dependent Standard Deviation Estimation

Page 26: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Algorithm Schematic

RemoveInter-Experiment

Effects

RemoveIntra-Sample

Effects

RemovePeptideEffects

3xData Dependent

Intensity Weighting

Data Dependent Standard Dev

Estimation

Page 27: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Refinement 3: Kernel Density Estimation

Page 28: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Protein Variance Estimation

Page 29: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Protein Variance Estimation

Page 30: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Kernels

Page 31: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Kernels

Stdev i max min

n

Pi 1.0

Page 32: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Kernels

Page 33: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Kernel Density Estimation

Page 34: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Kernel Density Estimation

Page 35: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Kernel Density Estimation

Deviation that shifts distribution

0.3 shift on Log2 Scale

Page 36: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improved Kernels

• We have a better estimate for Pi: the intensity-based weight!

• We have a better estimate for Stdevi: the intensity-based standard deviation!

Page 37: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improved Kernels

Page 38: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improved Kernel Density Estimation

Page 39: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improved Kernel Density Estimation

Page 40: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improved Kernel Density Estimation

Significant Deviation Worth

InvestigatingUnimportant

Deviation

Page 41: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improved Kernel Density Estimation

1.0 shift on Log2 Scale = 2 Fold Change

Page 42: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Refinement 4: Permutation Testing

Page 43: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Why Use Permutation Testing?

• Why go through all this work to just use a t-test or ANOVA?

• Ranked-based Mann-Whitney and Kruskal-Wallis tests “work”, but lack power

Page 44: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Basic Permutation Test1.11.10.81.11.41.01.00.91.21.00.71.00.70.90.90.00.50.30.71.0

T=4.84

Page 45: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Basic Permutation Test1.1 0.51.1 1.10.8 1.11.1 0.01.4 1.01.0 0.81.0 1.00.9 1.01.2 1.11.0 0.30.7 1.01.0 0.70.7 0.70.9 1.00.9 0.70.0 1.40.5 0.90.3 0.90.7 1.21.0 0.9

T=4.84 T=1.49

Page 46: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Basic Permutation Test1.1 0.5 0.5 0.51.1 1.1 0.9 0.90.8 1.1 1.0 1.41.1 0.0 0.7 1.01.4 1.0 0.7 0.71.0 0.8 1.1 1.11.0 1.0 1.2 1.10.9 1.0 1.0 0.31.2 1.1 1.1 1.21.0 0.3 1.1 1.00.7 1.0 1.0 1.11.0 0.7 0.9 0.70.7 0.7 0.3 0.80.9 1.0 1.0 0.90.9 0.7 0.8 1.00.0 1.4 1.0 1.00.5 0.9 0.7 0.00.3 0.9 0.0 1.00.7 1.2 1.4 0.71.0 0.9 0.9 0.9

x1000

T=4.84 T=1.49 T=1.34 T=1.14

Page 47: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Basic Permutation Test950 below 50 above

501000

p - value 0.05

Page 48: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Improvements…

• N is frequently very small

• Instead of randomizing N points, randomly select N points from Kernel Densities

• Expensive! What if you want more precision?

Page 49: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Extrapolating Precision

Actual T-Statistic of 6.6?

LastUsable

Permutation

1000 below 0 above

01000

p - value ?

Page 50: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Extrapolating Precision

Actual T-Statistic of 6.6?

Knijnenburg, et al 2011 (doi:10.1186/1471-2105-12-411)

Page 51: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Extrapolating Precision

LastUsable

Permutation

Page 52: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Extrapolating Precision

p-value = 0.0000018

LastUsable

Permutation

Page 53: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Conclusions

RemoveInter-Experiment

Effects

RemoveIntra-Sample

Effects

RemovePeptideEffects

3xData Dependent

Intensity Weighting

Data Dependent Standard Dev

Estimation

Kernel Density Estimation

(Fold Changes)

Permutation Testing

(P-Values)

Normalization Interpretation

• All of these ideas work for SILAC/ICAT as well!

Page 54: Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with  Scaffold Q+

Acknowledgements

Proteome Software Team–Bryan Head–Jana Lee–Audrey Lester–Susan Ludwigsen–Jimar Millar–De’Mel Mojica–Mark Turner–Nick Vincent-Maloney–Luisa Zini

Institute of Molecular Pathology–Karl Mechtler

Colorado State University–Jessica Prenni–Karen Dobos

Mayo Clinic, MN–Ann Oberg