advancing statistical analysis of multiplexed ms/ms quantitative data with scaffold q+
DESCRIPTION
Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with Scaffold Q+. Brian C. Searle and Mark Turner Proteome Software Inc. Vancouver Canada, ASMS 2012. Creative Commons Attribution. Reference. 114. 115. 116. 117. Ref. Ref. 114. 114. 115. 115. 116. 116. 117. - PowerPoint PPT PresentationTRANSCRIPT
Advancing Statistical Analysis of Multiplexed MS/MS Quantitative Data with
Scaffold Q+
Brian C. Searle and Mark TurnerProteome Software Inc.
Vancouver Canada, ASMS 2012
Creative Commons Attribution
114 115 116 117
Reference
114 115 116 117 114 115 116 117
ANOVA
114
115
116
117
114
(2)
115
(2)
116
(2)
117
(2)0
0.5
1
1.5
2
Oberg et al 2008 (doi:10.1021/pr700734f)
Ref Ref
“High Quality” Data• Virtually no
missing data
• Symmetric distribution
• High Kurtosis
“Normal Quality” Data• High Skew due
to truncation• >20% of intensities
are missing in this channel!
• Either ignore channels with any missing data (0.84 = 41%) …
“Normal Quality” Data…Or deal with a very
non-Gaussian data!
Contents
• A Simple, Non-parametric Normalization Model
• Refinement 1: Intelligent Intensity Weighting
• Refinement 2: Standard Deviation Estimation
• Refinement 3: Kernel Density Estimation
• Refinement 4: Permutation Testing
Simple, Non-parametric Normalization Model
Additive Effects on Log Scale
• Experiment: sample handling effects across MS acquisitions (LC and MS variation, calibration etc)
• Sample: sample handling effects between channels (pipetting errors, etc)
• Peptide: ionization effects
• Error: variation due to imprecise measurements
log2(intensity) experiment sample peptide error
Oberg et al 2008 (doi:10.1021/pr700734f)
Additive Effects on Log Scale
Effect Subtract Add Back Across
Experimentmedian for all intensities in MS/MS
median for all intensities
entire experiment
Sample median for each channel
median of all channels each MS/MS
Peptide summed intensity for each peptide
median summed intensity each protein
Median Polish
RemoveInter-Experiment
Effects
RemoveIntra-Sample
Effects
RemovePeptideEffects
3x
“Non-Parametric ANOVA”
Refinement 1: Intensity Weighting
Linear Intensity Weighting
Low Intensity,Low Weight High Intensity,
High Weight
Desired Intensity Weighting
Low Intensity,Low Weight
Most Data,High Weight
Saturated Data,Decreased Weight
Variance At Different Intensities
Estimate Confidence from Protein Deviation
Estimate Confidence from Protein Deviation
• Pij = 2 * cumulative t-distribution(tij), wherei = raw intensity binj = each spectrum in bin i = protein median for spectrum j
tij =
• Pi =
x ij
sn
n 1
Pijni
x
Data Dependent Intensity Weighting
Low Intensity,Low Weight
Most Data,High Weight Saturated Data,
Decreased Weight
Desired Intensity Weighting
Low Intensity,Low Weight
Most Data,High Weight
Saturated Data,Decreased Weight
Data Dependent Intensity Weighting
Low Intensity,Low Weight
Most Data,High Weight
Algorithm Schematic
RemoveInter-Experiment
Effects
RemoveIntra-Sample
Effects
RemovePeptideEffects
3xData Dependent
Intensity Weighting
Refinement 2: Standard Deviation Estimation
Standard Deviation Estimation
i = intensity binj = each spectrum in bin i = protein median for spectrum j
Stdev i x ijni
x
Data Dependent Standard Deviation Estimation
Data Dependent Standard Deviation Estimation
Algorithm Schematic
RemoveInter-Experiment
Effects
RemoveIntra-Sample
Effects
RemovePeptideEffects
3xData Dependent
Intensity Weighting
Data Dependent Standard Dev
Estimation
Refinement 3: Kernel Density Estimation
Protein Variance Estimation
Protein Variance Estimation
Kernels
Kernels
Stdev i max min
n
Pi 1.0
Kernels
Kernel Density Estimation
Kernel Density Estimation
Kernel Density Estimation
Deviation that shifts distribution
0.3 shift on Log2 Scale
Improved Kernels
• We have a better estimate for Pi: the intensity-based weight!
• We have a better estimate for Stdevi: the intensity-based standard deviation!
Improved Kernels
Improved Kernel Density Estimation
Improved Kernel Density Estimation
Improved Kernel Density Estimation
Significant Deviation Worth
InvestigatingUnimportant
Deviation
Improved Kernel Density Estimation
1.0 shift on Log2 Scale = 2 Fold Change
Refinement 4: Permutation Testing
Why Use Permutation Testing?
• Why go through all this work to just use a t-test or ANOVA?
• Ranked-based Mann-Whitney and Kruskal-Wallis tests “work”, but lack power
Basic Permutation Test1.11.10.81.11.41.01.00.91.21.00.71.00.70.90.90.00.50.30.71.0
T=4.84
Basic Permutation Test1.1 0.51.1 1.10.8 1.11.1 0.01.4 1.01.0 0.81.0 1.00.9 1.01.2 1.11.0 0.30.7 1.01.0 0.70.7 0.70.9 1.00.9 0.70.0 1.40.5 0.90.3 0.90.7 1.21.0 0.9
T=4.84 T=1.49
Basic Permutation Test1.1 0.5 0.5 0.51.1 1.1 0.9 0.90.8 1.1 1.0 1.41.1 0.0 0.7 1.01.4 1.0 0.7 0.71.0 0.8 1.1 1.11.0 1.0 1.2 1.10.9 1.0 1.0 0.31.2 1.1 1.1 1.21.0 0.3 1.1 1.00.7 1.0 1.0 1.11.0 0.7 0.9 0.70.7 0.7 0.3 0.80.9 1.0 1.0 0.90.9 0.7 0.8 1.00.0 1.4 1.0 1.00.5 0.9 0.7 0.00.3 0.9 0.0 1.00.7 1.2 1.4 0.71.0 0.9 0.9 0.9
x1000
T=4.84 T=1.49 T=1.34 T=1.14
Basic Permutation Test950 below 50 above
501000
p - value 0.05
Improvements…
• N is frequently very small
• Instead of randomizing N points, randomly select N points from Kernel Densities
• Expensive! What if you want more precision?
Extrapolating Precision
Actual T-Statistic of 6.6?
LastUsable
Permutation
1000 below 0 above
01000
p - value ?
Extrapolating Precision
Actual T-Statistic of 6.6?
Knijnenburg, et al 2011 (doi:10.1186/1471-2105-12-411)
Extrapolating Precision
LastUsable
Permutation
Extrapolating Precision
p-value = 0.0000018
LastUsable
Permutation
Conclusions
RemoveInter-Experiment
Effects
RemoveIntra-Sample
Effects
RemovePeptideEffects
3xData Dependent
Intensity Weighting
Data Dependent Standard Dev
Estimation
Kernel Density Estimation
(Fold Changes)
Permutation Testing
(P-Values)
Normalization Interpretation
• All of these ideas work for SILAC/ICAT as well!
Acknowledgements
Proteome Software Team–Bryan Head–Jana Lee–Audrey Lester–Susan Ludwigsen–Jimar Millar–De’Mel Mojica–Mark Turner–Nick Vincent-Maloney–Luisa Zini
Institute of Molecular Pathology–Karl Mechtler
Colorado State University–Jessica Prenni–Karen Dobos
Mayo Clinic, MN–Ann Oberg