felix naef & marcelo magnasco, gl meeting, nov. 19 2001 [email protected] outline
DESCRIPTION
Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 [email protected] Outline. Excursions into GeneChip data analysis. Background subtraction Probeset statistics. Background estimation. estimate both mean B and fluctuations s needed in low-intensity regime - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/1.jpg)
Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 [email protected]
Outline• Background subtraction
• Probeset statistics
Excursions into GeneChip data analysis
![Page 2: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/2.jpg)
Background estimation
• estimate both mean B and fluctuations • needed in low-intensity regime
• includes light reflection from substrate,
photodetector dark current, some cross-
hybridization (i.e. small residues)
• by the CLT, background is expected to be a Gaussian variable
![Page 3: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/3.jpg)
• idea: B is insensitive to MM and visible at low
intensity
• select probes such that |PM-MM| < (locally?)
• use =50 (new) or 100 (old settings)
• P(PM) or P(MM) is convolution of Gaussian and
step function
“+” =
0 B
B
Real P( P
M)
![Page 4: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/4.jpg)
example:
)
dependence on
![Page 5: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/5.jpg)
trick for dealing with negative values
![Page 6: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/6.jpg)
PM vs. MM distribution
MM>PMMM>PM
make a histogramin this regionmake a histogramin this region
zoom
![Page 7: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/7.jpg)
PM vs. MM histogram
![Page 8: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/8.jpg)
MM>PM across different chips
MM>PM not concentrated at low intensities: 27% of probe pairs with MM>PM are in the top quartile
Chip Dros HG85A Mu11K U74A YG_S98# pairs 14 16 20 16 16# samples 36 86 24 12 4% MM>PM 0.35 0.31 0.34 0.34 0.17% probesets with 1 MM>PM 0.951 0.91 0.95 0.92 0.73% probesets with 5 MM>PM 0.58 0.56 0.71 0.64 0.21% probesets with 10 MM>PM 0.04 0.07 0.26 0.1 0.02
![Page 9: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/9.jpg)
probe pairs trajectories (~80 chips)
• take all (PM, MM) for
a given probe set• center of mass (x,y)• ellipsoid of inertia
> and
• histogram the cm’s• color code acc. to
s = / (min(x, y
~ noise detrending
![Page 10: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/10.jpg)
all probe sets
blue : large sgreen : midred : small
![Page 11: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/11.jpg)
probes with ‘well’defined trajectories (eccentricity > 3)
~1/3 of probes
blue : largegreen : midred : small
![Page 12: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/12.jpg)
PM within a probe set
Are the brightness of the probes reasonably uniform? Or do different probes have very different hybridization efficiencies?
![Page 13: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/13.jpg)
So what can possibly be happening?
• sequence dependent hybridization efficiencies
are kinetic effects important?• cross-hybridization beyond what is detectable by
MM probes
this is hard to assess without sequence info• sequence dependent fabrication efficiencies?
variable probe densities
![Page 14: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/14.jpg)
Composite scores
What have we learned from previous slides?
• MM are not consistently behaving as expected
- What about not using them ?
• The probe set intensities vary over decades
- difficult to estimate absolute intensities using ‘averages’ (alternative: Li and Wong)- we focus on ratio scores
![Page 15: Felix Naef & Marcelo Magnasco, GL meeting, Nov. 19 2001 felix@funes.rockefeller Outline](https://reader035.vdocuments.mx/reader035/viewer/2022081603/56814948550346895db694ea/html5/thumbnails/15.jpg)
Outline of algorithm
1. estimate background (mean and std)
2. discard noisy and saturated probes use either only PM or PM-MM as raw intensities
3. average the remaining log-ratios in an outlier robust way (robust regression to intercept), SE
4. normalize by centering (event. local) log-ratio distribution