Download - Object Orie’d Data Analysis, Last Time
![Page 1: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/1.jpg)
Object Orie’d Data Analysis, Last Time
• Organizational Mattershttp://www.stat-or.unc.edu/webspace/courses/marron/UNCstor891OODA-2007/Stor891-07Home.ht
ml
Note: 1st Part’t Pres’ns, need more…
• Matlab Software
• Time Series of Curves
• Chemometrics Data
• Mortality Data
![Page 2: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/2.jpg)
Data Object Conceptualization
Object Space Feature Space
Curves
Images Manifolds
Shapes Tree Space
Trees
d
![Page 3: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/3.jpg)
Functional Data Analysis, Toy EG I
Object Space Feature Space
![Page 4: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/4.jpg)
Limitation of PCA
PCA can provide useful projection directions
But can’t “see everything”…
Reason:
• PCA finds dir’ns of maximal variation
• Which may obscure interesting structure
![Page 5: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/5.jpg)
Limitation of PCA
Toy Example:
•Apple – Banana – Pear
•Obscured by “noisy dimensions”
•1st 3 PC directions only show noise
•Study some rotations, to find structure
![Page 6: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/6.jpg)
Limitation of PCA, Toy E.g.
![Page 7: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/7.jpg)
Limitation of PCA
Toy Example:
•Rotation shows Apple – Banana – Pear
•Example constructed as:
•1st make these in 3-d
•Add 3 dimensions of high s.d. noise
•Carefully watch axis labels
![Page 8: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/8.jpg)
Yeast Cell Cycle Data• “Gene Expression” – Micro-array data
• Data (after major preprocessing): Expression “level” of:
• thousands of genes (d ~ 1,000s)
• but only dozens of “cases” (n ~ 10s)
• Interesting statistical issue:
High Dimension Low Sample Size data
(HDLSS)
![Page 9: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/9.jpg)
Yeast Cell Cycle Data
Data from:
Spellman, P. T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D. and Futcher, B. (1998), “Comprehensive Identification of Cell Cycle-regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization”, Molecular Biology of the Cell, 9, 3273-3297.
![Page 10: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/10.jpg)
Yeast Cell Cycle Data
Analysis here is from:
Zhao, X., Marron, J.S. and Wells, M.T. (2004) The Functional Data View of Longitudinal Data, Statistica Sinica, 14, 789-808
![Page 11: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/11.jpg)
Yeast Cell Cycle Data• Lab experiment:• Chemically “synchronize cell cycles”, of yeast
cells• Do cDNA micro-arrays over time• Used 18 time points, over “about 2 cell cycles”• Studied 4,489 genes (whole genome)• Time series view of data:
have 4,489 time series of length 18• Functional Data View:
have 18 “curves”, of dimension 4,489
![Page 12: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/12.jpg)
Yeast Cell Cycle Data, FDA View
Central question:Which genes are “periodic” over 2 cell cycles?
![Page 13: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/13.jpg)
Yeast Cell Cycle Data, FDA View
Periodic genes?
Naïve approach:Simple
PCA
![Page 14: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/14.jpg)
Yeast Cell Cycle Data, FDA View
• Central question: which genes are “periodic” over 2 cell cycles?
• Naïve approach: Simple PCA• No apparent (2 cycle) periodic structure?• Eigenvalues suggest large amount of
“variation”• PCA finds “directions of maximal
variation”• Often, but not always, same as
“interesting directions”• Here need better approach to study
periodicities
![Page 15: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/15.jpg)
Yeast Cell Cycle Data, FDA View
Approach• Project on Period 2 Components Only• Calculate via Fourier Representation• To understand, study Fourier Basis
Cute Fact: linear combos of sin and cos capture “phase”, since:
xcxcxxx sincossinsincoscoscos 21
![Page 16: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/16.jpg)
Fourier Basis
![Page 17: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/17.jpg)
Yeast Cell Cycle Data, FDA View
Approach• Project on Period 2 Components Only• Calculate via Fourier Representation• Project onto Subspace of Even
Frequencies• Keeps only 2-period part of data
(i.e. same over both cycles)• Then do PCA on projected data
![Page 18: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/18.jpg)
Fourier Basis
![Page 19: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/19.jpg)
Yeast Cell Cycles, Freq. 2 Proj.
PCA on
Freq. 2
Periodic
Component
Of Data
![Page 20: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/20.jpg)
Yeast Cell Cycles, Freq. 2 Proj.
PCA on periodic component of data • Hard to see periodicities in raw data• But very clear in PC1 (~sin) and PC2
(~cos)• PC1 and PC2 explain 65% of variation
(see residuals) • Recall linear combos of sin and cos
capture “phase”, since: xcxcxxx sincossinsincoscoscos 21
![Page 21: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/21.jpg)
Frequency 2 Analysis
• Important features of data appear
only at frequency 2,
• Hence project data onto 2-dim space
of sin and cos (freq. 2)
• Useful view: scatterplot
• Similar to PCA proj’ns, except
“directions” are now chosen, not “var
max’ing”
![Page 22: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/22.jpg)
Frequency 2 Analysis
![Page 23: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/23.jpg)
Frequency 2 Analysis• Project data onto 2-dim space of sin and cos
(freq. 2)
• Useful view: scatterplot
• Angle (in polar coordinates) shows phase
• Colors: Spellman’s cell cycle phase classification
• Black was labeled “not periodic”
• Within class phases approx’ly same, but notable differences
• Later will try to improve “phase classification”
![Page 24: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/24.jpg)
Batch and Source Adjustment
• For Stanford Breast Cancer Data (C. Perou)
• Analysis in Benito, et al (2004) Bioinformatics, 20, 105-114. https://genome.unc.edu/pubsup/dwd/
• Adjust for Source Effects– Different sources of mRNA
• Adjust for Batch Effects– Arrays fabricated at different times
![Page 25: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/25.jpg)
Idea Behind Adjustment
• Find “direction” from one to other• Shift data along that direction• Details of DWD Direction developed later
![Page 26: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/26.jpg)
Source Batch Adj: Raw Breast Cancer data
![Page 27: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/27.jpg)
Source Batch Adj: Source Colors
![Page 28: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/28.jpg)
Source Batch Adj: Batch Colors
![Page 29: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/29.jpg)
Source Batch Adj: Biological Class Colors
![Page 30: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/30.jpg)
Source Batch Adj: Biological Class Col. &
Symbols
![Page 31: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/31.jpg)
Source Batch Adj: Biological Class Symbols
![Page 32: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/32.jpg)
Source Batch Adj: Source Colors
![Page 33: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/33.jpg)
Source Batch Adj: PC 1-3 & DWD direction
![Page 34: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/34.jpg)
Source Batch Adj: DWD Source Adjustment
![Page 35: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/35.jpg)
Source Batch Adj: Source Adj’d, PCA view
![Page 36: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/36.jpg)
Source Batch Adj: Source Adj’d, Class
Colored
![Page 37: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/37.jpg)
Source Batch Adj: Source Adj’d, Batch
Colored
![Page 38: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/38.jpg)
Source Batch Adj: Source Adj’d, 5 PCs
![Page 39: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/39.jpg)
Source Batch Adj: S. Adj’d, Batch 1,2 vs. 3
DWD
![Page 40: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/40.jpg)
Source Batch Adj: S. & B1,2 vs. 3 Adjusted
![Page 41: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/41.jpg)
Source Batch Adj: S. & B1,2 vs. 3 Adj’d, 5 PCs
![Page 42: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/42.jpg)
Source Batch Adj: S. & B Adj’d, B1 vs. 2 DWD
![Page 43: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/43.jpg)
Source Batch Adj: S. & B Adj’d, B1 vs. 2 Adj’d
![Page 44: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/44.jpg)
Source Batch Adj: S. & B Adj’d, 5 PC view
![Page 45: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/45.jpg)
Source Batch Adj: S. & B Adj’d, 4 PC view
![Page 46: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/46.jpg)
Source Batch Adj: S. & B Adj’d, Class Colors
![Page 47: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/47.jpg)
Source Batch Adj: S. & B Adj’d, Adj’d PCA
![Page 48: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/48.jpg)
Source Batch Adj: Raw Data, Tree View
![Page 49: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/49.jpg)
Source Batch Adj: Raw Data, Array Tree
![Page 50: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/50.jpg)
Source Batch Adj: Raw Array Tree, Source Colored
![Page 51: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/51.jpg)
Source Batch Adj: Raw Array Tree, Batch Colored
![Page 52: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/52.jpg)
Source Batch Adj: Raw Array Tree, Class Colored
![Page 53: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/53.jpg)
Source Batch Adj: DWD Adjusted Data, Tree View
![Page 54: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/54.jpg)
Source Batch Adj: DWD Adjusted Data, Array
Tree
![Page 55: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/55.jpg)
Source Batch Adj: DWD Adjusted Data, Source
Colored
![Page 56: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/56.jpg)
Source Batch Adj: DWD Adjusted Data, Batch
Colored
![Page 57: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/57.jpg)
Source Batch Adj: DWD Adjusted Data, Class
Colored
![Page 58: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/58.jpg)
DWD: A look under the hood
• Distance Weighted Discrimination (DWD)– Modification of Support Vector
Machine– For HDLSS data– Uses 2nd Order Cone programming– Will study later
• Main Goal:– Find direction, that separates data
classes– In “best” possible way
![Page 59: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/59.jpg)
DWD: Why not PC1?
- PC 1 Direction feels variation, not classes
- Also eliminates (important?) within class variation
![Page 60: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/60.jpg)
DWD: Under the hood (cont.)
- Direction driven by classes
- “Sliding” maintains (important?) within class variation
![Page 61: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/61.jpg)
E. g. even worse for PCA
- PC1 direction is worst possible
![Page 62: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/62.jpg)
But easy for DWD
Since DWD uses class label information
![Page 63: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/63.jpg)
DWD does not solve all problems
Only handles means, not differing variation
![Page 64: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/64.jpg)
Interesting Benchmark Data Set
• NCI 60 Cell Lines
– Interesting benchmark, since same cells
– Data Web available:– http://discover.nci.nih.gov/datasetsNature2000.js
p
– Both cDNA and Affymetrix Platforms
• Different from Breast Cancer Data
– Which had no common samples
![Page 65: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/65.jpg)
NCI 60: Raw Data, Platform Colored
![Page 66: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/66.jpg)
NCI 60: Raw Data, Tree View
![Page 67: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/67.jpg)
NCI 60: Raw Data
![Page 68: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/68.jpg)
NCI 60: Raw Data, Before DWD Adjustment
![Page 69: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/69.jpg)
NCI 60: Before & After DWD adjustment
![Page 70: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/70.jpg)
NCI 60: Before & After, new scales
![Page 71: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/71.jpg)
NCI 60: After DWD
![Page 72: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/72.jpg)
NCI 60: DWD adjusted data
![Page 73: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/73.jpg)
NCI 60: Before Column Mean Adjustment
![Page 74: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/74.jpg)
NCI 60: Before & After Column Mean Adjustment
![Page 75: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/75.jpg)
NCI 60: Before & After Col. Mean Adj., Rescaled
![Page 76: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/76.jpg)
NCI 60: After DWD & Column Mean Adj.
![Page 77: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/77.jpg)
NCI 60: DWD & Column Mean Adjusted
![Page 78: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/78.jpg)
NCI 60: Before Column Stand. Dev. Adjustment
![Page 79: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/79.jpg)
NCI 60: Before and After Column S.D. Adjustment
![Page 80: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/80.jpg)
NCI 60: Before and After Col. S.D. Adj., Rescaled
![Page 81: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/81.jpg)
NCI 60: After Column Stand. Dev. adjustment
![Page 82: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/82.jpg)
NCI 60: Fully Adjusted Data
![Page 83: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/83.jpg)
NCI 60: Fully Adjusted Data, Platform Colored
![Page 84: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/84.jpg)
NCI 60: Fully Adjusted Data, Tree View
![Page 85: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/85.jpg)
NCI 60: Fully Adjusted Data, Melanoma Cluster
BREAST.MDAMB435BREAST.MDN MELAN.MALME3M MELAN.SKMEL2 MELAN.SKMEL5 MELAN.SKMEL28 MELAN.M14 MELAN.UACC62 MELAN.UACC257
![Page 86: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/86.jpg)
NCI 60: Adjusted, Melanoma Cluster, Tree
View
All Melanoma Lines + 2 Breast Lines
in Same Cluster
![Page 87: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/87.jpg)
NCI 60: Fully Adjusted Data, Leukemia Cluster
LEUK.CCRFCEM LEUK.K562 LEUK.MOLT4 LEUK.HL60 LEUK.RPMI8266LEUK.SR
![Page 88: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/88.jpg)
NCI 60: Adjusted, Leukemia Cluster, Tree
View
![Page 89: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/89.jpg)
Another DWD Appl’n: Visualization
• Recall PCA limitations• DWD uses class info• Hence can “better separate known
classes”• Do this for pairs of classes
(DWD just on those, ignore others)• Carefully choose pairs in NCI 60 data• Shows Effectiveness of Adjustment
![Page 90: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/90.jpg)
NCI 60: Views using DWD Dir’ns (focus on
biology)
![Page 91: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/91.jpg)
DWD Visualization of NCI 60 Data
• Most cancer types clearly distinct(Renal, CNS, Ovar, Leuk, Colon, Melan)
• Using these carefully chosen directions
• Others less clear cut– NSCLC (at least 3 subtypes)– Breast (4 published subtypes)
• DWD adjustment was very effective(very few black connectors visible)
![Page 92: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/92.jpg)
PCA Visualization of NCI 60 Data
• Can we find classes:(Renal, CNS, Ovar, Leuk, Colon, Melan)
• Using PC directions??• First try “unsupervised view”• I.e. switch off class colors• Then turn on colors, to identify
clusters• I.e. look at “supervised view”
![Page 93: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/93.jpg)
NCI 60: Can we find classes
Using PCA view?
![Page 94: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/94.jpg)
NCI 60: Can we find classes
Using PCA view?
![Page 95: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/95.jpg)
PCA Visualization of NCI 60 Data
Maybe need to look at more PCs?
Study array of such PCA projections:
129
1298585
12841854141
PC
vsPC
vsvsPC
![Page 96: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/96.jpg)
NCI 60: Can we find classes
Using PCA 5-8?
X
![Page 97: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/97.jpg)
NCI 60: Can we find classes
Using PCA 5-8?
X
![Page 98: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/98.jpg)
NCI 60: Can we find classes
Using PCA 9-12?
X
![Page 99: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/99.jpg)
NCI 60: Can we find classes
Using PCA 9-12?
X
![Page 100: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/100.jpg)
NCI 60: Can we find classes
Using PCA 1-4 vs. 5-8?
X
![Page 101: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/101.jpg)
NCI 60: Can we find classes
Using PCA 1-4 vs. 5-8?
X
![Page 102: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/102.jpg)
NCI 60: Can we find classes
Using PCA 1-4 vs. 9-12?
X
![Page 103: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/103.jpg)
NCI 60: Can we find classes
Using PCA 5-8 vs. 9-12?
X
![Page 104: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/104.jpg)
PCA Visualization of NCI 60 Data
Can we find classes using PC directions??• Found some, but not others• Not so distinct as in DWD view• Nothing after 1st five PCs • Rest seem to be noise driven• Orthogonality too strong a constraint???• Interesting dir’ns are nearly orthogonal
![Page 105: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/105.jpg)
NCI 60: Views using DWD Dir’ns (focus on
biology)
![Page 106: Object Orie’d Data Analysis, Last Time](https://reader035.vdocuments.mx/reader035/viewer/2022062314/56814724550346895db45cdf/html5/thumbnails/106.jpg)
PCA Visualization of NCI 60 Data
Finally• PCA view confirms quality of data
combo• Since connecting lines relatively short• Perhaps not so short as DWD views?• Suggests DWD adjustment
focuses on data classes?• Will see this again• And explain it later