data viz for gc high dim data - robert haralickharalick.org › dv ›...
TRANSCRIPT
![Page 1: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/1.jpg)
Michael Grossberg
Data Visualization Multi and HighDimensional Data
![Page 2: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/2.jpg)
Review: Data Types
![Page 3: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/3.jpg)
Multi-Dimensional Data❖ Tabular data, containing
❖ rows (records)
❖ columns (dimensions)
❖ rows >> columns
❖ identifiers introduce semantics
❖ Independent & dependent variables
❖ dependent are f(independent)
![Page 4: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/4.jpg)
Muli-Dimensional Type
![Page 5: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/5.jpg)
Limits to Displaying Dimensions
Limit? 4D? 5D? … 10D!?!
![Page 6: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/6.jpg)
High-Dimensional Data Visualization❖ How many dimensions?
❖ ~50 –tractable with “just” vis
❖ ~1000 –need analytical methods
❖ How many records?
❖ ~ 1000 –“just” visis fine
❖ >> 10,000 –need analytical methods
❖ Homogeneity
❖ Same data type?
❖ Same scales?
![Page 7: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/7.jpg)
Analytic Component
no / little analytics strong analytics component
Scatterplot Matrices
Parallel Coordinates
Pixel-based visualizations/heat maps
Multi-dimensional Scaling
LDA
![Page 8: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/8.jpg)
Geometric Methods
![Page 9: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/9.jpg)
Parallel Coordinates
❖ Each axis represents dimension
❖ Lines connecting axis represent records
❖ Suitable for
❖ all tabular data types
❖ heterogeneous data
![Page 10: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/10.jpg)
D3 Parallel Coordinates
http://bl.ocks.org/jasondavies/1341281
![Page 11: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/11.jpg)
Parallel Coordinates❖ Shows primarily relationships
between adjacent axis
❖ Limited scalability (~50 dimensions, ~1-5k records)
❖ Transparency of lines
❖ Interaction is crucial
❖ Axis reordering
❖ Brushing
❖ Filtering
❖ Algorithmic approaches:
❖ Choosing dimensions
❖ Choosing order
❖ Clustering & aggregating records
![Page 12: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/12.jpg)
500-Axis Parallel Coordinate
![Page 13: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/13.jpg)
Ambiguities
![Page 14: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/14.jpg)
Star Plot/Radar Plot❖ Similar to parallel
coordinates
❖ Radiate from a common origin
Coekin 1969
![Page 16: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/16.jpg)
Trellis Plots
Multiple Plots (often with shared axis)
![Page 17: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/17.jpg)
Small Multiples
❖ Like Trellis❖ More plots❖ Axis may not be
important❖ Each plot point on axis
❖ Use multiple views to show different partitions of a dataset
http://andrewgelman.com/2009/07/15/hard_sell_for_b/
public support for vouchers, Andrew Gelman 2009
![Page 19: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/19.jpg)
Combining Various Charts
![Page 20: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/20.jpg)
Scatterplot Matrices (SPLOM)
❖ Matrix of size d*d
❖ Each row/column is one dimension
❖ Each cell plots a scatterplot of two dimensions
http://mbostock.github.io/d3/talk/20111116/iris-splom.html
![Page 21: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/21.jpg)
Scatterplot Matrices❖ Limited scalability (~20
dimensions, ~500-1k records)
❖ Brushing is important
❖ Often combined with “Focus Scatterplot” as F+C technique
❖ Algorithmic approaches:
❖ Clustering & aggregating records
❖ Choosing dimensions
❖ Choosing order
![Page 22: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/22.jpg)
SPLOM Aggregation
Datavore: http://vis.stanford.edu/projects/datavore/splom/
![Page 24: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/24.jpg)
Combining PCs & Sploms
Claessen & van Wijk 2011
![Page 25: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/25.jpg)
Connected Charts
http://profs.etsmtl.ca/mmcguffin/research/2012-viau-connectedCharts/viau-eurovis2012-connectedCharts.pdf
C. Viau, M. J. McGuffin 2012
http://profs.etsmtl.ca/mmcguffin/research/
![Page 26: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/26.jpg)
Data Reduction
![Page 27: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/27.jpg)
Reducing Rows
Sampling ClusteringFiltering
http://www.jasondavies.com/parallel-sets/
Later
![Page 28: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/28.jpg)
Sampling
❖ show (random) subset
❖ Efficient for large dataset
❖ For display purposes
❖ Outlier-preserving approaches
Ellis & Dix, 2006
![Page 29: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/29.jpg)
Filtering
❖ Criteria to remove data
❖ minimum variability
❖ Range for dimension
❖ consistency in replicates
![Page 31: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/31.jpg)
Clustering
Clustering
http://www.jasondavies.com/parallel-sets/
Later
![Page 32: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/32.jpg)
Dimension Reduction
![Page 33: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/33.jpg)
Simple Random Projection
Experiments with Random Projection,Sanjoy Dasgupta, 2000
![Page 34: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/34.jpg)
Grand Tour
Travel a Dense Pathon 2-plane projections
in N-SpaceLook at all the scatterplots (Movie)
![Page 35: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/35.jpg)
Interesting Projections
❖ Search for 2D scatter plots that maximize/minimize some attribute
❖ Clumpiness
❖ Variance
❖ Non-gausianness (entropy)
![Page 36: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/36.jpg)
Scagnostics
TN Dang, L Wilkinson - 2014
Look for projections whichlook interesting
![Page 37: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/37.jpg)
Searching for Projections
![Page 38: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/38.jpg)
Principal Component Analysis (PCA)
![Page 39: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/39.jpg)
1-D mean, stdevMean
Variance = (Standard Deviation)2
![Page 40: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/40.jpg)
Normal (1-D ) distribution
![Page 41: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/41.jpg)
N-D mean, (co-)varianceN-D Mean
N-D Covariance
![Page 42: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/42.jpg)
Matrix Versions
rows observations
columnsdimensions/attributes
N-D Covariance
![Page 43: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/43.jpg)
N-D Normal Distribution
![Page 44: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/44.jpg)
DiagonalizationSymmetric matrix
Diagonalization (SVD)
Eigenvectors
EigenvaluesVariances
First principal component
![Page 45: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/45.jpg)
Eigenvectors = Principal Components
First principal component = direction of greatest variance
![Page 46: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/46.jpg)
Least Squares
![Page 47: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/47.jpg)
Data Oriented Coordinates
http://georgemdallas.wordpress.com/2013/10/30/principal-component-analysis-4-dummies-eigenvectors-eigenvalues-and-dimension-reduction/
![Page 48: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/48.jpg)
PCA in Vis ReductionPCA from Harvard Cs171 student project
http://mu-8.com Mercer and Pandian
![Page 49: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/49.jpg)
Many other Linear Component ReductionsLinear Discriminant Analysis (LDA)
Separate classes
Non-negative Matrix Factorization
AlsoFactor Analysis
Dictionary Learning
Latent Semantic Analysis (LSA)
![Page 50: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/50.jpg)
Projection Pursuit
Independent Component Analysis (ICA)
Make components independent
Maximize Entropy/Non-gausianness
![Page 51: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/51.jpg)
Non-Linear Dimension ReductionMulti-Dimensional Scaling (MDS)
Kernel PCA
Non-Linear map to lower dimensional space
Preserve distances between observations
Manifold Learning
![Page 52: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/52.jpg)
Pixel Based Methods
![Page 53: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/53.jpg)
Daniel Keim ❖ Pixel Based Methods
Designing Pixel-Oriented VisualizationTechniques: Theory and Applications
Daniel A. Keim, 2000
![Page 54: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/54.jpg)
3D Pitfall: Occlusion & Perspective
Gehlenborg and Wong, Nature Methods, 2012
![Page 55: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/55.jpg)
3D Pitfall: Occlusion & Perspective
Gehlenborg and Wong, Nature Methods, 2012
![Page 56: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/56.jpg)
Pixel Based Displays❖ Each cell is a “pixel”, value
encoded in color / value
❖ Meaning derived from ordering
❖ If no ordering inherent, clustering is used
❖ Scalable –1 px per item
❖ Good for homogeneous data
❖ same scale & type
![Page 57: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/57.jpg)
Bad Color Mapping
![Page 58: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/58.jpg)
Good Color Mapping
![Page 59: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/59.jpg)
Color is relative!
![Page 60: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/60.jpg)
Machine Learning❖ Supervised Learning
❖ A.K.A. Classification
❖ Known labels (subset of rows)
❖ Algorithms: label unlabeled rows
❖ Unsupervised Learning
❖ A.K.A Clustering
❖ Algorithm: label based on similarity
❖ Semi-Supervised Learning
❖ Do both
![Page 61: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/61.jpg)
Clustering
Partition Hierarchical
Bi-Partite Fuzzy
![Page 62: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/62.jpg)
Machine Learning❖ Supervised Learning
❖ A.K.A. Classification
❖ Known labels (subset of rows)
❖ Algorithms: label unlabeled rows
❖ Unsupervised Learning
❖ A.K.A Clustering
❖ Algorithm: label based on similarity
❖ Semi-Supervised Learning
❖ Do both
![Page 63: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/63.jpg)
Partition Clustering
Each Point in a Unique Class
![Page 64: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/64.jpg)
Centroid Based ClusteringExample Algorithm: K-Means Clustering
Partition
![Page 65: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/65.jpg)
Distribution Based ClusteringExpectation Maximization (EM)
Partition
![Page 66: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/66.jpg)
Density Based Clustering
Group areas with Certain DensityOPTICS (more general)
DBSCAN Mean Shift
Partition
![Page 67: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/67.jpg)
Simple Graph Cutting Methods
(1) Similarity Score(2) Pick Edge Threshold(3) Cut (only connect stronger edges)(4) Compute Connected components
Partition
![Page 68: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/68.jpg)
Graph Cut Based
Spectral Clustering
Partition
![Page 69: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/69.jpg)
Hierarchical Clustering
Hierarchical
Example: Ward Clustering
![Page 70: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/70.jpg)
Comparison
Different Clustering Based on Different Assumptions
![Page 71: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/71.jpg)
Bipartite Clustering (Biclustering)
Find Blocks in Data MatricesBi-Partite
![Page 72: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/72.jpg)
Fuzzy Clustering
Fuzzy
Membership in Cluster is Floating Point
![Page 73: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/73.jpg)
Clustering Applications❖ Clusters can be used to
❖ order (pixel based techniques)
❖ brush (geometric techniques)
❖ aggregate
❖ Aggregation
❖ cluster more homogeneous than whole dataset
❖ statistical measures, distributions, etc. more meaningful
http://www.jasondavies.com/parallel-sets/
![Page 74: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/74.jpg)
Clustered Heat Map
![Page 75: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/75.jpg)
F+C Approach, with Dendrograms
Lex, PacificVis, 2010
![Page 76: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/76.jpg)
Cluster ComparisonCaleydo Matchmaker
Lex, Streit, Partl, Kashofer, Schmalstieg 2010
https://www.youtube.com/watch?v=vi-G3LqHFZA
![Page 77: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/77.jpg)
Aggregation
http://people.seas.harvard.edu/~alex/papers/2011_infovis_visbricks.pdf
VisBricks
![Page 78: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/78.jpg)
Heterogeneous Data
![Page 79: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/79.jpg)
Heterogeneous Variables
https://www.youtube.com/watch?v=s2ZofJ2GVHU
Caleydo StratomeX
![Page 80: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/80.jpg)
Glyphs
![Page 81: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/81.jpg)
Chernoff Faces
http://www.nytimes.com/2008/04/01/science/01prof.htmlFail?
![Page 82: Data Viz for GC High Dim Data - Robert Haralickharalick.org › DV › Data_Viz_for_GC_High_Dim_Data.pdf · High-Dimensional Data Visualization How many dimensions? ~50 –tractable](https://reader033.vdocuments.mx/reader033/viewer/2022060414/5f125de4b2e115041e74cdc7/html5/thumbnails/82.jpg)
Complex Glyphs for Bio Workflows
Maguire 2012