Graph Visualization: Extensions
1
Presented byDave FuhryYang Zhang
Outline
• Some Visualization Tools• Why visualization? (Re-motivation)• Challenges• Information Visualization Data Types• TreeMaps• Handling high dimension
– PCA, Co-Clustering, Parallel Coordinates, Grand Tour• PRISM-HD: APSS plot, CSV• Applications 1: Disaster (Geodesic, content)• Applications 2: Social Network Analysis
2
Some Visualization Tools
3
Gephi
Prefuse Gnuplot
GraphViz
matplotlib
NodeXLPajek
d3
Sigma.jsCobweb
InfoViz
Cytoscape
Guess
NetworkX
Force-Directed Graph
Interactive
GUI
Weka
Orange
Outline
• Same challenges as with graph layout• Layout
– Representing items, their attributes, and structure.• Scale
– “Pixel wall”, but Big Data scales to billions of records.– Shneiderman ’08: Billion records into a Million pixels
• Interaction– Enable user to explore and get insight
Set A Set B Set C Set DX Y X Y X Y X Y
10 8.04 10 9.14 10 7.46 8 6.588 6.95 8 8.14 8 6.77 8 5.76
13 7.58 13 8.74 13 12.74 8 7.719 8.81 9 8.77 9 7.11 8 8.84
11 8.33 11 9.26 11 7.81 8 8.4714 9.96 14 8.1 14 8.84 8 7.04
6 7.24 6 6.13 6 6.08 8 5.254 4.26 4 3.1 4 5.39 19 12.5
12 10.84 12 9.11 12 8.15 8 5.567 4.82 7 7.26 7 6.42 8 7.915 5.68 5 4.74 5 5.73 8 6.89
[Anscombe 73]
Summary Statistics Linear RegressionuX = 9.0 σX = 3.317 Y2 = 3 + 0.5 XuY = 7.5 σY = 2.03 R2 = 0.67
Slides courtesy: Jeffrey Heer @ Stanford: A Brief Introduction to Data Visualization
2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
6 8 10 12 14 16 18 200
2
4
6
8
10
12
14
Set A
Set C Set D
Set B
X X
Y
Y
Slides courtesy: Jeffrey Heer @ Stanford: A Brief Introduction to Data Visualization
Slide courtesy: Ben Shneiderman @ UMD: Information Visualization for Knowledge Discovery.
Wattenberg 1998
http://www.smartmoney.com/map-of-the-market/
[Shneiderman ‘92]
Wattenberg 1998
rectangle size: market cap (Q)rectangle position: market sector (N), market cap (Q)color hue: loss vs. gain (N, O)color value: magnitude of loss or gain (Q)
Dimensionality Reduction
• Multidimensional scaling, e.g. PCA
• Self-organizing mapImage credit: Matthias Scholz, http://www.nlpca.org/
Parallel Coordinates
• Draw vertical line for each dimension• Item drawn as line through dimensions
Figures from Xiang, Fuhry, Jin, Zhao, Huang:Visualizing Clusters in Parallel Coordinates…, PAKDD ‘12
Grand Tour
• Visualize HDD with 2D scatterplots• “Tour” randomly generated planes• Smooth transition
[Asimov ‘83][Buja, Cook, Asimov, Hurley. ‘04]
Grand Tour (Demo)
Projection of a grand tour of six-dimensional data. Source: GGobi software.
14
Social networks Protein Interactions Internet
VLSI networks Data dependenciesNeighborhood graphs
PRISM-HD• What?
– A novel mechanism for exploring complex data
• Why?– User is often overwhelmed with
characteristics of data– Befuddled on where to start
• How?– Given, similarity measure-of-interest– Compute similarity graph at threshold (t)
• Key: Graphs are dimensionless
– Provide user graph visualization cues• User determines next threshold and
repeats
HD
HD
HIGH THRESHOLD MODERATE THRESHOLD LOW THRESHOLD
Applications 1: Disaster Mgmt / Geodesic Overlays
Applications 2:Disaster Mgmt / Community Analysis
[Fuhry, Ruan, and Parthasarathy. WebSci’12]
Applications 3:Social Network Analysis
Applications 3:Social Network Analysis (2)
Appendix
Nominal, Ordinal and QuantitativeN - Nominal (labels)
– Fruits: Apples, oranges, …
O - Ordinal (rank-ordered)– Quality of meat: Grade A, AA, AAA
Q - Interval (location of zero arbitrary)– Dates: Jan, 19, 2006; Location: (LAT 33.98, LONG -118.45)– Like a geometric point. Cannot compare directly– Only differences (i.e. intervals) may be compared
Q - Ratio (zero fixed)– Physical measurement: Length, Mass, Temp, …– Counts and amounts– Like a geometric vector, origin is meaningful
S. S. Stevens, On the theory of scales of measurements, 1946
Slide courtesy: Jeffrey Heer @ Stanford: A Brief Introduction to Data Visualization
Ag
e
Marital Status
Sin
gle
Marr
ied
Div
orc
e dW
idow
e d
19701980
19902000
Year
0-19
20-39
40-59
60+
All Marital Status
All Ages
All Years
Sum along Marital Status
Sum along Age
Sum along Year
Slide courtesy: Jeffrey Heer @ Stanford: A Brief Introduction to Data Visualization
Position (x 2)SizeValueTextureColorOrientationShape
Visual encoding variables
PositionLengthAreaVolumeValueTextureColorOrientationShapeTransparencyBlur / Focus …
Visual encoding variables
Image courtesy: “Jer” of blprnt.com. “Just Landed”