a quick overview of some visualization techniques and suggestions for projects
DESCRIPTION
A Quick Overview of Some Visualization Techniques and Suggestions for Projects. Prof. Michael McGuffin ETS, Montreal, Canada http://profs.logti.etsmtl.ca/mmcguffin/. - PowerPoint PPT PresentationTRANSCRIPT
A Quick Overview of Some Visualization Techniques and
Suggestions for Projects
Prof. Michael McGuffin
ETS, Montreal, Canada
http://profs.logti.etsmtl.ca/mmcguffin/
There are two main kinds of data we can visualize:
multidimensional data (which includes tables, mathematical
relations, and functions),and network data
(which includes trees).
Multidimensional Data:Scatterplot Matrices
Multidimensional Data
Consider students’ marks in …• Physics: 90%• Math: 95%• French Literature: 65%• History: 70%
Each student can be thought of as a tuple• (90%, 95%, 65%, 70%)• Etc.
Scatterplot Matrix (SPLOM)
Physics
Math
FrenchLiterature
History
(90%, 95%, 65%, 70%)
FrenchLiterature
Math
Scatterplot Matrix (SPLOM)
Physics
Math
FrenchLiterature
History
(90%, 95%, 65%, 70%)
(30%, 20%, 90%, 90%)
FrenchLiterature
Math
Scatterplot Matrix (SPLOM)
Nik
las
Elm
qvis
t, P
ierr
e D
ragi
cevi
c, J
ean-
Da
niel
Fek
ete
(200
8).
“Rol
ling
the
Dic
e: M
ulti
dim
ens
iona
l Vis
ual E
xplo
ratio
n u
sin
g S
catt
erpl
ot M
atrix
Na
vig
atio
n”.
Pro
cee
ding
s of
Inf
oVis
200
8.
Within each scatterplot, we could be interested in seeing outliers, correlations, etc.
Notice: the upper triangular half is the same as the lower triangular half, and the diagonal is not very interesting.
Scatterplot Matrix (SPLOM)
Wilkinson, Anand, Grossman,“Graph-Theoretic Scagnostics”, 2005
Notice: the diagonal is used to show the names of dimensions.
Example from Matlab: “carbig.mat”
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/mvplotdemo.html
SPLOM with histograms along the diagonal. Colors indicate the number of cylinders in the engine of each automobile.
Matrix of correlation coefficientsWhen we have many dimensions, we can summarize each scatterplot by computing its correlation coefficient and displaying only that, instead of
displaying all the individual data points. The below interface also allows the user to select one scatterplot and see a zoomed-in view for details.
Jinwook Seo and Ben Shneiderman, “A Rank-by-Feature Framework for …”, Proceedings of InfoVis 2004.
Corrgrams (Michael Friendly, 2002)
http://www.math.yorku.ca/SCS/Gallery/images/corrgram2t.gif
Multidimensional Data:Parallel Coordinates
(Alfred Inselberg and others)
Multidimensional Data
Consider students’ marks in …• Physics: 90%• Math: 95%• French Literature: 65%• History: 70%
Each student can be thought of as a tuple• (90%, 95%, 65%, 70%)• Etc.
Parallel Coordinates
100%
0%
Physics MathFrenchLiterature History
(90%, 95%, 65%, 70%)
Parallel Coordinates
100%
0%
Physics MathFrenchLiterature History
(90%, 95%, 65%, 70%)
(30%, 20%, 90%, 90%)
Parallel Coordinates
Johansson et al. 2005
Parallel Coordinates
Ellis, Bertini, Dix, “The Sampling Lens …”, 2005Ellis, Dix, “Enabling Automatic Clutter Reduction …”, 2006
Multidimensional Data:Scatterplot and Parallel Coordinate
Combinations
Example from Matlab: “carbig.mat”
http://www.mathworks.com/products/statistics/demos.html?file=/products/demos/shipping/stats/mvplotdemo.html
[Lopez, McGuffin, & Barford; not yet published]
Combinations of Parallel Coordinates and Scatterplots
Steed et al. 2009 Holten and van Wijk 2010
Yuan et al. 2009
Multidimensional Data:Glyphs
The fonction y = x^0.5:
x y--- --- 0 0 1 1 4 2 9 3...
The relation in a table of a relational database:
Client_name Product_purchased Price Date ...------------- ----------------- ------- ------------ -----Robert G. Trombone 500.00 2008 mars 7 .Robert G. Partitions vol. 1 45.00 2008 mars 7 .Lucie M. Flute 180.00 2007 nov 11 .Cynthia S. Partitions vol. 2 40.00 2008 juin 16Jules T. Piano 6000.00 2008 jan 10Jules T. Partitions vol. 1 45.00 2008 jan 13...
A video (for example, an .avi file):
x y time red green blue--- --- ------- ------- ------ ------ 0 0 0 255 0 0 0 1 0 200 10 6 ... 0 0 0.1 255 50 100 0 1 0.1 255 200 190 ...
Examples of relations. A relationis a subset of a cartesian product of two or more sets. Each relation can be thought of as a multidimensional data set. In the examples here, each row is a tuple, each column is a dimension.
Chernoff faces (1973)(an example of a "glyph")
Advantage: better than text to get a global impression of the data and to find interesting elements.
Disadvantage: the mapping from variables to faces has an effect on the saliency of each variable.
Disadvantage(?): redundancy of a symmetrical face.
http
://kspa
rk.kaist.a
c.kr/Hu
ma
n%
20
En
gin
ee
ring
.files/C
he
rno
ff/life_
in_
LA
.jpg
htt
p:/
/ma
pm
ake
r.ru
tge
rs.e
du
/35
5/C
he
rno
ff_
face
.gif
Other examples of glyphs
M. Ward (2002), “A Taxonomy of Glyph Placement Strategies for Multidimensional Data Visualization”, Information Visualization.
Interactive Presentation from the U.N.(United Nations Development Programme, Human Development Report)
See also Hans Rosling’s presentation on http://www.ted.com
Notice: the points are glyphs!
Multidimensional Data:Other techniques …
The fonction y = x^0.5:
x y--- --- 0 0 1 1 4 2 9 3...
The relation in a table of a relational database:
Client_name Product_purchased Price Date ...------------- ----------------- ------- ------------ -----Robert G. Trombone 500.00 2008 mars 7 .Robert G. Partitions vol. 1 45.00 2008 mars 7 .Lucie M. Flute 180.00 2007 nov 11 .Cynthia S. Partitions vol. 2 40.00 2008 juin 16Jules T. Piano 6000.00 2008 jan 10Jules T. Partitions vol. 1 45.00 2008 jan 13...
A video (for example, an .avi file):
x y time red green blue--- --- ------- ------- ------ ------ 0 0 0 255 0 0 0 1 0 200 10 6 ... 0 0 0.1 255 50 100 0 1 0.1 255 200 190 ...
Examples of relations. A relationis a subset of a cartesian product of two or more sets. Each relation can be thought of as a multidimensional data set. In the examples here, each row is a tuple, each column is a dimension.
Red
Blue
Green
A video
Network Data:Node-Link Diagrams
Note: in the context of this discussion, the word “graph” means the same thing as “network”, i.e., a “graph” is a topological structure with nodes and edges that can be drawn many different ways. In more
general English conversation, however, the word “graph” may simply mean a picture showing a mathematical function or something similar. To avoid confusion, it is often good to use the word “network” to refer to
the topological structure.
Generated with a “SugiyamaLayout”( http://www.ogdf.net/ogdf.php/tech:howto:hierlr )
Generated by Graphviz( http://www.graphviz.org/Gallery/directed/unix.html )
How to compute the layout of a graph?
• Many algorithms are available– See “Graph Drawing” annual conference proceedings– See book “Graph Drawing: Algorithms for the Visualization
of Graphs” by Di Battista et al. (1999)– Examples: Reingold-Tilford (1981) for binary trees,
Sugiyama et al. (1981) for directed acyclic graphs (DAGs), or examples on previous slide
• Most of these algorithms are not easy to implement• One class of algorithms that are easy to implement
(in their naïve form) and that are applicable to any graph: force-directed layout
Force-Directed Layout of a Network in 3D
http://profs.logti.etsmtl.ca/mmcguffin/research/graph3D/
•Pseudo-physical simulation of masses and springs that converges toward a final layout
•The nodes are masses that are mutually repelled by an electric force
•The edges are springs
Tree Data
A tree can be thought of as simply a special kind of network.
A naïve and easy-to-program layout: each subtree has an interval in x that is not overlapped by the neighboring subtrees. A postorder depth-first-traversal combines the intervals of subtrees to yield the interval of a parent node.
A "Reingold Tilford" style layout : saves space in x by moving subtrees together as much as possible. (For details, see section 3 of Christoph Buchheim, Michael Jünger and Sebastian Leipert, "Improving Walker's Algorithm to Run in Linear Time", Proceedings of Symposium on Graph Drawing (GD) 2002, pages 344-353.) The is more complicated to program.
Another easy-to-program layout: a preorder depth-first-traversal encounters nodes in order of their y coordinates, and the x coordinate of each node is proportional to its depth.
Classical/Layered Indented Outline List
A naïve and easy-to-program layout: each subtree has an interval in x that is not overlapped by the neighboring subtrees. A postorder depth-first-traversal combines the intervals of subtrees to yield the interval of a parent node.
A "Reingold Tilford" style layout : saves space in x by moving subtrees together as much as possible. (For details, see section 3 of Christoph Buchheim, Michael Jünger and Sebastian Leipert, "Improving Walker's Algorithm to Run in Linear Time", Proceedings of Symposium on Graph Drawing (GD) 2002, pages 344-353.) The is more complicated to program.
Another easy-to-program layout: a preorder depth-first-traversal encounters nodes in order of their y coordinates, and the x coordinate of each node is proportional to its depth.
Classical/Layered Indented Outline List
A naïve and easy-to-program layout: each subtree has an interval in x that is not overlapped by the neighboring subtrees. A postorder depth-first-traversal combines the intervals of subtrees to yield the interval of a parent node.
A "Reingold Tilford" style layout : saves space in x by moving subtrees together as much as possible. (For details, see section 3 of Christoph Buchheim, Michael Jünger and Sebastian Leipert, "Improving Walker's Algorithm to Run in Linear Time", Proceedings of Symposium on Graph Drawing (GD) 2002, pages 344-353.) The is more complicated to program.
Another easy-to-program layout: a preorder depth-first-traversal encounters nodes in order of their y coordinates, and the x coordinate of each node is proportional to its depth.
Classical/Layered Indented Outline List
Trees
Michael McGuffin and Jean-Marc Robert, 2010
Trees
Treemaps(Ben Shneiderman and others)
http://www.cs.umd.edu/hcil/treemap-history/
Martin Wattenberg, 1998and
http://www.smartmoney.com/marketmap/
Marc Smith and Andrew Fiore, 2001
Focus+Context Visualizations
Fisheye lens
Image: Keahey and Robertson
3D Visualizations
3D + interaction + animations :Christopher Collins and Sheelagh Carpendale, 2007
Project Ideas on the Wiki
Possible project topics:• visualization of multidimensional data (using parallel coordinates,
using scatterplot matrix, using glyphs,...) Example data sets:– http://archive.ics.uci.edu/ml/datasets/Iris– http://vis.stanford.edu/protovis/ex/cars.html– http://vis.stanford.edu/protovis/ex/flowers.html
• visualization of one tree (genealogy tree, evolutionary tree, files on harddrive,...)– Example: http://www.cs.umd.edu/hcil/treemap-history/
• visualization of differences between two trees• visualization of one network (using node-link diagram, using
adjacency matrix,...)• visualization of differences between two networks• visualization of time series data
Example: http://leebyron.com/else/streamgraph/• a 3D visualization• a zoomable user interface
Some libraries, APIs, software, and platforms:• InfoVis Toolkit (Java) http://ivtk.sourceforge.net/• Prefuse (Java) http://prefuse.org/• Flare (ActionScript) http://flare.prefuse.org/• Protovis (JavaScript) http://vis.stanford.edu/protovis/ http://eagereyes.org/tutorials/protovis-primer-part-1• Many Eyes http://manyeyes.alphaworks.ibm.com/• VTK (C++ / Java / Python) http://vtk.org/• Processing (Java) http://processing.org
For visualizing networks• Pajek http://pajek.imfm.si/• Graphviz http://www.graphviz.org/• Tulip http://www.tulip-software.org/• Walrus http://www.caida.org/tools/visualization/walrus/• JUNG http://jung.sourceforge.net/• NetworkX (Python) http://networkx.lanl.gov/
For multidimensional data• XmdvTool http://davis.wpi.edu/~xmdv/• Spotfire http://www.cs.umd.edu/hcil/spotfire/• Polaris http://window.stanford.edu/projects/polaris/• Tableau http://www.tableausoftware.com• Mondrian http://rosuda.org/Mondrian
Some websites and blogs:• http://infosthetics.com• http://flowingdata.com• http://eagereyes.org• http://www.visualcomplexity.com/vc/blog/• http://manyeyes.alphaworks.ibm.com/blog/• http://www.gapminder.org/
Some Books about Visualization• Jacques Bertin (1967), Sémiologie graphique: Les diagrammes, Les réseaux, Les
cartes.• Jacques Bertin (1977), La graphique et le traitement graphique de l'information.• John Wilder Tukey (1977), Exploratory Data Analysis.• Edward R. Tufte (1983), The Visual Display of Quantitative Information.• Edward R. Tufte (1990), Envisioning Information.• Edward R. Tufte (1997), Visual Explanations: Images and Quantities, Evidence and
Narrative.• Di Battista, Giuseppe and Peter Eades and Roberto Tamassia and Ioannis G. Tollis
(1999), Graph Drawing: Algorithms for the Visualization of Graphs.• Leland Wilkinson (1999), The Grammar of Graphics.• Stuart K. Card and Jock D. Mackinlay and Ben Shneiderman (1999), Readings in
Information Visualization: Using Vision to Think.• Colin Ware (2000), Information Visualization: Perception for Design.• Robert Spence (2001), Information Visualization.
A bibliography of books and papers
http://profs.logti.etsmtl.ca/mmcguffin/bib/vis.txt