data visualization - a very rough guide
DESCRIPTION
Data Visualization - A Very Rough Guide. Ken Brodlie University of Leeds. Visualization “Use of computer-supported, interactive, visual representations of data to amplify cognition” (Card, McKinlay, Shneiderman) Born as a discipline in 1987 with publication of NSF Report - PowerPoint PPT PresentationTRANSCRIPT
1SDMIV
Data Visualization- A Very Rough Guide
Data Visualization- A Very Rough Guide
Ken BrodlieUniversity of Leeds
2SDMIV
What is This Thing Called Visualization?
What is This Thing Called Visualization?
Visualization– “Use of computer-
supported, interactive, visual representations of data to amplify cognition” (Card, McKinlay, Shneiderman)
– Born as a discipline in 1987 with publication of NSF Report
– Now widely used in computational science and engineering
Vis5D
3SDMIV
Visualization – Twin SubjectsVisualization – Twin Subjects
Scientific Visualization
– Visualization of physical data
Information Visualization
– Visualization of abstract data
Ozone layer around earthAutomobile web site- visualizing links
4SDMIV
Scientific Visualization – Another CharacterisationScientific Visualization – Another Characterisation
Focus is on visualizing an entity measured in a multi-dimensional space
– 1D– 2D– 3D– Occasionally nD
Underlying field is recreated from the sampled data
Relationship between variables well understood – some independent, some dependent
http://pacific.commerce.ubc.ca/xr/plot.html
Image from D. Bartz and M. Meissner
5SDMIV
Scientific Visualization Model
Scientific Visualization Model
Visualization represented as pipeline:
– Read in data– Build model of
underlying entity– Construct a
visualization in terms of geometry
– Render geometry as image
Realised as modular visualization environment
– IRIS Explorer – IBM Open Visualization
Data Explorer (DX)– AVS
visualizemodeldata render
6SDMIV
Extending the SciVis Model
Extending the SciVis Model
The dataflow model has proved extremely flexible
Provides basis of collaborative visualization
– Implemented in IRIS Explorer as the COVISA toolkit
Extensible– User code
introduced as module in pipeline allows computational steering
visualizemodeldata render
internetcollaborative server
render
simulate visualize rendercontrol
7SDMIV
An e-Science DemonstratorAn e-Science Demonstrator
Emergency scenario: release of toxic chemical
– Simulation launched on Grid resource, steered from desktop using IRIS Explorer
– Collaborators linked in remotely using COVISA toolkit
Dispersion of pollutantstudied under varyingwind directions
A collaboratorlinks in overthe network
8SDMIV
Other MetaphorsOther Metaphors
Other user interface metaphors have been suggested
Spreadsheet interface becoming popular..
Allows audit trail of visualizations
Jankun-Kelly and Ma
9SDMIV
Information VisualizationInformation Visualization
Focus is on visualizing set of observations that are multi-variate
Example of iris data set
– 150 observations of 4 variables (length, width of petal and sepal)
– Techniques aim to display relationships between variables
10SDMIV
Dataflow for Information Visualization
Dataflow for Information Visualization
Again we can express as a dataflow – but emphasis now is on data itself rather than underlying entity
First step is to form the data into a table of observations, each observation being a set of values of the variables
Then we apply a visualization technique as before
visualizedatatabledata render
A B C
1 .. .. ..
2 .. .. ..
variables
observations
11SDMIV
Multivariate VisualizationMultivariate Visualization
Software:– Xmdvtool
Matthew Ward
Techniques designed for any number of variables
– Glyph techniques– Parallel co-
ordinates– Scatter plot
matrices– Pixel-based
techniques
Acknowledgement:Many of images in followingslides taken from Ward’s work ..and also IRIS Explorer!
12SDMIV
Glyph TechniquesGlyph Techniques
Star plots– Each observation
represented as a ‘star’
– Each spike represents a variable
– Length of spike indicates the value
Variety of possible glyphs
– Chernoff faces Crime inDetroit
13SDMIV
Parallel Co-ordinatesParallel Co-ordinates
Each variate represented as vertical axis
Axes laid out uniformly
Observation represented as a polyline traversing all M axes, crossing each axis at the observed value of the variate
Detroit homicide data (7 variables,13 observations)
14SDMIV
Scatter Plot MatricesScatter Plot Matrices
Matrix of 2D scatter plots
– Each plot shows projection of data onto a 2D subspace of the variates
– Order M2 plots
15SDMIV
The Screen Space ProblemThe Screen Space Problem
All techniques, sooner or later, run out of screen space
Parallel co-ordinates
– Usable for up to 150 variates
– Unworkable greater than 250 variates
Remote sensing: 5 variates, 16,384 observations)
16SDMIV
Brushing as a SolutionBrushing as a Solution
Brushing selects a restricted range of one or more variables
Selection then highlighted
17SDMIV
Clustering as a SolutionClustering as a Solution
Success has been achieved through clustering of observations
Hierarchical parallel co-ordinates
– Cluster by similarity
– Display using translucency and proximity-based colour
18SDMIV
Hierarchical Parallel Co-ordinates
Hierarchical Parallel Co-ordinates
19SDMIV
Reduction of Dimensionality of Variate
Space
Reduction of Dimensionality of Variate
Space
Reduce number of variables, preserve information
Principal Component Analysis
– Transform to new co-ordinate system
– Hard to interpret Hierarchical reduction
of variate space– Cluster variables
where distance between observations is typically small
– Choose representative for each cluster
20SDMIV
Using a Dataflow System for Information Visualization
Using a Dataflow System for Information Visualization
IRIS Explorer used to visualize data from BMW
– Five variables displayed using spatial arrangement for three, colour and object type for others
– Notice the clusters…
More later..
Kraus & Ertl
21SDMIV
Scientific Visualization – Information VisualizationScientific Visualization – Information Visualization
Focus is on visualizing set of observations that are multi-variate
There is no underlying field – it is the data itself we want to visualize
The relationship between variables is not well understood
Focus is on visualizing an entity measured in a multi-dimensional space
Underlying field is recreated from the sampled data
Relationship between variables well understood
Scientific Visualization
Information Visualization