making sense of data visually: a modern look at datavisualization
TRANSCRIPT
![Page 1: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/1.jpg)
Making sense of data visually:
A modern look at data visualization
VLADIMIR MILEV
NEW VENTURE SOFTWARE
![Page 2: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/2.jpg)
Author BioVladimir Milev
MCPD Enterprise
Speaker (Devreach, NTK Slovenia and others)
DV Evangelist
Founder at New Venture Software
@vmilev
www.linkedin.com/in/vladimirmilev/
![Page 3: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/3.jpg)
http://www.newventuresoftware.com/
![Page 4: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/4.jpg)
Agenda1. Big data and information overload
2. What problems DataViz solves
3. DataViz fundamental theory
4. Basic visualizations
5. Advanced visualizations
![Page 5: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/5.jpg)
Information OverloadTwitter: 500 million tweets per day
Facebook: 55 million status updates per day
Facebook: 900 million interactions per day (comments, likes etc.)
Reddit:
![Page 6: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/6.jpg)
Proliferation of smart devices We are already living in a world dominated by
smart devices What is the meaning of this? More connected, data is more accessible Less space for tables and text Must use visual communication
![Page 7: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/7.jpg)
Making Sense of DataIncreasing amount of data available
Increasing number of data consumer devices
Obtaining data no longer a problem
We have an Information Overload issue
Quick data analysis is the new problem
But how quick?
![Page 8: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/8.jpg)
A Picture is worth a 1000 wordsWith about 1,000,000 ganglion cells, the human retina would transmit data at roughly the rate of an Ethernet connection, or 10 million bits per second.”
-Vijay Balasubramanian, PhD, Professor of Physics at U Penn
![Page 9: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/9.jpg)
OK – That’s a lot of bandwidthBUT ARE WE USING IT EFFICIENTLY?
![Page 10: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/10.jpg)
EfficiencyBest readers usually read up to about 300 words per minute.
Average word length is 5.1 letters
300 * 5.1 = 1530 characters per minute
Or 1530 / 60 = 25.5 characters per second
1 character is usually stored as 8 bits
26 * 8 = 208 bits per second
Reading bandwidth is ~0.025 KiB/s
Or 0.00208% Efficiency
![Page 11: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/11.jpg)
So reading clearly isn’t the way to go…BUT WHAT IS THE SOLUTION?
![Page 12: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/12.jpg)
Using statisticsFor the most part of the 20th century
Using arithmetic mean, average, standard deviation
Variance, correlations, regressions
Turns out this is not good enough
![Page 13: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/13.jpg)
Anscombe’s QuartetI II III IV
x y x y x y x y
10 8.04 10 9.14 10 7.46 8 6.58
8 6.95 8 8.14 8 6.77 8 5.76
13 7.58 13 8.74 13 12.74 8 7.71
9 8.81 9 8.77 9 7.11 8 8.84
11 8.33 11 9.26 11 7.81 8 8.47
14 9.96 14 8.1 14 8.84 8 7.04
6 7.24 6 6.13 6 6.08 8 5.25
4 4.26 4 3.1 4 5.39 19 12.5
12 10.84 12 9.13 12 8.15 8 5.56
7 4.82 7 7.26 7 6.42 8 7.91
5 5.68 5 4.74 5 5.73 8 6.89
• Statistical properties are identical:• Mean of X (9.0) and Y (7.5) values are constant• Nearly same variances, correlations and regressions• As far as statistics is concerned these sets are almost the same
![Page 14: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/14.jpg)
Anscombe’s Quartet
![Page 15: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/15.jpg)
So DataViz is very powerful
But why does it work so well?
![Page 16: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/16.jpg)
Gestalt PsychologySeeing with the brain
The mind understands external stimuli as whole rather than the sum of their parts
We tend to order our experience in a manner that is regular, orderly, symmetric, and simple
Key principles of gestalt: reification, multistability, invariance
Gestalt laws of grouping: proximity, similarity, closure, symmetry
![Page 17: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/17.jpg)
Gestalt Principles - ReificationOur minds tend to construct/generate information
![Page 18: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/18.jpg)
Gestalt Principles - Multistability
The tendency of our mind to jump back and forth between ambiguous alternative interpretations
Spinning Girl Rubin Vase
![Page 19: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/19.jpg)
Gestalt Principles - InvarianceThe tendency to perceive simple geometric objects independent of rotation, translation, and scale
Also elastic deformations, different lighting, and different component features
![Page 20: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/20.jpg)
Gestalt Laws of Grouping - Similarity
We group objects based on visual similarity
![Page 21: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/21.jpg)
Gestalt Laws of Grouping - Proximity
We group items based on spatial proximity
![Page 22: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/22.jpg)
Gestalt Laws of Grouping - Closure
We perceive objects such as shapes, letters, pictures, etc., as being whole when they are not complete
![Page 23: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/23.jpg)
Application in Data Visualization Introducing the visual variables
Fundamental properties of objects which can encode information into a picture
Fundamental visual variables:◦ Position
◦ Size
◦ Color
◦ Shape
◦ Orientation
Basis for all Data Visualization!
![Page 24: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/24.jpg)
Basic/Common VisualizationsBar graphs
Line graphs
Area charts
Pie charts
![Page 25: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/25.jpg)
Bar Graphs
• Using color correctly to encode gender
• Using position (ordering) to create an orderly scale
• Using size to encode the values• Using orientation to differentiate
gender again
![Page 26: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/26.jpg)
Bar Graphs continued
• Labels are used• Color is neutral and does not encode
information• Again, we have top-down ordering
(position)• And again size encodes the relative
numeric value
![Page 27: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/27.jpg)
Bars and Normal Distribution
Minimum passing grade
• Distribution of test scores for Polish “Matura” exam
• Normal Distribution is expected
• Red line shows normal distribution
• 30 is the minimum expected grade
• Detecting behavioral changes• What happened?
![Page 28: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/28.jpg)
Line Graphs
Confirming what we already know –paper media is declining rapidly.
• Shape encodes the value• Color is not significant• Design goal is to show a
trend/change
![Page 29: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/29.jpg)
Area Graphs
Effect of school year on Team Fortress 2 players
School starts
• Similar to line graph• Design goal for area
charts is emphasize on the value/quantity, not so much on the trend
• You can see both• Color has no
meaning
![Page 30: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/30.jpg)
Area Graphs continued• This time color carries a meaning (legend)
• The graph is also good for displaying ratio between series of data over time
![Page 31: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/31.jpg)
Pie Charts
![Page 32: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/32.jpg)
Pie ChartsGolden Rules for Pie Charts
• Ratio of one piece to the whole
• Order the values
• Less than 6 pieces
• Avoid legends
• Sum up to 100%
![Page 33: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/33.jpg)
Abusing Pie Charts
Don’t break the rules!
![Page 34: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/34.jpg)
Maps
Plot millions of journal entries from 18th and 19th century ship logs, and you reveal a picture of ocean trade you've never seen before
• Visualization of routes
• Color saturation indicates heavily used routes
![Page 35: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/35.jpg)
Maps are good with animations too
• Concentration of NO2 from 2005 to 2011
• Using both color and position to encode concentration
• Using continuous color scale• Adding another dimension -
time
![Page 36: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/36.jpg)
Choropleth Maps
Displaying the most popular name for a newborn in each state
• Using discrete palette to encode information
![Page 37: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/37.jpg)
Heat Maps
• Excellent for plotting recurring values
• Color saturation/brightness encodes the values
• Position also encodes information
• Easy to spot concentrations and find patterns
![Page 38: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/38.jpg)
Heat Maps medicine/genetics
![Page 39: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/39.jpg)
Tree Maps
• Excellent for representing hierarchical data
• Color carries a meaning• Size carries a meaning as well• Position is irrelevant• Suitable for annotations
![Page 40: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/40.jpg)
Parallel Coordinates Plot
• Interactive visualization• Good at displaying
relationships between different dimensions of data
• Position encodes dimension
• Color encodes scale
![Page 41: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/41.jpg)
Parallel Coordinates Plot – in action
Selecting a subset of a dimension to display the relationships with the other dimensions
![Page 42: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/42.jpg)
Chord Diagram
• Similar to Parallel Coordinates plot
• Color and Position used to encode data
• Design is different• Filtering of dimensions is not a
design goal• Focuses on selecting a whole
dimension
![Page 43: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/43.jpg)
Some resourceshttp://www.reddit.com/r/dataisbeautiful/
http://blog.visual.ly/
http://flowingdata.com/
http://eagereyes.org/
http://www.perceptualedge.com/blog/
![Page 44: Making sense of data visually: A modern look at datavisualization](https://reader030.vdocuments.mx/reader030/viewer/2022032419/55a3a7681a28ab2e0d8b4852/html5/thumbnails/44.jpg)
Thank You!