spatial reasoning, & statistical graphics emr 21: september 12, 2012
TRANSCRIPT
Spatial Reasoning, & Statistical Graphics
EMR 21: September 12, 2012
Spatial Reasoning, & Statistical Graphics
EMR 21: September 12, 2012
Map This
• Scenario:• You are a graphics director at the
New York Times• Whole Foods story• This is the 21st century mapmaking
process…
Map this data… (handout)
45 Countries, various quantities
That moment
• When you have data/findings but you are wondering how to visualize...
Part I: Spatial ReasoningWhat is it?
EMR 21: September 12, 2012
What is Spatial Reasoning?
Spatial Reasoning 1:
• Spatial reasoning is the ability we use to position and orient ourselves in everyday environments. This ability is present in all animals and humans to a more or lesser extent. Spatial reasoning consists of two main abilities namely spatial visualization ability which is your ability to call up images in your mind and the ability to reason with these images. (http://www.fibonicci.com/)
Spatial Reasoning 2:
• Probably the most common and basic form of human intelligence. From birth, people employ methods of spatial reasoning almost continuously to infer information about their environment, how it evolves over time and how we change our location in space…. (Golledge, 1998)
Spatial Reasoning 2:
• When applied to computation, spatial reasoning attempts to solve problems that deal with objects that occupy space. (Golledge 1998)
Spatial Reasoning 3:
• The ability to visualize spatial patterns and mentally manipulate them over a time-ordered sequence of spatial transformations…. (Wikipedia)
Spatial Reasoning
• Human Intelligence• Mental Representations / Mental
Maps– Acquiring and organizing information– Internal representations
• Environmental Knowledge• Traditionally conducted at human
scales
• Maps, GIS, and spatial analyses stimulate spatial reasoning about our world at non-human scales
Part II: Statistical Graphics
Statistical Graphics
• Central to and parallel with the “development of science”
• Most statistical analysis yields numeric or tabular forms
• Some techniques enable reporting in more pictorial fashion
• Big Data and Visualization• Exploratory Data Analysis
– We don’t know what we’re going to find
Why do we “visualize” data?
Why do we visualize data?
• External representations• To create a visual interface / We are
visual creatures• To detect structures that are otherwise
hidden• Data analysis without visualization risks
missing important happenings• To take advantage of our most powerful
sense: vision• To see things!
Seminal figure
• William Playfair (Scottish architect)
William Playfair
• 22 September 1759 – 11 February 1823
• 3 or 4 MAJOR Contributions• The Commercial and Political Atlas• The founder of graphical methods for
statistics…• Invisible structures
William Playfair
• “Information Architect”• Line Graph, Bar Chart, Pie Chart
Data Graphics
• Visually display measured quantities by means of the combined use of points, lines, a coordinate system, numbers, symbols, words, shading, and color– Edward Tufte (Visual Display of
Quantitative Information)
Tufte Pages
Good Data Graphics Should:
• Show the data• Induce viewer to think about
phenomena• Present many numbers in a small
space
Good Data Graphics Should:
• Show the data• Induce viewer to think about phenomena• Present many numbers in a small space• Make large data sets coherent• Facilitate comparison• Work at multiple levels (to see vs. to read)• Be closely integrated with text or other
descriptors
Graphics Reveal Information
• An interface with data that mimics primitive environmental knowledge acquisition.
• We are visual creatures!• Let’s look at some techniques…
Individual Variables
• Raw Table• Grouped Frequency Table• Dispersion Graphs• Histogram• Other Charts
Raw Table
• Header on top• Name in the left column
• Attribute/variable on the right
Raw Table: Pros and Cons
• Pros– Specific info– Easy to retrieve min and max
• Cons– Do not reveal overall distribution– Hard to see patterns, clusters
Grouped Frequency Tables
• See pg. 37 of your book• Table 3.2• Half table / half histogram
Point Graphs
• AKA 1-dimensional scatter plot– Each data value plotted on a number line– Open circles
Point Graphs
• Pros: – Show distributions, clusters
• Cons:– Overlapping symbols– Hard to detect duplicates
Dispersion Graphs
• Like point graphs, but classed• Stacked dots in each class
– Figure 3.1B in your book
Dispersion graphs
• Pros: – Show distributions– Easier to read
• Cons:– Precision is lost
Histogram
• Conceptually similar to the dispersion graph
What’so the difference?
Histograms
• Fewer classes than dispersion graphs• Much more common
Measures of Central Tendency
• Objective measures of averages:– MODE– MEAN– MEDIAN
Measures of Dispersion
Range: max minus min Affected by outliers
Interquartile range: 75th percentile minus 25th percentile Immune to outliers
Standard Deviation Affected by outliers
Scatter plots
• What are they?• What do they do?
Scatter plots
• AKA Scatter Diagram• A graphing technique used to examine the relationship of variables against one another
Scatter plots
Scatter plots
Scatter plots
• The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.
Scatter plots
• Cartesian Coordinates?
Scatter plots
• Cartesian Coordinates• AKA the rectangular coordinate system
• Developed in the 1630’s by Rene Descartes
Scatter plots
• Pros: Help determine if two attributes are correlated
• Cons: Hard to interpret with lots of observations (the more measurements, the more dots)
Parallel Coordinate Plot• What?
Parallel Coordinate Plot (PCP)
• A graphing technique for multivariate data (more than 2 attributes)
• A set of parallel vertical axes are drawn– One vertical axis per variable attribute)– Each observation is a line connecting the
vertical axes
Parallel Coordinate Plots
• If adjacent variables are perfectly correlated, no lines will intersect between the vertical axes
• See figures 3.8, 3.9
Parallel Coordinate Plots
• Pros: Powerful, Very effective within interactive statistical software
• Cons: Hard to identify individual values and individual observations
Some More Statistical Graphics
• Bar Graphs
• Box plots
• Multivariate ray glyphs
• Cher off faces
Bar Graphs
• AKA Bar Charts
• What do they do?
Bar Graphs
• AKA Bar Charts
• Summarizes categorical data
• Horizontal axis represents categories
• Vert. axis represents counts or %
• Illustrates the differences in % or counts between categories
Bar Charts
• Look like histograms, BUT:– The data is discrete, discontinuous (often
qualitative)• Animal species
– Histograms are for continuous data (often quantitative)
• E.g. heights of a population, city populations
Box Plots
• AKA box-and-whisker diagram or candlestick chart
• What are they? When do we use them?
Box Plots: What are they?
• Graphic depictions of population distributions
Box Plots: What are they?
• Graphic depictions of population distributions
• Indicate the population’s “5-number summary”– Minimum– Lower Quartile– Median– Upper Quartile– Maximum
How They’re Made:
• 1st: find the 5 values for each of your populations
How They’re Made:
• 2nd: Plot the values as points
How They’re Made:
• 3rd: Draw the boxes
Question:• Which Groups has the smallest
interquartile range
Box Plots:
• Pros: – Easy to generate (esp. by hand)– Easy to interpret
• Cons:– Only show a summary of population– Some trends, and clusters will be invisible
Pie Charts
• We all know what they are, but in this context, what is their purpose?
The point of a pie chart
• A circle divided into parts– “Parts of a whole”– Example: Harvard Student Body by Class
• Freshman• Sophomores• Juniors• Seniors• Grad students
Multivariate Ray Glyphs
• Figure 18.8 of your book
Multivariate Ray Glyphs
• Extended rays from an interior circle
• Lengths of the rays proportional to value of each attribute
• If you connect the rays, called “snowflake”
Chernoff Faces
• Who’s heard of them?
Chernoff Faces
• Used to illustrate trends in multivariate data
• Different attributes are mapped to different facial features– Face width– Level of the ears– Curvature of the mouth– Length of the nose– Shape of the eyes – Eyebrows
Chernoff Faces
• Pros: Intuitive depiction of multivariate data
• Cons: Lose lots of numerical precision
Conclusion
• Lots of ways to summarize, share, organize, and communicate stats
• Choosing the right way is important– Know your data– Know your audience– Know your medium– Know your graphics
Conclusion
Many techniques for looking at data Tables Graphed methods:
Grouped-Frequency Tables Dispersion Graph, Histogram Scatter plots, and PCP’s
Know what they’re for and when to use them Read Ch. 3
Graphs
Data Tables
• What are they?• When to use them• Design Issues
Pie Charts
• What are they?• When to use them• Design Issues
Bar Graphs
• What are they?• When to use them• Design Issues
Line Graphs
• What are they?• When to use them• Design Issues
Scatter Plots
• What are they?• When to use them• Design Issues