spatial reasoning, & statistical graphics emr 21: september 12, 2012

Post on 22-Dec-2015

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Spatial Reasoning, & Statistical Graphics

EMR 21: September 12, 2012

Spatial Reasoning, & Statistical Graphics

EMR 21: September 12, 2012

Map This

• Scenario:• You are a graphics director at the

New York Times• Whole Foods story• This is the 21st century mapmaking

process…

Map this data… (handout)

45 Countries, various quantities

That moment

• When you have data/findings but you are wondering how to visualize...

Part I: Spatial ReasoningWhat is it?

EMR 21: September 12, 2012

What is Spatial Reasoning?

Spatial Reasoning 1:

• Spatial reasoning is the ability we use to position and orient ourselves in everyday environments. This ability is present in all animals and humans to a more or lesser extent. Spatial reasoning consists of two main abilities namely spatial visualization ability which is your ability to call up images in your mind and the ability to reason with these images. (http://www.fibonicci.com/)

Spatial Reasoning 2:

• Probably the most common and basic form of human intelligence. From birth, people employ methods of spatial reasoning almost continuously to infer information about their environment, how it evolves over time and how we change our location in space…. (Golledge, 1998)

Spatial Reasoning 2:

• When applied to computation, spatial reasoning attempts to solve problems that deal with objects that occupy space. (Golledge 1998)

Spatial Reasoning 3:

• The ability to visualize spatial patterns and mentally manipulate them over a time-ordered sequence of spatial transformations…. (Wikipedia)

Spatial Reasoning

• Human Intelligence• Mental Representations / Mental

Maps– Acquiring and organizing information– Internal representations

• Environmental Knowledge• Traditionally conducted at human

scales

• Maps, GIS, and spatial analyses stimulate spatial reasoning about our world at non-human scales

Part II: Statistical Graphics

Statistical Graphics

• Central to and parallel with the “development of science”

• Most statistical analysis yields numeric or tabular forms

• Some techniques enable reporting in more pictorial fashion

• Big Data and Visualization• Exploratory Data Analysis

– We don’t know what we’re going to find

Why do we “visualize” data?

Why do we visualize data?

• External representations• To create a visual interface / We are

visual creatures• To detect structures that are otherwise

hidden• Data analysis without visualization risks

missing important happenings• To take advantage of our most powerful

sense: vision• To see things!

Seminal figure

• William Playfair (Scottish architect)

William Playfair

• 22 September 1759 – 11 February 1823

• 3 or 4 MAJOR Contributions• The Commercial and Political Atlas• The founder of graphical methods for

statistics…• Invisible structures

William Playfair

• “Information Architect”• Line Graph, Bar Chart, Pie Chart

Data Graphics

• Visually display measured quantities by means of the combined use of points, lines, a coordinate system, numbers, symbols, words, shading, and color– Edward Tufte (Visual Display of

Quantitative Information)

Tufte Pages

Good Data Graphics Should:

• Show the data• Induce viewer to think about

phenomena• Present many numbers in a small

space

Good Data Graphics Should:

• Show the data• Induce viewer to think about phenomena• Present many numbers in a small space• Make large data sets coherent• Facilitate comparison• Work at multiple levels (to see vs. to read)• Be closely integrated with text or other

descriptors

Graphics Reveal Information

• An interface with data that mimics primitive environmental knowledge acquisition.

• We are visual creatures!• Let’s look at some techniques…

Individual Variables

• Raw Table• Grouped Frequency Table• Dispersion Graphs• Histogram• Other Charts

Raw Table

• Header on top• Name in the left column

• Attribute/variable on the right

Raw Table: Pros and Cons

• Pros– Specific info– Easy to retrieve min and max

• Cons– Do not reveal overall distribution– Hard to see patterns, clusters

Grouped Frequency Tables

• See pg. 37 of your book• Table 3.2• Half table / half histogram

Point Graphs

• AKA 1-dimensional scatter plot– Each data value plotted on a number line– Open circles

Point Graphs

• Pros: – Show distributions, clusters

• Cons:– Overlapping symbols– Hard to detect duplicates

Dispersion Graphs

• Like point graphs, but classed• Stacked dots in each class

– Figure 3.1B in your book

Dispersion graphs

• Pros: – Show distributions– Easier to read

• Cons:– Precision is lost

Histogram

• Conceptually similar to the dispersion graph

What’so the difference?

Histograms

• Fewer classes than dispersion graphs• Much more common

Measures of Central Tendency

• Objective measures of averages:– MODE– MEAN– MEDIAN

Measures of Dispersion

Range: max minus min Affected by outliers

Interquartile range: 75th percentile minus 25th percentile Immune to outliers

Standard Deviation Affected by outliers

Scatter plots

• What are they?• What do they do?

Scatter plots

• AKA Scatter Diagram• A graphing technique used to examine the relationship of variables against one another

Scatter plots

Scatter plots

Scatter plots

• The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.

Scatter plots

• Cartesian Coordinates?

Scatter plots

• Cartesian Coordinates• AKA the rectangular coordinate system

• Developed in the 1630’s by Rene Descartes

Scatter plots

• Pros: Help determine if two attributes are correlated

• Cons: Hard to interpret with lots of observations (the more measurements, the more dots)

Parallel Coordinate Plot• What?

Parallel Coordinate Plot (PCP)

• A graphing technique for multivariate data (more than 2 attributes)

• A set of parallel vertical axes are drawn– One vertical axis per variable attribute)– Each observation is a line connecting the

vertical axes

Parallel Coordinate Plots

• If adjacent variables are perfectly correlated, no lines will intersect between the vertical axes

• See figures 3.8, 3.9

Parallel Coordinate Plots

• Pros: Powerful, Very effective within interactive statistical software

• Cons: Hard to identify individual values and individual observations

Some More Statistical Graphics

• Bar Graphs

• Box plots

• Multivariate ray glyphs

• Cher off faces

Bar Graphs

• AKA Bar Charts

• What do they do?

Bar Graphs

• AKA Bar Charts

• Summarizes categorical data

• Horizontal axis represents categories

• Vert. axis represents counts or %

• Illustrates the differences in % or counts between categories

Bar Charts

• Look like histograms, BUT:– The data is discrete, discontinuous (often

qualitative)• Animal species

– Histograms are for continuous data (often quantitative)

• E.g. heights of a population, city populations

Box Plots

• AKA box-and-whisker diagram or candlestick chart

• What are they? When do we use them?

Box Plots: What are they?

• Graphic depictions of population distributions

Box Plots: What are they?

• Graphic depictions of population distributions

• Indicate the population’s “5-number summary”– Minimum– Lower Quartile– Median– Upper Quartile– Maximum

How They’re Made:

• 1st: find the 5 values for each of your populations

How They’re Made:

• 2nd: Plot the values as points

How They’re Made:

• 3rd: Draw the boxes

Question:• Which Groups has the smallest

interquartile range

Box Plots:

• Pros: – Easy to generate (esp. by hand)– Easy to interpret

• Cons:– Only show a summary of population– Some trends, and clusters will be invisible

Pie Charts

• We all know what they are, but in this context, what is their purpose?

The point of a pie chart

• A circle divided into parts– “Parts of a whole”– Example: Harvard Student Body by Class

• Freshman• Sophomores• Juniors• Seniors• Grad students

Multivariate Ray Glyphs

• Figure 18.8 of your book

Multivariate Ray Glyphs

• Extended rays from an interior circle

• Lengths of the rays proportional to value of each attribute

• If you connect the rays, called “snowflake”

Chernoff Faces

• Who’s heard of them?

Chernoff Faces

• Used to illustrate trends in multivariate data

• Different attributes are mapped to different facial features– Face width– Level of the ears– Curvature of the mouth– Length of the nose– Shape of the eyes – Eyebrows

Chernoff Faces

• Pros: Intuitive depiction of multivariate data

• Cons: Lose lots of numerical precision

Conclusion

• Lots of ways to summarize, share, organize, and communicate stats

• Choosing the right way is important– Know your data– Know your audience– Know your medium– Know your graphics

Conclusion

Many techniques for looking at data Tables Graphed methods:

Grouped-Frequency Tables Dispersion Graph, Histogram Scatter plots, and PCP’s

Know what they’re for and when to use them Read Ch. 3

Graphs

Data Tables

• What are they?• When to use them• Design Issues

Pie Charts

• What are they?• When to use them• Design Issues

Bar Graphs

• What are they?• When to use them• Design Issues

Line Graphs

• What are they?• When to use them• Design Issues

Scatter Plots

• What are they?• When to use them• Design Issues

top related