“it is a capital mistake to theorize before one has data ...danielleszafir.com/dhrn.pdf · “it...
TRANSCRIPT
![Page 1: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/1.jpg)
“It is a capital mistake to theorize
before one has data. Insensibly one
begins to twist facts to suit theories,
instead of theories to suit facts.”
![Page 2: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/2.jpg)
![Page 3: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/3.jpg)
Data Science
![Page 4: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/4.jpg)
“Data Science” is thinking with data
![Page 5: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/5.jpg)
How to categorize data
How to computationally
explore data
How to visually explore data
![Page 6: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/6.jpg)
Please ask questions if you have them!!!
![Page 7: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/7.jpg)
How to categorize data
How to computationally
explore data
How to visually explore data
![Page 8: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/8.jpg)
What are the different properties of
the data
![Page 9: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/9.jpg)
Data falls into two categories:
Qualitative: Descriptions, categories, and observations
Quantitative: Numeric measures
![Page 10: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/10.jpg)
Data about this book:
Qualitative: Old
By Shakespeare
Published in London
Quantitative: 142 pages
20,000 words
1,700 nouns
![Page 11: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/11.jpg)
Data can also be:
Ordinal Categorical/Nominal
Discrete Continuous
![Page 12: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/12.jpg)
What are the different levels of detail
we can look at?
![Page 13: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/13.jpg)
Scales Overview
Detail
Overview: High-level patterns looking
across all the data
Detail: Low-level patterns looking at specific pieces of the data
![Page 14: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/14.jpg)
How to categorize data
How to computationally
explore data
How to visually explore data
![Page 15: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/15.jpg)
![Page 16: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/16.jpg)
Reduce the dataset using mathematics
and logic
![Page 17: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/17.jpg)
All models are wrong, but some are
useful. --George E. P. Box
![Page 18: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/18.jpg)
Use statistics to group the data into
manageable units
![Page 19: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/19.jpg)
data
data data
data
data
data
data
data
data
data
data
data
data
data data data
data
data
data
Algorithmically categorize dataset
based on properties of the data
Ma
th &
Lo
gic
![Page 20: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/20.jpg)
Topic Models: Identify words that categorize groups of texts in a
corpus
Clustering: Identify groups of datapoints with similar properties
Bayes Nets: Compute how likely it is that a text belongs to
different groups based on its properties
Explainers: Determine how similar different texts are to an
example text
![Page 21: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/21.jpg)
How to categorize data
How to computationally
explore data
How to visually explore data
![Page 22: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/22.jpg)
You need statistics to describe data, but
then visualization to see it in context.
-- Andy Kirk
![Page 23: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/23.jpg)
![Page 24: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/24.jpg)
Visualizations let us explore and communicate large
amounts of data visually
![Page 25: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/25.jpg)
1) Visually encode the data
2) Arrange the encoded data to highlight
patterns of interest
3) Design complementary methods for
looking at the data that can answer
complex analysis questions
4) Design ways for interacting with the
encoded data that support your
analysis
![Page 26: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/26.jpg)
1) Visually encode the data
2) Arrange the encoded data to highlight
patterns of interest
3) Design complementary methods for
looking at the data that can answer
complex analysis questions
4) Design ways for interacting with the
encoded data that support your
analysis
![Page 27: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/27.jpg)
A
B
A B
A B
A B
A B
A
A B
Position
Size
Shape
Value/Lightness
Color
Orientation
Texture
Visual Encodings: Ways to map data values to
visual marks
Different visual encodings
highlight different properties
in the data
![Page 28: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/28.jpg)
A B A B +
A B
Encodings can be combined to communicate
multiple properties of the data
![Page 29: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/29.jpg)
1) Visually encode the data
2) Arrange the encoded data to highlight
patterns of interest
3) Design complementary methods for
looking at the data that can answer
complex analysis questions
4) Design ways for interacting with the
encoded data that support your
analysis
![Page 30: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/30.jpg)
Juxtapositioning encoded data side-by-side
Superpositioning encoded data in the same
space
Explicitly encoding relationships of interest
between datapoints
Once data is encoded, we can
highlight relationships in the data by:
![Page 31: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/31.jpg)
Small Multiples: Juxtapose large
numbers of small
visualizations to
communicate high-
level patterns
Can either subdivide
the data or
properties of the
data
![Page 32: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/32.jpg)
1) Visually encode the data
2) Arrange the encoded data to highlight
patterns of interest
3) Design complementary methods for
looking at the data that can answer
complex analysis questions
4) Design ways for interacting with the
encoded data that support your
analysis
![Page 33: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/33.jpg)
Coordinated Views: Create multiple visualizations that work together to
support complex analysis
![Page 34: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/34.jpg)
Index Membership Freq Grouped Freq Pos in Reference In
dex
G
rou
ped
Fre
q
Pos
in R
efer
ence
Dynamic Remapping: Allow the user to change what data maps to which
visual channels to highlight different patterns
![Page 35: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/35.jpg)
1) Visually encode the data
2) Arrange the encoded data to highlight
patterns of interest
3) Design complementary methods for
looking at the data that can answer
complex analysis questions
4) Design ways for interacting with the
encoded data that support your
analysis
![Page 36: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/36.jpg)
Don’
Always connect back to the person:
how can we make insights meaningful?
![Page 37: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/37.jpg)
Interaction
![Page 38: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/38.jpg)
Some techniques for visualizing data…
![Page 39: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/39.jpg)
Bar Charts
Compare values
![Page 40: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/40.jpg)
Line Graphs
Identify trends
![Page 41: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/41.jpg)
Scatterplots
Identify clusters
![Page 42: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/42.jpg)
Pie Charts
Communicate proportions of a whole
![Page 43: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/43.jpg)
Heatmaps
Color to convey values compactly
![Page 44: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/44.jpg)
Histogram
Distribution over different properties
![Page 45: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/45.jpg)
Choropleths: Color to convey
values
Cartograms: Size to convey
values
![Page 46: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/46.jpg)
Networks/Node-Link Diagrams
Connect related objects
![Page 47: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/47.jpg)
Trees & Hierarchies
Communicate hierarchical relationships
![Page 48: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/48.jpg)
Learn More about Visualization
CS638/838: Visualization Prof. Michael Gleicher
11:00-12:15 Tu/Th
Visualization Reading Group 2pm every other Thursday
![Page 49: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/49.jpg)
1) Break into groups—mix “techies” and
“humanists”
2) Pick one dataset from your group to
talk about
3) Sketch how you might approach
analyzing this data
4) Rinse and repeat
5) Group critique
![Page 50: “It is a capital mistake to theorize before one has data ...danielleszafir.com/DHRN.pdf · “It is a capital mistake to theorize before one has data. Insensibly one begins to twist](https://reader033.vdocuments.mx/reader033/viewer/2022052019/60333e4b7f60c251fc6b29f0/html5/thumbnails/50.jpg)
What are the different properties of the data?
What are the interesting relationships between these
properties and why?
What are common or informative labels that can
describe different aspects of the data?
What, if any, questions do you want to explore in the
data?
What levels of detail are interesting?
What would be some interesting ways to “look” at this
data?
What patterns (or lack thereof) would you hope to find in
this data and what would they mean?