visualization and analysis of text
DESCRIPTION
Visualization and Analysis of Text. Remco Chang, PhD Assistant Professor Department of Computer Science Tufts University December 17, 2010 Cologne, Germany. Introduction. Information Visualization Novel visual representations Storytelling User-Driven Visual Analysis Data exploration - PowerPoint PPT PresentationTRANSCRIPT
1/41
Visualization and Analysis of Text
Remco Chang, PhDAssistant Professor
Department of Computer ScienceTufts University
December 17, 2010Cologne, Germany
2/41
CMVVis Examples P Topics
Introduction
• Information Visualization– Novel visual representations– Storytelling– User-Driven
• Visual Analysis– Data exploration– Hypotheses generation– Interactive visualization + Computation
3/41
CMVVis Examples P Topics
Visualization
• Pre-attentive Processing
Examples courtesy of Chris Healey
4/41
CMVVis Examples P Topics
Visualization• This is helpful
because:
– It allows us to process more information quickly
– We can see trends and patterns
5/41
CMVVis Examples P Topics
Storytelling
• US Budget from 1961 - 2008
6/41
CMVVis Examples P Topics
Storytelling
• Minard’s Map:
• Napolean’s March to Moscow
7/41
CMVVis Examples P Topics
Visualization
• Influences the thought…
Images courtesy of Barbara Tversky
8/41
CMVVis Examples P Topics
Visual Encoding
• Affects the:
– Types of possible operations
– The user’s thinking process
Zhang and Norman. The Representation Of Numbers. Cognition. (1995)
9/41
CMVVis Examples P Topics
Classifying Numeric Systems
10/41
CMVVis Examples P Topics
Example: Arithmetic
Slide courtesy of Pat Hanrahan
11/41
CMVVis Examples P Topics
Example: Arithmetic
12/41
CMVVis Examples P Topics
Example: Arithmetic
13/41
CMVVis Examples P Topics
Example: Arithmetic
14/41
CMVVis Examples P Topics
Examples of Text Visualization
• Wordle
Images Courtesy of Many Eyes
15/41
CMVVis Examples P Topics
Examples of Text Visualization
• WordTree
16/41
CMVVis Examples P Topics
Examples of Text Visualization
• WordTree
17/41
CMVVis Examples P Topics
Examples of Text Visualization
• Phrase Net
18/41
CMVVis Examples P Topics
Examples of Text Visualization
• Google Auto-Complete
19/41
CMVVis Examples P Topics
Examples of Text Visualization
• Visualizing changes in Wikipedia
Images Courtesy of Info.fm
20/4120/37
CMVVis Examples P Topics
Examples of Text Visualization
• ThemeRiver
21/41
CMVVis Examples P Topics
Visual Exploration
• Coordinated Multi-Views (CMV)
Where
When
Who
What
Original Data
EvidenceBox
22/41
CMVVis Examples P Topics
WHY?
This group’s attacks are not bounded by geo-locations but instead, religious beliefs.
Its attack patterns changed with its developments.
Coordinated Multi-Views
23/41
CMVVis Examples P Topics
LIDAR Linked Feature Space
23/37
24/41
CMVVis Examples P Topics
LIDAR Change Detection
24/37
25/41
CMVVis Examples P Topics
Urban Model
25/37
26/41
CMVVis Examples P Topics
Urban Visualization
26/37
27/41
CMVVis Examples P Topics
Coordinated Multi-Views• Financial Wire Fraud
– With Bank of America– Discover suspicious
international wire transactions
• Bridge Maintenance – With US DOT– Exploring subjective
inspection reports
• Biomechanical Motion– With U. Minnesota and
Brown– Interactive motion
comparison methods
28/41
CMVVis Examples P Topics
Coordinated Multi-Views• Financial Wire Fraud
– With Bank of America– Discover suspicious
international wire transactions
• Bridge Maintenance – With US DOT– Exploring subjective
inspection reports
• Biomechanical Motion– With U. Minnesota and
Brown– Interactive motion
comparison methods
29/41
CMVVis Examples P Topics
Coordinated Multi-Views• Financial Wire Fraud
– With Bank of America– Discover suspicious
international wire transactions
• Bridge Maintenance – With US DOT– Exploring subjective
inspection reports
• Biomechanical Motion– With U. Minnesota and
Brown– Interactive motion
comparison methods
30/41
CMVVis Examples P Topics
CMV + Text Analysis
31/41
CMVVis Examples P Topics
Parallel Topics
• Task: Given the proposals submitted to the National Science Foundation (NSF), identify:
– Proposals that are interdisciplinary– Proposals that are potentially transformative– Proposals that are focused
32/41
CMVVis Examples P Topics
Parallel Topics
• Approach:
– Apply topic modeling algorithms to identify latent topics (David Blei, “Latent dirichlet allocation”, 2003)
– Visualize the distribution of proposals based on the topics
33/41
CMVVis Examples P Topics
Topic Modeling
• Given a set of k documents, find n number of topics– Each document then is described as:• (W1 * Topic1, W2 * Topic2, W3 * Topic3, …, Wn * Topicn)
• W1 + W2 + W3 + … + Wn = 1
Topic 1 Topic 2 … Topic NDocument 1 0.12 0.68 … 0.005Document 2 0.3 0.06 … 0.01…Document K
∑ = 1
∑ = 1...
34/41
CMVVis Examples P Topics
Topic Modeling
• A topic is a combination of keywords
35/41
CMVVis Examples P Topics
Parallel Topics
• Based on “Parallel Coordinates”– Each vertical axis is a topic– Each set of horizontal connected lines is a
document
36/41
CMVVis Examples P Topics
Visual Signatures
Single topic Bi-topic
No salient topic
• We identify different signatures for proposals:– Single Topic – focused research– Bi-Topic – Interdisciplinary research– No-Topic – Potentially transformative research
37/41
CMVVis Examples P Topics
Selecting Single Topic Proposals
Topic 1 Topic 2 … Topic N
Document 1 0.12 0.68 … 0.005
Document 2 0.3 0.06 … 0.01
…
Document K
SD = 0.14SD = 0.06
Max
SD
38/41
CMVVis Examples P Topics
Selecting Multi-Topic ProposalseducationtechnologyInteractive
environment
39/41
CMVVis Examples P Topics
Selecting No-Topic Proposals
40/41
CMVVis Examples P Topics
Recap
• Objective: To discover interdisciplinary and potentially innovative research proposals
• Parallel Topics – data-centric approach
• Approach: To support interactive selection of proposals based on their number of topics
41/41
CMVVis Examples P Topics
Questions and Comments?
Thank you!!
[email protected]://www.cs.tufts.edu/~remco