metro maps of
DESCRIPTION
Metro Maps of. Dafna Shahaf Carlos Guestrin Eric Horvitz. T he abundance of books is a distraction. ‘‘. ,,. Lucius Annaeus Seneca. 4 BC – 65 AD. … and it does not get any better. 129,864,880 Books (Google estimate) Research: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/1.jpg)
Metro Maps of
Dafna ShahafCarlos Guestrin
Eric Horvitz
![Page 2: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/2.jpg)
The abundance of books is a distraction‘‘
,,Lucius Annaeus Seneca
4 BC – 65 AD
![Page 3: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/3.jpg)
… and it does not get any better
• 129,864,880 Books (Google estimate)
• Research:– PubMed: 19 million papers
(One paper added per minute!)– Scopus: 40 million papers
![Page 4: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/4.jpg)
Papers
InnovativePapers
![Page 5: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/5.jpg)
So, you want to understand a research topic…
Now what?
![Page 6: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/6.jpg)
Search Engines are Great
• But do not show how it all fits together
![Page 7: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/7.jpg)
Timeline Systems
![Page 8: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/8.jpg)
Research is not Linear
![Page 9: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/9.jpg)
Metro Map
• A map is a set of lines of articles• Each line follows a coherent narrative thread• Temporal Dynamics + Structure
austerity
bailout
junk status
Germany
protests
strike
labor unionsMerkel
![Page 10: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/10.jpg)
Map Definition• A map M is a pair (G, P) where
– G=(V,E) is a directed graph– P is a set of paths in G (metro lines)– Each e Î E must belong to at least one metro line
austerity
bailout
junk status
protests
strike
Germany
labor unionsMerkel
![Page 11: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/11.jpg)
Game Plan
Objective Algorithm Does it
work?
![Page 12: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/12.jpg)
Properties of a Good Map
1. Coherence
???
![Page 13: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/13.jpg)
1 2 3 4 5
Greece
Europe
ItalyRepublican
Protest
Coherence: Main IdeaConnecting the Dots [S, Guestrin, KDD’10]
Debt default
Coherence is not a property of local interactions:
Incoherent: Each pair shares different words
![Page 14: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/14.jpg)
1 2 3 4 5
Greece
Austerity
ItalyRepublican
Protest
Coherence: Main IdeaConnecting the Dots [S, Guestrin, KDD’10]
Debt default
A more-coherent chain:
Coherent: a small number of words captures the story
![Page 15: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/15.jpg)
Words are too Simple
1 2 3
Probability
NetworkCost
Sensor networks
Bayesiannetworks Social
networks
![Page 16: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/16.jpg)
Using the Citation Graph
• Create a graph per word– All papers mentioning the word – Edge weight = strength of influence [El-Arini, Guestrin KDD‘11]
3
6 7
4
9
2
8
1
5
Network
Where did paper 8 get the idea?
Do papers 8 and 9 mean the same thing?
![Page 17: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/17.jpg)
Words are too Simple
1 2 3
Probability
NetworkCost
Sensor networks
Bayesiannetworks Social
networks
Incoherent
![Page 18: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/18.jpg)
Properties of a Good Map
1. Coherence
Is it enough?
![Page 19: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/19.jpg)
Max-coherence MapQuery: Reinforcement Learning
![Page 20: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/20.jpg)
Properties of a Good Map
1. Coherence
2. Coverage
Should cover diverse topics important to
the user
![Page 21: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/21.jpg)
Coverage: What to Cover?
• Perhaps words?• Not enough:
SVM in oracle database 10gMilenova et al
VLDB '05
Support Vector Machines in Relational Databases RupingSVM '02
1
2
![Page 22: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/22.jpg)
Similar Content
1 2
![Page 23: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/23.jpg)
Different Impact Citing Venues and Authors:
Affected more authors/ venues
Very little intersection
1 2
![Page 24: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/24.jpg)
What to Cover?
• Instead of words…• Cover papers• A paper covers papers that
it had an impact on• High-coverage map:
impact on a lot of the corpus• Why descendants?
• Soft notion: [0,1]
![Page 25: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/25.jpg)
p has High Impact on q if…p
q
Many paths(especially short)
Note that our protocol is different from previous
work…
coherent
Formalize with coherent random walks
We use the algorithm of…
r
![Page 26: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/26.jpg)
Map Coverage• Documents cover pieces of the corpus:
CorpusCoverage
![Page 27: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/27.jpg)
High-coverage, Coherent Map
![Page 28: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/28.jpg)
Properties of a Good Map
1. Coherence
2. Coverage
3. Connectivity
![Page 29: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/29.jpg)
Definition: Connectivity
• Experimented with formulations• Users do not care about connection type• Encourage connections between pairs of lines
![Page 30: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/30.jpg)
Lines with No Intersection
Solution: Reward lines that had impact on each other
Perceptrons SVMOptimizing Kernels
for SVM
Face DetectionSVM for Facial
Recognition
![Page 31: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/31.jpg)
Tying it all Together:Map Objective
• Coherence– Either coherent or not: Constraint
• Coverage– Must have!
• Connectivity– Nice to have
Consider all coherent maps with maximum possible coverage.
Find the most connected one.
![Page 32: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/32.jpg)
Game Plan
Objective Algorithm Does it
work?
![Page 33: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/33.jpg)
Approach Overview
Documents D
…
1. Coherence graph G 2. Coverage function f
f( ) = ?
3. Increase Connectivity
![Page 34: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/34.jpg)
Coherence Graph: Main Idea
• Vertices correspond to short coherent chains• Directed edges between chains which can be
conjoined and remain coherent
1 2 3
4 5 6 5 8 9
1 2 3 5 8 9
![Page 35: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/35.jpg)
Finding High-Coverage Chains• Paths correspond to coherent chains.• Problem: find a path of length K maximizing
coverage of underlying articles
1 2 3
4 5 6 5 8 9
Cover( )
>
Cover( )
?
1 2 3 4 5 6
1 2 3 5 8 9
![Page 36: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/36.jpg)
Reformulation• Paths correspond to coherent chains.• Problem: find a path of length K maximizing
coverage of underlying articles
• Submodular orienteering– [Chekuri and Pal, 2005]– Quasipolynomial time recursive greedy– O(log OPT) approximation
Orienteering
a function of the nodes visited
![Page 37: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/37.jpg)
Approach Overview: Recap
Documents D
…
1. Coherence graph G 2. Coverage function f
f( ) = ?
3. Increase Connectivity
Encodes all coherent chains as
graph paths
Submodular orienteering [Chekuri & Pal, 2005]
Quasipoly time recursive greedy
O(log OPT) approximation
![Page 38: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/38.jpg)
Example Map: Reinforcement Learning
multi-agent cooperative joint teammdp states pomdp transition optioncontrol motor robot skills armbandit regret dilemma exploration armq-learning bound optimal rmax mdp
![Page 39: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/39.jpg)
Example Map Detail: SVM
![Page 40: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/40.jpg)
Game Plan
Objective Algorithm Does it
work?
![Page 41: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/41.jpg)
User Study
• Tricky!– No double-blind, no within-subject– Domain: understandable yet unfamiliar– Reinforcement Learning (RL)
![Page 42: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/42.jpg)
User Study
• 30 participants• First-year grad student, Reinforcement
Learning project• Update a survey paper from 1996• Identify research directions + relevant papers
– Google Scholar – Map and Google Scholar – Baselines: Map, Wikipedia
![Page 43: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/43.jpg)
Results (in a nutshell)Be
tter
Google Us Google Us
Map users find better papers, and
cover more important areas
![Page 44: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/44.jpg)
User CommentsHelpful
noticed directions I didn't know aboutgreat starting point
… get a basic idea of what science is up to
why don't you draw words on edges?
Legend is confusing
hard to get an idea from paper title alone
![Page 45: Metro Maps of](https://reader035.vdocuments.mx/reader035/viewer/2022062521/56816772550346895ddc60b1/html5/thumbnails/45.jpg)
Conclusions• Formulated metrics characterizing good maps for
the scientific domain• Efficient methods with theoretical guarantees• User studies highlight the promise of the method• Website on the way!• Personalization
Thank you!